Algorithm 1: The online LyDROO algorithm for solving (P1). input : Parameters V , {γi, ci}Ni=1, K, training interval δT , Mt update interval δM ; output: Control actions xt,ytKt=1; 1 Initialize the DNN with random parameters θ1 and empty replay memory, M1 ← 2N; 2 Empty initial data queue Qi(1) = 0 and energy queue Yi(1) = 0, for i = 1,··· ,N; 3 fort=1,2,...,Kdo 4 Observe the input ξt = ht, Qi(t), Yi(t)Ni=1 and update Mt using (8) if mod (t, δM ) = 0; 5 Generate a relaxed offloading action xˆt = Πθt ξt with the DNN; 6 Quantize xˆt into Mt binary actions xti|i = 1, · · · , Mt using the NOP method; 7 Compute Gxti,ξt by optimizing resource allocation yit in (P2) for each xti; 8 Select the best solution xt = arg max G xti , ξt and execute the joint action xt , yt ; { x ti } 9 Update the replay memory by adding (ξt,xt); 10 if mod (t, δT ) = 0 then 11 Uniformly sample a batch of data set {(ξτ , xτ ) | τ ∈ St } from the memory; 12 Train the DNN with {(ξτ , xτ ) | τ ∈ St} and update θt using the Adam algorithm; 13 end 14 t ← t + 1; 15 Update {Qi(t),Yi(t)}N based on xt−1,yt−1 and data arrival observation At−1N using (5) and (7). i=1 i i=1 16 end通过什么进行最优动作决策
时间: 2023-06-17 22:01:59 浏览: 123
最优动作决策是通过计算每个可能的离散动作的预期收益来实现的。具体来说,在第7步中,对于每个离散的二进制动作(xi, yi),算法会优化资源分配yit,以最大化问题(P2)的目标函数G(xi, yi, ξt)。然后,算法会选择具有最大G值的动作作为最优动作,执行这个动作并更新回放记忆。整个过程在算法的第8步中完成。
相关问题
Algorithm 1: The online LyDROO algorithm for solving (P1). input : Parameters V , {γi, ci}Ni=1, K, training interval δT , Mt update interval δM ; output: Control actions xt,ytKt=1; 1 Initialize the DNN with random parameters θ1 and empty replay memory, M1 ← 2N; 2 Empty initial data queue Qi(1) = 0 and energy queue Yi(1) = 0, for i = 1,··· ,N; 3 fort=1,2,...,Kdo 4 Observe the input ξt = ht, Qi(t), Yi(t)Ni=1 and update Mt using (8) if mod (t, δM ) = 0; 5 Generate a relaxed offloading action xˆt = Πθt ξt with the DNN; 6 Quantize xˆt into Mt binary actions xti|i = 1, · · · , Mt using the NOP method; 7 Compute Gxti,ξt by optimizing resource allocation yit in (P2) for each xti; 8 Select the best solution xt = arg max G xti , ξt and execute the joint action xt , yt ; { x ti } 9 Update the replay memory by adding (ξt,xt); 10 if mod (t, δT ) = 0 then 11 Uniformly sample a batch of data set {(ξτ , xτ ) | τ ∈ St } from the memory; 12 Train the DNN with {(ξτ , xτ ) | τ ∈ St} and update θt using the Adam algorithm; 13 end 14 t ← t + 1; 15 Update {Qi(t),Yi(t)}N based on xt−1,yt−1 and data arrival observation At−1N using (5) and (7). i=1 i i=1 16 end,算法的输入输出是什么
输入:参数V,{γi,ci}Ni=1,K,训练间隔δT,Mt更新间隔δM。
输出:控制动作xt,ytKt=1。
其中,V是一个参数集合,{γi,ci}是关于每个用户i的重要性权重和计算任务的数据量,K是迭代次数,δT是训练间隔,δM是Mt更新间隔。算法的输出是K次迭代后的控制动作xt,ytKt=1。
algorithm identifier in public key not recognised:1.2.156.10197.1.301
这个错误通常发生在使用Bouncy Castle库解析SM2公钥时,因为Bouncy Castle默认不支持SM2算法标识符。
要解决此问题,您可以使用以下代码将Bouncy Castle的Provider配置为支持SM2算法标识符:
```
Security.addProvider(new BouncyCastleProvider());
SM2ParameterSpec sm2Spec = new SM2ParameterSpec();
AlgorithmParameters parameters = AlgorithmParameters.getInstance("SM2", "BC");
parameters.init(sm2Spec);
```
在创建SM2的PublicKey对象之前,您需要先执行上述代码来初始化算法参数。这将告诉Bouncy Castle库如何解析SM2算法标识符,从而避免出现“algorithm identifier in public key not recognised”错误。
另外,还需要确保您的公钥PEM文件正确包含了SM2算法标识符,如下所示:
```
-----BEGIN PUBLIC KEY-----
MIICITCCAYoCCQD2yTo1T8ZQvzANBgkqhkiG9w0BAQsFADBFMQswCQYDVQQGEwJB
....
-----END PUBLIC KEY-----
```
其中,`-----BEGIN PUBLIC KEY-----`和`-----END PUBLIC KEY-----`是PEM文件的起始和结束标记。`MIICITCCAYoCCQD2yTo1T8ZQvzANBgkqhkiG9w0BAQsFADBFMQswCQYDVQQGEwJB`是经过Base64编码的SM2公钥数据,以ASN.1 DER编码格式存储。如果您的公钥PEM文件中没有正确包含SM2算法标识符,也可能会导致“algorithm identifier in public key not recognised”错误。