Algorithm 1: The online LyDROO algorithm for solving (P1). input : Parameters V , {γi, ci}Ni=1, K, training interval δT , Mt update interval δM ; output: Control actions 􏰕xt,yt􏰖Kt=1; 1 Initialize the DNN with random parameters θ1 and empty replay memory, M1 ← 2N; 2 Empty initial data queue Qi(1) = 0 and energy queue Yi(1) = 0, for i = 1,··· ,N; 3 fort=1,2,...,Kdo 4 Observe the input ξt = 􏰕ht, Qi(t), Yi(t)􏰖Ni=1 and update Mt using (8) if mod (t, δM ) = 0; 5 Generate a relaxed offloading action xˆt = Πθt 􏰅ξt􏰆 with the DNN; 6 Quantize xˆt into Mt binary actions 􏰕xti|i = 1, · · · , Mt􏰖 using the NOP method; 7 Compute G􏰅xti,ξt􏰆 by optimizing resource allocation yit in (P2) for each xti; 8 Select the best solution xt = arg max G 􏰅xti , ξt 􏰆 and execute the joint action 􏰅xt , yt 􏰆; { x ti } 9 Update the replay memory by adding (ξt,xt); 10 if mod (t, δT ) = 0 then 11 Uniformly sample a batch of data set {(ξτ , xτ ) | τ ∈ St } from the memory; 12 Train the DNN with {(ξτ , xτ ) | τ ∈ St} and update θt using the Adam algorithm; 13 end 14 t ← t + 1; 15 Update {Qi(t),Yi(t)}N based on 􏰅xt−1,yt−1􏰆 and data arrival observation 􏰙At−1􏰚N using (5) and (7). i=1 i i=1 16 end With the above actor-critic-update loop, the DNN consistently learns from the best and most recent state-action pairs, leading to a better policy πθt that gradually approximates the optimal mapping to solve (P3). We summarize the pseudo-code of LyDROO in Algorithm 1, where the major computational complexity is in line 7 that computes G􏰅xti,ξt􏰆 by solving the optimal resource allocation problems. This in fact indicates that the proposed LyDROO algorithm can be extended to solve (P1) when considering a general non-decreasing concave utility U (rit) in the objective, because the per-frame resource allocation problem to compute G􏰅xti,ξt􏰆 is a convex problem that can be efficiently solved, where the detailed analysis is omitted. In the next subsection, we propose a low-complexity algorithm to obtain G 􏰅xti, ξt􏰆. B. Low-complexity Algorithm for Optimal Resource Allocation Given the value of xt in (P2), we denote the index set of users with xti = 1 as Mt1, and the complementary user set as Mt0. For simplicity of exposition, we drop the superscript t and express the optimal resource allocation problem that computes G 􏰅xt, ξt􏰆 as following (P4) : maximize 􏰀j∈M0 􏰕ajfj/φ − Yj(t)κfj3􏰖 + 􏰀i∈M1 {airi,O − Yi(t)ei,O} (28a) τ,f,eO,rO 17 ，这个算法中基于模型的DRL算法和无优化的DRL算法和DNN深度学习都各体现在哪

时间: 2023-06-16 20:05:51 浏览: 62

这个算法中基于模型的DRL算法和无优化的DRL算法体现在如何生成离散的决策xt。具体来说，在第5行，算法使用一个基于深度神经网络的Actor模型来生成一个连续的决策xˆt。然后在第6行，算法使用了一个无优化的离散化方法，即NOP（Number of On Positions）方法来将连续的决策xˆt离散化成为Mt个二进制决策xti。而DNN深度学习则体现在第12行，算法使用Adam算法来更新DNN的参数θt，以逐渐逼近最优策略π∗。

Algorithm 1: The online LyDROO algorithm for solving (P1). input : Parameters V , {γi, ci}Ni=1, K, training interval δT , Mt update interval δM ; output: Control actions 􏰄xt,yt􏰅Kt=1; 1 Initialize the DNN with random parameters θ1 and empty replay memory, M1 ← 2N; 2 Empty initial data queue Qi(1) = 0 and energy queue Yi(1) = 0, for i = 1,··· ,N; 3 fort=1,2,...,Kdo 4 Observe the input ξt = 􏰄ht, Qi(t), Yi(t)􏰅Ni=1 and update Mt using (8) if mod (t, δM ) = 0; 5 Generate a relaxed offloading action xˆt = Πθt 􏰀ξt􏰁 with the DNN; 6 Quantize xˆt into Mt binary actions 􏰄xti|i = 1, · · · , Mt􏰅 using the NOP method; 7 Compute G􏰀xti,ξt􏰁 by optimizing resource allocation yit in (P2) for each xti; 8 Select the best solution xt = arg max G 􏰀xti , ξt 􏰁 and execute the joint action 􏰀xt , yt 􏰁; { x ti } 9 Update the replay memory by adding (ξt,xt); 10 if mod (t, δT ) = 0 then 11 Uniformly sample a batch of data set {(ξτ , xτ ) | τ ∈ St } from the memory; 12 Train the DNN with {(ξτ , xτ ) | τ ∈ St} and update θt using the Adam algorithm; 13 end 14 t ← t + 1; 15 Update {Qi(t),Yi(t)}N based on 􏰀xt−1,yt−1􏰁 and data arrival observation 􏰒At−1􏰓N using (5) and (7). i=1 i i=1 16 end，算法的输入输出是什么

输入：参数V，{γi,ci}Ni=1，K，训练间隔δT，Mt更新间隔δM。输出：控制动作􏰄xt,yt􏰅Kt=1。其中，V是一个参数集合，{γi,ci}是关于每个用户i的重要性权重和计算任务的数据量，K是迭代次数，δT是训练间隔，δM是Mt更新间隔。算法的输出是K次迭代后的控制动作􏰄xt,yt􏰅Kt=1。

algorithm identifier in public key not recognised：1.2.156.10197.1.301

这个错误通常发生在使用Bouncy Castle库解析SM2公钥时，因为Bouncy Castle默认不支持SM2算法标识符。要解决此问题，您可以使用以下代码将Bouncy Castle的Provider配置为支持SM2算法标识符： ``` Security.addProvider(new BouncyCastleProvider()); SM2ParameterSpec sm2Spec = new SM2ParameterSpec(); AlgorithmParameters parameters = AlgorithmParameters.getInstance("SM2", "BC"); parameters.init(sm2Spec); ``` 在创建SM2的PublicKey对象之前，您需要先执行上述代码来初始化算法参数。这将告诉Bouncy Castle库如何解析SM2算法标识符，从而避免出现“algorithm identifier in public key not recognised”错误。另外，还需要确保您的公钥PEM文件正确包含了SM2算法标识符，如下所示： ``` -----BEGIN PUBLIC KEY----- MIICITCCAYoCCQD2yTo1T8ZQvzANBgkqhkiG9w0BAQsFADBFMQswCQYDVQQGEwJB .... -----END PUBLIC KEY----- ``` 其中，`-----BEGIN PUBLIC KEY-----`和`-----END PUBLIC KEY-----`是PEM文件的起始和结束标记。`MIICITCCAYoCCQD2yTo1T8ZQvzANBgkqhkiG9w0BAQsFADBFMQswCQYDVQQGEwJB`是经过Base64编码的SM2公钥数据，以ASN.1 DER编码格式存储。如果您的公钥PEM文件中没有正确包含SM2算法标识符，也可能会导致“algorithm identifier in public key not recognised”错误。

algorithm identifier in public key not recognised：1.2.156.10197.1.301

相关推荐

java.lang.RuntimeException: Unsupported algorithm: HmacSHA1解决方法

NGPM_v1.444.rar_444non.com_Dominated_genetic algorithm_www.V.com

Algorithm-Gossip.rar_1971_gossip_gossip algorithm_site:www.pudn.

parfor i=1:layer.numChannels Z(:,:,i)=edge(Y,'sobel'); end

解决org.apache.ibatis.binding.BindingException: Parameter 'name' not found. Available parameters are [param1, algorithm]

runtimeexception: unsupported algorithm: hmacsha1

vtkDemandDrivenPipeline:780 ERR| vtkCompositeDataPipeline (0000029B31386590): Input for connection index 0 on input port index 1 for algorithm vtkImageBlend (0000029B31391240) is of type vtkImageData, but a vtkImageStencilData is required.

boost::algorithm::trim

boost::algorithm::split

pelican optimization algorithm: a novel nature-inspired algorithm for engine

Input for connection index 0 on input port index 1 for algorithm vtkImageBlend (000001F3518C7A80) is of type vtkImageData, but a vtkImageStencilData is required.

org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyd

lms.c.zip_The Program_lms algorithm_lms.c

标准pso代码-鲁.zip_On Strategy_algorithm，pso_cec2013_pso for decisio

Batalgorithm.rar_The Bat_bat algorithm_bat_algorithm_havingl4v_m

ComplexCi：带有DataCastle竞争解决方案的复杂网络中的集体影响力（CI）算法的c ++实现

Javascript SHA-1：Secure Hash Algorithm

最新推荐

中文翻译论文：The wake-sleep algorithm for unsupervised neural networks

基于STM32通过PWM驱动直流电机

最新微信文章编辑器排版工具程序源码.rar

信息办公电信计费系统完整代码-netctossconformity.rar

交流电桥实验（95）.zip

RTL8188FU-Linux-v5.7.4.2-36687.20200602.tar(20765).gz

管理建模和仿真的文件

：YOLOv1目标检测算法：实时目标检测的先驱，开启计算机视觉新篇章

info-center source defatult

c++校园超市商品信息管理系统课程设计说明书(含源代码) (2).pdf