Recall that to solve (P2) in the tth time frame, we observe ξt {hti, Qi(t), Yi(t)}Ni=1, consisting of the channel gains {hti}Ni=1 and the system queue states {Qi(t),Yi(t)}Ni=1, and accordingly decide the control action {xt, yt}, including the binary offloading decision xt and the continuous resource allocation yt τit, fit, eti,O, rit,ONi=1. A close observation shows that although (P2) is a non-convex optimization problem, the resource allocation problem to optimize yt is in fact an “easy” convex problem if xt is fixed. In Section IV.B, we will propose a customized algorithm to efficiently obtain the optimal yt given xt in (P2). Here, we denote Gxt,ξt as the optimal value of (P2) by optimizing yt given the offloading decision xt and parameter ξt. Therefore, solving (P2) is equivalent to finding the optimal offloading decision (xt)∗, where (P3) : xt∗ = arg maximize G xt, ξt . (20) xt ∈{0,1}N In general, obtaining (xt)∗ requires enumerating 2N offloading decisions, which leads to significantly high computational complexity even when N is moderate (e.g., N = 10). Other search based methods, such as branch-and-bound and block coordinate descent [29], are also time-consuming when N is large. In practice, neither method is applicable to online decision- making under fast-varying channel condition. Leveraging the DRL technique, we propose a LyDROO algorithm to construct a policy π that maps from the input ξt to the optimal action (xt)∗, i.e., π : ξt → (xt)∗, with very low complexity, e.g., tens of milliseconds computation time (i.e., the time duration from observing ξt to producing a control action {xt, yt}) when N = 10.,为什么要使用深度强化学习
时间: 2024-04-27 15:25:10 浏览: 22
在这个问题中,我们需要使用深度强化学习来解决一个在线决策问题,即在快速变化的信道条件下,每个时刻需要决定哪些任务应该在本地执行,哪些任务应该在边缘设备上执行。由于任务数量可能很大,传统的搜索和优化方法在计算上非常昂贵,因此我们需要使用更高效的方法。深度强化学习可以学习一个策略,该策略可以在给定当前状态时返回最优的行动,而且可以通过训练过程自适应地处理不同的任务和信道条件。因此,通过使用深度强化学习,我们可以在快速变化的环境中快速地做出优化的决策,这对于边缘计算等实时应用非常重要。
相关问题
Recall that to solve (P2) in the tth time frame, we observe ξt {hti, Qi(t), Yi(t)}Ni=1, consisting of the channel gains {hti}Ni=1 and the system queue states {Qi(t),Yi(t)}Ni=1, and accordingly decide the control action {xt, yt}, including the binary offloading decision xt and the continuous resource allocation yt τit, fit, eti,O, rit,ONi=1. A close observation shows that although (P2) is a non-convex optimization problem, the resource allocation problem to optimize yt is in fact an “easy” convex problem if xt is fixed. In Section IV.B, we will propose a customized algorithm to efficiently obtain the optimal yt given xt in (P2). Here, we denote Gxt,ξt as the optimal value of (P2) by optimizing yt given the offloading decision xt and parameter ξt. Therefore, solving (P2) is equivalent to finding the optimal offloading decision (xt)∗, where,yt的内容是靠什么确定的
yt的内容是通过解决(P2)中的资源分配问题来确定的。具体来说,给定二进制卸载决策xt和参数ξt,我们需要优化yt来最大化(P2)的性能指标,其中yt包括τit,fit,eti,O,rit,O。虽然(P2)是一个非凸优化问题,但是当xt固定时,优化yt的资源分配问题实际上是一个凸优化问题。因此,在第IV.B节中,我们将提出一种定制的算法,有效地获得在(P2)中给定xt的最优yt。因此,通过解决(P2)中的资源分配问题来确定yt的内容,并最终找到最优的卸载决策(xt)*,以最大化系统的性能指标。
design a DHNN to store a pattern then use the designed DHNN to recall the corrupted pattern
Sure, here's a brief overview of how you can design a DHNN to store and recall patterns:
1. First, choose the number of neurons in the hidden layer of your DHNN. This will depend on the complexity of the patterns you want to store.
2. Initialize the weights between the input layer and the hidden layer randomly.
3. Present the pattern you want to store to the input layer of the DHNN.
4. Compute the output of the hidden layer using the weights and biases.
5. Use a Hebbian learning rule to update the weights between the input layer and the hidden layer. This will strengthen the connections between the neurons that fire together.
6. Repeat steps 3-5 for each pattern you want to store.
7. To recall a pattern, present a corrupted version of the pattern to the input layer of the DHNN.
8. Compute the output of the hidden layer using the updated weights and biases.
9. Use an inverse Hebbian learning rule to update the weights between the hidden layer and the output layer. This will help the DHNN to reconstruct the original pattern.
10. Compute the output of the output layer using the updated weights and biases. This should be a close approximation of the original pattern.
11. Repeat steps 7-10 for each corrupted pattern you want to recall.
I hope this helps! Let me know if you have any other questions.
相关推荐
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![7z](https://img-home.csdnimg.cn/images/20210720083312.png)
![rar](https://img-home.csdnimg.cn/images/20210720083606.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)