periods=100
时间: 2023-08-25 10:04:48 浏览: 26
这里的 periods=100 意味着我们要生成一个长度为 100 的序列。具体来说,pd.date_range("1/1/2022",periods=100) 生成了一个长度为 100 的日期序列,这个序列从 "2022-01-01" 开始,每个日期间隔为 1 天,一直到 "2022-04-10"。而 np.random.randn(100) 则生成了一个长度为 100 的随机数序列。
相关问题
import pandas as pd import numpy as np w = pd.date_range(start = '2018/6/1',periods = 100,freq = 'D') y = pd.Series(np.arange(100),index = w) print(y.head(8)) ps = y.resample('M').mean() print(ps) y.index = y.index.strftime('%Y%m') y.groupby(y.index).mean() y.groupby(y.index.strftime('%Y%m')).mean()优化一下
你可以将最后两行代码合并成一行,同时使用`astype`方法将日期字符串转换成整数格式,这样可以避免在分组时再次调用`strftime`方法。修改后的代码如下:
```
import pandas as pd
import numpy as np
w = pd.date_range(start='2018/6/1', periods=100, freq='D')
y = pd.Series(np.arange(100), index=w)
print(y.head(8))
ps = y.resample('M').mean()
print(ps)
y.groupby(y.index.strftime('%Y%m').astype(int)).mean()
```
这样可以使代码更简洁高效。
implement the SGD algorithm and plot the expected regret as function of T using 1000 sample path simulations with python, Total number of periods 𝑇 = 2000, Initial decision 𝑝 = 0, Step size 𝜂𝑡 = 𝐷/(𝐺√𝑡), where 𝐷 = 20, 𝐺 = 100
Sure, here is an implementation of the SGD algorithm in Python:
```python
import numpy as np
def sgd(t, p, eta_t, D):
G = 100 # gradient bound
reward = 10 - 0.5 * p + np.random.normal(0, 0.04)
grad = -0.5
p_t = p - eta_t * grad
p_t = np.clip(p_t, 0, 19) # ensure p_t stays within bounds
eta_t = D / (G * np.sqrt(t+1))
return p_t, eta_t, reward
# Define parameters
T = 2000
p = 0
D = 20
# Run simulations
regret_sgd = np.zeros(T)
total_reward_sgd = 0
chosen_p_sgd = []
eta_t = D / (100 * np.sqrt(1))
for t in range(T):
p, eta_t, reward = sgd(t, p, eta_t, D)
total_reward_sgd += reward
chosen_p_sgd.append(p)
regret_sgd[t] = max([10 - 0.5 * p for p in range(20)]) * (t+1) - total_reward_sgd
# Plot results
import matplotlib.pyplot as plt
plt.plot(regret_sgd)
plt.xlabel("Time")
plt.ylabel("Expected Regret")
plt.show()
```
This code simulates the SGD algorithm for 2000 periods and plots the expected regret as a function of time. It uses 1000 sample paths by default.
The SGD algorithm updates the decision variable `p` at each time step based on the gradient of the reward function. The step size `eta_t` is updated at each time step using the formula `eta_t = D / (G * sqrt(t+1))`, where `D` is the maximum value of the reward function and `G` is an upper bound on the gradient.
Note that the gradient of the reward function is simply -0.5 in this case, since the reward function is linear in `p`.
You can run this code to see the results for yourself. Let me know if you have any questions!
相关推荐
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)