with torch.no_grad(): buf_reward, buf_mask, buf_action, buf_log_probe, buf_state = buffer.sample_all() bs = 2 ** 10 # set a smaller 'bs: batch size' when out of GPU memory. buf_value = torch.cat([self.cri(buf_state[i:i + bs]) for i in range(0, buf_state.size(0), bs)], dim=0) buf_logprob = torch.cat([buf_log_probe[i:i+bs] for i in range(0, buf_state.size(0), bs)], dim=0) buf_r_sum, buf_advantage = self.compute_reward(buf_len, buf_reward, buf_mask, buf_value) del buf_reward, buf_mask
时间: 2023-04-07 21:02:51 浏览: 138
0695-极智开发-解读pytorch中with torch.no-grad()的作用
这段代码使用了PyTorch中的no_grad()函数,表示在这个上下文环境中不需要计算梯度。接着,从缓存中取出了所有的奖励、掩码、动作、对数概率和状态,并将它们存储在相应的缓冲区中。最后,将批大小(bs)设置为1024。
阅读全文