写一个在gym环境下运行的移动障碍物网格环境
时间: 2024-04-09 17:28:17 浏览: 123
下面是一个在gym环境下运行的移动障碍物网格环境的示例代码:
```python
import gym
from gym import spaces
import numpy as np
class GridWorldEnv(gym.Env):
def __init__(self, grid_size=5):
super(GridWorldEnv, self).__init__()
self.grid_size = grid_size
self.start_position = (0, 0)
self.target_position = (grid_size-1, grid_size-1)
self.current_position = self.start_position
self.action_space = spaces.Discrete(4) # 上、下、左、右四个动作
self.observation_space = spaces.Tuple((
spaces.Discrete(grid_size),
spaces.Discrete(grid_size)
))
def reset(self):
self.current_position = self.start_position
return self.current_position
def step(self, action):
x, y = self.current_position
if action == 0: # 上
y += 1
elif action == 1: # 下
y -= 1
elif action == 2: # 左
x -= 1
elif action == 3: # 右
x += 1
# 检查新位置的合法性
if 0 <= x < self.grid_size and 0 <= y < self.grid_size:
self.current_position = (x, y)
done = self.current_position == self.target_position
reward = 1.0 if done else 0.0
return self.current_position, reward, done, {}
def render(self):
grid = np.zeros((self.grid_size, self.grid_size))
grid[self.start_position] = 0.5
grid[self.target_position] = 0.5
grid[self.current_position] = 1.0
for row in grid:
print(' '.join(['{:1.1f}'.format(cell) for cell in row]))
# 使用示例
env = GridWorldEnv(grid_size=5)
observation = env.reset()
for _ in range(10):
env.render()
action = env.action_space.sample()
observation, reward, done, _ = env.step(action)
if done:
break
```
运行上述代码,您将看到一个5x5的网格环境,起始位置标记为0.5,目标位置标记为0.5,当前位置标记为1.0。每次循环中,随机选择一个动作,更新环境状态,直到到达目标位置为止。
阅读全文