from sb3_contrib import MaskablePPO
时间: 2023-09-23 13:03:08 浏览: 246
Python库 | sb3_contrib-0.11.0a4-py3-none-any.whl
`MaskablePPO` is a reinforcement learning algorithm implemented in the `sb3_contrib` package, which is a collection of third-party algorithms built on top of Stable Baselines3 (SB3).
`MaskablePPO` is based on the Proximal Policy Optimization (PPO) algorithm and includes a novel technique called "action masking", which allows the agent to ignore certain actions during training and inference. This can be useful in environments where certain actions are invalid or have negative consequences.
To use `MaskablePPO`, you will need to install the `sb3_contrib` package and import the `MaskablePPO` class. You can then use it to train and evaluate your reinforcement learning models.
阅读全文