-
Notifications
You must be signed in to change notification settings - Fork 412
Description
Motivation
I recently started learning TorchRL, and creating a custom environment (using torchrl.envs.EnvBase) based on the documentation (https://pytorch.org/rl/reference/envs.html). For my environment, I would like to apply an action mask, such that the environment does not allow infeasible action based on the observation (for example, suppose the action is to choose a trump card, the number of A
is limited, such that it cannot be chosen once all the A
's are drawn). So far, I could not find a way to implement action masking for the environment, but it would be a convenient feature to implement similar environment.
Solution
It would be convenient if I can include a mask as a part of the observation_spec
, such that the environment can tell feasible/infeasible actions based on the observation (even when a random action is chosen). Currently, my environment cannot pass torchrl.envs.utils.check_env_specs()
since infeasible actions are chosen.
If it is not reasonable to implement this feature, any alternative way to implement an action mask is appreciated.
Checklist
I searched with the keyword 'Action Masking', but could not find relevant issues. Sorry if I missed something.
- [ X ] I have checked that there is no similar issue in the repo (required)