Skip to content

[Feature Request] Action Masking #1404

@Kang-SungKu

Description

@Kang-SungKu

Motivation

I recently started learning TorchRL, and creating a custom environment (using torchrl.envs.EnvBase) based on the documentation (https://pytorch.org/rl/reference/envs.html). For my environment, I would like to apply an action mask, such that the environment does not allow infeasible action based on the observation (for example, suppose the action is to choose a trump card, the number of A is limited, such that it cannot be chosen once all the A's are drawn). So far, I could not find a way to implement action masking for the environment, but it would be a convenient feature to implement similar environment.

Solution

It would be convenient if I can include a mask as a part of the observation_spec, such that the environment can tell feasible/infeasible actions based on the observation (even when a random action is chosen). Currently, my environment cannot pass torchrl.envs.utils.check_env_specs() since infeasible actions are chosen.

If it is not reasonable to implement this feature, any alternative way to implement an action mask is appreciated.

Checklist

I searched with the keyword 'Action Masking', but could not find relevant issues. Sorry if I missed something.

  • [ X ] I have checked that there is no similar issue in the repo (required)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions