-
Notifications
You must be signed in to change notification settings - Fork 413
Description
Motivation
Vectorized environments are environments that perform simulations using batches. This can be useful to benefit from parallel computation on GPUs. These environments have their own batch_sizes, which can be used for different reasons.
For example:
- Brax observations have shape
(n_vectorized_envs, obs_size)
- Vmas observations have shape
(n_vectorized_envs, n_agents, obs_size)
Currently, torchrl environment infrastructure has some issues with environemnts which have non-empty batch sizes or that have a batch dimension for agents.
Ideally, we would like to use vectorized environments freely in torch rl and leverage its features such as ParallelEnv
and Collectors
on top of such environments. This whould create tensordicts with many dimensions in the batch_size, for example:
tensordict.batch_size = (
n_parallel_envs, # from ParallelEnv
n_agents, # from env.batch_size
n_vectorized_envs, # from env.batch_size
*other_env_dimensions, # from env.batch_size
n_rollout_samples # from env.rollout()
)
I created this issue to list and organize all the issues that need to be addressed in order to generalize to BaseEnv
s with general batch sizes in torchrl:
Issues
Stacking tensordicts of hetergoeneous shapes and nestedtensors compatibility (#766)(PR)
When some of the dimensions of the vectorized enironment are heterogenous (agents with different observation and action spaces that stil share the other batch dimensions), we need to carry this heterogeneous data in a suitable data straucture.
NestedTensors provide a natural candidate for this task. Here is a list of the operations that need to be supported by NestedTensors in order to enable this feature:
- stacking along any dim
- shape (not only size)
- indexing along any dim that is compatible
- stacking nested tensors together (currently we can't combine a two nested tensors containing tensors of shape
[[a, b], [a, c]]
into a single one of shape[[[a, b], [a, c]], [[a, b], [a, c]]]
) - NesetedTensors of NestedTensors
- Nested tensor aritchemic and algebraic operations