Make HorovodPlugin.all_gather return a tensor
#9696
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes #9695
The Horovod Plugin is the only plugin whose
all_gatherreturns a List[torch.Tensor] instead of a regular tensor. this is despite the support or allgather here: https://horovod.readthedocs.io/en/stable/_modules/horovod/torch/mpi_ops.html#allgatherThis PR updates the implementation of all gather here such that:
I'm unclear why it was implemented like this before
This is part of #7534
and came up during review of #9414 and #9677
Note: TrainingTypePlugin.all_gather is not called anywhere in the trainer. this is purely to power the LightningModule's all_gather's implementation. So this would be a breaking change iff someone was training with horovod and they depended on
LightningModule.all_gatherDoes your PR introduce any breaking changes? If yes, please list them.
Yes, this makes an in-place change to
HorovodPlugin.all_gatherto return a new return typeBefore submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃