-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
bugSomething isn't workingSomething isn't workinghelp wantedOpen to be worked onOpen to be worked on
Description
🐛 Bug
I have a custom distributed plugin, but it currently does not work PTL's automatic distributed sampler.
Plugin looks like this:
class MyPlugin(ParallelPlugin):
@property
def distributed_sampler_kwargs(self):
...
But in data_loading.py, when deciding whether to add a distributed data loader, PTL looks at accelerator_connector.is_distributed:
need_dist_sampler = self.accelerator_connector.is_distributed and not isinstance(
dataloader.sampler, DistributedSampler
)
And self.accelerator_connector.is_distributed only returns True if the built-in plugins are used, not any custom plugin:
@property
def is_distributed(self) -> bool:
is_distributed = self.use_ddp or self.use_ddp2 or self.use_horovod
if self.on_tpu:
is_distributed |= self.training_type_plugin.is_distributed
return is_distributed
Therefore, with a custom plugin, the distributed sampler is not set.
How can a custom plugin set itself to be distributed, so this property, and any other properties related to distributed training will automatically be set to the correct value?
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workinghelp wantedOpen to be worked onOpen to be worked on