Skip to content

DataModule PrepareData Only Called on Node 0 #7927

@Queuecumber

Description

@Queuecumber

🐛 Bug

The LightningDataModule prepare_data function should be able to be called on the local rank 0 process on any node if the user chooses, and there are APIs around controlling this, but they don't work. I've reproduced this bug and traced it to one (rather obvious) line:

https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/core/datamodule.py#L384

This line wraps the prepare_data function in rank_zero_only, a function which means global zero not local zero. There is already logic in the data connector around making sure that the function is only run when it is supposed to. Apparently the multi-node setup was never tested?

This fix is simple: just delete the rank_zero_only part.

Additional context

This is a quick one-line fix that I'm happy to PR, but someone needs to engage with me on it so that it actually gets merged.

Metadata

Metadata

Labels

bugSomething isn't workingdata handlingGeneric data-related topichelp wantedOpen to be worked on

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions