Skip to content

Keeping DDP override in sync with upstream torch #4630

@edenlightning

Description

@edenlightning

From @ananthsub:
how should Lightning keep its DDP override in sync with the upstream torch DistributedDataParallel? these implementations have now diverged. I think this leads to performance degradations with Lightning + gradient accumulations, since the require_backward_grad_sync attribute isn't checked before the backwards pass

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions