Skip to content

Do not force sync_dist=True on epoch end #13210

@carmocca

Description

@carmocca

Proposed refactor

https://github.com/PyTorchLightning/pytorch-lightning/blob/fb40cbce2ea4afbcf5842e9ec524f665a3faa95d/pytorch_lightning/trainer/connectors/logger_connector/result.py#L524-L530

Metrics logged inside epoch_end hooks are forcefully synced. This makes the flag irrelevant and causes communication overrhead that might be undesired.

Motivation

This was added #7966

With the reasoning that users would likely forget to set this themselves. Currently, for extra control, using torchmetrics.Metrics is the suggested solution.

Pitch

Do not force this sync and show a warning for 1 epoch suggesting setting sync_dist=True for such metrics.

Additional context

Proposed by @rohitgr7


If you enjoy Lightning, check out our other projects! ⚡

  • Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

  • Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.

  • Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.

  • Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.

  • Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

cc @Borda @justusschock @awaelchli @rohitgr7 @carmocca @edward-io @ananthsub @kamil-kaczmarek @Raalsky @Blaizzy

Metadata

Metadata

Labels

good first issueGood for newcomersloggingRelated to the `LoggerConnector` and `log()`refactor

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions