Lightning-AI
diff --git a/‎CHANGELOG.md‎
Lines changed: 22 additions & 4 deletions b/‎CHANGELOG.md‎
Lines changed: 22 additions & 4 deletions
diff --git a/‎docs/source/governance.rst‎
Lines changed: 57 additions & 3 deletions b/‎docs/source/governance.rst‎
Lines changed: 57 additions & 3 deletions
diff --git a/‎pytorch_lightning/accelerators/accelerator.py‎
Lines changed: 0 additions & 7 deletions b/‎pytorch_lightning/accelerators/accelerator.py‎
Lines changed: 0 additions & 7 deletions
diff --git a/‎pytorch_lightning/callbacks/early_stopping.py‎
Lines changed: 3 additions & 1 deletion b/‎pytorch_lightning/callbacks/early_stopping.py‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎pytorch_lightning/callbacks/timer.py‎
Lines changed: 3 additions & 1 deletion b/‎pytorch_lightning/callbacks/timer.py‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎pytorch_lightning/core/hooks.py‎
Lines changed: 14 additions & 0 deletions b/‎pytorch_lightning/core/hooks.py‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎pytorch_lightning/core/saving.py‎
Lines changed: 0 additions & 19 deletions b/‎pytorch_lightning/core/saving.py‎
Lines changed: 0 additions & 19 deletions
diff --git a/‎pytorch_lightning/loggers/mlflow.py‎
Lines changed: 26 additions & 4 deletions b/‎pytorch_lightning/loggers/mlflow.py‎
Lines changed: 26 additions & 4 deletions
diff --git a/‎pytorch_lightning/loggers/wandb.py‎
Lines changed: 1 addition & 1 deletion b/‎pytorch_lightning/loggers/wandb.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎pytorch_lightning/loops/epoch/training_epoch_loop.py‎
Lines changed: 4 additions & 2 deletions b/‎pytorch_lightning/loops/epoch/training_epoch_loop.py‎
Lines changed: 4 additions & 2 deletions
@@ -60,6 +60,15 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
 - Added sanitization of tensors when they get logged as hyperparameters in `TensorBoardLogger` ([#9031](https://github.com/PyTorchLightning/pytorch-lightning/pull/9031))
 
 
+- Added `InterBatchParallelDataFetcher` ([#9020](https://github.com/PyTorchLightning/pytorch-lightning/pull/9020))
+
+
+- Added `DataLoaderIterDataFetcher` ([#9020](https://github.com/PyTorchLightning/pytorch-lightning/pull/9020))
+
+
+- Added a friendly error message when DDP attempts to spawn new distributed processes with rank > 0 ([#9005](https://github.com/PyTorchLightning/pytorch-lightning/pull/9005))
+
+
 ### Changed
 
 - Parsing of the `gpus` Trainer argument has changed: `gpus="n"` (str) no longer selects the GPU index n and instead selects the first n devices. ([#8770](https://github.com/PyTorchLightning/pytorch-lightning/pull/8770))
@@ -114,10 +123,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
 - Deprecated `DataModule` properties: `train_transforms`, `val_transforms`, `test_transforms`, `size`, `dims` ([#8851](https://github.com/PyTorchLightning/pytorch-lightning/pull/8851))
 
 
--
-
-
--
+- Deprecated `prepare_data_per_node` flag on Trainer and set it as a property of `DataHooks`, accessible in the `LightningModule` and `LightningDataModule` [#8958](https://github.com/PyTorchLightning/pytorch-lightning/pull/8958)
 
 
 -
@@ -139,6 +145,12 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
 - Removed the deprecated `optimizer_idx` from `training_step` as an accepted argument in manual optimization ([#8576](https://github.com/PyTorchLightning/pytorch-lightning/pull/8576))
 
 
+- Removed support for the deprecated `on_save_checkpoint` signature. The hook now takes a `checkpoint` positional parameter ([#8697](https://github.com/PyTorchLightning/pytorch-lightning/pull/8697))
+
+
+- Removed support for the deprecated `on_load_checkpoint` signature. The hook now takes a `pl_module` positional parameter ([#8697](https://github.com/PyTorchLightning/pytorch-lightning/pull/8697))
+
+
 - Removed the deprecated `save_function` property in `ModelCheckpoint` ([#8680](https://github.com/PyTorchLightning/pytorch-lightning/pull/8680))
 
 
@@ -160,9 +172,15 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
 - Removed deprecated `GradInformation` module in favor of `pytorch_lightning.utilities.grads` ([#8831](https://github.com/PyTorchLightning/pytorch-lightning/pull/8831/))
 
 
+- Removed `TrainingTypePlugin.on_save` and `Accelerator.on_save` ([#9023](https://github.com/PyTorchLightning/pytorch-lightning/pull/9023))
+
+
 - Removed deprecated `connect_precision_plugin` and `connect_training_type_plugin` from `Accelerator` ([#9019](https://github.com/PyTorchLightning/pytorch-lightning/pull/9019))
 
 
+- Removed `on_train_epoch_end` from `Accelerator` ([#9035](https://github.com/PyTorchLightning/pytorch-lightning/pull/9035))
+
+
 ### Fixed
 
 - Ensure the existence of `DDPPlugin._sync_dir` in `reconciliate_processes` ([#8939](https://github.com/PyTorchLightning/pytorch-lightning/pull/8939))
 
@@ -1,7 +1,14 @@
 .. _governance:
 
-Lightning Governance | Persons of interest
-==========================================
+Lightning Governance
+####################
+
+This document describes governance processes we follow in developing PyTorch Lightning.
+
+Persons of Interest
+*******************
+
+.. _governance_bdfl:
 
 BDFL
 ----
@@ -14,7 +21,7 @@ Leads
 -----
 - Jirka Borovec (`Borda <https://github.com/Borda>`_)
 - Ethan Harris (`ethanwharris <https://github.com/ethanwharris>`_) (Torchbearer founder)
-- Justus Schock (`justusschock <https://github.com/justusschock>`_) (Former Core Member PyTorch Ignite)
+- Justus Schock (`justusschock <https://github.com/justusschock>`_)
 - Adrian Wälchli (`awaelchli <https://github.com/awaelchli>`_)
 - Thomas Chaton (`tchaton <https://github.com/tchaton>`_)
 - Sean Narenthiran (`SeanNaren <https://github.com/SeanNaren>`_)
@@ -44,3 +51,50 @@ Alumni
 - Teddy Koker (`teddykoker <https://github.com/teddykoker>`_)
 - Nate Raw (`nateraw <https://github.com/nateraw>`_)
 - Peter Yu (`yukw777 <https://github.com/yukw777>`_)
+
+
+Releases
+********
+
+We release a new minor version (e.g., 1.5.0) every three months and bugfix releases every week.
+The minor versions contain new features, API changes, deprecations, removals, potential backward-incompatible
+changes and also all previous bugfixes included in any bugfix release. With every release, we publish a changelog
+where we list additions, removals, changed functionality and fixes.
+
+Project Management and Decision Making
+**************************************
+
+The decision what goes into a release is governed by the :ref:`staff contributors and leaders <governance>` of
+Lightning development. Whenever possible, discussion happens publicly on GitHub and includes the whole community.
+For controversial changes, it is mandatory to seek consultation from :ref:`governance_bdfl` for a final decision.
+When a consensus is reached, staff and core contributors assign milestones and labels to the issue and/or pull request
+and start tracking the development. It is possible that priorities change over time.
+
+Commits to the project are exclusively to be added by pull requests on GitHub and anyone in the community is welcome to
+review them. However, reviews submitted by
+`code owners <https://github.com/PyTorchLightning/pytorch-lightning/blob/master/.github/CODEOWNERS>`_
+have higher weight and it is necessary to get the approval of code owners before a pull request can be merged.
+Additional requirements may apply case by case.
+
+API Evolution
+*************
+
+Lightning's development is driven by research and best practices in a rapidly developing field of AI and machine
+learning. Change is inevitable and when it happens, the Lightning team is committed to minimizing user friction and
+maximizing ease of transition from one version to the next. We take backward compatibility and reproducibility very
+seriously.
+
+For API removal, renaming or other forms of backward-incompatible changes, the procedure is:
+
+#. A deprecation process is initiated at version X, producing warning messages at runtime and in the documentation.
+#. Calls to the deprecated API remain unchanged in their function during the deprecation phase.
+#. Two minor versions in the future at version X+2 the breaking change takes effect.
+
+The "X+2" rule is a recommendation and not a strict requirement. Longer deprecation cylces may apply for some cases.
+
+New API and features are declared as:
+
+- *Experimental*: Anything labelled as *experimental* or *beta* in the documentation is considered unstable and should
+    not be used in production. The community is encouraged to test the feature and report issues directly on GitHub.
+- *Stable*: Everything not specifically labelled as experimental should be considered stable. Reported issues will be
+    treated with priority.
@@ -371,9 +371,6 @@ def lightning_module_state_dict(self) -> Dict[str, Union[Any, Tensor]]:
         """
         return self.training_type_plugin.lightning_module_state_dict()
 
-    def on_save(self, checkpoint: Dict[str, Union[Any, Tensor]]) -> Dict[str, Union[Any, Tensor]]:
-        return self.training_type_plugin.on_save(checkpoint)
-
     def barrier(self, name: Optional[str] = None) -> None:
         self.training_type_plugin.barrier(name=name)
 
@@ -479,10 +476,6 @@ def restore_checkpoint_after_pre_dispatch(self) -> bool:
     def update_global_step(self, total_batch_idx: int, current_global_step: int) -> int:
         return self.training_type_plugin.update_global_step(total_batch_idx, current_global_step)
 
-    def on_train_epoch_end(self) -> None:
-        """Hook to do something on the end of an training epoch."""
-        pass
-
     def on_train_start(self) -> None:
         """Called when train begins."""
         return self.training_type_plugin.on_train_start()
 
@@ -159,7 +159,9 @@ def on_save_checkpoint(
             "patience": self.patience,
         }
 
-    def on_load_checkpoint(self, callback_state: Dict[str, Any]) -> None:
+    def on_load_checkpoint(
+        self, trainer: "pl.Trainer", pl_module: "pl.LightningModule", callback_state: Dict[str, Any]
+    ) -> None:
         self.wait_count = callback_state["wait_count"]
         self.stopped_epoch = callback_state["stopped_epoch"]
         self.best_score = callback_state["best_score"]
 
@@ -158,7 +158,9 @@ def on_save_checkpoint(
     ) -> Dict[str, Any]:
         return {"time_elapsed": {stage.value: self.time_elapsed(stage) for stage in list(RunningStage)}}
 
-    def on_load_checkpoint(self, callback_state: Dict[str, Any]) -> None:
+    def on_load_checkpoint(
+        self, trainer: "pl.Trainer", pl_module: "pl.LightningModule", callback_state: Dict[str, Any]
+    ) -> None:
         time_elapsed = callback_state.get("time_elapsed", {})
         self._offset = time_elapsed.get(RunningStage.TRAINING.value, 0)
 
 
@@ -372,6 +372,16 @@ def configure_sharded_model(self) -> None:
 class DataHooks:
     """Hooks to be used for data related stuff."""
 
+    def __init__(self) -> None:
+        """
+        Attributes:
+            prepare_data_per_node:
+                If True, each LOCAL_RANK=0 will call prepare data.
+                Otherwise only NODE_RANK=0, LOCAL_RANK=0 will prepare data.
+        """
+        super().__init__()
+        self.prepare_data_per_node: bool = True
+
     def prepare_data(self) -> None:
         """
         Use this to download and prepare data.
@@ -405,6 +415,10 @@ def prepare_data(self):
             # call on GLOBAL_RANK=0 (great for shared file systems)
             Trainer(prepare_data_per_node=False)
 
+        Note:
+            Setting ``prepare_data_per_node`` with the trainer flag is deprecated and will be removed in v1.7.0.
+            Please set ``prepare_data_per_node`` in LightningDataModule or LightningModule directly instead.
+
         This is called before requesting the dataloaders:
 
         .. code-block:: python
 
@@ -212,25 +212,6 @@ def _load_model_state(cls, checkpoint: Dict[str, Any], strict: bool = True, **cl
 
         return model
 
-    def on_load_checkpoint(self, checkpoint: Dict[str, Any]) -> None:
-        """
-        Do something with the checkpoint.
-        Gives model a chance to load something before ``state_dict`` is restored.
-
-        Args:
-            checkpoint: A dictionary with variables from the checkpoint.
-        """
-
-    def on_save_checkpoint(self, checkpoint: Dict[str, Any]) -> None:
-        """
-        Give the model a chance to add something to the checkpoint.
-        ``state_dict`` is already there.
-
-        Args:
-            checkpoint: A dictionary in which you can save variables to save in a checkpoint.
-                Contents need to be pickleable.
-        """
-
     # -------------------------
     # OPTIONAL HOOKS
     # -------------------------
 
@@ -171,14 +171,24 @@ def experiment(self) -> MlflowClient:
         return self._mlflow_client
 
     @property
-    def run_id(self):
-        # create the experiment if it does not exist to get the run id
+    def run_id(self) -> str:
+        """
+        Create the experiment if it does not exist to get the run id.
+
+        Returns:
+            The run id.
+        """
         _ = self.experiment
         return self._run_id
 
     @property
-    def experiment_id(self):
-        # create the experiment if it does not exist to get the experiment id
+    def experiment_id(self) -> str:
+        """
+        Create the experiment if it does not exist to get the experiment id.
+
+        Returns:
+            The experiment id.
+        """
         _ = self.experiment
         return self._experiment_id
 
@@ -239,8 +249,20 @@ def save_dir(self) -> Optional[str]:
 
     @property
     def name(self) -> str:
+        """
+        Get the experiment id.
+
+        Returns:
+            The experiment id.
+        """
         return self.experiment_id
 
     @property
     def version(self) -> str:
+        """
+        Get the run id.
+
+        Returns:
+            The run id.
+        """
         return self.run_id
@@ -29,7 +29,7 @@
 from pytorch_lightning.utilities import _module_available, rank_zero_only
 from pytorch_lightning.utilities.exceptions import MisconfigurationException
 from pytorch_lightning.utilities.imports import _compare_version
-from pytorch_lightning.utilities.warnings import rank_zero_deprecation, rank_zero_warn
+from pytorch_lightning.utilities.warnings import rank_zero_warn
 
 _WANDB_AVAILABLE = _module_available("wandb")
 _WANDB_GREATER_EQUAL_0_10_22 = _compare_version("wandb", operator.ge, "0.10.22")
 
@@ -135,8 +135,10 @@ def advance(self, dataloader_iter: Iterator, **kwargs: Any) -> None:
             # ------------------------------------
             # TRAINING_STEP + TRAINING_STEP_END
             # ------------------------------------
-            with self.trainer.profiler.profile("training_batch_to_device"):
-                batch = self.trainer.accelerator.batch_to_device(batch)
+            # FIXME: Remove with InterBatchProcessor.
+            if not self.trainer.data_connector.data_fetcher.store_on_device:
+                with self.trainer.profiler.profile("training_batch_to_device"):
+                    batch = self.trainer.accelerator.batch_to_device(batch)
 
             self.batch_progress.increment_ready()