Lightning-AI
diff --git a/‎.github/workflows/ci_test-base.yml‎
Lines changed: 1 addition & 2 deletions b/‎.github/workflows/ci_test-base.yml‎
Lines changed: 1 addition & 2 deletions
diff --git a/‎.github/workflows/ci_test-conda.yml‎
Lines changed: 1 addition & 2 deletions b/‎.github/workflows/ci_test-conda.yml‎
Lines changed: 1 addition & 2 deletions
diff --git a/‎.github/workflows/ci_test-full.yml‎
Lines changed: 1 addition & 2 deletions b/‎.github/workflows/ci_test-full.yml‎
Lines changed: 1 addition & 2 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 7 additions & 2 deletions b/‎CHANGELOG.md‎
Lines changed: 7 additions & 2 deletions
diff --git a/‎docs/source/multi_gpu.rst‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/multi_gpu.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/optimizers.rst‎
Lines changed: 36 additions & 13 deletions b/‎docs/source/optimizers.rst‎
Lines changed: 36 additions & 13 deletions
diff --git a/‎pytorch_lightning/__init__.py‎
Lines changed: 1 addition & 1 deletion b/‎pytorch_lightning/__init__.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎pytorch_lightning/core/lightning.py‎
Lines changed: 1 addition & 3 deletions b/‎pytorch_lightning/core/lightning.py‎
Lines changed: 1 addition & 3 deletions
diff --git a/‎pytorch_lightning/core/optimizer.py‎
Lines changed: 25 additions & 3 deletions b/‎pytorch_lightning/core/optimizer.py‎
Lines changed: 25 additions & 3 deletions
diff --git a/‎pytorch_lightning/metrics/classification/__init__.py‎
Lines changed: 1 addition & 1 deletion b/‎pytorch_lightning/metrics/classification/__init__.py‎
Lines changed: 1 addition & 1 deletion
@@ -76,8 +76,7 @@ jobs:
       with:
         name: pytest-results-${{ runner.os }}-${{ matrix.python-version }}-${{ matrix.requires }}
         path: junit/test-results-${{ runner.os }}-${{ matrix.python-version }}-${{ matrix.requires }}.xml
-      # Use always() to always run this step to publish test results when there are test failures
-      if: always()
+      if: failure()
 
     - name: Statistics
       if: success()
 
@@ -50,5 +50,4 @@ jobs:
       with:
         name: pytest-results-${{ runner.os }}-${{ matrix.python-version }}-${{ matrix.requires }}
         path: junit/test-results-${{ runner.os }}-${{ matrix.python-version }}-${{ matrix.requires }}.xml
-      # Use always() to always run this step to publish test results when there are test failures
-      if: always()
+      if: failure()
@@ -129,8 +129,7 @@ jobs:
       with:
         name: pytest-results-${{ runner.os }}-${{ matrix.python-version }}-${{ matrix.requires }}
         path: junit/test-results-${{ runner.os }}-${{ matrix.python-version }}-${{ matrix.requires }}.xml
-      # Use always() to always run this step to publish test results when there are test failures
-      if: always()
+      if: failure()
 
     - name: Statistics
       if: success()
 
@@ -39,6 +39,11 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
 
 ### Fixed
 
+- Fixed trainer by default `None` in `DDPAccelerator` ([#4915](https://github.com/PyTorchLightning/pytorch-lightning/pull/4915))
+
+
+- Fixed `LightningOptimizer` exposes optimizer attributes ([#5095](https://github.com/PyTorchLightning/pytorch-lightning/pull/5095))
+
 
 
 ## [1.1.0] - 2020-12-09
@@ -80,9 +85,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
 
 ### Changed
 
-- Removed `multiclass_roc` and `multiclass_precision_recall_curve`, use `roc` and `precision_recall_curve` instead ([#4549](https://github.com/PyTorchLightning/pytorch-lightning/pull/4549))
 - Tuner algorithms will be skipped if `fast_dev_run=True` ([#3903](https://github.com/PyTorchLightning/pytorch-lightning/pull/3903))
-- WandbLogger does not force wandb `reinit` arg to True anymore and creates a run only when needed ([#4648](https://github.com/PyTorchLightning/pytorch-lightning/pull/4648))
+- `WandbLogger` does not force wandb `reinit` arg to True anymore and creates a run only when needed ([#4648](https://github.com/PyTorchLightning/pytorch-lightning/pull/4648))
 - Changed `automatic_optimization` to be a model attribute ([#4602](https://github.com/PyTorchLightning/pytorch-lightning/pull/4602))
 - Changed `Simple Profiler` report to order by percentage time spent + num calls ([#4880](https://github.com/PyTorchLightning/pytorch-lightning/pull/4880))
 - Simplify optimization Logic ([#4984](https://github.com/PyTorchLightning/pytorch-lightning/pull/4984))
@@ -100,6 +104,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
 ### Removed
 
 - Removed `reorder` parameter of the `auc` metric ([#5004](https://github.com/PyTorchLightning/pytorch-lightning/pull/5004))
+- Removed `multiclass_roc` and `multiclass_precision_recall_curve`, use `roc` and `precision_recall_curve` instead ([#4549](https://github.com/PyTorchLightning/pytorch-lightning/pull/4549))
 
 ### Fixed
 
 
@@ -663,7 +663,7 @@ It is highly recommended to use Sharded Training in multi-GPU environments where
 A technical note: as batch size scales, storing activations for the backwards pass becomes the bottleneck in training. As a result, sharding optimizer state and gradients becomes less impactful.
 Work within the future will bring optional sharding to activations and model parameters to reduce memory further, but come with a speed cost.
 
-To use Sharded Training, you need to first install FairScale using the command below or install all extras using ``pip install pytorch-lightning["extra"]``.
+To use Sharded Training, you need to first install FairScale using the command below.
 
 .. code-block:: bash
 
 
@@ -191,46 +191,69 @@ override the :meth:`optimizer_step` function.
 
 For example, here step optimizer A every 2 batches and optimizer B every 4 batches
 
-.. testcode::
+.. note:: When using Trainer(enable_pl_optimizer=True), there is no need to call `.zero_grad()`.
 
-    def optimizer_step(self, current_epoch, batch_nb, optimizer, optimizer_idx, second_order_closure=None, on_tpu=False, using_native_amp=False, using_lbfgs=False):
-        optimizer.step()
+.. testcode::
 
     def optimizer_zero_grad(self, current_epoch, batch_idx, optimizer, opt_idx):
       optimizer.zero_grad()
 
     # Alternating schedule for optimizer steps (ie: GANs)
-    def optimizer_step(self, current_epoch, batch_nb, optimizer, optimizer_idx, second_order_closure=None, on_tpu=False, using_native_amp=False, using_lbfgs=False):
+    def optimizer_step(self, current_epoch, batch_nb, optimizer, optimizer_idx, closure, on_tpu=False, using_native_amp=False, using_lbfgs=False):
         # update generator opt every 2 steps
         if optimizer_i == 0:
             if batch_nb % 2 == 0 :
-                optimizer.step()
-                optimizer.zero_grad()
+               optimizer.step(closure=closure)
 
         # update discriminator opt every 4 steps
         if optimizer_i == 1:
             if batch_nb % 4 == 0 :
-                optimizer.step()
-                optimizer.zero_grad()
+               optimizer.step(closure=closure)
+
+.. note:: When using ``Trainer(enable_pl_optimizer=True)``, ``.step`` accepts a boolean ``make_optimizer_step`` which can be used as follow.
+
+.. testcode::
+
+    def optimizer_zero_grad(self, current_epoch, batch_idx, optimizer, opt_idx):
+      optimizer.zero_grad()
+
+    # Alternating schedule for optimizer steps (ie: GANs)
+    def optimizer_step(self, current_epoch, batch_nb, optimizer, optimizer_idx, closure, on_tpu=False, using_native_amp=False, using_lbfgs=False):
+        # update generator opt every 2 steps
+        if optimizer_i == 0:
+            optimizer.step(closure=closure, make_optimizer_step=(batch_nb % 2) == 0)
 
-        # ...
-        # add as many optimizers as you want
+        # update discriminator opt every 4 steps
+        if optimizer_i == 1:
+            optimizer.step(closure=closure, make_optimizer_step=(batch_nb % 4) == 0)
 
 Here we add a learning-rate warm up
 
 .. testcode::
 
     # learning rate warm-up
-    def optimizer_step(self, current_epoch, batch_nb, optimizer, optimizer_idx, second_order_closure=None, on_tpu=False, using_native_amp=False, using_lbfgs=False):
+    def optimizer_step(self, current_epoch, batch_nb, optimizer, optimizer_idx, closure, on_tpu=False, using_native_amp=False, using_lbfgs=False):
         # warm up lr
         if self.trainer.global_step < 500:
             lr_scale = min(1., float(self.trainer.global_step + 1) / 500.)
             for pg in optimizer.param_groups:
                 pg['lr'] = lr_scale * self.hparams.learning_rate
 
         # update params
-        optimizer.step()
-        optimizer.zero_grad()
+        optimizer.step(closure=closure)
+
+The default ``optimizer_step`` is relying on the internal ``LightningOptimizer`` to properly perform a step.
+
+.. testcode::
+
+    from pytorch_lightning.core.optimizer import LightningOptimizer
+   
+    # function hook in LightningModule
+    def optimizer_step(self, current_epoch, batch_nb, optimizer, optimizer_idx, closure, on_tpu=False, using_native_amp=False, using_lbfgs=False):
+      if not isinstance(optimizer, LightningOptimizer):
+         # wraps into LightingOptimizer only for running step
+         optimizer = LightningOptimizer.to_lightning_optimizer(optimizer, self.trainer)
+      optimizer.step(closure=closure)
 
 ----------
 
 
@@ -1,6 +1,6 @@
 """Root package info."""
 
-__version__ = '1.1.0'
+__version__ = '1.1.1rc0'
 __author__ = 'William Falcon et al.'
 __author_email__ = '[email protected]'
 __license__ = 'Apache-2.0'
 
@@ -1170,7 +1170,6 @@ def toggle_optimizer(self, optimizer: Optimizer, optimizer_idx: int):
 
     def optimizer_step(
         self,
-        *args,
         epoch: int = None,
         batch_idx: int = None,
         optimizer: Optimizer = None,
@@ -1179,7 +1178,6 @@ def optimizer_step(
         on_tpu: bool = None,
         using_native_amp: bool = None,
         using_lbfgs: bool = None,
-        **kwargs,
     ) -> None:
         r"""
         Override this method to adjust the default way the
@@ -1254,7 +1252,7 @@ def optimizer_step(self, epoch, batch_idx, optimizer, optimizer_idx,
         if not isinstance(optimizer, LightningOptimizer):
             # wraps into LightingOptimizer only for running step
             optimizer = LightningOptimizer.to_lightning_optimizer(optimizer, self.trainer)
-        optimizer.step(closure=optimizer_closure, *args, **kwargs)
+        optimizer.step(closure=optimizer_closure)
 
     def optimizer_zero_grad(
         self, epoch: int, batch_idx: int, optimizer: Optimizer, optimizer_idx: int
 
@@ -57,12 +57,35 @@ def __init__(self,
         else:
             self.__class__ = type("Lightning" + optimizer.__class__.__name__, (self.__class__, optimizer.__class__), {})
 
-        self._trainer = None
         self._optimizer = optimizer
+        self._trainer = None
         self._accumulate_grad_batches = accumulate_grad_batches
-        self._automatic_optimization = None
         self._optimizer_idx = None
 
+    @property
+    def defaults(self):
+        return self._optimizer.defaults
+
+    @defaults.setter
+    def defaults(self, defaults):
+        self._optimizer.defaults = defaults
+
+    @property
+    def state(self):
+        return self._optimizer.state
+
+    @state.setter
+    def state(self, state):
+        self._optimizer.state = state
+
+    @property
+    def param_groups(self):
+        return self._optimizer.param_groups
+
+    @param_groups.setter
+    def param_groups(self, param_groups):
+        self._optimizer.param_groups = param_groups
+
     @property
     def accumulate_grad_batches(self):
         return self._accumulate_grad_batches
@@ -73,7 +96,6 @@ def accumulate_grad_batches(self, accumulate_grad_batches):
 
     def _on_trainer_init(self, trainer):
         self._trainer = proxy(trainer)
-        self._automatic_optimization = trainer.train_loop.automatic_optimization
         for opt_idx, opt in enumerate(trainer.optimizers):
             if opt == self._optimizer:
                 self._optimizer_idx = opt_idx
 
@@ -14,7 +14,7 @@
 from pytorch_lightning.metrics.classification.accuracy import Accuracy
 from pytorch_lightning.metrics.classification.average_precision import AveragePrecision
 from pytorch_lightning.metrics.classification.confusion_matrix import ConfusionMatrix
-from pytorch_lightning.metrics.classification.f_beta import FBeta, F1
+from pytorch_lightning.metrics.classification.f_beta import FBeta, Fbeta, F1
 from pytorch_lightning.metrics.classification.precision_recall import Precision, Recall
 from pytorch_lightning.metrics.classification.precision_recall_curve import PrecisionRecallCurve
 from pytorch_lightning.metrics.classification.roc import ROC