Trainer Error Handling Fix

## 🐛 Bug



For distributed training, if a subset of ranks fail during some training step, the current setting tries to gracefully shutdown by calling 

https://github.com/PyTorchLightning/pytorch-lightning/blob/22a266d8b8cf57455cc863e20491e416ec635ba7/pytorch_lightning/trainer/trainer.py#L634

However, as not all ranks enter this `on_train_end`, we have the logic to perform model checkpoint which would hang while broadcasting.

https://github.com/PyTorchLightning/pytorch-lightning/blob/22a266d8b8cf57455cc863e20491e416ec635ba7/pytorch_lightning/callbacks/model_checkpoint.py#L723-L730

see also discussion #6807 and #6791



### Pitch

keep special case for `KeyboardInterrupt` expection, for the rest exceptions, we raise the exception and remove finally

```
  try:
      ..... (run training epochs)
      # hook
      self.train_loop.on_train_end()
  except KeyboardInterrupt:
      rank_zero_warn('Detected KeyboardInterrupt, attempting graceful shutdown...')
      # user could press Ctrl+c many times... only shutdown once
      if not self.interrupted:
          self.state = TrainerState.INTERRUPTED
          self.on_keyboard_interrupt()
          self.train_loop.on_train_end()
  except:
      print_exc()
      raise
```

### To Reproduce

Use following [**BoringModel**](https://colab.research.google.com/drive/1HvWVVTK8j2Nj52qU4Q4YCyzOm0_aLQF3?usp=sharing) and post here



### Expected behavior



### Environment

**Note**: `Bugs with code` are solved faster ! `Colab Notebook` should be made `public` !

* `IDE`: Please, use our python [bug_report_model.py](https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/bug_report_model.py
) template.

* `Colab Notebook`: Please copy and paste the output from our [environment collection script](https://raw.githubusercontent.com/PyTorchLightning/pytorch-lightning/master/tests/collect_env_details.py) (or fill out the checklist below manually).

You can get the script and run it with:
```
wget https://raw.githubusercontent.com/PyTorchLightning/pytorch-lightning/master/tests/collect_env_details.py
# For security purposes, please check the contents of collect_env_details.py before running it.
python collect_env_details.py
```

 - PyTorch Version (e.g., 1.0):
 - OS (e.g., Linux):
 - How you installed PyTorch (`conda`, `pip`, source):
 - Build command you used (if compiling from source):
 - Python version:
 - CUDA/cuDNN version:
 - GPU models and configuration:
 - Any other relevant information:

### Additional context

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Trainer Error Handling Fix #6842

🐛 Bug

Pitch

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Trainer Error Handling Fix #6842

Description

🐛 Bug

Pitch

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions