Specify Gradient Clipping Norm in Trainer

## 🚀 Feature
Allow specification of the gradient clipping norm_type, which by default is euclidean and fixed.

### Motivation

We are using pytorch lightning to increase training performance in the standalone Federated Learning context (experimental setting). In this context the trained models diverge from their underlying data and get aggregated on the server side which leads to larger gradients in general. To preserve the direction of the gradient, but limit the magnitude per single dimension, we need to apply the inf norm.

### Pitch

Add a parameter `gradient_clipping_norm_type: float=2.0` to trainer. Pass the parameter to the _clip_gradients method. Changing the call from 
`_clip_gradients(optimizer, grad_clip_val)`
to somewhat like
`_clip_gradients(optimizer, grad_clip_val, grad_clip_norm_type)`

### Additional context

The impact is minimal and only effects the following function

https://github.com/PyTorchLightning/pytorch-lightning/blob/861d699f43e661a917c92d772526fbde28200e89/pytorch_lightning/accelerators/accelerator.py#L119-L128

Which is calling

https://github.com/PyTorchLightning/pytorch-lightning/blob/861d699f43e661a917c92d772526fbde28200e89/pytorch_lightning/accelerators/accelerator.py#L130-L130

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Specify Gradient Clipping Norm in Trainer #5671

🚀 Feature

Motivation

Pitch

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Specify Gradient Clipping Norm in Trainer #5671

Description

🚀 Feature

Motivation

Pitch

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions