Skip to content

support for bf16 in lightning trainer #8874

@yuvalkirstain

Description

@yuvalkirstain

🚀 Feature

Many models (e.g. T5) were trained using bf16, and as such, using fp16 results with nans. Now pytorch supports bf16. I think that supporting bf16 is a must have feature (and due to my ongoing research that builds upon Pytorch Lightning it will be amazing if it will be integrated fast :) )

Motivation

Many models (especially in NLP like T5) cannot be trained with fp16 when using pytorch-lightning. This makes some of them practically unusable (due to the memory consumption), and others just train slow.

Pitch

I want to pass an argument to the trainer that says precision=bf16 and it will train it with bf16.

Metadata

Metadata

Assignees

Labels

featureIs an improvement or enhancementhelp wantedOpen to be worked onpriority: 0High priority task

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions