Skip to content

Support Habanas' Gaudi Accelerator #10214

@SeanNaren

Description

@SeanNaren

🚀 Feature

Recently Intel's Habana Labs announced AWS nodes for their Gaudi Accelerator. As described in the reddit post there seems to be a lot of potential for model parallel solution that require fast interconnect between processors.

Looking at the pytorch migration guide there is room for lightning to assist in making this process seamless, helping users to not have to worry about this details (especially around the KCRS -> RSCK format for convolutions).

I suggest we introduce a new accelerator type such that we can enable this in lightning through the accelerator flag, i.e:

trainer = Trainer(accelerator='hpu')

HPU is what the accelerator seems to be called within the migration guide.

cc @Borda

Metadata

Metadata

Assignees

Labels

featureIs an improvement or enhancementhelp wantedOpen to be worked on

Type

No type

Projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions