Add 2d-2d support to MXFP8 Grouped GEMM #4816

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

danielvegamyhre wants to merge 1 commit into pytorch:main from danielvegamyhre:export-D81362680

Contributor

danielvegamyhre commented Sep 4, 2025

Summary:

MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

Add support for 2d-2d inputs with dynamic groups along K dimension
Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

netlify bot commented Sep 4, 2025 •

edited

Loading

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`f809b74`
🔍 Latest deploy log	https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68b9b4f6a6879c0008632a85
😎 Deploy Preview	https://deploy-preview-4816--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

meta-cla bot added the cla signed label

Contributor

facebook-github-bot commented Sep 4, 2025

This pull request was exported from Phabricator. Differential Revision: D81362680

facebook-github-bot added the fb-exported label

danielvegamyhre force-pushed the export-D81362680 branch from dc99695 to 069b945 Compare

September 4, 2025 02:54

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

069b945

Summary:

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

b008808

Summary:

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from 069b945 to b008808 Compare

September 4, 2025 02:55

Contributor

facebook-github-bot commented Sep 4, 2025

This pull request was exported from Phabricator. Differential Revision: D81362680

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

953bbe8

Summary:
Pull Request resolved: pytorch#4816

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from b008808 to 953bbe8 Compare

September 4, 2025 02:58

Contributor

facebook-github-bot commented Sep 4, 2025

This pull request was exported from Phabricator. Differential Revision: D81362680

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

484cea5

Summary:
Pull Request resolved: pytorch#4816

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from 953bbe8 to 484cea5 Compare

September 4, 2025 04:02

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

ec96115

Summary:

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from 484cea5 to ec96115 Compare

September 4, 2025 04:11

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

5d9c6dd

Summary:

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from ec96115 to 5d9c6dd Compare

September 4, 2025 04:12

Contributor

facebook-github-bot commented Sep 4, 2025

This pull request was exported from Phabricator. Differential Revision: D81362680

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

b0c77bb

Summary:
Pull Request resolved: pytorch#4816

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from 5d9c6dd to b0c77bb Compare

September 4, 2025 04:14

Contributor

facebook-github-bot commented Sep 4, 2025

This pull request was exported from Phabricator. Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from b0c77bb to 018b5d2 Compare

September 4, 2025 04:25

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

018b5d2

Summary:
Pull Request resolved: pytorch#4816

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

66e7a12

Summary:

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from 018b5d2 to 66e7a12 Compare

September 4, 2025 04:34

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

08430a6

Summary:

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from 66e7a12 to 08430a6 Compare

September 4, 2025 04:35

Contributor

facebook-github-bot commented Sep 4, 2025

This pull request was exported from Phabricator. Differential Revision: D81362680

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

78c8e43

Summary:
Pull Request resolved: pytorch#4816

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from b34875f to bc0c554 Compare

September 4, 2025 04:50

Contributor

facebook-github-bot commented Sep 4, 2025

This pull request was exported from Phabricator. Differential Revision: D81362680

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

812d712

Summary:
Pull Request resolved: pytorch#4816

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from bc0c554 to 812d712 Compare

September 4, 2025 04:52

Contributor

facebook-github-bot commented Sep 4, 2025

This pull request was exported from Phabricator. Differential Revision: D81362680

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

4b3651c

Summary:
Pull Request resolved: pytorch#4816

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from 812d712 to 4b3651c Compare

September 4, 2025 04:58

Contributor

facebook-github-bot commented Sep 4, 2025

This pull request was exported from Phabricator. Differential Revision: D81362680

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

Summary:
Pull Request resolved: pytorch#4816

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from 4b3651c to 4914467 Compare

September 4, 2025 05:04

Contributor

facebook-github-bot commented Sep 4, 2025

This pull request was exported from Phabricator. Differential Revision: D81362680

danielvegamyhre added a commit to danielvegamyhre/FBGEMM that referenced this pull request


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

Summary:
Pull Request resolved: pytorch#4816

## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from 4914467 to 0755452 Compare

September 4, 2025 05:11


          Add 2d-2d support to MXFP8 Grouped GEMM (pytorch#4816)

f809b74

Summary:
X-link: facebookresearch/FBGEMM#1846


## MXFP8 grouped GEMM updates to (1) handle 2d-2d case, and (2) have a PyTorch compliant API

- Add support for 2d-2d inputs with dynamic groups along K dimension
- Added tests to ensure correct numerics for both 2d-2d and 2d-3d cases, with randomly group sizes
- Add benchmarks for both 2d-3d and 2d-2d cases

Reviewed By: ngimel, cthi

Differential Revision: D81362680

danielvegamyhre force-pushed the export-D81362680 branch from 0755452 to f809b74 Compare

September 4, 2025 15:49

Contributor

facebook-github-bot commented Sep 4, 2025

This pull request was exported from Phabricator. Differential Revision: D81362680

facebook-github-bot closed this in

c6a8daf

facebook-github-bot added the Merged label

Contributor

facebook-github-bot commented Sep 4, 2025

This pull request has been merged in c6a8daf.

danielvegamyhre mentioned this pull request

MXFP8 grouped GEMM support for torch._scaled_grouped_mm + submodule bump pytorch/pytorch#162209

Closed

pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request


          MXFP8 grouped GEMM support for torch._scaled_grouped_mm + submodule b…

b6d0a9e

…ump (#162209)

## Summary
- We just landed 2d-2d support for mxfp8 grouped gemm in FBGEMM: pytorch/FBGEMM#4816
- This is needed for backward pass of mxfp8 MoE training with grouped gemms
- Changes:
    - Add dispatching + input validation for mxfp8 grouped gemm in `torch._scaled_grouped_mm`
    - Add meta registration input validation for mxfp8 grouped gemm, for composability with compile
    - Add unit tests exercising torch._scaled_grouped_mm with mxfp8 inputs
    - Bump FBGEMM third party submodule to include:
          - pytorch/FBGEMM#4816
          - pytorch/FBGEMM#4820
          - pytorch/FBGEMM#4821
          - pytorch/FBGEMM#4823

#### How fbgemm dependency was bumped
Documenting this since I haven't found it documented elsewhere:
- `cd ~/pytorch/third_party/fbgemm`
- `git fetch`
- `git checkout <hash>`
- `cd ~/pytorch`
- `git add third_party/fbgemm`

## Test plan

#### Test build
```
USE_FBGEMM_GENAI=1 python -m pip install --no-build-isolation -v -e .
...
Successfully installed torch-2.9.0a0+gitf5070f3
```
[full build log](https://www.internalfb.com/phabricator/paste/view/P1933787581)

#### Unit tests
```
pytest test/test_matmul_cuda.py -k test_mxfp8_scaled_grouped_mm_
...

test/test_matmul_cuda.py .........                                                                                                                        [100%]

============================================================== 9 passed, 1668 deselected in 5.34s ===============================================================
```

Pull Request resolved: #162209
Approved by: https://github.com/ngimel

daisyden pushed a commit to daisyden/pytorch that referenced this pull request


          MXFP8 grouped GEMM support for torch._scaled_grouped_mm + submodule b…

83c9f3a

…ump (pytorch#162209)

## Summary
- We just landed 2d-2d support for mxfp8 grouped gemm in FBGEMM: pytorch/FBGEMM#4816
- This is needed for backward pass of mxfp8 MoE training with grouped gemms
- Changes:
    - Add dispatching + input validation for mxfp8 grouped gemm in `torch._scaled_grouped_mm`
    - Add meta registration input validation for mxfp8 grouped gemm, for composability with compile
    - Add unit tests exercising torch._scaled_grouped_mm with mxfp8 inputs
    - Bump FBGEMM third party submodule to include:
          - pytorch/FBGEMM#4816
          - pytorch/FBGEMM#4820
          - pytorch/FBGEMM#4821
          - pytorch/FBGEMM#4823

#### How fbgemm dependency was bumped
Documenting this since I haven't found it documented elsewhere:
- `cd ~/pytorch/third_party/fbgemm`
- `git fetch`
- `git checkout <hash>`
- `cd ~/pytorch`
- `git add third_party/fbgemm`

## Test plan

#### Test build
```
USE_FBGEMM_GENAI=1 python -m pip install --no-build-isolation -v -e .
...
Successfully installed torch-2.9.0a0+gitf5070f3
```
[full build log](https://www.internalfb.com/phabricator/paste/view/P1933787581)

#### Unit tests
```
pytest test/test_matmul_cuda.py -k test_mxfp8_scaled_grouped_mm_
...

test/test_matmul_cuda.py .........                                                                                                                        [100%]

============================================================== 9 passed, 1668 deselected in 5.34s ===============================================================
```

Pull Request resolved: pytorch#162209
Approved by: https://github.com/ngimel

danielvegamyhre mentioned this pull request

[mxfp8 moe training] integrate mxfp8 grouped gemm and triton kernels for scale conversion to blocked format pytorch/ao#2977

Merged

markc-614 pushed a commit to markc-614/pytorch that referenced this pull request


          MXFP8 grouped GEMM support for torch._scaled_grouped_mm + submodule b…

87d3e4f

…ump (pytorch#162209)

## Summary
- We just landed 2d-2d support for mxfp8 grouped gemm in FBGEMM: pytorch/FBGEMM#4816
- This is needed for backward pass of mxfp8 MoE training with grouped gemms
- Changes:
    - Add dispatching + input validation for mxfp8 grouped gemm in `torch._scaled_grouped_mm`
    - Add meta registration input validation for mxfp8 grouped gemm, for composability with compile
    - Add unit tests exercising torch._scaled_grouped_mm with mxfp8 inputs
    - Bump FBGEMM third party submodule to include:
          - pytorch/FBGEMM#4816
          - pytorch/FBGEMM#4820
          - pytorch/FBGEMM#4821
          - pytorch/FBGEMM#4823

#### How fbgemm dependency was bumped
Documenting this since I haven't found it documented elsewhere:
- `cd ~/pytorch/third_party/fbgemm`
- `git fetch`
- `git checkout <hash>`
- `cd ~/pytorch`
- `git add third_party/fbgemm`

## Test plan

#### Test build
```
USE_FBGEMM_GENAI=1 python -m pip install --no-build-isolation -v -e .
...
Successfully installed torch-2.9.0a0+gitf5070f3
```
[full build log](https://www.internalfb.com/phabricator/paste/view/P1933787581)

#### Unit tests
```
pytest test/test_matmul_cuda.py -k test_mxfp8_scaled_grouped_mm_
...

test/test_matmul_cuda.py .........                                                                                                                        [100%]

============================================================== 9 passed, 1668 deselected in 5.34s ===============================================================
```

Pull Request resolved: pytorch#162209
Approved by: https://github.com/ngimel

This was referenced Sep 17, 2025

[roadmap/tracker] Low precision MoE training pytorch/ao#2147

Open

Add MXFP8 Support to scaled_grouped_gemm pytorch/pytorch#153502

Open

mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request


          MXFP8 grouped GEMM support for torch._scaled_grouped_mm + submodule b…

8e51842

…ump (pytorch#162209)

## Summary
- We just landed 2d-2d support for mxfp8 grouped gemm in FBGEMM: pytorch/FBGEMM#4816
- This is needed for backward pass of mxfp8 MoE training with grouped gemms
- Changes:
    - Add dispatching + input validation for mxfp8 grouped gemm in `torch._scaled_grouped_mm`
    - Add meta registration input validation for mxfp8 grouped gemm, for composability with compile
    - Add unit tests exercising torch._scaled_grouped_mm with mxfp8 inputs
    - Bump FBGEMM third party submodule to include:
          - pytorch/FBGEMM#4816
          - pytorch/FBGEMM#4820
          - pytorch/FBGEMM#4821
          - pytorch/FBGEMM#4823

#### How fbgemm dependency was bumped
Documenting this since I haven't found it documented elsewhere:
- `cd ~/pytorch/third_party/fbgemm`
- `git fetch`
- `git checkout <hash>`
- `cd ~/pytorch`
- `git add third_party/fbgemm`

## Test plan

#### Test build
```
USE_FBGEMM_GENAI=1 python -m pip install --no-build-isolation -v -e .
...
Successfully installed torch-2.9.0a0+gitf5070f3
```
[full build log](https://www.internalfb.com/phabricator/paste/view/P1933787581)

#### Unit tests
```
pytest test/test_matmul_cuda.py -k test_mxfp8_scaled_grouped_mm_
...

test/test_matmul_cuda.py .........                                                                                                                        [100%]

============================================================== 9 passed, 1668 deselected in 5.34s ===============================================================
```

Pull Request resolved: pytorch#162209
Approved by: https://github.com/ngimel

cleonard530 pushed a commit to cleonard530/pytorch that referenced this pull request


          MXFP8 grouped GEMM support for torch._scaled_grouped_mm + submodule b…

1ce6d11

…ump (pytorch#162209)

## Summary
- We just landed 2d-2d support for mxfp8 grouped gemm in FBGEMM: pytorch/FBGEMM#4816
- This is needed for backward pass of mxfp8 MoE training with grouped gemms
- Changes:
    - Add dispatching + input validation for mxfp8 grouped gemm in `torch._scaled_grouped_mm`
    - Add meta registration input validation for mxfp8 grouped gemm, for composability with compile
    - Add unit tests exercising torch._scaled_grouped_mm with mxfp8 inputs
    - Bump FBGEMM third party submodule to include:
          - pytorch/FBGEMM#4816
          - pytorch/FBGEMM#4820
          - pytorch/FBGEMM#4821
          - pytorch/FBGEMM#4823

#### How fbgemm dependency was bumped
Documenting this since I haven't found it documented elsewhere:
- `cd ~/pytorch/third_party/fbgemm`
- `git fetch`
- `git checkout <hash>`
- `cd ~/pytorch`
- `git add third_party/fbgemm`

## Test plan

#### Test build
```
USE_FBGEMM_GENAI=1 python -m pip install --no-build-isolation -v -e .
...
Successfully installed torch-2.9.0a0+gitf5070f3
```
[full build log](https://www.internalfb.com/phabricator/paste/view/P1933787581)

#### Unit tests
```
pytest test/test_matmul_cuda.py -k test_mxfp8_scaled_grouped_mm_
...

test/test_matmul_cuda.py .........                                                                                                                        [100%]

============================================================== 9 passed, 1668 deselected in 5.34s ===============================================================
```

Pull Request resolved: pytorch#162209
Approved by: https://github.com/ngimel

dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request


          MXFP8 grouped GEMM support for torch._scaled_grouped_mm + submodule b…

e6d7594

…ump (pytorch#162209)

## Summary
- We just landed 2d-2d support for mxfp8 grouped gemm in FBGEMM: pytorch/FBGEMM#4816
- This is needed for backward pass of mxfp8 MoE training with grouped gemms
- Changes:
    - Add dispatching + input validation for mxfp8 grouped gemm in `torch._scaled_grouped_mm`
    - Add meta registration input validation for mxfp8 grouped gemm, for composability with compile
    - Add unit tests exercising torch._scaled_grouped_mm with mxfp8 inputs
    - Bump FBGEMM third party submodule to include:
          - pytorch/FBGEMM#4816
          - pytorch/FBGEMM#4820
          - pytorch/FBGEMM#4821
          - pytorch/FBGEMM#4823

#### How fbgemm dependency was bumped
Documenting this since I haven't found it documented elsewhere:
- `cd ~/pytorch/third_party/fbgemm`
- `git fetch`
- `git checkout <hash>`
- `cd ~/pytorch`
- `git add third_party/fbgemm`

## Test plan

#### Test build
```
USE_FBGEMM_GENAI=1 python -m pip install --no-build-isolation -v -e .
...
Successfully installed torch-2.9.0a0+gitf5070f3
```
[full build log](https://www.internalfb.com/phabricator/paste/view/P1933787581)

#### Unit tests
```
pytest test/test_matmul_cuda.py -k test_mxfp8_scaled_grouped_mm_
...

test/test_matmul_cuda.py .........                                                                                                                        [100%]

============================================================== 9 passed, 1668 deselected in 5.34s ===============================================================
```

Pull Request resolved: pytorch#162209
Approved by: https://github.com/ngimel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed fb-exported Merged