Arm backend: Add support for single input matmul #10654

oscarandersson8218 · 2025-05-02T10:24:43Z

Summary

AnnotateDecomposedMatmul makes sure that a decomposed matmul will two dq-nodes before and a q-node after it's mm/bmm-node. Previously it assumed that the partition always had two input nodes (two dq-nodes), but this is not the case for a single input matmul, e.g. torch.matmul(x, x). In such a case we must copy the dq-node and insert it before the mm/bmm's two inputs.

Before pass:
         -> expand -> view ->
       /                      \
x -> dq                        bmm -> view -> q
       \                      /
         -> expand -> view ->

After pass:
   -> expand -> view -> dq
 /                        \
x                          bmm -> q -> view
 \                        /
   -> expand -> view -> dq

cc @digantdesai @freddan80 @per @zingo

AnnotateDecomposedMatmul makes sure that a decomposed matmul will two dq-nodes before and a q-node after it's mm/bmm-node. Previously it assumed that the partition always had two input nodes (two dq-nodes), but this is not the case for a single input matmul, e.g. torch.matmul(x, x). In such a case we must copy the dq-node and insert it before the mm/bmm's two inputs. Before pass: -> expand -> view -> / \ x -> dq bmm -> view -> q \ / -> expand -> view -> After pass: -> expand -> view -> dq / \ x bmm -> q -> view \ / -> expand -> view -> dq Signed-off-by: Oscar Andersson <[email protected]> Change-Id: I5ac381ccd712a535736fa16d1ee864dc76ae2b30

pytorch-bot · 2025-05-02T10:24:46Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10654

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

CI workflows being skipped on PR

❌ 2 New Failures

As of commit aedd502 with merge base e88aafc ():

NEW FAILURES - The following jobs have failed:

pull / unittest-editable / macos / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1
trunk / test-llama-runner-mac (fp32, coreml) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

zingo · 2025-05-05T08:48:38Z

MacOS fails is unrelated

- ConvertMmToBmmPass converts an MM node to BMM nodes, turns input and output tensors from rank-2 to rank-3 via unsqueeze/squeeze, and inserts q-dq before and after BMM node when necessary. - After ConvertMmToBmmPass: x -> q -> dq -> unsqueeze -> q_2 -> dq_2 -> \ bmm -> q_4 -> dq_4 / y -> q_1 -> dq_1 -> unsqueeze -> q_3 -> dq_3 -> - Therefore, if the original matmul was 2D, the bmm already has DQ nodes on its inputs and Q node on its output. If AnnotateDecomposedMatmulPass (pytorch#10654) is still applied in this case, it produces illegal sequences such as: x -> q -> unsqueeze -> q_2 (invalid) - Fix by checking whether the BMM is already surrounded by DQ nodes on its inputs and Q nodes on its output. Change-Id: I9949d59b0b4a96fa34a88b0734014567ea6f24cc Signed-off-by: Yufeng Shi <[email protected]> Co-authored-by: Oscar Andersson <[email protected]>

@digantdesai

- ConvertMmToBmmPass converts an MM node to BMM nodes, turns input and output tensors from rank-2 to rank-3 via unsqueeze/squeeze, and inserts q-dq before and after BMM node when necessary. - After ConvertMmToBmmPass: ``` x -> q -> dq -> unsqueeze -> q_2 -> dq_2 -> \ bmm -> q_4 -> dq_4 / y -> q_1 -> dq_1 -> unsqueeze -> q_3 -> dq_3 -> ``` - Therefore, if the original matmul was 2D, the bmm already has DQ nodes on its inputs and Q node on its output. If AnnotateDecomposedMatmulPass (#10654) is still applied in this case, it produces illegal sequences such as: x -> q -> unsqueeze -> q_2 (invalid) - Fix by checking whether the BMM is already surrounded by DQ nodes on its inputs and Q nodes on its output. Change-Id: I9949d59b0b4a96fa34a88b0734014567ea6f24cc cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 Signed-off-by: Yufeng Shi <[email protected]> Co-authored-by: Oscar Andersson <[email protected]>

@digantdesai

- ConvertMmToBmmPass converts an MM node to BMM nodes, turns input and output tensors from rank-2 to rank-3 via unsqueeze/squeeze, and inserts q-dq before and after BMM node when necessary. - After ConvertMmToBmmPass: ``` x -> q -> dq -> unsqueeze -> q_2 -> dq_2 -> \ bmm -> q_4 -> dq_4 / y -> q_1 -> dq_1 -> unsqueeze -> q_3 -> dq_3 -> ``` - Therefore, if the original matmul was 2D, the bmm already has DQ nodes on its inputs and Q node on its output. If AnnotateDecomposedMatmulPass (#10654) is still applied in this case, it produces illegal sequences such as: x -> q -> unsqueeze -> q_2 (invalid) - Fix by checking whether the BMM is already surrounded by DQ nodes on its inputs and Q nodes on its output. Change-Id: I9949d59b0b4a96fa34a88b0734014567ea6f24cc cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 Signed-off-by: Yufeng Shi <[email protected]> Co-authored-by: Oscar Andersson <[email protected]> (cherry picked from commit 9a7fb42)

@digantdesai

- ConvertMmToBmmPass converts an MM node to BMM nodes, turns input and output tensors from rank-2 to rank-3 via unsqueeze/squeeze, and inserts q-dq before and after BMM node when necessary. - After ConvertMmToBmmPass: ``` x -> q -> dq -> unsqueeze -> q_2 -> dq_2 -> \ bmm -> q_4 -> dq_4 / y -> q_1 -> dq_1 -> unsqueeze -> q_3 -> dq_3 -> ``` - Therefore, if the original matmul was 2D, the bmm already has DQ nodes on its inputs and Q node on its output. If AnnotateDecomposedMatmulPass (#10654) is still applied in this case, it produces illegal sequences such as: x -> q -> unsqueeze -> q_2 (invalid) - Fix by checking whether the BMM is already surrounded by DQ nodes on its inputs and Q nodes on its output. Change-Id: I9949d59b0b4a96fa34a88b0734014567ea6f24cc cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 Signed-off-by: Yufeng Shi <[email protected]> Co-authored-by: Yufeng Shi <[email protected]> Co-authored-by: Oscar Andersson <[email protected]>

oscarandersson8218 requested a review from digantdesai as a code owner May 2, 2025 10:24

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 2, 2025

oscarandersson8218 added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk topic: not user facing labels May 2, 2025

zingo and others added 2 commits May 2, 2025 19:00

Merge branch 'main' into matmul_single_input

9a2a9e7

Merge branch 'main' into matmul_single_input

aedd502

zingo approved these changes May 5, 2025

View reviewed changes

zingo merged commit 6260921 into pytorch:main May 5, 2025
172 of 174 checks passed

YufengShi-dudu mentioned this pull request Sep 26, 2025

Arm backend: Fix torch.matmul() failures for 2D tensor inputs #14624

Merged

pytorchbot mentioned this pull request Oct 7, 2025

Arm backend: Fix torch.matmul() failures for 2D tensor inputs #14845

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Arm backend: Add support for single input matmul #10654

Arm backend: Add support for single input matmul #10654

Uh oh!

oscarandersson8218 commented May 2, 2025 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented May 2, 2025 •

edited

Loading

Uh oh!

zingo commented May 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Arm backend: Add support for single input matmul #10654

Arm backend: Add support for single input matmul #10654

Uh oh!

Conversation

oscarandersson8218 commented May 2, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

pytorch-bot bot commented May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10654

❗ 1 Active SEVs

❌ 2 New Failures

Uh oh!

zingo commented May 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

oscarandersson8218 commented May 2, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented May 2, 2025 •

edited

Loading