Skip to content

Conversation

@oscarandersson8218
Copy link
Collaborator

@oscarandersson8218 oscarandersson8218 commented May 2, 2025

Summary

AnnotateDecomposedMatmul makes sure that a decomposed matmul will two dq-nodes before and a q-node after it's mm/bmm-node. Previously it assumed that the partition always had two input nodes (two dq-nodes), but this is not the case for a single input matmul, e.g. torch.matmul(x, x). In such a case we must copy the dq-node and insert it before the mm/bmm's two inputs.

Before pass:
         -> expand -> view ->
       /                      \
x -> dq                        bmm -> view -> q
       \                      /
         -> expand -> view ->

After pass:
   -> expand -> view -> dq
 /                        \
x                          bmm -> q -> view
 \                        /
   -> expand -> view -> dq

cc @digantdesai @freddan80 @per @zingo

AnnotateDecomposedMatmul makes sure that a decomposed matmul will two
dq-nodes before and a q-node after it's mm/bmm-node. Previously it
assumed that the partition always had two input nodes (two dq-nodes),
but this is not the case for a single input matmul, e.g.
torch.matmul(x, x). In such a case we must copy the dq-node and insert
it before the mm/bmm's two inputs.

Before pass:
         -> expand -> view ->
       /                      \
x -> dq                        bmm -> view -> q
       \                      /
         -> expand -> view ->

After pass:
   -> expand -> view -> dq
 /                        \
x                          bmm -> q -> view
 \                        /
   -> expand -> view -> dq

Signed-off-by: Oscar Andersson <[email protected]>
Change-Id: I5ac381ccd712a535736fa16d1ee864dc76ae2b30
@pytorch-bot
Copy link

pytorch-bot bot commented May 2, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10654

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 2 New Failures

As of commit aedd502 with merge base e88aafc (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 2, 2025
@oscarandersson8218 oscarandersson8218 added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk topic: not user facing labels May 2, 2025
@zingo
Copy link
Collaborator

zingo commented May 5, 2025

MacOS fails is unrelated

@zingo zingo merged commit 6260921 into pytorch:main May 5, 2025
172 of 174 checks passed
YufengShi-dudu added a commit to YufengShi-dudu/executorch that referenced this pull request Sep 26, 2025
- ConvertMmToBmmPass converts an MM node to BMM nodes, turns input
  and output tensors from rank-2 to rank-3 via unsqueeze/squeeze,
  and inserts q-dq before and after BMM node when necessary.

- After ConvertMmToBmmPass:
  x -> q   -> dq   -> unsqueeze -> q_2 -> dq_2 ->
                                                 \
                                                bmm -> q_4 -> dq_4
                                                 /
  y -> q_1 -> dq_1 -> unsqueeze -> q_3 -> dq_3 ->

- Therefore, if the original matmul was 2D, the bmm already has
  DQ nodes on its inputs and Q node on its output.
  If AnnotateDecomposedMatmulPass (pytorch#10654) is still applied in
  this case, it produces illegal sequences such as:
  x -> q -> unsqueeze -> q_2 (invalid)

- Fix by checking whether the BMM is already surrounded by DQ nodes
  on its inputs and Q nodes on its output.

Change-Id: I9949d59b0b4a96fa34a88b0734014567ea6f24cc
Signed-off-by: Yufeng Shi <[email protected]>
Co-authored-by: Oscar Andersson <[email protected]>
zingo pushed a commit that referenced this pull request Oct 6, 2025
- ConvertMmToBmmPass converts an MM node to BMM nodes, turns input and
output tensors from rank-2 to rank-3 via unsqueeze/squeeze, and inserts
q-dq before and after BMM node when necessary.

- After ConvertMmToBmmPass:
```
  x -> q   -> dq   -> unsqueeze -> q_2 -> dq_2 ->
                                                 \
                                                bmm -> q_4 -> dq_4
                                                 /
  y -> q_1 -> dq_1 -> unsqueeze -> q_3 -> dq_3 ->
```

- Therefore, if the original matmul was 2D, the bmm already has DQ nodes
on its inputs and Q node on its output. If AnnotateDecomposedMatmulPass
(#10654) is still applied in this case, it produces illegal sequences
such as: x -> q -> unsqueeze -> q_2 (invalid)

- Fix by checking whether the BMM is already surrounded by DQ nodes on
its inputs and Q nodes on its output.

Change-Id: I9949d59b0b4a96fa34a88b0734014567ea6f24cc


cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

Signed-off-by: Yufeng Shi <[email protected]>
Co-authored-by: Oscar Andersson <[email protected]>
pytorchbot pushed a commit that referenced this pull request Oct 7, 2025
- ConvertMmToBmmPass converts an MM node to BMM nodes, turns input and
output tensors from rank-2 to rank-3 via unsqueeze/squeeze, and inserts
q-dq before and after BMM node when necessary.

- After ConvertMmToBmmPass:
```
  x -> q   -> dq   -> unsqueeze -> q_2 -> dq_2 ->
                                                 \
                                                bmm -> q_4 -> dq_4
                                                 /
  y -> q_1 -> dq_1 -> unsqueeze -> q_3 -> dq_3 ->
```

- Therefore, if the original matmul was 2D, the bmm already has DQ nodes
on its inputs and Q node on its output. If AnnotateDecomposedMatmulPass
(#10654) is still applied in this case, it produces illegal sequences
such as: x -> q -> unsqueeze -> q_2 (invalid)

- Fix by checking whether the BMM is already surrounded by DQ nodes on
its inputs and Q nodes on its output.

Change-Id: I9949d59b0b4a96fa34a88b0734014567ea6f24cc

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

Signed-off-by: Yufeng Shi <[email protected]>
Co-authored-by: Oscar Andersson <[email protected]>
(cherry picked from commit 9a7fb42)
GregoryComer pushed a commit that referenced this pull request Oct 7, 2025
- ConvertMmToBmmPass converts an MM node to BMM nodes, turns input and
output tensors from rank-2 to rank-3 via unsqueeze/squeeze, and inserts
q-dq before and after BMM node when necessary.

- After ConvertMmToBmmPass:
```
  x -> q   -> dq   -> unsqueeze -> q_2 -> dq_2 ->
                                                 \
                                                bmm -> q_4 -> dq_4
                                                 /
  y -> q_1 -> dq_1 -> unsqueeze -> q_3 -> dq_3 ->
```

- Therefore, if the original matmul was 2D, the bmm already has DQ nodes
on its inputs and Q node on its output. If AnnotateDecomposedMatmulPass
(#10654) is still applied in this case, it produces illegal sequences
such as: x -> q -> unsqueeze -> q_2 (invalid)

- Fix by checking whether the BMM is already surrounded by DQ nodes on
its inputs and Q nodes on its output.

Change-Id: I9949d59b0b4a96fa34a88b0734014567ea6f24cc


cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

Signed-off-by: Yufeng Shi <[email protected]>
Co-authored-by: Yufeng Shi <[email protected]>
Co-authored-by: Oscar Andersson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm topic: not user facing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants