Arm backend: Annotate ADD/SUB with indepenedent observers #13516

gggekov · 2025-08-19T14:53:08Z

We were annotating the ADD/SUB with a shared observer resulting in the same quantisation parameters on the two inputs even if we were adding numbers in different ranges(positive tensor to a tensor with positive and negative values). As a result,
the quantisation parameters were suboptimal. This change annotates the operator with independent observers and changes how we rescale the two inputs to bring them to the same range. Added a unit test of a resnet model. Lowered the number of channels on a few unit tests in order to keep the Total SRAM Used below 2MB for the Ethos-U55 to fit within the memory limit of the Corstone-300.

Fixes #12959

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

pytorch-bot · 2025-08-19T14:53:12Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13516

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

ROCm MI2xx CI/CD workflows failing due to : download from https://api.github.com/repos/pytorch/pytorch timed out.

❌ 2 New Failures, 1 Unrelated Failure

As of commit 87ac448 with merge base 71a7806 ():

NEW FAILURES - The following jobs have failed:

Build documentation / build (buck2) / Build doc (gh)
At least one of the pre-conditions you specified did not hold
trunk / test-llama-runner-mac (fp32, coreml) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest-nxp-neutron / linux-job (gh) (trunk failure)
test_split_group_convolution__applied_by_default

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-08-19T20:31:13Z

@digantdesai has imported this pull request. If you are a Meta employee, you can view this in D80562110.

digantdesai · 2025-08-20T11:37:26Z

backends/arm/operators/op_add.py

            # pyre-ignore
            tqutils.insert_rescale_op_to_int8(
-                tosa_graph, add_output, scale_back, node, self.tosa_spec
+                tosa_graph, add_output, scale_back, node, False, self.tosa_spec


Readability nit

Suggested change

tosa_graph, add_output, scale_back, node, False, self.tosa_spec

tosa_graph, add_output, scale_back, node, compute_rescale=False, self.tosa_spec

Will do, thanks. We have a long weekend in the UK, will be back on Tuesday.

backends/arm/test/ops/test_conv_combos.py

digantdesai · 2025-08-20T11:41:03Z

backends/arm/test/ops/test_sub.py

 }
 fvp_sub2_xfails = {"rand_4D_2x2x4x4": "MLETORCH-517 : Multiple batches not supported"}

+# Sub and tan - the tan has a really steep curve just before Pi/2 and a point of discontinuity at Pi/2


digantdesai · 2025-08-20T11:42:43Z

backends/arm/tosa_quant_utils.py

 from tosa.RoundingMode import RoundingMode  # type: ignore


+def insert_rescale_ops_to_int32_for_add_sub(


Nit why the fn name should have _for_add_sub suffix?

The insert_rescale_ops_to_int32_for_add_sub function is only called for the ADD & SUB ops because only for these operators we use a common scale of 2max(scale_A,scale_B) and then multiply the original scale by 1<<20 without overflowing in a 32-bit accumualator.

I mean the function name shouldn't list its call sites :)

We were annotating the ADD/SUB with a shared observer resulting in the same quantisation parameters on the two inputs even if we were adding numbers in different ranges(positive tensor to a tensor with positive and negative values). As a result, the quantisation parameters were suboptimal. This change annotates the operator with independent observers and changes how we rescale the two inputs to bring them to the same range. Added a unit test of a resnet model. Lowered the number of channels on a few unit tests in order to keep the Total SRAM Used below 2MB for the Ethos-U55 to fit within the memory limit of the Corstone-300. Change-Id: I7adde636f901c9df6b779d946a157e66fd12e24e

zingo · 2025-09-01T15:08:14Z

Rebased after a fix for some broken arm tests was merged

zingo · 2025-09-01T22:52:40Z

Test fails are unrelated

zingo · 2025-09-01T22:57:26Z

@digantdesai I cant merge this, is there a older version of this PR internally blocking this?

facebook-github-bot · 2025-09-02T10:20:47Z

@digantdesai has imported this pull request. If you are a Meta employee, you can view this in D80562110.

gggekov requested a review from digantdesai as a code owner August 19, 2025 14:53

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 19, 2025

gggekov added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk topic: not user facing labels Aug 19, 2025

gggekov mentioned this pull request Aug 19, 2025

Incorrect Observer Sharing/Derivation at Conv-ReLU+ Residual with Arm Ethos Quantizer #12959

Closed

zingo added release notes: arm Changes to the ARM backend delegate and removed topic: not user facing labels Aug 19, 2025

digantdesai reviewed Aug 20, 2025

View reviewed changes

backends/arm/test/ops/test_conv_combos.py Show resolved Hide resolved

digantdesai reviewed Aug 20, 2025

View reviewed changes

digantdesai approved these changes Aug 20, 2025

View reviewed changes

gggekov force-pushed the Meta_fusing_conv1d_relu_residual_add branch 6 times, most recently from 80e0d2b to 03d88e1 Compare August 29, 2025 15:12

gggekov force-pushed the Meta_fusing_conv1d_relu_residual_add branch from 03d88e1 to c87751e Compare August 29, 2025 16:40

Merge branch 'main' into Meta_fusing_conv1d_relu_residual_add

cad772c

zingo approved these changes Sep 1, 2025

View reviewed changes

Merge branch 'main' into Meta_fusing_conv1d_relu_residual_add

87ac448

zingo merged commit 8ba92a9 into pytorch:main Sep 2, 2025
245 of 248 checks passed

	tosa_graph, add_output, scale_back, node, False, self.tosa_spec
	tosa_graph, add_output, scale_back, node, compute_rescale=False, self.tosa_spec

		from tosa.RoundingMode import RoundingMode # type: ignore


		def insert_rescale_ops_to_int32_for_add_sub(

Arm backend: Annotate ADD/SUB with indepenedent observers #13516

Arm backend: Annotate ADD/SUB with indepenedent observers #13516

Uh oh!

Conversation

gggekov commented Aug 19, 2025 • edited by zingo Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13516

❗ 1 Active SEVs

❌ 2 New Failures, 1 Unrelated Failure

Uh oh!

facebook-github-bot commented Aug 19, 2025

Uh oh!

digantdesai Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gggekov Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

gggekov Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

digantdesai Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

digantdesai Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

gggekov Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

digantdesai Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

gggekov Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

zingo commented Sep 1, 2025

Uh oh!

zingo commented Sep 1, 2025

Uh oh!

zingo commented Sep 1, 2025

Uh oh!

facebook-github-bot commented Sep 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gggekov commented Aug 19, 2025 •

edited by zingo

Loading

pytorch-bot bot commented Aug 19, 2025 •

edited

Loading

digantdesai Aug 20, 2025 •

edited

Loading