Skip to content

Conversation

@mcr229
Copy link
Contributor

@mcr229 mcr229 commented Dec 4, 2024

Resolving the following issue:

#7019

XNNPACK has split its targets to XNNPACK (core orchestration of kernels) and microkernels-prod to separate static libraries. In upstream PyTorch we updated to this change, and had to make changes throughout the PyTorch CMake to install and link both XNNPACK and microkernels-prod.

Originally I had thought the issue stemmed from a mistake in PyTorch in which we weren't correctly linking microkernels-prod to torch_cpu, which when linked with portable pybindings was causing missing symbol issues related to ADRP. This led me to spend a lot of time messing with the pytorch cmake, rebuilding the wheels locally, pip installing them and trying to rebuild the executorch wheels. However, after much troubleshooting, I realized that the issue wasn't that the microkernels-prod static library was missing, but rather because it was there. This was likely causing problems duplicate linking problems with the microkernels-prod from our xnnpack_backend target as we were looking for microkernel function pointers in torch_cpu.

Previously we explicitly linked XNNPACK along with the xnnpack_backend in order to avoid using the XNNPACK symbols libtorch_cpu.dylib(from pytorch). Since Executorch and Pytorch now have the change of split targets between XNNPACK and microkernels-prod, we also have to explicitly link microkernel-prod we built for the xnnpack_backend to avoid using the static library from upstream torch_cpu. This is likely what caused the ld: invalid use of ADRP in ... because the function pointer to the ukernel (in microkernel-prod) was pointing to the instance of torch_cpu.

I tested it out by finally rebuilding the wheels with the latest torch nightly, and it was able to succeed.

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 4, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7173

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 10a1288 with merge base d89d3e7 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@mcr229 mcr229 requested a review from larryliu0820 December 4, 2024 09:34
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 4, 2024
@mcr229 mcr229 requested a review from huydhn December 4, 2024 09:34
@huydhn
Copy link
Contributor

huydhn commented Dec 4, 2024

I'm adding ciflow/binaries to test this out on PR. It would trigger the nightly build here.

Copy link
Contributor

@huydhn huydhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list(APPEND _dep_libs xnnpack_backend XNNPACK)
# need to explicitly specify XNNPACK and microkernels-prod
# here otherwise uses XNNPACK and microkernel-prod symbols from libtorch_cpu
list(APPEND _dep_libs xnnpack_backend XNNPACK microkernels-prod)
Copy link
Contributor

@digantdesai digantdesai Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surprised it didn't cause more problems before TBH

Copy link
Contributor

@guangy10 guangy10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for troubleshooting and fix!

@guangy10
Copy link
Contributor

guangy10 commented Dec 4, 2024

Curious how it is related to the broken libtorch-cpu-shared-with-deps-cxx11-abi-build in pytorch

@facebook-github-bot
Copy link
Contributor

@mcr229 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@larryliu0820 larryliu0820 merged commit dd08abc into pytorch:main Dec 7, 2024
57 of 59 checks passed
@mcr229 mcr229 deleted the xnn_linkage branch July 25, 2025 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/binaries CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: not user facing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants