Support non-CTK Nvidia libraries #864

rwgk · 2025-08-19T20:53:52Z

Closes #776, #823, #856

Bump cuda-pathfinder version to 1.1.1a3

Adds support for non-CTK Nvidia libraries, as needed for nvmath:

CUDA-related:

mathdx ([FEA]: Support for finding libmathdx through pathfinder #776)
cufftMp ([FEA]: pathfinder support for cufftMp wheel #856)
nvshmem_host

Non-CUDA:

nvpl_fftw

A general approach for finding .so/.dll files under site-packages was implemented in these files:

cuda/pathfinder/_utils/find_site_packages_dll.py
cuda/pathfinder/_utils/find_site_packages_so.py

However, this turned out to be relatively slow (~1.9 seconds in a certain environment with a network drive), even though only RECORD files are inspected, via the stdlib importlib.metadata API. (An initial version that simply walked all site-packages directories was prohibitively slow, ~100 seconds in the same environment.)

To maximize performance, and to make supporting more libraries more intutive at the same time, the approach to searching site-packages was changed to rely on new dictionaries added in supported_nvidia_libs.py:

SITE_PACKAGES_LIBDIRS_LINUX
SITE_PACKAGES_LIBDIRS_WINDOWS

These dictionaries are based on the output from two pairs of helper scripts added under the toolshed/ directory:

Linux: collect_site_packages_so_files.sh, make_site_packages_libdirs_linux.py
Windows: collect_site_packages_dll_files.ps1, make_site_packages_libdirs_windows.py

With this, the same test that took 1.9 seconds to complete with the slower importlib.metadata-based approach completes in ~0.2 seconds (similar to status quo). — The slower approach is now used only from test_load_nvidia_dynamic_lib.py, as an easy way to determine dynamically which wheels are available on each platform.

A simple LIBNAMES_REQUIRING_RTLD_DEEPBIND feature was added to support cufftMp.

Piggy-backed: tests/child_load_nvidia_dynamic_lib_helper.py was factored out of tests/test_load_nvidia_dynamic_lib.py to improve test performance, especially on Windows (factor 2x).

…vidia_lib_dirs()

copy-pr-bot · 2025-08-19T20:53:57Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

rwgk · 2025-08-19T20:54:32Z

/ok to test

…_via_metadata

…_dirs()

rwgk · 2025-08-20T01:00:10Z

/ok to test

rwgk · 2025-08-20T04:36:15Z

/ok to test

…testing.

…ep find_all_so_files_via_metadata as fallback

…dirs_windows.py

…el workflow.

…DA libraries.

…longer needed due to resolution of NVIDIA#608).

… keep find_all_dll_files_via_metadata as fallback

rwgk · 2025-08-20T19:15:55Z

/ok to test

leofang

Made a quick pass. Will resume tomorrow.

leofang · 2025-08-25T04:20:53Z

toolshed/make_site_packages_libdirs_windows.py

I believe the two make_site_packages_libdirs_*.py files can be merged into one. You already handled the directory separators (/ vs \ or \\) properly, and always use the former in supported_nvidia_libs.py.

@copilot Could you please merge the newly added toolshed/make_site_packages_libdirs_*.py scripts into one?

(These scripts were both auto-generated by ChatGPT (with a very minimal effort), I decided it's not worth a manual effort consolidating them.)

The copilot doesn't listen here. I tried via the Spaces feature, but that's only GPT 4.1, and it prompted me to attach files manually. So I went back to ChatGPT 5 Pro, the thinking version, that I mostly use.

I'm attaching what it generated. I'll have time to try it out only later. The line counts:

76 make_site_packages_libdirs_linux.py 110 make_site_packages_libdirs_windows.py 155 make_site_packages_libdirs.py

Do you think it's worth more effort? The original scripts were generated in ~5 minutes each, including writing the prompts.

Throwing them away seems wrong.

Working on the more, I'm not sure.

make_site_packages_libdirs.py.txt

I've manually consolidated the two script now: commit 69d7c98

Even ChatGPT 5 Pro didn't get it right the first time through, two iterations later it was a pretty big mess, then I just did it manually.

I guess we need to make copilot as an assignee before we can "at" it, but we probably can't do so in an existing PR, only before a PR is created.

cuda_pathfinder/pyproject.toml

toolshed/collect_site_packages_dll_files.ps1

.github/workflows/test-wheel-linux.yml

.github/workflows/test-wheel-windows.yml

…er package)

…ll_using_nvidia_bin_dirs and associated foreign_wheels unit test

rwgk · 2025-08-27T00:06:36Z

As discussed in our meeting today: I removed the fallback code paths in find_nvidia_dynamic_lib.py

commit 5711602

rwgk · 2025-08-27T00:41:50Z

/ok to test

…libdirs_windows.py → make_site_packages_libdirs.py

rwgk · 2025-08-27T21:48:38Z

@leofang I'm holding off rerunning the tests, pending your full review. (I think the delta between the last time the full tests ran is small.)

leofang

Looks great, thanks a lot for the hard and thorough work Ralf!

leofang · 2025-08-27T23:53:07Z

cuda_pathfinder/cuda/pathfinder/_dynamic_libs/supported_nvidia_libs.py


 LIBNAMES_REQUIRING_OS_ADD_DLL_DIRECTORY = (
    "cufft",
    "nvrtc",
 )

+LIBNAMES_REQUIRING_RTLD_DEEPBIND = ("cufftMp",)


Q: is deepbind needed so as to avoid symbol collision with libcufft?

leofang · 2025-08-28T00:38:33Z

cuda_pathfinder/cuda/pathfinder/_utils/find_site_packages_dll.py

Not a blocker, but would be nice to revisit after the dust is settled.

I believe this is another file that can be merged with find_site_packages_so.py. If the names dll vs so are bothering you, just refer to them as, say, dso (dynamic shared libraries).

Even on Windows, we also have the same version "suffix" concept despite they are not literally the suffices. Taking cuBLASLt as example: cublasLt64_12.dll is the Windows-equivalent of libcublasLt.so.12 on Linux, and 12 is the "suffix" . So we could have a split_dso_version_suffix that handles both platforms, and then find_all_dso_files_via_metadata is unified.

The same reasoning that find_all_so_files_via_metadata returns a dict of dicts, strictly speaking, also applies to Windows, so a later unification of these two files can also make our treatment more robust.

leofang · 2025-08-28T00:45:52Z

cuda_pathfinder/cuda/pathfinder/_dynamic_libs/supported_nvidia_libs.py

+    "nvfatbin": ("nvidia/cu13/lib", "nvidia/nvfatbin/lib"),
+    "nvjpeg": ("nvidia/cu13/lib", "nvidia/nvjpeg/lib"),
+    "nvrtc": ("nvidia/cu13/lib", "nvidia/cuda_nvrtc/lib"),
+    "nvvm": ("nvidia/cu13/lib", "nvidia/cuda_nvcc/nvvm/lib64"),


oh man, so in CUDA 13 the nvvm subdir still exists, but only libdevice is still kept there while everything else is moved to bin/include/lib... 😞

This is annoying... so the wheel layout for NVVM is changed, but not the system CTK which still has $CUDA_PATH/nvvm...

$ tree /usr/local/cuda-13.0/nvvm/ /usr/local/cuda-13.0/nvvm/ ├── bin │ └── cicc ├── include │ └── nvvm.h ├── lib64 │ ├── libnvvm.so -> libnvvm.so.4 │ ├── libnvvm.so.4 -> libnvvm.so.4.0.0 │ └── libnvvm.so.4.0.0 └── libdevice └── libdevice.10.bc 4 directories, 6 files

leofang · 2025-08-28T00:47:44Z

/ok to test 69d7c98

rwgk · 2025-08-28T01:52:53Z

Thanks Leo!

github-actions · 2025-08-28T02:07:19Z

Doc Preview CI
Preview removed because the pull request was closed or merged.

rwgk added 3 commits August 19, 2025 13:42

Add find_site_packages_so.py

eeef550

Use find_all_so_files_under_all_site_packages() from _find_so_using_n…

678947b

…vidia_lib_dirs()

Bump cuda-pathfinder version to 1.1.1a3

c3e5d33

github-project-automation bot added this to CCCL Aug 19, 2025

github-project-automation bot moved this to Todo in CCCL Aug 19, 2025

rwgk self-assigned this Aug 19, 2025

rwgk added the cuda.pathfinder Everything related to the cuda.pathfinder module label Aug 19, 2025

rwgk added this to the pathfinder-nvmath-support milestone Aug 19, 2025

This comment has been minimized.

Sign in to view

rwgk added 3 commits August 19, 2025 15:28

Limit site-packages search to ("nvidia", "nvpl") subdirs.

6421b49

Replace find_all_so_files_under_all_site_packages → find_all_so_files…

8dc6dd5

…_via_metadata

Add find_site_packages_dll.py and use from _find_dll_using_nvidia_bin…

7376bef

…_dirs()

rwgk added 2 commits August 19, 2025 21:17

Add mathdx, cufftMp DIRECT_DEPENDENCIES

37b8822

Add LIBNAMES_REQUIRING_RTLD_DEEPBIND feature (for cufftMp)

d12cc2c

rwgk and others added 11 commits August 19, 2025 23:41

pyproject.toml: add libmathdx, cufftmpm nvshmem, nvpl-fft wheels for …

f3db887

…testing.

Add SITE_PACKAGES_LIBDIRS_LINUX

1951ab7

Add make_site_packages_libdirs_linux.py

c6fe20a

Use SITE_PACKAGES_LIBDIRS_LINUX in _find_so_using_nvidia_lib_dirs, ke…

fcc7a7c

…ep find_all_so_files_via_metadata as fallback

Add SITE_PACKAGES_LIBDIRS_WINDOWS and toolshed/make_site_packages_lib…

72fa759

…dirs_windows.py

chmod 755 make_site_packages_libdirs_windows.py

0b06db9

Adds paths for the CUDA static library based on CUDA_HOME (NVIDIA#608).

0eaecd3

Removes LIB and LIBRARY_PATH environment variables from the build-whe…

1f2a917

…el workflow.

Updates Linux install to search both lib and lib64 directories for CU…

959d34b

…DA libraries.

Removes LIBRARY_PATH environment variable from installation docs (no …

6adc349

…longer needed due to resolution of NVIDIA#608).

Use SITE_PACKAGES_LIBDIRS_WINDOWS in _find_dll_using_nvidia_bin_dirs,…

a793517

… keep find_all_dll_files_via_metadata as fallback

rwgk requested a review from kkraus14 August 22, 2025 02:27

rwgk changed the title ~~Find all dynamic libs under all site packages~~ Support non-CTK Nvidia libraries, add general fallback for unsupported libs under site-packages Aug 22, 2025

Merge branch 'main' into find_all_dynamic_libs_under_all_site_packages

784723f

rwgk added a commit to rwgk/cuda-python that referenced this pull request Aug 22, 2025

Transfer ci/, .github/ changes from PR NVIDIA#864

673c38c

rwgk mentioned this pull request Aug 22, 2025

Initial version of cuda.pathfinder._find_nvidia_headers for nvshmem #661

Open

leofang requested changes Aug 25, 2025

View reviewed changes

github-project-automation bot moved this from Todo to In Progress in CCCL Aug 25, 2025

rwgk added 6 commits August 26, 2025 16:01

Merge branch 'main' into find_all_dynamic_libs_under_all_site_packages

68b411e

Fix existing (on main) pre-commit error

a8a9506

Do not install nvidia-cufftmp-cu12 on Windows (it is only a placehold…

ad72088

…er package)

Leo's --only-binary=:all: suggestions

0c3cd20

Leo's --only-binary=:all: suggestions (toolshed scripts)

1c0a0e5

Remove fallback code paths in _find_so_using_nvidia_lib_dirs, _find_d…

5711602

…ll_using_nvidia_bin_dirs and associated foreign_wheels unit test

rwgk added 2 commits August 27, 2025 12:02

Merge branch 'main' into find_all_dynamic_libs_under_all_site_packages

02a9bea

Consolidate make_site_packages_libdirs_linux.py + make_site_packages_…

69d7c98

…libdirs_windows.py → make_site_packages_libdirs.py

rwgk force-pushed the find_all_dynamic_libs_under_all_site_packages branch from 901dc47 to 69d7c98 Compare August 27, 2025 21:36

leofang approved these changes Aug 28, 2025

View reviewed changes

github-project-automation bot moved this from In Progress to In Review in CCCL Aug 28, 2025

rwgk merged commit 1137e15 into NVIDIA:main Aug 28, 2025
50 checks passed

github-project-automation bot moved this from In Review to Done in CCCL Aug 28, 2025

rwgk deleted the find_all_dynamic_libs_under_all_site_packages branch August 28, 2025 01:53

leofang linked an issue Aug 28, 2025 that may be closed by this pull request

[FEA]: pathfinder support for cufftMp wheel #856

Closed

1 task

leofang mentioned this pull request Aug 28, 2025

[FEA]: pathfinder support for cufftMp wheel #856

Closed

1 task

rwgk changed the title ~~Support non-CTK Nvidia libraries, add general fallback for unsupported libs under site-packages~~ Support non-CTK Nvidia libraries Aug 28, 2025

Support non-CTK Nvidia libraries #864

Support non-CTK Nvidia libraries #864

Conversation

rwgk commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Aug 19, 2025

Uh oh!

rwgk commented Aug 19, 2025

Uh oh!

This comment has been minimized.

rwgk commented Aug 20, 2025

Uh oh!

rwgk commented Aug 20, 2025

Uh oh!

rwgk commented Aug 20, 2025

Uh oh!

leofang left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rwgk commented Aug 27, 2025

Uh oh!

rwgk commented Aug 27, 2025

Uh oh!

rwgk commented Aug 27, 2025

Uh oh!

leofang left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leofang Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leofang commented Aug 28, 2025

Uh oh!

rwgk commented Aug 28, 2025

Uh oh!

Uh oh!

github-actions bot commented Aug 28, 2025

Uh oh!

Uh oh!

rwgk commented Aug 19, 2025 •

edited

Loading

leofang Aug 28, 2025 •

edited

Loading