Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
eeef550
Add find_site_packages_so.py
rwgk Aug 19, 2025
678947b
Use find_all_so_files_under_all_site_packages() from _find_so_using_n…
rwgk Aug 19, 2025
c3e5d33
Bump cuda-pathfinder version to `1.1.1a3`
rwgk Aug 19, 2025
6421b49
Limit site-packages search to ("nvidia", "nvpl") subdirs.
rwgk Aug 19, 2025
8dc6dd5
Replace find_all_so_files_under_all_site_packages → find_all_so_files…
rwgk Aug 19, 2025
7376bef
Add find_site_packages_dll.py and use from _find_dll_using_nvidia_bin…
rwgk Aug 20, 2025
37b8822
Add mathdx, cufftMp DIRECT_DEPENDENCIES
rwgk Aug 20, 2025
d12cc2c
Add LIBNAMES_REQUIRING_RTLD_DEEPBIND feature (for cufftMp)
rwgk Aug 20, 2025
f3db887
pyproject.toml: add libmathdx, cufftmpm nvshmem, nvpl-fft wheels for …
rwgk Aug 20, 2025
1951ab7
Add SITE_PACKAGES_LIBDIRS_LINUX
rwgk Aug 20, 2025
c6fe20a
Add make_site_packages_libdirs_linux.py
rwgk Aug 20, 2025
fcc7a7c
Use SITE_PACKAGES_LIBDIRS_LINUX in _find_so_using_nvidia_lib_dirs, ke…
rwgk Aug 20, 2025
72fa759
Add SITE_PACKAGES_LIBDIRS_WINDOWS and toolshed/make_site_packages_lib…
rwgk Aug 20, 2025
0b06db9
chmod 755 make_site_packages_libdirs_windows.py
rwgk Aug 20, 2025
0eaecd3
Adds paths for the CUDA static library based on CUDA_HOME (#608).
Andy-Jost Aug 14, 2025
1f2a917
Removes LIB and LIBRARY_PATH environment variables from the build-whe…
Andy-Jost Aug 19, 2025
959d34b
Updates Linux install to search both lib and lib64 directories for CU…
Andy-Jost Aug 19, 2025
6adc349
Removes LIBRARY_PATH environment variable from installation docs (no …
Andy-Jost Aug 20, 2025
a793517
Use SITE_PACKAGES_LIBDIRS_WINDOWS in _find_dll_using_nvidia_bin_dirs,…
rwgk Aug 20, 2025
11ff2d3
Merge branch 'main' into find_all_dynamic_libs_under_all_site_packages
rwgk Aug 20, 2025
c5f2d33
Factor out SITE_PACKAGES_LIBDIRS_*_CTK, add test_supported_libnames_*…
rwgk Aug 20, 2025
2a6eda3
Also exercise "other" (non-CTK) libnames in test_load_nvidia_dynamic_…
rwgk Aug 20, 2025
7957fe4
Exercise fallback code path using pygit2 wheel.
rwgk Aug 21, 2025
6a7f570
Merge branch 'main' into find_all_dynamic_libs_under_all_site_packages
rwgk Aug 21, 2025
12485f1
Add other_wheels,foreign_wheels to pip install nvidia_wheels_cu13
rwgk Aug 21, 2025
70b198c
Add toolshed/collect_site_packages_so_files.sh, with terse Usage comm…
rwgk Aug 21, 2025
6cea192
Add toolshed/collect_site_packages_dll_files.ps1 with terse Usage com…
rwgk Aug 21, 2025
f33d7c8
Add pygit2 comments.
rwgk Aug 21, 2025
1b216d7
Replace special-case workaround in tests/child_load_nvidia_dynamic_li…
rwgk Aug 21, 2025
3c900ae
Add anticipated CTK 13 paths for mathdx in SITE_PACKAGES_LIBDIRS_LINU…
rwgk Aug 21, 2025
58a2520
Rename other_wheels → nvidia_wheels_host
rwgk Aug 21, 2025
1c939a7
WIP
rwgk Aug 21, 2025
acc226b
Restore _no_such_file_in_sub_dirs error reporting
rwgk Aug 21, 2025
b5b12f0
Merge branch 'main' into find_all_dynamic_libs_under_all_site_packages
rwgk Aug 21, 2025
9678a86
Merge branch 'main' into find_all_dynamic_libs_under_all_site_packages
rwgk Aug 22, 2025
fcdd977
Use `pip install -v ".[nvidia_wheels_cu${TEST_CUDA_MAJOR},nvidia_whee…
rwgk Aug 22, 2025
87a8e43
Export TEST_CUDA_MAJOR to the GITHUB_ENV
rwgk Aug 22, 2025
784723f
Merge branch 'main' into find_all_dynamic_libs_under_all_site_packages
rwgk Aug 22, 2025
68b411e
Merge branch 'main' into find_all_dynamic_libs_under_all_site_packages
rwgk Aug 26, 2025
a8a9506
Fix existing (on main) pre-commit error
rwgk Aug 26, 2025
ad72088
Do not install nvidia-cufftmp-cu12 on Windows (it is only a placehold…
rwgk Aug 26, 2025
0c3cd20
Leo's --only-binary=:all: suggestions
rwgk Aug 26, 2025
1c0a0e5
Leo's --only-binary=:all: suggestions (toolshed scripts)
rwgk Aug 26, 2025
5711602
Remove fallback code paths in _find_so_using_nvidia_lib_dirs, _find_d…
rwgk Aug 27, 2025
02a9bea
Merge branch 'main' into find_all_dynamic_libs_under_all_site_packages
rwgk Aug 27, 2025
69d7c98
Consolidate make_site_packages_libdirs_linux.py + make_site_packages_…
rwgk Aug 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 4 additions & 5 deletions .github/workflows/test-wheel-linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -321,16 +321,15 @@ jobs:
pip install $(ls cuda_python*.whl)[all]
fi

- name: Install cuda.pathfinder nvidia_wheels_cu13
if: startsWith(matrix.CUDA_VER, '13.')
- name: Install cuda.pathfinder extra wheels for testing
run: |
set -euo pipefail
pushd cuda_pathfinder
pip install -v .[nvidia_wheels_cu13]
pip freeze
pip install --only-binary=:all: -v ".[nvidia_wheels_cu${TEST_CUDA_MAJOR},nvidia_wheels_host]"
pip list
popd

- name: Run cuda.pathfinder tests with all_must_work
if: startsWith(matrix.CUDA_VER, '13.')
env:
CUDA_PATHFINDER_TEST_LOAD_NVIDIA_DYNAMIC_LIB_STRICTNESS: all_must_work
run: run-tests pathfinder
8 changes: 3 additions & 5 deletions .github/workflows/test-wheel-windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -288,17 +288,15 @@ jobs:
pip install "$((Get-ChildItem -Filter cuda_python*.whl).FullName)[all]"
}

- name: Install cuda.pathfinder nvidia_wheels_cu13
if: startsWith(matrix.CUDA_VER, '13.')
- name: Install cuda.pathfinder extra wheels for testing
shell: bash --noprofile --norc -xeuo pipefail {0}
run: |
pushd cuda_pathfinder
pip install -v .[nvidia_wheels_cu13]
pip freeze
pip install --only-binary=:all: -v ".[nvidia_wheels_cu${TEST_CUDA_MAJOR},nvidia_wheels_host]"
pip list
popd

- name: Run cuda.pathfinder tests with all_must_work
if: startsWith(matrix.CUDA_VER, '13.')
env:
CUDA_PATHFINDER_TEST_LOAD_NVIDIA_DYNAMIC_LIB_STRICTNESS: all_must_work
shell: bash --noprofile --norc -xeuo pipefail {0}
Expand Down
1 change: 1 addition & 0 deletions ci/tools/env-vars
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ elif [[ "${1}" == "test" ]]; then
echo "SETUP_SANITIZER=${SETUP_SANITIZER}" >> $GITHUB_ENV
echo "SKIP_CUDA_BINDINGS_TEST=${SKIP_CUDA_BINDINGS_TEST}" >> $GITHUB_ENV
echo "SKIP_CYTHON_TEST=${SKIP_CYTHON_TEST}" >> $GITHUB_ENV
echo "TEST_CUDA_MAJOR=${TEST_CUDA_MAJOR}" >> $GITHUB_ENV
fi

echo "CUDA_BINDINGS_ARTIFACT_BASENAME=${CUDA_BINDINGS_ARTIFACT_BASENAME}" >> $GITHUB_ENV
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
from cuda.pathfinder._dynamic_libs.load_dl_common import DynamicLibNotFoundError
from cuda.pathfinder._dynamic_libs.supported_nvidia_libs import (
IS_WINDOWS,
SITE_PACKAGES_LIBDIRS_LINUX,
SITE_PACKAGES_LIBDIRS_WINDOWS,
is_suppressed_dll_file,
)
from cuda.pathfinder._utils.find_sub_dirs import find_sub_dirs, find_sub_dirs_all_sitepackages
Expand All @@ -28,22 +30,25 @@ def _no_such_file_in_sub_dirs(
def _find_so_using_nvidia_lib_dirs(
libname: str, so_basename: str, error_messages: list[str], attachments: list[str]
) -> Optional[str]:
file_wild = so_basename + "*"
nvidia_sub_dirs_list: list[tuple[str, ...]] = [("nvidia", "*", "lib")] # works also for CTK 13 nvvm
if libname == "nvvm":
nvidia_sub_dirs_list.append(("nvidia", "*", "nvvm", "lib64")) # CTK 12
for nvidia_sub_dirs in nvidia_sub_dirs_list:
for lib_dir in find_sub_dirs_all_sitepackages(nvidia_sub_dirs):
# First look for an exact match
so_name = os.path.join(lib_dir, so_basename)
if os.path.isfile(so_name):
return so_name
# Look for a versioned library
# Using sort here mainly to make the result deterministic.
for so_name in sorted(glob.glob(os.path.join(lib_dir, file_wild))):
rel_dirs = SITE_PACKAGES_LIBDIRS_LINUX.get(libname)
if rel_dirs is not None:
sub_dirs_searched = []
file_wild = so_basename + "*"
for rel_dir in rel_dirs:
sub_dir = tuple(rel_dir.split(os.path.sep))
for abs_dir in find_sub_dirs_all_sitepackages(sub_dir):
# First look for an exact match
so_name = os.path.join(abs_dir, so_basename)
if os.path.isfile(so_name):
return so_name
_no_such_file_in_sub_dirs(nvidia_sub_dirs, file_wild, error_messages, attachments)
# Look for a versioned library
# Using sort here mainly to make the result deterministic.
for so_name in sorted(glob.glob(os.path.join(abs_dir, file_wild))):
if os.path.isfile(so_name):
return so_name
sub_dirs_searched.append(sub_dir)
for sub_dir in sub_dirs_searched:
_no_such_file_in_sub_dirs(sub_dir, file_wild, error_messages, attachments)
return None


Expand All @@ -59,18 +64,18 @@ def _find_dll_under_dir(dirpath: str, file_wild: str) -> Optional[str]:
def _find_dll_using_nvidia_bin_dirs(
libname: str, lib_searched_for: str, error_messages: list[str], attachments: list[str]
) -> Optional[str]:
nvidia_sub_dirs_list: list[tuple[str, ...]] = [
("nvidia", "*", "bin"), # CTK 12
("nvidia", "*", "bin", "*"), # CTK 13, e.g. site-packages\nvidia\cu13\bin\x86_64\
]
if libname == "nvvm":
nvidia_sub_dirs_list.append(("nvidia", "*", "nvvm", "bin")) # Only for CTK 12
for nvidia_sub_dirs in nvidia_sub_dirs_list:
for bin_dir in find_sub_dirs_all_sitepackages(nvidia_sub_dirs):
dll_name = _find_dll_under_dir(bin_dir, lib_searched_for)
if dll_name is not None:
return dll_name
_no_such_file_in_sub_dirs(nvidia_sub_dirs, lib_searched_for, error_messages, attachments)
rel_dirs = SITE_PACKAGES_LIBDIRS_WINDOWS.get(libname)
if rel_dirs is not None:
sub_dirs_searched = []
for rel_dir in rel_dirs:
sub_dir = tuple(rel_dir.split(os.path.sep))
for abs_dir in find_sub_dirs_all_sitepackages(sub_dir):
dll_name = _find_dll_under_dir(abs_dir, lib_searched_for)
if dll_name is not None:
return dll_name
sub_dirs_searched.append(sub_dir)
for sub_dir in sub_dirs_searched:
_no_such_file_in_sub_dirs(sub_dir, lib_searched_for, error_messages, attachments)
return None


Expand Down
21 changes: 16 additions & 5 deletions cuda_pathfinder/cuda/pathfinder/_dynamic_libs/load_dl_linux.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,10 @@
from typing import Optional, cast

from cuda.pathfinder._dynamic_libs.load_dl_common import LoadedDL
from cuda.pathfinder._dynamic_libs.supported_nvidia_libs import SUPPORTED_LINUX_SONAMES
from cuda.pathfinder._dynamic_libs.supported_nvidia_libs import (
LIBNAMES_REQUIRING_RTLD_DEEPBIND,
SUPPORTED_LINUX_SONAMES,
)

CDLL_MODE = os.RTLD_NOW | os.RTLD_GLOBAL

Expand Down Expand Up @@ -138,6 +141,13 @@ def check_if_already_loaded_from_elsewhere(libname: str, _have_abs_path: bool) -
return None


def _load_lib(libname: str, filename: str) -> ctypes.CDLL:
cdll_mode = CDLL_MODE
if libname in LIBNAMES_REQUIRING_RTLD_DEEPBIND:
cdll_mode |= os.RTLD_DEEPBIND
return ctypes.CDLL(filename, cdll_mode)


def load_with_system_search(libname: str) -> Optional[LoadedDL]:
"""Try to load a library using system search paths.

Expand All @@ -152,13 +162,14 @@ def load_with_system_search(libname: str) -> Optional[LoadedDL]:
"""
for soname in get_candidate_sonames(libname):
try:
handle = ctypes.CDLL(soname, CDLL_MODE)
handle = _load_lib(libname, soname)
except OSError:
pass
else:
abs_path = abs_path_for_dynamic_library(libname, handle)
if abs_path is None:
raise RuntimeError(f"No expected symbol for {libname=!r}")
return LoadedDL(abs_path, False, handle._handle)
except OSError:
pass
return None


Expand Down Expand Up @@ -196,7 +207,7 @@ def load_with_abs_path(libname: str, found_path: str) -> LoadedDL:
"""
_work_around_known_bugs(libname, found_path)
try:
handle = ctypes.CDLL(found_path, CDLL_MODE)
handle = _load_lib(libname, found_path)
except OSError as e:
raise RuntimeError(f"Failed to dlopen {found_path}: {e}") from e
return LoadedDL(found_path, False, handle._handle)
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@
SUPPORTED_LIBNAMES = SUPPORTED_LIBNAMES_WINDOWS if IS_WINDOWS else SUPPORTED_LIBNAMES_LINUX

# Based on ldd output for Linux x86_64 nvidia-*-cu12 wheels (12.8.1)
DIRECT_DEPENDENCIES = {
DIRECT_DEPENDENCIES_CTK = {
"cublas": ("cublasLt",),
"cufftw": ("cufft",),
# "cufile_rdma": ("cufile",),
Expand All @@ -82,6 +82,10 @@
"npps": ("nppc",),
"nvblas": ("cublas", "cublasLt"),
}
DIRECT_DEPENDENCIES = DIRECT_DEPENDENCIES_CTK | {
"mathdx": ("nvrtc",),
"cufftMp": ("nvshmem_host",),
}

# Based on these released files:
# cuda_11.0.3_450.51.06_linux.run
Expand All @@ -104,7 +108,7 @@
# cuda_12.9.1_575.57.08_linux.run
# cuda_13.0.0_580.65.06_linux.run
# Generated with toolshed/build_pathfinder_sonames.py
SUPPORTED_LINUX_SONAMES = {
SUPPORTED_LINUX_SONAMES_CTK = {
"cublas": (
"libcublas.so.11",
"libcublas.so.12",
Expand Down Expand Up @@ -232,6 +236,13 @@
"libnvvm.so.4",
),
}
SUPPORTED_LINUX_SONAMES_OTHER = {
"cufftMp": ("libcufftMp.so.11",),
"mathdx": ("libmathdx.so.0",),
"nvpl_fftw": ("libnvpl_fftw.so.0",),
"nvshmem_host": ("libnvshmem_host.so.3",),
}
SUPPORTED_LINUX_SONAMES = SUPPORTED_LINUX_SONAMES_CTK | SUPPORTED_LINUX_SONAMES_OTHER

# Based on these released files:
# cuda_11.0.3_451.82_win10.exe
Expand All @@ -254,7 +265,7 @@
# cuda_12.9.1_576.57_windows.exe
# cuda_13.0.0_windows.exe
# Generated with toolshed/build_pathfinder_dlls.py
SUPPORTED_WINDOWS_DLLS = {
SUPPORTED_WINDOWS_DLLS_CTK = {
"cublas": (
"cublas64_11.dll",
"cublas64_12.dll",
Expand Down Expand Up @@ -384,12 +395,91 @@
"nvvm70.dll",
),
}
SUPPORTED_WINDOWS_DLLS_OTHER = {
"mathdx": ("mathdx64_0.dll",),
}
SUPPORTED_WINDOWS_DLLS = SUPPORTED_WINDOWS_DLLS_CTK | SUPPORTED_WINDOWS_DLLS_OTHER

LIBNAMES_REQUIRING_OS_ADD_DLL_DIRECTORY = (
"cufft",
"nvrtc",
)

LIBNAMES_REQUIRING_RTLD_DEEPBIND = ("cufftMp",)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: is deepbind needed so as to avoid symbol collision with libcufft?


# Based on output of toolshed/make_site_packages_libdirs_linux.py
SITE_PACKAGES_LIBDIRS_LINUX_CTK = {
"cublas": ("nvidia/cu13/lib", "nvidia/cublas/lib"),
"cublasLt": ("nvidia/cu13/lib", "nvidia/cublas/lib"),
"cudart": ("nvidia/cu13/lib", "nvidia/cuda_runtime/lib"),
"cufft": ("nvidia/cu13/lib", "nvidia/cufft/lib"),
"cufftw": ("nvidia/cu13/lib", "nvidia/cufft/lib"),
"cufile": ("nvidia/cu13/lib", "nvidia/cufile/lib"),
# "cufile_rdma": ("nvidia/cu13/lib", "nvidia/cufile/lib"),
"curand": ("nvidia/cu13/lib", "nvidia/curand/lib"),
"cusolver": ("nvidia/cu13/lib", "nvidia/cusolver/lib"),
"cusolverMg": ("nvidia/cu13/lib", "nvidia/cusolver/lib"),
"cusparse": ("nvidia/cu13/lib", "nvidia/cusparse/lib"),
"nppc": ("nvidia/cu13/lib", "nvidia/npp/lib"),
"nppial": ("nvidia/cu13/lib", "nvidia/npp/lib"),
"nppicc": ("nvidia/cu13/lib", "nvidia/npp/lib"),
"nppidei": ("nvidia/cu13/lib", "nvidia/npp/lib"),
"nppif": ("nvidia/cu13/lib", "nvidia/npp/lib"),
"nppig": ("nvidia/cu13/lib", "nvidia/npp/lib"),
"nppim": ("nvidia/cu13/lib", "nvidia/npp/lib"),
"nppist": ("nvidia/cu13/lib", "nvidia/npp/lib"),
"nppisu": ("nvidia/cu13/lib", "nvidia/npp/lib"),
"nppitc": ("nvidia/cu13/lib", "nvidia/npp/lib"),
"npps": ("nvidia/cu13/lib", "nvidia/npp/lib"),
"nvJitLink": ("nvidia/cu13/lib", "nvidia/nvjitlink/lib"),
"nvblas": ("nvidia/cu13/lib", "nvidia/cublas/lib"),
"nvfatbin": ("nvidia/cu13/lib", "nvidia/nvfatbin/lib"),
"nvjpeg": ("nvidia/cu13/lib", "nvidia/nvjpeg/lib"),
"nvrtc": ("nvidia/cu13/lib", "nvidia/cuda_nvrtc/lib"),
"nvvm": ("nvidia/cu13/lib", "nvidia/cuda_nvcc/nvvm/lib64"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh man, so in CUDA 13 the nvvm subdir still exists, but only libdevice is still kept there while everything else is moved to bin/include/lib... 😞

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is annoying... so the wheel layout for NVVM is changed, but not the system CTK which still has $CUDA_PATH/nvvm...

$ tree /usr/local/cuda-13.0/nvvm/
/usr/local/cuda-13.0/nvvm/
├── bin
│   └── cicc
├── include
│   └── nvvm.h
├── lib64
│   ├── libnvvm.so -> libnvvm.so.4
│   ├── libnvvm.so.4 -> libnvvm.so.4.0.0
│   └── libnvvm.so.4.0.0
└── libdevice
    └── libdevice.10.bc

4 directories, 6 files

}
SITE_PACKAGES_LIBDIRS_LINUX_OTHER = {
"cufftMp": ("nvidia/cufftmp/cu12/lib",),
"mathdx": ("nvidia/cu13/lib", "nvidia/cu12/lib"),
"nvpl_fftw": ("nvpl/lib",),
"nvshmem_host": ("nvidia/nvshmem/lib",),
}
SITE_PACKAGES_LIBDIRS_LINUX = SITE_PACKAGES_LIBDIRS_LINUX_CTK | SITE_PACKAGES_LIBDIRS_LINUX_OTHER

# Based on output of toolshed/make_site_packages_libdirs_windows.py
SITE_PACKAGES_LIBDIRS_WINDOWS_CTK = {
"cublas": ("nvidia/cu13/bin/x86_64", "nvidia/cublas/bin"),
"cublasLt": ("nvidia/cu13/bin/x86_64", "nvidia/cublas/bin"),
"cudart": ("nvidia/cu13/bin/x86_64", "nvidia/cuda_runtime/bin"),
"cufft": ("nvidia/cu13/bin/x86_64", "nvidia/cufft/bin"),
"cufftw": ("nvidia/cu13/bin/x86_64", "nvidia/cufft/bin"),
"curand": ("nvidia/cu13/bin/x86_64", "nvidia/curand/bin"),
"cusolver": ("nvidia/cu13/bin/x86_64", "nvidia/cusolver/bin"),
"cusolverMg": ("nvidia/cu13/bin/x86_64", "nvidia/cusolver/bin"),
"cusparse": ("nvidia/cu13/bin/x86_64", "nvidia/cusparse/bin"),
"nppc": ("nvidia/cu13/bin/x86_64", "nvidia/npp/bin"),
"nppial": ("nvidia/cu13/bin/x86_64", "nvidia/npp/bin"),
"nppicc": ("nvidia/cu13/bin/x86_64", "nvidia/npp/bin"),
"nppidei": ("nvidia/cu13/bin/x86_64", "nvidia/npp/bin"),
"nppif": ("nvidia/cu13/bin/x86_64", "nvidia/npp/bin"),
"nppig": ("nvidia/cu13/bin/x86_64", "nvidia/npp/bin"),
"nppim": ("nvidia/cu13/bin/x86_64", "nvidia/npp/bin"),
"nppist": ("nvidia/cu13/bin/x86_64", "nvidia/npp/bin"),
"nppisu": ("nvidia/cu13/bin/x86_64", "nvidia/npp/bin"),
"nppitc": ("nvidia/cu13/bin/x86_64", "nvidia/npp/bin"),
"npps": ("nvidia/cu13/bin/x86_64", "nvidia/npp/bin"),
"nvJitLink": ("nvidia/cu13/bin/x86_64", "nvidia/nvjitlink/bin"),
"nvblas": ("nvidia/cu13/bin/x86_64", "nvidia/cublas/bin"),
"nvfatbin": ("nvidia/cu13/bin/x86_64", "nvidia/nvfatbin/bin"),
"nvjpeg": ("nvidia/cu13/bin/x86_64", "nvidia/nvjpeg/bin"),
"nvrtc": ("nvidia/cu13/bin/x86_64", "nvidia/cuda_nvrtc/bin"),
"nvvm": ("nvidia/cu13/bin/x86_64", "nvidia/cuda_nvcc/nvvm/bin"),
}
SITE_PACKAGES_LIBDIRS_WINDOWS_OTHER = {
"mathdx": ("nvidia/cu13/bin/x86_64", "nvidia/cu12/bin"),
}
SITE_PACKAGES_LIBDIRS_WINDOWS = SITE_PACKAGES_LIBDIRS_WINDOWS_CTK | SITE_PACKAGES_LIBDIRS_WINDOWS_OTHER


def is_suppressed_dll_file(path_basename: str) -> bool:
if path_basename.startswith("nvrtc"):
Expand Down
26 changes: 26 additions & 0 deletions cuda_pathfinder/cuda/pathfinder/_utils/find_site_packages_dll.py
Copy link
Member

@leofang leofang Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker, but would be nice to revisit after the dust is settled.

I believe this is another file that can be merged with find_site_packages_so.py. If the names dll vs so are bothering you, just refer to them as, say, dso (dynamic shared libraries).

Even on Windows, we also have the same version "suffix" concept despite they are not literally the suffices. Taking cuBLASLt as example: cublasLt64_12.dll is the Windows-equivalent of libcublasLt.so.12 on Linux, and 12 is the "suffix" . So we could have a split_dso_version_suffix that handles both platforms, and then find_all_dso_files_via_metadata is unified.

The same reasoning that find_all_so_files_via_metadata returns a dict of dicts, strictly speaking, also applies to Windows, so a later unification of these two files can also make our treatment more robust.

Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

import collections
import functools
import importlib.metadata


@functools.cache
def find_all_dll_files_via_metadata() -> dict[str, tuple[str, ...]]:
results: collections.defaultdict[str, list[str]] = collections.defaultdict(list)

# sort dists for deterministic output
for dist in sorted(importlib.metadata.distributions(), key=lambda d: (d.metadata.get("Name", ""), d.version)):
files = dist.files
if not files:
continue
for relpath in sorted(files, key=lambda p: str(p)): # deterministic
relname = relpath.name.lower()
if not relname.endswith(".dll"):
continue
abs_path = str(dist.locate_file(relpath))
results[relname].append(abs_path)

# plain dicts; sort inner list for stability
return {k: tuple(sorted(v)) for k, v in results.items()}
39 changes: 39 additions & 0 deletions cuda_pathfinder/cuda/pathfinder/_utils/find_site_packages_so.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

import collections
import functools
import importlib.metadata
import re

_SO_RE = re.compile(r"\.so(?:$|\.)") # matches libfoo.so or libfoo.so.1.2.3


def split_so_version_suffix(so_filename: str) -> tuple[str, str]:
idx = so_filename.rfind(".so")
assert idx > 0, so_filename
idx += 3
return (so_filename[:idx], so_filename[idx:])


@functools.cache
def find_all_so_files_via_metadata() -> dict[str, dict[str, tuple[str, ...]]]:
results: collections.defaultdict[str, collections.defaultdict[str, list[str]]] = collections.defaultdict(
lambda: collections.defaultdict(list)
)

# sort dists for deterministic output
for dist in sorted(importlib.metadata.distributions(), key=lambda d: (d.metadata.get("Name", ""), d.version)):
files = dist.files
if not files:
continue
for relpath in sorted(files, key=lambda p: str(p)): # deterministic
relname = relpath.name
if not _SO_RE.search(relname):
continue
so_basename, so_version_suffix = split_so_version_suffix(relname)
abs_path = str(dist.locate_file(relpath))
results[so_basename][so_version_suffix].append(abs_path)

# plain dicts; sort inner lists for stability
return {k: {kk: tuple(sorted(vv)) for kk, vv in v.items()} for k, v in results.items()}
2 changes: 1 addition & 1 deletion cuda_pathfinder/cuda/pathfinder/_version.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

__version__ = "1.1.1a2"
__version__ = "1.1.1a3"
Loading
Loading