Skip to content

Commit 75e9e73

Browse files
[SYCL] Lookup versioned OpenCL adapter library as fallback (#20229)
**Problem** SYCL RT loads `libur_adapter_opencl.so` (https://github.com/intel/llvm/blob/0ff1a5c2b4e4bc56799ec2dd17a89c3c57608890/sycl/include/sycl/detail/os_util.hpp#L129) while UR loads `libur_adapter_opencl.so.0` (https://github.com/intel/llvm/blob/0031df16e41bd0665e85af635b7bfd4e187ce7cd/unified-runtime/source/loader/ur_manifests.hpp#L35). Note that SYCL RT calls `dlopen()` with `RTLD_NOLOAD` flag, which causes `dlopen()` to fail if this library wasn’t loaded before. Now, in our Linux compiler packages, `libur_adapter_opencl.so` and `libur_adapter_opencl.so.0` are symlinked so they are the same file, that’s why call to `dlopen()` in SYCL RT succeeds. However, the problem happens with DPCPP PyPi package, which doesn’t support symlinked files, so call to dlopen() fails because these are two different files. **Proposed solution** Lookup `libur_adapter_opencl.so.0` as fallback. **Other potential solutions** 1. Why not just load `libur_adapter_opencl.so.0` always? Because that causes SYCL unit tests, which rely on mocked OpenCL adapter to fail. In unit tests, we actually want SYCL RT to load `libur_adapter_opencl.so` (mocked) and UR to load `libur_adapter_opencl.so.0`, both of which are different files. 2. Why not remove `RTLD_NOLOAD` flag? When using PyPi package, that can cause SYCL RT and UR to load two OpenCL adapters libraries. I'm not an expert on loaders, but that might lead to more bugs if, for example, OpenCL adapter functions that SYCL RT calls have side effects.
1 parent c355a3d commit 75e9e73

File tree

2 files changed

+54
-34
lines changed

2 files changed

+54
-34
lines changed

sycl/include/sycl/detail/os_util.hpp

Lines changed: 24 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212

1313
#include <sycl/detail/export.hpp> // for __SYCL_EXPORT
1414

15+
#include <array>
1516
#include <cstdlib> // for size_t
1617
#include <functional>
1718
#include <string> // for string
@@ -106,27 +107,38 @@ void fileTreeWalk(const std::string Path,
106107
std::function<void(const std::string)> Func,
107108
bool ignoreErrors = false);
108109

109-
void *dynLookup(const char *WinName, const char *LinName, const char *FunName);
110-
111110
// Look up a function name that was dynamically linked
112-
// This is used by the runtime where it needs to manipulate native handles (e.g.
113-
// retaining OpenCL handles). On Windows, the symbol name is looked up in
114-
// `WinName`. In Linux, it uses `LinName`.
111+
// This is used by the runtime where it needs to manipulate native handles
112+
// (e.g. retaining OpenCL handles).
115113
//
116114
// The library must already have been loaded (perhaps by UR), otherwise this
117115
// function throws a SYCL runtime exception.
116+
void *dynLookup(const char *const *LibNames, size_t LibNameSizes,
117+
const char *FunName);
118+
118119
template <typename fn>
119-
fn *dynLookupFunction(const char *WinName, const char *LinName,
120+
fn *dynLookupFunction(const char *const *LibNames, size_t LibNameSize,
120121
const char *FunName) {
121-
return reinterpret_cast<fn *>(dynLookup(WinName, LinName, FunName));
122+
return reinterpret_cast<fn *>(dynLookup(LibNames, LibNameSize, FunName));
122123
}
123-
// On Linux, the name of OpenCL that was used to link against may be either
124-
// `OpenCL.so`, `OpenCL.so.1` or possibly anything else.
125-
// `libur_adapter_opencl.so` is a more stable name, since it is hardcoded into
126-
// the loader.
124+
125+
// On Linux, first try to load from libur_adapter_opencl.so, then
126+
// libur_adapter_opencl.so.0 if the first is not found. libur_adapter_opencl.so
127+
// and libur_adapter_opencl.so.0 might be different libraries if they are not
128+
// symlinked, which is the case with PyPi compiler distribution package.
129+
// We can't load libur_adapter_opencl.so.0 always as the first choice because
130+
// that would break SYCL unittests, which rely on mocking libur_adapter_opencl.
131+
#ifdef __SYCL_RT_OS_WINDOWS
132+
constexpr std::array<const char *, 1> OCLLibNames = {"OpenCL"};
133+
#else
134+
constexpr std::array<const char *, 2> OCLLibNames = {
135+
"libur_adapter_opencl.so", "libur_adapter_opencl.so.0"};
136+
#endif
137+
127138
#define __SYCL_OCL_CALL(FN, ...) \
128139
(sycl::_V1::detail::dynLookupFunction<decltype(FN)>( \
129-
"OpenCL", "libur_adapter_opencl.so", #FN)(__VA_ARGS__))
140+
sycl::detail::OCLLibNames.data(), sycl::detail::OCLLibNames.size(), \
141+
#FN)(__VA_ARGS__))
130142

131143
} // namespace detail
132144
} // namespace _V1

sycl/source/detail/os_util.cpp

Lines changed: 30 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -291,36 +291,44 @@ size_t getDirectorySize(const std::string &Path, bool ignoreErrors) {
291291
return DirSizeVar;
292292
}
293293

294-
// Look up a function name that was dynamically linked
295-
// This is used by the runtime where it needs to manipulate native handles (e.g.
296-
// retaining OpenCL handles). On Windows, the symbol name is looked up in
297-
// `WinName`. In Linux, it uses `LinName`.
294+
// Look up a function name from the given list of shared libraries.
298295
//
299-
// The library must already have been loaded (perhaps by UR), otherwise this
296+
// These library must already have been loaded (perhaps by UR), otherwise this
300297
// function throws a SYCL runtime exception.
301-
void *dynLookup([[maybe_unused]] const char *WinName,
302-
[[maybe_unused]] const char *LinName, const char *FunName) {
298+
void *dynLookup(const char *const *LibNames, size_t LibNameSizes,
299+
const char *FunName) {
303300
#ifdef __SYCL_RT_OS_WINDOWS
304-
auto handle = GetModuleHandleA(WinName);
305-
if (!handle) {
306-
throw sycl::exception(make_error_code(errc::runtime),
307-
std::string(WinName) + " library is not loaded");
308-
}
309-
auto *retVal = GetProcAddress(handle, FunName);
301+
HMODULE handle = nullptr;
302+
auto GetHandleF = [](const char *LibName) {
303+
return GetModuleHandleA(LibName);
304+
};
305+
auto GetProcF = [&]() { return GetProcAddress(handle, FunName); };
310306
#else
311-
auto handle = dlopen(LinName, RTLD_LAZY | RTLD_NOLOAD);
312-
if (!handle) {
313-
throw sycl::exception(make_error_code(errc::runtime),
314-
std::string(LinName) + " library is not loaded");
315-
}
316-
auto *retVal = dlsym(handle, FunName);
317-
dlclose(handle);
307+
void *handle = nullptr;
308+
auto GetHandleF = [](const char *LibName) {
309+
return dlopen(LibName, RTLD_LAZY | RTLD_NOLOAD);
310+
};
311+
auto GetProcF = [&]() {
312+
auto *retVal = dlsym(handle, FunName);
313+
dlclose(handle);
314+
return retVal;
315+
};
318316
#endif
319-
if (!retVal) {
317+
318+
// Iterate over the list of libraries and try to find one that is loaded.
319+
size_t LibNameIterator = 0;
320+
while (!handle && LibNameIterator < LibNameSizes)
321+
handle = GetHandleF(LibNames[LibNameIterator++]);
322+
if (!handle)
323+
throw sycl::exception(make_error_code(errc::runtime),
324+
"Libraries could not be loaded");
325+
326+
// Look up the function in the loaded library.
327+
auto *retVal = GetProcF();
328+
if (!retVal)
320329
throw sycl::exception(make_error_code(errc::runtime),
321330
"Symbol " + std::string(FunName) +
322331
" could not be found");
323-
}
324332
return reinterpret_cast<void *>(retVal);
325333
}
326334

0 commit comments

Comments
 (0)