Closed
Description
Describe the bug
Split compilation and linking is important. When doing this, I get a warning the SYCL target is missing. If I act on the warning to select the SYCL target, the binary is missing the kernel and crashes at runtime with a CUDA error.
To Reproduce
$ clang++ -O3 -std=c++20 -fsycl -fsycl-targets=nvptx64-nvidia-cuda -Xsycl-target-backend --cuda-gpu-arch=sm_80 simple-sycl-app.cpp -c --gcc-toolchain=/lustre/projects/bristol/modules/gcc/12.1.0
clang-16: warning: CUDA version 11.7 is only partially supported [-Wunknown-cuda-version]
$ clang++ -O3 -std=c++20 -fsycl -fsycl-targets=nvptx64-nvidia-cuda -Xsycl-target-backend --cuda-gpu-arch=sm_80 simple-sycl-app.o -o a.out --gcc-toolchain=/lustre/projects/bristol/modules/gcc/12.1.0
clang-16: warning: linked binaries do not contain expected 'nvptx64-nvidia-cuda' target; found targets: 'nvptx64-nvidia-cuda-sm_80' [-Wsycl-target]
clang-16: warning: CUDA version 11.7 is only partially supported [-Wunknown-cuda-version]
$ ./a.out
NVIDIA A100-SXM4-40GB
The results are correct!
## The warning on link says it couldn't find the sm_80 version, so I can tell it to use exactly the one it found in the earlier step:
$ clang++ -O3 -std=c++20 -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sm_80 -Xsycl-target-backend --cuda-gpu-arch=sm_80 simple-sycl-app.o -o a.out --gcc-toolchain=/lustre/projects/bristol/modules/gcc/12.1.0
clang-16: warning: CUDA version 11.7 is only partially supported [-Wunknown-cuda-version]
warning: overriding the module target triple with nvptx64-nvidia-cuda-sm_80 [-Woverride-module]
1 warning generated.
$ ./a.out
NVIDIA A100-SXM4-40GB
PI CUDA ERROR:
Value: 500
Name: CUDA_ERROR_NOT_FOUND
Description: named symbol not found
Function: cuda_piKernelCreate
Source Location: /lustre/projects/bristol/modules-phase3/sycl_workspace_22_11_21/llvm/sycl/plugins/cuda/pi_cuda.cpp:2826
terminate called after throwing an instance of 'sycl::_V1::exception'
what(): Native API failed. Native API returns: -999 (Unknown PI error)
Aborted (core dumped)
The expected behaviour is acting on the warnings should result in them going away with code that doesn't crash.
Environment (please complete the following information):
- OS: HPE/Cray RHEL 8.3
- Target device and vendor: Nvidia A100
- DPC++ version: 6aefd63
Additional context
Discovered during the SYCL Practitioners Hackathon 2022 (user group meeting).