Skip to content

Acting on split compilation warning results in missing kernels in binary #7511

Closed
@tomdeakin

Description

@tomdeakin

Describe the bug
Split compilation and linking is important. When doing this, I get a warning the SYCL target is missing. If I act on the warning to select the SYCL target, the binary is missing the kernel and crashes at runtime with a CUDA error.

To Reproduce

$ clang++ -O3 -std=c++20 -fsycl -fsycl-targets=nvptx64-nvidia-cuda -Xsycl-target-backend --cuda-gpu-arch=sm_80 simple-sycl-app.cpp -c --gcc-toolchain=/lustre/projects/bristol/modules/gcc/12.1.0
clang-16: warning: CUDA version 11.7 is only partially supported [-Wunknown-cuda-version]
$ clang++ -O3 -std=c++20 -fsycl -fsycl-targets=nvptx64-nvidia-cuda -Xsycl-target-backend --cuda-gpu-arch=sm_80 simple-sycl-app.o -o a.out --gcc-toolchain=/lustre/projects/bristol/modules/gcc/12.1.0
clang-16: warning: linked binaries do not contain expected 'nvptx64-nvidia-cuda' target; found targets: 'nvptx64-nvidia-cuda-sm_80' [-Wsycl-target]
clang-16: warning: CUDA version 11.7 is only partially supported [-Wunknown-cuda-version]
$ ./a.out 
NVIDIA A100-SXM4-40GB
The results are correct!

## The warning on link says it couldn't find the sm_80 version, so I can tell it to use exactly the one it found in the earlier step:
$ clang++ -O3 -std=c++20 -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sm_80 -Xsycl-target-backend --cuda-gpu-arch=sm_80 simple-sycl-app.o -o a.out --gcc-toolchain=/lustre/projects/bristol/modules/gcc/12.1.0
clang-16: warning: CUDA version 11.7 is only partially supported [-Wunknown-cuda-version]
warning: overriding the module target triple with nvptx64-nvidia-cuda-sm_80 [-Woverride-module]
1 warning generated.
$ ./a.out 
NVIDIA A100-SXM4-40GB

PI CUDA ERROR:
        Value:           500
        Name:            CUDA_ERROR_NOT_FOUND
        Description:     named symbol not found
        Function:        cuda_piKernelCreate
        Source Location: /lustre/projects/bristol/modules-phase3/sycl_workspace_22_11_21/llvm/sycl/plugins/cuda/pi_cuda.cpp:2826

terminate called after throwing an instance of 'sycl::_V1::exception'
  what():  Native API failed. Native API returns: -999 (Unknown PI error)
Aborted (core dumped)

The expected behaviour is acting on the warnings should result in them going away with code that doesn't crash.

Environment (please complete the following information):

  • OS: HPE/Cray RHEL 8.3
  • Target device and vendor: Nvidia A100
  • DPC++ version: 6aefd63

Additional context
Discovered during the SYCL Practitioners Hackathon 2022 (user group meeting).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcudaCUDA back-end

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions