-
Notifications
You must be signed in to change notification settings - Fork 795
Description
Describe the bug
sycl::exp(double) compile fail with -ffast-math on CUDA backend, caused by patch #5801
To Reproduce
Please describe the steps to reproduce the behavior:
- Compiler for CUDA backend using head 0f0c5d1 or later
- Get sycl_test.txt, rename it as sycl_test.cpp and compile by command
clang++ -fsycl -fsycl-unnamed-lambda -fsycl-targets=nvptx64-nvidia-cuda -O2 -ffast-math sycl_test.cpp -o a.out
Environment (please complete the following information):
OS: Linux-Ubuntu 20.04
Target device and vendor: NVIDIA TITAN RTX
DPC++ version: clang version 15.0.0 (https://github.com/intel/llvm.git 0f0c5d1)
Dependencies version: NVIDIA-SMI 510.47.03, CUDA Version: 11.6
Additional context
Log:
sycl_test.cpp:13:26: error: call to 'exp' is ambiguous
double tmp = sycl::exp(2.0 * id);
^~~~~~~~~
include/sycl/CL/sycl/handler.hpp:212:5: note: in instantiation of function template specialization 'main(const int, const char **)::(anonymous class)::operator()(sycl::handler &)::(anonymous class)::operator()<sycl::item<1, true>>' requested here
KernelFunc(Arg);
^
include/sycl/CL/sycl/handler.hpp:1122:5: note: in instantiation of member function 'sycl::detail::RoundedRangeKernel<sycl::item<1, true>, 1, (lambda at sycl_test.cpp:12:29)>::operator()' requested here
KernelFunc(detail::Builder::getElement(detail::declptr()));
^
include/sycl/CL/sycl/handler.hpp:1242:5: note: in instantiation of function template specialization 'sycl::handler::kernel_parallel_for<sycl::detail::RoundedRangeKernel<sycl::item<1, true>, 1, (lambda at sycl_test.cpp:12:29)>, sycl::item<1, true>, sycl::detail::RoundedRangeKernel<sycl::item<1, true>, 1, (lambda at sycl_test.cpp:12:29)>>' requested here
kernel_parallel_for<KernelName, ElementType>(KernelFunc);
^
include/sycl/CL/sycl/handler.hpp:1030:7: note: in instantiation of function template specialization 'sycl::handler::kernel_parallel_for_wrapper<sycl::detail::RoundedRangeKernel<sycl::item<1, true>, 1, (lambda at sycl_test.cpp:12:29)>, sycl::item<1, true>, sycl::detail::RoundedRangeKernel<sycl::item<1, true>, 1, (lambda at sycl_test.cpp:12:29)>>' requested here
kernel_parallel_for_wrapper<KName, TransformedArgType>(Wrapper);
^
include/sycl/CL/sycl/handler.hpp:1460:5: note: in instantiation of function template specialization 'sycl::handler::parallel_for_lambda_impl<sycl::detail::auto_name, (lambda at sycl_test.cpp:12:29), 1>' requested here
parallel_for_lambda_impl(NumWorkItems, std::move(KernelFunc));
^
sycl_test.cpp:12:13: note: in instantiation of function template specialization 'sycl::handler::parallel_for<sycl::detail::auto_name, (lambda at sycl_test.cpp:12:29)>' requested here
cgh.parallel_for(N, [=](auto id) {
^
include/sycl/CL/sycl/builtins.hpp:154:49: note: candidate function [with T = double]
detail::enable_if_t<__FAST_MATH_GENFLOAT(T), T> exp(T x) __NOEXC {
^
include/sycl/CL/sycl/builtins.hpp:1582:55: note: candidate function [with T = double]
detail::enable_if_t<detail::is_genfloat::value, T> exp(T x) __NOEXC {