Skip to content

Cannot compile some .cu files with latest Visual Studio and Cuda 12.9 #3965

@chacha21

Description

@chacha21

It might be a pure CUDA problem, but I get a ton of failures with a fresh build of the cuda module of opencv 4.12.0

Windows 10
CMakeGUI 4.0.3
latest Visual studio 2022 (17.14.7) or Visual studio 2019 (16.11.48)
OpenCV 4.12 + opencv_contrib 4.12
CUDA SDK 12.9 Update 1

When compiling median_filter.cu, a lot of errors are reported from util_ptx.cuh

A few examples :


1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include\cub/util_ptx.cuh(73): error : expected a ")"
1>    asm("vshl.u32.u32.u32.clamp.add %0, %1, %2, %3;" : "=r"(ret) : "r"(x), "r"(shift), "r"(addend));
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include\cub/util_ptx.cuh(208): error : calling a __device__ function("__syncthreads_and") from a __host__ function("CTA_SYNC_AND") is not allowed
1>    return __syncthreads_and(p);

It can be (partly) fixed by undefining __OPENCV_USE_WAVELET_MATRIX_FOR_MEDIAN_FILTER_CUDA__, but there are other similar errors in other modules

Another one with generalized_hough.cu :


C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include\thrust/system/detail/generic/for_each.h(49): error : static assertion failed with "unimplemented for this system"
1>    static_assert((thrust::detail::depend_on_instantiation<InputIterator, false>::value), "unimplemented for this system");
1>    ^
1>          detected during:
1>            instantiation of "InputIterator thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::system::detail::generic::for_each(thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::execution_policy<DerivedPolicy> &, InputIterator, InputIterator, UnaryFunction) [with DerivedPolicy=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::cuda_cub::tag, InputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::zip_iterator<cuda::std::__4::tuple<thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>>>, UnaryFunction=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::detail::unary_transform_functor<cv::cuda::device::binder2nd<cv::cuda::device::minimum<int>>>]" at line 46 of C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include\thrust/detail/for_each.inl
1>            instantiation of "InputIterator thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::for_each(const thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::detail::execution_policy_base<DerivedPolicy> &, InputIterator, InputIterator, UnaryFunction) [with DerivedPolicy=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::cuda_cub::tag, InputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::zip_iterator<cuda::std::__4::tuple<thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>>>, UnaryFunction=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::detail::unary_transform_functor<cv::cuda::device::binder2nd<cv::cuda::device::minimum<int>>>]" at line 62 of C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include\thrust/system/detail/generic/transform.inl
1>            instantiation of "OutputIterator thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::system::detail::generic::transform(thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::execution_policy<DerivedPolicy> &, InputIterator, InputIterator, OutputIterator, UnaryFunction) [with DerivedPolicy=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::cuda_cub::tag, InputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, OutputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, UnaryFunction=cv::cuda::device::binder2nd<cv::cuda::device::minimum<int>>]" at line 47 of C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include\thrust/detail/transform.inl
1>            instantiation of "OutputIterator thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::transform(const thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::detail::execution_policy_base<DerivedPolicy> &, InputIterator, InputIterator, OutputIterator, UnaryFunction) [with DerivedPolicy=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::cuda_cub::tag, InputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, OutputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, UnaryFunction=cv::cuda::device::binder2nd<cv::cuda::device::minimum<int>>]" at line 148 of C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include\thrust/detail/transform.inl
1>            instantiation of "OutputIterator thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::transform(InputIterator, InputIterator, OutputIterator, UnaryFunction) [with InputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, OutputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, UnaryFunction=cv::cuda::device::binder2nd<cv::cuda::device::minimum<int>>]" at line 541 of D:\opencv_contrib-4.12.0\modules\cudaimgproc\src\cuda\generalized_hough.cu
1>            instantiation of "void cv::cuda::device::ght::Guil_Full_buildFeatureList_caller<FT,isTempl>(const unsigned int *, const float *, int, int *, int, float, float, int, float2, float) [with FT=cv::cuda::device::ght::TemplFeatureTable, isTempl=true]" at line 549 of D:\opencv_contrib-4.12.0\modules\cudaimgproc\src\cuda\generalized_hough.cu
1>
1>1 error detected in the compilation of "D:/opencv_contrib-4.12.0/modules/cudaimgproc/src/cuda/generalized_hough.cu".

I don't know which part of the toolchain is broken when using CUB.

I will also ask on NVIDIA forums

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions