-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Closed
Description
It might be a pure CUDA problem, but I get a ton of failures with a fresh build of the cuda module of opencv 4.12.0
Windows 10
CMakeGUI 4.0.3
latest Visual studio 2022 (17.14.7) or Visual studio 2019 (16.11.48)
OpenCV 4.12 + opencv_contrib 4.12
CUDA SDK 12.9 Update 1
When compiling median_filter.cu, a lot of errors are reported from util_ptx.cuh
A few examples :
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include\cub/util_ptx.cuh(73): error : expected a ")"
1> asm("vshl.u32.u32.u32.clamp.add %0, %1, %2, %3;" : "=r"(ret) : "r"(x), "r"(shift), "r"(addend));
1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include\cub/util_ptx.cuh(208): error : calling a __device__ function("__syncthreads_and") from a __host__ function("CTA_SYNC_AND") is not allowed
1> return __syncthreads_and(p);
It can be (partly) fixed by undefining __OPENCV_USE_WAVELET_MATRIX_FOR_MEDIAN_FILTER_CUDA__, but there are other similar errors in other modules
Another one with generalized_hough.cu :
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include\thrust/system/detail/generic/for_each.h(49): error : static assertion failed with "unimplemented for this system"
1> static_assert((thrust::detail::depend_on_instantiation<InputIterator, false>::value), "unimplemented for this system");
1> ^
1> detected during:
1> instantiation of "InputIterator thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::system::detail::generic::for_each(thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::execution_policy<DerivedPolicy> &, InputIterator, InputIterator, UnaryFunction) [with DerivedPolicy=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::cuda_cub::tag, InputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::zip_iterator<cuda::std::__4::tuple<thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>>>, UnaryFunction=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::detail::unary_transform_functor<cv::cuda::device::binder2nd<cv::cuda::device::minimum<int>>>]" at line 46 of C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include\thrust/detail/for_each.inl
1> instantiation of "InputIterator thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::for_each(const thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::detail::execution_policy_base<DerivedPolicy> &, InputIterator, InputIterator, UnaryFunction) [with DerivedPolicy=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::cuda_cub::tag, InputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::zip_iterator<cuda::std::__4::tuple<thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>>>, UnaryFunction=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::detail::unary_transform_functor<cv::cuda::device::binder2nd<cv::cuda::device::minimum<int>>>]" at line 62 of C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include\thrust/system/detail/generic/transform.inl
1> instantiation of "OutputIterator thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::system::detail::generic::transform(thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::execution_policy<DerivedPolicy> &, InputIterator, InputIterator, OutputIterator, UnaryFunction) [with DerivedPolicy=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::cuda_cub::tag, InputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, OutputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, UnaryFunction=cv::cuda::device::binder2nd<cv::cuda::device::minimum<int>>]" at line 47 of C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include\thrust/detail/transform.inl
1> instantiation of "OutputIterator thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::transform(const thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::detail::execution_policy_base<DerivedPolicy> &, InputIterator, InputIterator, OutputIterator, UnaryFunction) [with DerivedPolicy=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::cuda_cub::tag, InputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, OutputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, UnaryFunction=cv::cuda::device::binder2nd<cv::cuda::device::minimum<int>>]" at line 148 of C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\include\thrust/detail/transform.inl
1> instantiation of "OutputIterator thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::transform(InputIterator, InputIterator, OutputIterator, UnaryFunction) [with InputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, OutputIterator=thrust::THRUST_200802_SM_500_520_600_610_700_750_800_860_890_900_1000_1200_NS::device_ptr<int>, UnaryFunction=cv::cuda::device::binder2nd<cv::cuda::device::minimum<int>>]" at line 541 of D:\opencv_contrib-4.12.0\modules\cudaimgproc\src\cuda\generalized_hough.cu
1> instantiation of "void cv::cuda::device::ght::Guil_Full_buildFeatureList_caller<FT,isTempl>(const unsigned int *, const float *, int, int *, int, float, float, int, float2, float) [with FT=cv::cuda::device::ght::TemplFeatureTable, isTempl=true]" at line 549 of D:\opencv_contrib-4.12.0\modules\cudaimgproc\src\cuda\generalized_hough.cu
1>
1>1 error detected in the compilation of "D:/opencv_contrib-4.12.0/modules/cudaimgproc/src/cuda/generalized_hough.cu".
I don't know which part of the toolchain is broken when using CUB.
I will also ask on NVIDIA forums
Metadata
Metadata
Assignees
Labels
No labels