Skip to content

Conversation

@vchiluka5
Copy link
Contributor

@vchiluka5 vchiluka5 commented Aug 5, 2019

Added lazy loading of nvcuda.dll. Removed unwanted driver APIs as the final output remains same and replaced driver API with runtime API wherever possible

resolves #2210

force_builders=Custom
buildworker:Custom=linux-4
docker_image:Custom=ubuntu-cuda:18.04

@chacha21
Copy link
Contributor

chacha21 commented Aug 6, 2019

In your latest patch, GetProcAddress/dlsym are used to identify "cuCtxGetCurrent" and it is fine.

However, you can notice that "cuCtxPushCurrent" and "cuCtxPopCurrent" are defined as macros in cuda.h and are aliasing cuCtxPushCurrent_v2 and cuCtxPopCurrent_v2.
Thus, we can't be sure that in the future, "cuCtxGetCurrent" won't evolve the same.

That's why I suggest using a stringify macro to build the "cuCtxGetCurrent" symbol name, because it will protect from such changes and compatibility breaking.
`#define FUNCTION_NAME(name) MACRO_STRINGIFY(name)

#define MACRO_STRINGIFY(name) #name

...

const char* cuCtxGetCurrent_name = FUNCTION_NAME(cuCtxGetCurrent);

GetProcAddress(hNVCuda, cuCtxGetCurrent_name);
`

@vchiluka5
Copy link
Contributor Author

In your latest patch, GetProcAddress/dlsym are used to identify "cuCtxGetCurrent" and it is fine.

However, you can notice that "cuCtxPushCurrent" and "cuCtxPopCurrent" are defined as macros in cuda.h and are aliasing cuCtxPushCurrent_v2 and cuCtxPopCurrent_v2.
Thus, we can't be sure that in the future, "cuCtxGetCurrent" won't evolve the same.

That's why I suggest using a stringify macro to build the "cuCtxGetCurrent" symbol name, because it will protect from such changes and compatibility breaking.
`#define FUNCTION_NAME(name) MACRO_STRINGIFY(name)

#define MACRO_STRINGIFY(name) #name

...

const char* cuCtxGetCurrent_name = FUNCTION_NAME(cuCtxGetCurrent);

GetProcAddress(hNVCuda, cuCtxGetCurrent_name);
`

cuCtxGetCurrent() is used to only fetch the runtime context and there is only 1 runtime context. This function has no further evolution of v2 and above and hence we can keep it simple for this function.
Remaining functions like cuCtxPopCurrent and cuCtxPushCurrent were not needed in this scenario and hence removed.
This implementation is for SDK 1.0 and for next release we are trying to completely get rid of driver APIs so that integration becomes more easy.

@chacha21
Copy link
Contributor

chacha21 commented Aug 6, 2019

👍

@vchiluka5
Copy link
Contributor Author

Hi @alalek
Can you please elaborate build failure? I dont see error regarding nvidia optical flow and its on IOS.

Copy link
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

singleton's constructor/destructor should not be "public".

{
static LoadNvidiaModules* m_pInstance;
if (!m_pInstance)
m_pInstance = new LoadNvidiaModules;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pointer

which code should call object destruction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Object destruction will happen by default right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added object destruction code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Destructor code has never been called. You can ensure by adding printf() there.

Please follow the example (link is above):

  • no instance pointers
  • just static (global) object in function

Copy link
Contributor Author

@vchiluka5 vchiluka5 Aug 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Done and i ensured it by adding printf in destructor. Thanks :-)
Please take a look

Copy link
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thank you 👍

@opencv-pushbot opencv-pushbot merged commit c494a5d into opencv:master Aug 14, 2019
@alalek alalek mentioned this pull request Aug 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

opencv_cudaoptflow 4.1.1 has a strong dependency to nvcuda.dll

4 participants