[SYCL] Fix issue of acquring kernel twice #11953
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In #11751, ref counting of kernels objects was changed to be more accurate in order to allow for in-memory caching to be disabled. When getting a kernel form the cache, the ref count the kernel handle is now incremented (when caching is enabled). Thus, a method like
ProgramManager::getOrCreateKernel
will increment the ref count of the kernel it gets. However, inenqueueImpKernel
, when enqueuing a kernel with a kernel bundle,ProgramManager::getOrCreateKernel
is called twice, first indirectly by:llvm/sycl/source/detail/scheduler/commands.cpp
Lines 2527 to 2528 in c43a90f
and second directly by:
llvm/sycl/source/detail/scheduler/commands.cpp
Lines 2538 to 2548 in c43a90f
This means that the ref count of the acquired kernel is incremented twice, yet the rest of the function will only free once, which leads to a leak of the kernel. As the second comment and asserts say, the only need for the second call to
getOrCreateKernel
is to fetch the mutex associated to the cached kernel retrieved from the first call, so this PR adjustsget_kernel
to save this mutex and forgo this extragetOrCreateKernel
call and unintentional additional ref count.