Skip to content

Commit c74f05d

Browse files
authored
[SYCL] Memory tracking and deferred release in the Level Zero plugin (#3710)
Kernel with indirect access flag can access memory allocations indirectly. To be conservative we currently set indirect access flag for all kernels. When kernel with indirect access is submitted it starts to reference all existing usm memory allocations. Referenced memory allocations can't be released if kernel has not finished. That's why we need to track all memory allocations referenced by each kernel with indirect access and deallocate memory only when all kernels referencing this memory allocation have finished. This patch implements memory tracking and deferred release. Records about all usm memory allocations are stored in a platform. When kernel is submitted we make a snapshot of all existing records, kernel starts to reference all these allocations. As soon as kernel finished execution we need to remove all references to allocations from this kernel. If memory is not referenced by any kernel it can be deallocated and removed from the list of tracked memory allocations.
1 parent 42690f9 commit c74f05d

File tree

3 files changed

+362
-22
lines changed

3 files changed

+362
-22
lines changed

sycl/doc/EnvironmentVariables.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ subject to change. Do not rely on these variables in production code.
3232
| SYCL_PI_LEVEL_ZERO_BATCH_SIZE | Integer | Sets a preferred number of commands to batch into a command list before executing the command list. A value of 0 causes the batch size to be adjusted dynamically. A value greater than 0 specifies fixed size batching, with the batch size set to the specified value. The default is 0. |
3333
| SYCL_PI_LEVEL_ZERO_FILTER_EVENT_WAIT_LIST | Integer | When set to 0, disables filtering of signaled events from wait lists when using the Level Zero backend. The default is 1. |
3434
| SYCL_PI_LEVEL_ZERO_USE_COPY_ENGINE | Integer | Allows the use of copy engine, if available in the device, in Level Zero plugin to transfer SYCL buffer or image data between the host and/or device(s) and to fill SYCL buffer or image data in device or shared memory. The default is 1. |
35+
| SYCL_PI_LEVEL_ZERO_TRACK_INDIRECT_ACCESS_MEMORY | Any(\*) | Enable support of the kernels with indirect access and corresponding deferred release of memory allocations in the Level Zero plugin. |
3536
| SYCL_PARALLEL_FOR_RANGE_ROUNDING_TRACE | Any(\*) | Enables tracing of parallel_for invocations with rounded-up ranges. |
3637
| SYCL_DISABLE_PARALLEL_FOR_RANGE_ROUNDING | Any(\*) | Disables automatic rounding-up of parallel_for invocation ranges. |
3738
| SYCL_ENABLE_PCI | Integer | When set to 1, enables obtaining the GPU PCI address when using the Level Zero backend. The default is 0. |

0 commit comments

Comments
 (0)