Skip to content

Implement cache for IPC handles #403

@vinser52

Description

@vinser52

Rationale

UMF needs to employ efficient caching for IPC handles for the following reasons:

  • IPC handle creation might be expensive. For example in the case of L0, it requires access to the GPU driver.
  • Programm logic might request an IPC handle for the same memory region multiple times. In addition to performance implications, it might cause inefficient resource usage. For example, L0 implementation of IPC handles is based on file descriptors - inefficient implementation might reach the limit of available file descriptors on the system very quickly (oneCCL had such issue in the past).

Description

PR #88 introduces initial support for IPC functionality. It allows the creation of an IPC handle for the memory buffer (allocated via UMF) on the producer side and opening (mapping to the address space) this handle on the consumer side (in the same or another process).

IPC handles functionality is memory provider-specific. When the UMF client requests an IPC handle for a fine grain allocation (returned by umfPoolMalloc) internally IPC handle is created by the memory provider for the whole coarse grain allocation. UMF needs to implement IPC cache on both: the producer and the consumer sides. So the memory provider creates an IPC handle only once per coarse grain allocation on a producer side. On the consumer side, the IPC handle is opened by the memory provider only once and all subsequent open requests for the same IPC handle reuse the mapping created by the first open request.

API Changes

N/A

Implementation details

On the consumer side, the unique key for caching is the address of a coarse grain allocation. On the producer side, the key is the pair of process ID that creates this handle and the address of a coarse grain allocation represented by this IPC handle.

PR #88 already implements naive caching on the consumer side using critnib data structure. But it is not enough because critnib is just a map - we cannot limit the size of the cache. In fact, we need some cache implementation with an eviction policy (LRU, LFU, etc.) that allows us to limit the size of the cache and underlying resource usage.

I would suggest to use combination of uthash and utlist data structures. Both data structures are implemented using C macros and the implementations are just a single header file per data structure: uthash.h and utlist.h

The size of the caches on the producer and consumer sides should be configurable (via environment variable).

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions