-
Notifications
You must be signed in to change notification settings - Fork 795
[SYCL][NFCI] Less shared_ptr for device_impl
#18270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
After intel#18251 device are guaranteed to be alive until SYCL RT library shutdown, so we don't have to pass everything in `std::shared_ptr<device_impl>` and might use raw pointers/references much more. That said, constraints from intel#18143 (mostly unittests linking statically and lifetimes of static/thread-local objects following from that) are still here and I'm addressing them the same way - not totally changing the ownership model, using `std::enable_shared_from_this` and keep creating shared pointers for member objects to keep the graph of resource ownership intact.
jopperm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to sycl/source/detail/jit_compiler.cpp LGTM.
| MNextAvailableQueueID.fetch_add(1, std::memory_order_relaxed)} { | ||
| queue_impl_interop(UrQueue); | ||
| } | ||
| : queue_impl(UrQueue, Context, AsyncHandler, {}) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using delegating ctor here instead of non-ctor helper queue_impl_interop (that was inlined into the ctor below).
| "Host task submissions should have an associated queue"); | ||
| interop_handle IH{MReqToMem, HostTask.MQueue, | ||
| HostTask.MQueue->getDeviceImplPtr(), | ||
| HostTask.MQueue->getDeviceImpl().shared_from_this(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interop_handle seems to be part of ABI, so not changing it here.
| RTDeviceBinaryImage * | ||
| retrieveAMDGCNOrNVPTXKernelBinary(const DeviceImplPtr DeviceImpl, | ||
| const std::string &KernelName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems dead, just removing.
|
@uditagarwal97 , ping. |
shared_ptr for device_implshared_ptr for device_impl
Refactored the `ProgramManager` to use `device_impl &` instead of `const device &`. See #18270 and #18251 that started the refactoring. Signed-off-by: Sergei Vinogradov <[email protected]>
…e_impl *` intel#18251 extended `device_impl`s' lifetimes until shutdown and intel#18270 started to pass devices as raw pointers in some of the APIs. This PR builds on top of that and extends usage of raw pointers/references/`device_range` as the devices are known to be alive and extra `std::shared_ptr`'s atomic increments aren't necessary and could be avoided. This change mostly touches `device_image_impl` and `program_manager` and switches most of the APIs to use `devices_range`. A few number of other modifications are caused by these APIs' changes and are necessary to keep the code buildable.
…e_impl *` intel#18251 extended `device_impl`s' lifetimes until shutdown and intel#18270 started to pass devices as raw pointers in some of the APIs. This PR builds on top of that and extends usage of raw pointers/references/`device_range` as the devices are known to be alive and extra `std::shared_ptr`'s atomic increments aren't necessary and could be avoided. This change mostly touches `device_image_impl` and `program_manager` and switches most of the APIs to use `devices_range`. A few number of other modifications are caused by these APIs' changes and are necessary to keep the code buildable.
…e_impl *` (#19459) #18251 extended `device_impl`s' lifetimes until shutdown and #18270 started to pass devices as raw pointers in some of the APIs. This PR builds on top of that and extends usage of raw pointers/references/`device_range` as the devices are known to be alive and extra `std::shared_ptr`'s atomic increments aren't necessary and could be avoided. Since we change the type of `device_image_impl::MDevices`, other APIs in that class and in `program_manager` don't need to operate in terms of `sycl::device` or `std::shared_ptr<device_impl>` and we can switch them to use `devices_range` instead. A small number of other modifications are caused by these APIs' changes and are necessary to keep the code buildable. One extra change is the addition of a minor `devices_range::to<std::vector<ur_device_handle_t>>()` helper that we can use now that most of the arguments are `device_range`. Technically, could go in another PR but then we'd just be modifying the exact same lines two times, so I decided to fuse it here.
…ce_impl *` intel#18251 extended `device_impl`s' lifetimes until shutdown and intel#18270 started to pass devices as raw pointers in some of the APIs. This PR builds on top of that and extends usage of raw pointers/references/`device_range` as the devices are known to be alive and extra `std::shared_ptr`'s atomic increments aren't necessary and could be avoided. Since we change the type of `kernel_bundle_impl::MDevices`, other APIs in that class don't need to operate in terms of `sycl::device` or `std::shared_ptr<device_impl>` and we can switch them to use `devices_range` instead. A small number of other modifications are caused by these APIs' changes and are necessary to keep the code buildable.
…ce_impl *` intel#18251 extended `device_impl`s' lifetimes until shutdown and intel#18270 started to pass devices as raw pointers in some of the APIs. This PR builds on top of that and extends usage of raw pointers/references/`device_range` as the devices are known to be alive and extra `std::shared_ptr`'s atomic increments aren't necessary and could be avoided. Since we change the type of `kernel_bundle_impl::MDevices`, other APIs in that class don't need to operate in terms of `sycl::device` or `std::shared_ptr<device_impl>` and we can switch them to use `devices_range` instead. A small number of other modifications are caused by these APIs' changes and are necessary to keep the code buildable.
…ce_impl *` intel#18251 extended `device_impl`s' lifetimes until shutdown and intel#18270 started to pass devices as raw pointers in some of the APIs. This PR builds on top of that and extends usage of raw pointers/references/`device_range` as the devices are known to be alive and extra `std::shared_ptr`'s atomic increments aren't necessary and could be avoided. Since we change the type of `kernel_bundle_impl::MDevices`, other APIs in that class don't need to operate in terms of `sycl::device` or `std::shared_ptr<device_impl>` and we can switch them to use `devices_range` instead. A small number of other modifications are caused by these APIs' changes and are necessary to keep the code buildable.
…ce_impl *` (#19484) #18251 extended `device_impl`s' lifetimes until shutdown and #18270 started to pass devices as raw pointers in some of the APIs. This PR builds on top of that and extends usage of raw pointers/references/`device_range` as the devices are known to be alive and extra `std::shared_ptr`'s atomic increments aren't necessary and could be avoided. Since we change the type of `kernel_bundle_impl::MDevices`, other APIs in that class don't need to operate in terms of `sycl::device` or `std::shared_ptr<device_impl>` and we can switch them to use `devices_range` instead. A small number of other modifications are caused by these APIs' changes and are necessary to keep the code buildable.
After #18251 devices are guaranteed to be alive until SYCL RT library shutdown, so we don't have to pass everything in
std::shared_ptr<device_impl>and might use raw pointers/references much more.That said, constraints from
#18143 (mostly unittests linking statically and lifetimes of static/thread-local objects following from that) are still here and I'm addressing them the same way - not totally changing the ownership model, using
std::enable_shared_from_thisand keep creating shared pointers for member objects to keep the graph of resource ownership intact.