Skip to content

Conversation

@steffenlarsen
Copy link
Contributor

This commit adds the ability for doing "fast linking" of kernel bundles. Fast linking lets the implementation use AOT binaries from the underlying SYCLBIN files to dynamically link the images in the kernel bundles.

@steffenlarsen steffenlarsen force-pushed the steffen/dynamic_linking branch from 84ffa78 to b8c44d9 Compare September 23, 2025 14:34
@steffenlarsen steffenlarsen force-pushed the steffen/dynamic_linking branch from b8c44d9 to cc05fe9 Compare September 23, 2025 15:05
@steffenlarsen steffenlarsen force-pushed the steffen/dynamic_linking branch from cc05fe9 to aa08b13 Compare September 29, 2025 16:08
@steffenlarsen steffenlarsen force-pushed the steffen/dynamic_linking branch from aa08b13 to f84e78e Compare October 28, 2025 08:40
This commit adds the ability for doing "fast linking" of kernel bundles.
Fast linking lets the implementation use AOT binaries from the
underlying SYCLBIN files to dynamically link the images in the kernel
bundles.

Signed-off-by: Larsen, Steffen <[email protected]>
@steffenlarsen steffenlarsen force-pushed the steffen/dynamic_linking branch from f84e78e to 6fe8e6b Compare October 28, 2025 10:45
@steffenlarsen steffenlarsen changed the title [SYCL][UR][Docs] Add fast-link option for SYCLBIN [SYCL] Add fast-link option for SYCLBIN Oct 28, 2025
@steffenlarsen steffenlarsen marked this pull request as ready for review October 29, 2025 10:38
@steffenlarsen steffenlarsen requested a review from a team as a code owner October 29, 2025 10:38
@steffenlarsen
Copy link
Contributor Author

@intel/llvm-reviewers-runtime - This is ready for review! 🎉

@steffenlarsen
Copy link
Contributor Author

@intel/llvm-reviewers-runtime - Friendly ping.

1 similar comment
@steffenlarsen
Copy link
Contributor Author

@intel/llvm-reviewers-runtime - Friendly ping.


namespace detail {
__SYCL_EXPORT std::shared_ptr<detail::kernel_bundle_impl>
link_impl(const std::vector<kernel_bundle<bundle_state::object>> &ObjectBundles,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like the idea of using the specific container (std::vector in that case) when we need to pass the range of objects to the ABI entry point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! It was copy-pasted from the existing link_impl without further thought, but I agree that passing ranges here is better. I decided to go with C-style pointer + size arguments to avoid sycl::span becoming a part of the ABI for this. It is converted to a span in the library however, to simplify the usage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we already use sycl::span as the ABI entry-point parameter. Maybe we should use it here as well for consistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. We could, but I prefer C-style arrays to make the ABI breaking surface smaller once we have the actual std::span.

Signed-off-by: Larsen, Steffen <[email protected]>
Comment on lines +56 to +57
return createSyclObjFromImpl<kernel_bundle<bundle_state::executable>>(
std::move(Impl));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason to do this here instead of inside libsycl.so?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is templated, so we can't move all of link_common. If you mean change the return type of link_impl, it is currently a parallel (overload) to another link_impl, so it was chosen for consistency's sake.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean change the return type of link_impl

Yes, I think most of ABI interfaces should operate in terms of real SYCL objects.

@@ -0,0 +1,39 @@
//==-------- syclbin_properties.hpp - SYCLBIN and tooling properties -------==//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, this part of the style guide has been dropped (having file name present in the very first line).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean it should be

Suggested change
//==-------- syclbin_properties.hpp - SYCLBIN and tooling properties -------==//
//==-------------------- SYCLBIN and tooling properties --------------------==//

or should the header be dropped entirely?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's usually

//==-----------------------------------------------------------------------==//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
/// \file <Description>
///

or something like that now.

Comment on lines +416 to +417
// Collect all unique images.
std::vector<device_image_plain> DevImages;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm that else part is just indentation change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is indeed the intention! 😄


std::vector<const RTDeviceBinaryImage *> Result;
for (auto &SYCLBIN : MSYCLBINs) {
auto NativeBinImgs = SYCLBIN->getNativeBinaryImages(Dev);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use explicit type, it's not clear if there is a redundant copy or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does indeed do some redundant copying. I've changed it to use move iterators here as well.

Comment on lines 1 to 8
//==-------------- link_input.cpp --- SYCLBIN extension tests --------------==//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't do that in tests, AFAIK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the surrounding tests have it, but I don't mind either way. I can go remove it in the others after addressing the comments here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#20626 removes the others.

Signed-off-by: Larsen, Steffen <[email protected]>
ext::oneapi::experimental::detail::link_props, PropertyListT>>>
kernel_bundle<bundle_state::executable>
link_common(const kernel_bundle<bundle_state::object> *ObjectBundles,
size_t NumObjectBundles, const std::vector<device> &Devs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to throw an exception if size is zero?

Comment on lines +66 to +67
// Create a map between exported symbols and their indices in the device
// images collection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs more clarification that LinkGraph expects the dependencies graph in terms of indices. Also being discussed offline on how to make it more readable.

One option is to add LinkGraph::addEdge, another is somethink like

// Format expected by the LinkGraph's ctor:
std::vector<..> Dependencies;
auto AddEdge = [&Dependencies](...) { ... };

Comment on lines +337 to +339
// When doing fast-linking, we insert the suitable AOT binaries from the
// object bundles. This needs to be done per-device, as AOT binaries may
// not be compatible across different architectures.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know the detail, but any chance the comment could be lifter to right before if (FastLink) to describe similarities/differences between two paths?

Comment on lines +345 to +346
const std::shared_ptr<detail::kernel_bundle_impl> &ObjectBundleImpl =
getSyclObjImpl(ObjectBundle);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const std::shared_ptr<detail::kernel_bundle_impl> &ObjectBundleImpl =
getSyclObjImpl(ObjectBundle);
detail::kernel_bundle_impl>&ObjectBundleImpl = *getSyclObjImpl(ObjectBundle);

Comment on lines +388 to +389
// If any of the exported symbols overlap with an AOT binary, skip
// this image.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add extra comment describing when that may happen and why? Same below.

Comment on lines +526 to +527
sycl::span<const device_image_plain> AllDevImgsSpan(AllDevImgs.data(),
AllDevImgs.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this? span is constructible from std::vector: https://godbolt.org/z/bxGde7WPr, or rather from Container,

template <class _Container>
_SYCL_SPAN_INLINE_VISIBILITY constexpr span(
_Container &__c,
std::enable_if_t<__is_span_compatible_container<_Container, _Tp>::value,
std::nullptr_t> = nullptr)
: __data{std::data(__c)}, __size{(size_type)std::size(__c)} {}

const std::vector<kernel_bundle<bundle_state::object>> &ObjectBundles,
devices_range Devs, const property_list &PropList, private_tag)
kernel_bundle_impl(sycl::span<const kernel_bundle<bundle_state::object>,
sycl::dynamic_extent> &ObjectBundles,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop sycl::dynamic_extent.

link_impl(const std::vector<kernel_bundle<bundle_state::object>> &ObjectBundles,
const std::vector<device> &Devs, const property_list &PropList) {
return detail::kernel_bundle_impl::create(ObjectBundles, Devs, PropList);
sycl::span<const kernel_bundle<bundle_state::object>, sycl::dynamic_extent>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop sycl::dynamic_extent.

link_impl(const kernel_bundle<bundle_state::object> *ObjectBundles,
size_t NumObjectBundles, const std::vector<device> &Devs,
bool FastLink) {
sycl::span<const kernel_bundle<bundle_state::object>, sycl::dynamic_extent>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop sycl::dynamic_extent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants