-
Notifications
You must be signed in to change notification settings - Fork 2
[SYCL][Doc][RTC] Add device global queries #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][Doc][RTC] Add device global queries #2
Conversation
Signed-off-by: Julian Oppermann <[email protected]>
`bundle_state::ext_oneapi_source` in the language `source_language::sycl`, and | ||
* it defines a device global whose name is `name` and which was declared without | ||
the `device_image_scope` property, and | ||
* `dev` is contained by the context associated with this bundle. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is dev
a parameter of this function? It doesn't seem to serve any purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took the signature straight from the presentation you gave a while back 😇 But I agree, checking the device only makes sense if we also test if the device supports device globals. Playing around with the E2E tests for DGs, such a check does not seem to happen for code compiled with the normal pipeline.
I'll drop the parameter.
* `dev` is contained by the context associated with this bundle. | ||
|
||
`name` must be a {cpp} identifier that is valid for referencing the device | ||
global at the bottom of the source code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really true? Can you handle cases like:
- The device global is defined in a namespace, and
name
includes the namespace qualifier likemynamespace::var
? - The device global is defined in a namespace, the application defines a namespace alias, and
name
uses the namespace alias likemyalias::var
? - The device global is defined as a static member of a struct, and
name
includes the name of the struct likemystruct::var
?
I suspect these would be hard to implement for the same reasons we cannot just pass an arbitrary C++ expression for the address of a kernel function to ext_oneapi_get_kernel
, It seems like we would want to use registered_kernel_names
also for the names of device globals. (Maybe we should rename the property to registered_names
instead.) In this case, registered_names
would accept either:
- A C++ expression for a pointer to a kernel function, or
- A C++ expression for a pointer to a device global variable.
In both cases, the expression is conceptually compiled at the bottom of the source code.
Then, the name
parameter to ext_oneapi_has_device_global
could be either:
- The name of the device global if it was declared in the global namespace (i.e. the top-level namespace), or
- The name of a device global that was registered via
registered_names
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a reasonable set of (scoped) names that we could mangle on the fly, but I didn't consider aliases, inline namespaces etc.. I agree that to do this right, we should extend registered_kernel_names
to also support DG names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this would be quite a heavy-weight change, I'm proposing to limit device globals to be declared on global scope only in 83d2898. Happy to revert if you think that's too constrained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementing it in phases is OK, but I think we need to lift the restriction before we consider this extension "finished". Note that NVRTC allows global variables to be defined in namespaces, etc., and their nvrtcAddNameExpression
API allows registration of either kernel or global variable names. Therefore allowing device global names in registered_names
would be equivalent to what NVRTC has now.
If we plan to rename registered_kernel_names
to registered_names
, we should do that sooner rather than later. Better to do this before people start using the extension in ernest.
`bundle_state::ext_oneapi_source`. | ||
|
||
_Effects:_ If device memory for `name` has not been allocated at the time of | ||
this call, it will be allocated and zero-initialized synchronously. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a detail of our implementation. It would also be valid to allocate the device memory earlier (e.g. when the kernel bundle is compiled), or allocate it later (e.g. when the host first reads or writes to the address). Maybe we could instead add a Remarks: section like:
Remarks: The contents of the device global may be read or written from the host by reading from or writing to this address. If the address is read before any kernel writes to the device global, the read operation returns the device global's initial value.
this call, it will be allocated and zero-initialized synchronously. | ||
|
||
_Returns:_ Returns a USM pointer to the device global `name`'s storage on device | ||
`dev`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should say that this is a "device USM" pointer. That way application's know they cannot directly access it from the host.
The formatting is messed up here because of the "'s". I generally avoid pluralizing identifiers for stylistic reasons. I'd rewrite the sentence like:
Returns: A device USM pointer to the storage for the device global
name
on devicedev
.
_Constraints:_ This function is not available when `State` is | ||
`bundle_state::ext_oneapi_source`. | ||
|
||
_Returns:_ Returns the size in bytes of device global `name`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of a nit, but the size is the size of the device global's storage, which is not the same as sizeof(name)
. I'd say:
Returns: The size in bytes of the USM storage for device global
name
.
(BTW, notice that we do not repeat the word "Returns" in the description of the return value.)
[source,c++] | ||
---- | ||
size_t ext_oneapi_get_device_global_size(const std::string &name, | ||
const device &dev) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the dev
parameter? Is the size different on different devices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, see discussion on ext_oneapi_has_device_global
. I'll drop the parameter.
…6386) The overloads for single_task and parallel_for in the sycl_ext_oneapi_kernel_properties extension are being deprecated as mentioned in intel#14785. So I'm rewriting tests containg such overloads so that they can still run after the deprecation. --------- Signed-off-by: Hu, Peisen <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
Superseded by intel#17401. |
Adds three new methods on
kernel_bundle
to interact with device globals defined in runtime-compiled code: