Skip to content

[SYCL] Fix detection of opaque pointer aspect usage #9586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

steffenlarsen
Copy link
Contributor

When opaque pointers are enabled, the aspect propagation pass may not currently detect the use of types in calls to functions where the type is passed as an opaque pointer. In these cases, the type can usually be detected through the source type in GEP instructions inside the corresponding functions. A common example of this is the constructors for classes marked as using aspects.

This commit fixes the detection of usage inside functions using opaque pointer arguments by extracting information about the source type from GEP instructions inside the functions.

When opaque pointers are enabled, the aspect propagation pass may not
currently detect the use of types in calls to functions where the type
is passed as an opaque pointer. In these cases, the type can usually be
detected through the source type in GEP instructions inside the
corresponding functions. A common example of this is the constructors
for classes marked as using aspects.

This commit fixes the detection of usage inside functions using opaque
pointer arguments by extracting information about the source type from
GEP instructions inside the functions.

Signed-off-by: Larsen, Steffen <[email protected]>
@steffenlarsen steffenlarsen requested review from a team as code owners May 24, 2023 11:56
@steffenlarsen steffenlarsen temporarily deployed to aws May 24, 2023 12:23 — with GitHub Actions Inactive
@steffenlarsen steffenlarsen temporarily deployed to aws May 24, 2023 12:59 — with GitHub Actions Inactive
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
@steffenlarsen steffenlarsen temporarily deployed to aws May 24, 2023 16:55 — with GitHub Actions Inactive
Copy link
Contributor

@aelovikov-intel aelovikov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this implementation is 100% reliable, but the SYCL RT test change is fine.

@steffenlarsen steffenlarsen temporarily deployed to aws May 24, 2023 17:41 — with GitHub Actions Inactive
@steffenlarsen steffenlarsen temporarily deployed to aws May 25, 2023 08:38 — with GitHub Actions Inactive
@steffenlarsen steffenlarsen temporarily deployed to aws May 25, 2023 09:16 — with GitHub Actions Inactive
@steffenlarsen steffenlarsen merged commit 7ec4b49 into intel:sycl May 25, 2023
@jgstarIntel
Copy link
Contributor

Not sure if this implementation is 100% reliable, but the SYCL RT test change is fine.

@aelovikov-intel @steffenlarsen Can you please expand on the above comment?

The implementation & testing looks good to me and seems good enough to now allow opaque pointers to be used in default compiler mode for device code.
Is there a way this solution would need to generalize or are there other know relevant cases, that may need to be accounted for (perhaps down the line) ?

@aelovikov-intel
Copy link
Contributor

aelovikov-intel commented May 25, 2023

@aelovikov-intel @steffenlarsen Can you please expand on the above comment?

From the point of LangRef (AFAIK, but I haven't re-read it in a while)

%a = getelementptr inbounds %struct.StructWithAspect, ptr addrspace(4) %this1, i32 0, i32 0

is exactly the same as

%a = getelementptr inbounds i8, ptr addrspace(4) %this1, i32 %<byte offset of the field>

and can be substituted with it prior to the aspects propagation pass (unless we have some very strict restrictions on the placement of the latter in the optimization pipeline, which would be weird by itself).

@steffenlarsen
Copy link
Contributor Author

In the example @aelovikov-intel mentions, I don't believe there is a way for us to actually determine the usage. However, I don't think that has to be a problem.

The problem this PR fixes is mainly for cases where a function GEPs directly into a type that has some implicit requirements. With opaque pointers enabled, this happens for certain ctors. However, strictly speaking we don't need to know for these ctors because further up the call-chain we should know the type, e.g. through an alloca or similar. The reason why we would want to fix it here is to have as much knowledge of the call-chain causing a requirement as possible, all the way down to the ctor in this case.

@AlexeySachkov
Copy link
Contributor

@aelovikov-intel,

and can be substituted with it prior to the aspects propagation pass (unless we have some very strict restrictions on the placement of the latter in the optimization pipeline, which would be weird by itself).

The pass was added to implement functionality required by 5.7. Optional kernel features and in particular, the pass gathers list of aspects used by each kernel so that later at runtime we could check if used aspects match aspects supported by a target device.

SYCL spec does not provide any definition as to what is considered to be an aspect use, i.e. several implementations are possible:

  • one which is closer to analysis of an optimized code, i.e. if you have double variable in a kernel, but it got optimized out, then we don't list fp64 aspect as a used
  • one which is closer to original source code analysis: if you have double variable in a kernel, then we consider fp64 aspect to be used regardless of whether that variable was optimized out

The first option is a bit "unstable", because it may produce different result based on different optimization levels. It can theoretically change the result even if you simply restructure you code, making some optimizations impossible. Therefore, we decided to go with the second option, to make decisions made by our toolchain more predictable and transparent for end users.

As a result, we try to launch the pass as early as possible to achieve that, from design doc:

It is important that this IR phase runs before any other optimization phase that might eliminate a reference to a type or inline a function call because such optimizations will cause us to miss information about aspects that are used. Therefore, it is recommended that this new phase run first, before all other IR phases.

Please note that this is not a requirement of a pass itself (it would run correctly in any position within a pipeline), but rather a next-level requirement from us to perform this analysis at certain point in the pipeline to get certain results.

However, we faced user experience issues with such approach: in C++ floating-point literals have double type, unless they have f suffix. This makes any constant like constexpr auto PI = 3.14; (and something like this is defined within math.h) to have double type and even if it is only used as a float everywhere else, we still say that fp64 is used. This was proved to be very annoying for end users and for fp64 aspect we decided to "relax" our analysis, i.e. it is performed at a later stage, after optimizations, see #8681

@jgstarIntel

Is there a way this solution would need to generalize or are there other know relevant cases, that may need to be accounted for (perhaps down the line) ?

I doubt that we can generalize here. The reason is that some instructions are a bit unique compared to others and therefore there will always be a few exceptions on top of "go over each operand of the instruction to get its type". At the moment other such cases are not known or they would have been added already (in this or other PRs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants