Skip to content

Conversation

al42and
Copy link
Contributor

@al42and al42and commented Sep 21, 2023

Don't throw ICE when an unknown device is specified explicitly via -Xsycl-target-backend --offload-arch=. We don't enable macros or other niceties from sycl_ext_oneapi_device_architecture, but at least the code compiles.

Fixes #8112, #11203, #12010

Don't throw ICE when an unknown device is specified explicitly via
`-Xsycl-target-backend --offload-arch=`. We don't enable macros or other
niceties from sycl_ext_oneapi_device_architecture, but at least the code
compiles.

Fixes intel#8112, intel#11203
@al42and al42and requested a review from a team as a code owner September 21, 2023 14:59
@al42and al42and temporarily deployed to WindowsCILock September 21, 2023 19:57 — with GitHub Actions Inactive
@al42and
Copy link
Contributor Author

al42and commented Sep 22, 2023

I am not sure how the CI failures could be related to this change. Are they known to be flaky?

Triple.isNVPTX() || Triple.isAMDGCN()) {
StringRef Device = JA.getOffloadingArch();
if (!Device.empty()) {
if (!Device.empty() && !SYCL::gen::getGenDeviceMacro(Device).empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a test for this circumstance? Also, would it be reasonable to emit some kind of diagnostic that the macro isn't being generated due to an issue with the device value?

Copy link
Contributor Author

@al42and al42and Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a test for this circumstance?

Could you advise on the best place to put a test? clang/test/Driver/sycl-oneapi-gpu.cpp?

Also, would it be reasonable to emit some kind of diagnostic that the macro isn't being generated due to an issue with the device value?

I don't think macro alone warrants this. It's prone to false positives (what if the code don't use these macros at all?), and occurs at the wrong place (it's not the problem of the user that they have gfx1036, it's the problem of the developer who wrote code relying a macro __SYCL_TARGET_AMD_GPU_GFX1036__). Warning about unsupported macros in the code (__SYCL_TARGET_* will never be set) independently of the target architecture could will be better: it will be emitted if and only if there is an actual bug in the code, but this is already covered by -Wundef (since, if the macro is supported, it will be defined to 0, https://github.com/intel/llvm/blob/de92299c2c09d626dcbc633c83e259d828220d03/sycl/doc/design/DeviceIf.md).

We could, more generally, warn about "unsupported" devices, the way Clang warns about "unsupported" CUDA versions (-Wunknown-cuda-version). I don't think it fits the scope of this PR, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @al42and . If you read the DeviceIf.md, it is quite clear IMO:

"These macros are an internal implementation detail, so they should not be documented to users, and user code should not make use of them."

@al42and I think the answer to your question is that the appropriate place to place such a test is in clang/test/Driver: I guess you just want something similar to this: https://github.com/JackAKirk/llvm/blob/2b4b45af5a1ae0d23bcb632cc4588faecedc1956/clang/test/Driver/sycl-fno-libspirv-warn.cpp

except you want to check there are no warnings or errors when compiling for a device that e.g doesn't have a macro assigned but is supported by the driver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except you want to check there are no warnings or errors when compiling for a device that e.g doesn't have a macro assigned but is supported by the driver.

That's going to be an unreliable test and in need of regular updates. Hopefully, such architectures will eventually get assigned a flag :)

I think a better test is to check that when compiling for gfx0000 or sm_00 we get a reasonable error instead of an ICE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better test is to check that when compiling for gfx0000 or sm_00 we get a reasonable error instead of an ICE.

Unfortunately, the validity of the architecture is checked before the broken compiler pass is invoked, so this idea will not work.

Instead, I'm using an old architecture (e.g., sm_30), that is valid but is not and will not be supported.

Copy link
Contributor Author

@al42and al42and Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like NVIDIA device libraries are not as reliable as I thought they are. Changed the test to use an old AMD architecture (gfx600, supported by LLVM, not listed in clang/lib/Driver/ToolChains/SYCL.cpp) instead.

Copy link
Contributor Author

@al42and al42and Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are harder than I expected 😅 Hopefully I'll get "Build + LIT" to pass this time with 081d821.

The timeout of Basic/vector/int-convert.cpp in e2e / Intel GEN12 Graphics with Level Zero is unrelated: #12011

@bader bader requested review from JackAKirk and mdtoguchi November 27, 2023 19:05
@againull againull merged commit 5a51c0a into intel:sycl Nov 28, 2023
@al42and al42and deleted the fix-8112 branch November 29, 2023 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow specifying arbitrary target architecture via backend flags

4 participants