Skip to content

Commit f324e44

Browse files
committed
[LIBCLC][AMDGCN] Fix get_max_sub_group_size
Using defines to figure out the wavefront size there is incorrect because libclc is not built for a specific amdgcn version, so it will always default to `64`. Instead use the `__oclc_wavefront64` global variable provided by ROCm, which will be set to a different value depending on the architecture.
1 parent 0a1e6d9 commit f324e44

File tree

1 file changed

+10
-15
lines changed

1 file changed

+10
-15
lines changed

libclc/amdgcn-amdhsa/libspirv/workitem/get_max_sub_group_size.cl

Lines changed: 10 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -8,21 +8,16 @@
88

99
#include <spirv/spirv.h>
1010

11-
// FIXME: Remove the following workaround once the clang change is released.
12-
// This is for backward compatibility with older clang which does not define
13-
// __AMDGCN_WAVEFRONT_SIZE. It does not consider -mwavefrontsize64.
14-
// See:
15-
// https://github.com/intel/llvm/blob/sycl/clang/lib/Basic/Targets/AMDGPU.h#L414
16-
// and:
17-
// https://github.com/intel/llvm/blob/sycl/clang/lib/Basic/Targets/AMDGPU.cpp#L421
18-
#ifndef __AMDGCN_WAVEFRONT_SIZE
19-
#if __gfx1010__ || __gfx1011__ || __gfx1012__ || __gfx1030__ || __gfx1031__
20-
#define __AMDGCN_WAVEFRONT_SIZE 32
21-
#else
22-
#define __AMDGCN_WAVEFRONT_SIZE 64
23-
#endif
24-
#endif
11+
// The clang driver will define this variable depending on the architecture and
12+
// compile flags by linking in ROCm bitcode defining it to true or false. If
13+
// it's 1 the wavefront size used is 64, if it's 0 the wavefront size used is
14+
// 32.
15+
extern constant unsigned char __oclc_wavefrontsize64;
2516

2617
_CLC_DEF _CLC_OVERLOAD uint __spirv_SubgroupMaxSize() {
27-
return __AMDGCN_WAVEFRONT_SIZE;
18+
if (__oclc_wavefrontsize64 == 1) {
19+
return 64;
20+
} else {
21+
return 32;
22+
}
2823
}

0 commit comments

Comments
 (0)