-
Notifications
You must be signed in to change notification settings - Fork 14.9k
Closed
Labels
llvm:analysisIncludes value tracking, cost tables and constant foldingIncludes value tracking, cost tables and constant folding
Description
Flang can't vectorize the loop in s113
of TSVC while Clang can vectorize the loop written in C.
! Fortran version
do 1 nl = 1,ntimes
do 10 i = 2,n
a(i) = a(1) + b(i)
10 continue
call dummy(ld,n,a,b,c,d,e,aa,bb,cc,1.)
1 continue
// C version
for (int nl = 0; nl < ntimes; nl++) {
for (int i = 1; i < n; i++) {
a[i] = a[0] + b[i];
}
dummy(a, b, c, d, e, aa, bb, cc, 0.);
}
$ flang-new -v -Ofast s113.f -S -Rpass=licm\|vector -falias-analysis
flang-new version 18.0.0 (https://github.com/llvm/llvm-project.git 1c1227846425883a3d39ff56700660236a97152c)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /path/to/install/bin
Found candidate GCC installation: /path/to/lib/gcc/aarch64-unknown-linux-gnu/11.2.0
Selected GCC installation: /path/to/lib/gcc/aarch64-unknown-linux-gnu/11.2.0
Candidate multilib: .;@m64
Selected multilib: .;@m64
"/path/to/install/bin/flang-new" -fc1 -triple aarch64-unknown-linux-gnu -S -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -ffast-math -target-cpu generic -target-feature +neon -target-feature +v8a -fstack-arrays -fversion-loops-for-stride -falias-analysis -Rpass=vector -O3 -o s113.s -x f95-cpp-input s113.f
$ clang -Ofast s113.c -Rpass=licm\|vector
/path/to/s113.c:16:11: remark: hoisting load [-Rpass=licm]
16 | a[i] = a[0] + b[i];
| ^
/path/to/s113.c:15:3: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize]
15 | for (int i = 1; i < LEN; i++) {
| ^
It can be reproduced with the following C code which is the same program as the above C code essentially.
// C version
for (int nl = 0; nl < ntimes; nl++) {
for (int i = 2; i <= n; i++) {
a[i-1] = a[0] + b[i-1];
}
dummy(a, b, c, d, e, aa, bb, cc, 0.);
}
Actually, Flang generates LLVM IR like this C code.
LICM is necessary for vectorization because LoopAccessAnalysis can't analyze a[0]
correctly.
It seems that LICM doesn't work due to the linear expression in indices of arrays.
Metadata
Metadata
Assignees
Labels
llvm:analysisIncludes value tracking, cost tables and constant foldingIncludes value tracking, cost tables and constant folding