-
Notifications
You must be signed in to change notification settings - Fork 14.6k
Open
Labels
TBAAType-Based Alias Analysis / Strict AliasingType-Based Alias Analysis / Strict Aliasingclang:codegenIR generation bugs: mangling, exceptions, etc.IR generation bugs: mangling, exceptions, etc.loopoptimvectorization
Description
void foo(int *A, int *B, int *C, int *D, int len) {
auto func = [&] () {
for (int i =0; i < len; i++)
A[i] = B[i] * C[i] + D[i];
};
func();
}
$ clang++ lambda.cpp -O3 -c -S -emit-llvm -fno-unroll-loops -fno-inline
The issue is lambda is outlined into another function due to which variable "len" is loaded inside the loop. Loop bounds are not determined due to this load (compiler might be expecting alias to the load instruction).
The original problem is in Geekbench 6.1 src/geekbench/ml/backend/cpu/depthwise_convolution_2d.cpp Line no: 402. The reason for adding "-fno-unroll-loops -fno-inline" in the above code is that for smaller functions, lambda is inlined and there is no problem in vectorization. In Geekbench, inlining doesn't happen for lambda, but outlined, and hence fails to vectorize. A good performance gain is expected in Geekbench if vectorization happens for this loop.
Metadata
Metadata
Assignees
Labels
TBAAType-Based Alias Analysis / Strict AliasingType-Based Alias Analysis / Strict Aliasingclang:codegenIR generation bugs: mangling, exceptions, etc.IR generation bugs: mangling, exceptions, etc.loopoptimvectorization