-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Open
Labels
compiler:simdinstruction-level vectorizationinstruction-level vectorizationperformanceMust go fasterMust go faster
Description
From a discussion on slack with @gbaraldi and @haampie , we (think) that LLVM is not aware of us aligning arrays to 16 byte. This results in less efficient code in SIMDable loops over such arrays, using (in the case I observed) lots of unaligned loads & stores, even for properly SIMD aligned views into such arrays.
I don't have a small MWE at hand, sadly enough, but I'll try to get one. @gbaraldi mentioned that we should be able to teach LLVM about our alignment, hence this issue. What spurred the discussion was me trying to figure out why there were lots of vinsertps in the code, indicating lots of misaligned stuff.
Metadata
Metadata
Assignees
Labels
compiler:simdinstruction-level vectorizationinstruction-level vectorizationperformanceMust go fasterMust go faster