-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Description
| Bugzilla Link | 35520 |
| Resolution | FIXED |
| Resolved on | Jan 20, 2018 08:10 |
| Version | trunk |
| OS | Linux |
| Blocks | #30613 |
| CC | @efriedma-quic,@hfinkel,@bogner,@RKSimon,@arsenm,@JonPsson,@rotateright |
Extended Description
The following simple test case generates completely broken code on SystemZ (run with -mcpu=z13 to enable the vector ISA):
define i16 @test(<16 x i1> %src)
{
%res = bitcast <16 x i1> %src to i16
ret i16 %res
}
What happens is that in the ABI, the <16 x i1> is passed as <16 x i8>, so the code would need to truncate to a real 16-bit vector, and reinterpret the result as i16. For some reason, the code generator attempts to implement this as a truncating store of a <16 x i8> in register to a <16 x i1> in memory, and then loads that memory location as i16.
So far still OK, but the truncating store generates completely broken code: it simply repeatedly stores all 16 bytes of the source vector to the same byte in memory. Looking for the code that does this, I found this in VectorLegalizer::ExpandStore (added by arsenm):
// FIXME: This is completely broken and inconsistent with ExpandLoad
// handling.
// For sub-byte element sizes, this ends up with 0 stride between elements,
// so the same element just gets re-written to the same location. There seem
// to be tests explicitly testing for this broken behavior though. tests
// for this broken behavior.
which in fact matches exactly what I'm seeing.
What's going on here? Does this not happen on other platforms? Should I be doing something different in the back-end, or should we try to fix this common code issue after all? Any suggestions welcome ...