Skip to content

Commit ef90c3c

Browse files
committed
[AArch64][GlobalISel] Legalize more CTPOP vector types.
Similar to other operations, s8, s16 s32 and s64 vector elements are clamped to legal vector sizes, odd number of elements are widened to the next power-2 and s128 is scalarized. This helps legalize cttz as well as ctpop.
1 parent 0e6ea09 commit ef90c3c

File tree

4 files changed

+866
-427
lines changed

4 files changed

+866
-427
lines changed

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6140,6 +6140,7 @@ LegalizerHelper::moreElementsVector(MachineInstr &MI, unsigned TypeIdx,
61406140
case TargetOpcode::G_SEXT_INREG:
61416141
case TargetOpcode::G_ABS:
61426142
case TargetOpcode::G_CTLZ:
6143+
case TargetOpcode::G_CTPOP:
61436144
if (TypeIdx != 0)
61446145
return UnableToLegalize;
61456146
Observer.changingInstr(MI);

llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -323,7 +323,13 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
323323
.clampScalar(0, s32, s128)
324324
.widenScalarToNextPow2(0)
325325
.minScalarEltSameAsIf(always, 1, 0)
326-
.maxScalarEltSameAsIf(always, 1, 0);
326+
.maxScalarEltSameAsIf(always, 1, 0)
327+
.clampNumElements(0, v8s8, v16s8)
328+
.clampNumElements(0, v4s16, v8s16)
329+
.clampNumElements(0, v2s32, v4s32)
330+
.clampNumElements(0, v2s64, v2s64)
331+
.moreElementsToNextPow2(0)
332+
.scalarizeIf(scalarOrEltWiderThan(0, 64), 0);
327333

328334
getActionDefinitionsBuilder(G_CTLZ)
329335
.legalFor({{s32, s32},

0 commit comments

Comments
 (0)