-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[IR] Add new CreateVectorInterleave interface #150931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR adds a new interface to IRBuilder called CreateVectorInterleave, which can be used to create vector.interleave intrinsics of factors 2-8. I've also added a new interface to the Folder called FoldVectorInterleave, which can spot when every operand is the same splat and instead return a new splat of the appropriate size. For convenience I have also moved getInterleaveIntrinsicID and getDeinterleaveIntrinsicID from VectorUtils.cpp to Intrinsics.cpp where it can be used by IRBuilder.
|
@llvm/pr-subscribers-backend-aarch64 @llvm/pr-subscribers-llvm-transforms Author: David Sherwood (david-arm) ChangesThis PR adds a new interface to IRBuilder called CreateVectorInterleave, which can be used to create vector.interleave intrinsics of factors 2-8. I've also added a new interface to the Folder called FoldVectorInterleave, which can spot when every operand is the same splat and instead return a new splat of the appropriate size. For convenience I have also moved getInterleaveIntrinsicID and getDeinterleaveIntrinsicID from VectorUtils.cpp to Intrinsics.cpp where it can be used by IRBuilder. Patch is 30.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150931.diff 17 Files Affected:
diff --git a/llvm/include/llvm/Analysis/InstSimplifyFolder.h b/llvm/include/llvm/Analysis/InstSimplifyFolder.h
index 58793ed977f68..b7e3decb91d61 100644
--- a/llvm/include/llvm/Analysis/InstSimplifyFolder.h
+++ b/llvm/include/llvm/Analysis/InstSimplifyFolder.h
@@ -125,6 +125,10 @@ class LLVM_ABI InstSimplifyFolder final : public IRBuilderFolder {
dyn_cast_if_present<CallBase>(FMFSource));
}
+ Value *FoldVectorInterleave(ArrayRef<Value *> Ops) const override {
+ return ConstFolder.FoldVectorInterleave(Ops);
+ }
+
//===--------------------------------------------------------------------===//
// Cast/Conversion Operators
//===--------------------------------------------------------------------===//
diff --git a/llvm/include/llvm/Analysis/TargetFolder.h b/llvm/include/llvm/Analysis/TargetFolder.h
index d27455cf3505d..151ff93975ae6 100644
--- a/llvm/include/llvm/Analysis/TargetFolder.h
+++ b/llvm/include/llvm/Analysis/TargetFolder.h
@@ -189,6 +189,24 @@ class LLVM_ABI TargetFolder final : public IRBuilderFolder {
return nullptr;
}
+ Value *FoldVectorInterleave(ArrayRef<Value *> Ops) const override {
+ // Check to see if all operands are the same.
+ for (unsigned I = 1; I < Ops.size(); I++) {
+ if (Ops[I] != Ops[0])
+ return nullptr;
+ }
+
+ // Is this just a large splat?
+ if (auto *C = dyn_cast<Constant>(Ops[0])) {
+ if (auto *V = C->getSplatValue()) {
+ auto *SubvecTy = cast<VectorType>(Ops[0]->getType());
+ return ConstantVector::getSplat(
+ SubvecTy->getElementCount() * Ops.size(), V);
+ }
+ }
+ return nullptr;
+ }
+
Value *FoldBinaryIntrinsic(Intrinsic::ID ID, Value *LHS, Value *RHS, Type *Ty,
Instruction *FMFSource) const override {
auto *C1 = dyn_cast<Constant>(LHS);
diff --git a/llvm/include/llvm/Analysis/VectorUtils.h b/llvm/include/llvm/Analysis/VectorUtils.h
index 9a2773c06bae6..b55c4e0a6bf76 100644
--- a/llvm/include/llvm/Analysis/VectorUtils.h
+++ b/llvm/include/llvm/Analysis/VectorUtils.h
@@ -177,12 +177,6 @@ LLVM_ABI bool isVectorIntrinsicWithStructReturnOverloadAtField(
LLVM_ABI Intrinsic::ID
getVectorIntrinsicIDForCall(const CallInst *CI, const TargetLibraryInfo *TLI);
-/// Returns the corresponding llvm.vector.interleaveN intrinsic for factor N.
-LLVM_ABI Intrinsic::ID getInterleaveIntrinsicID(unsigned Factor);
-
-/// Returns the corresponding llvm.vector.deinterleaveN intrinsic for factor N.
-LLVM_ABI Intrinsic::ID getDeinterleaveIntrinsicID(unsigned Factor);
-
/// Returns the corresponding factor of llvm.vector.interleaveN intrinsics.
LLVM_ABI unsigned getInterleaveIntrinsicFactor(Intrinsic::ID ID);
diff --git a/llvm/include/llvm/IR/ConstantFolder.h b/llvm/include/llvm/IR/ConstantFolder.h
index 26b7242abc4d9..578f44aabf0e9 100644
--- a/llvm/include/llvm/IR/ConstantFolder.h
+++ b/llvm/include/llvm/IR/ConstantFolder.h
@@ -181,6 +181,24 @@ class LLVM_ABI ConstantFolder final : public IRBuilderFolder {
return nullptr;
}
+ Value *FoldVectorInterleave(ArrayRef<Value *> Ops) const override {
+ // Check to see if all operands are the same.
+ for (unsigned I = 1; I < Ops.size(); I++) {
+ if (Ops[I] != Ops[0])
+ return nullptr;
+ }
+
+ // Is this just a large splat?
+ if (auto *C = dyn_cast<Constant>(Ops[0])) {
+ if (auto *V = C->getSplatValue()) {
+ auto *SubvecTy = cast<VectorType>(Ops[0]->getType());
+ return ConstantVector::getSplat(
+ SubvecTy->getElementCount() * Ops.size(), V);
+ }
+ }
+ return nullptr;
+ }
+
Value *FoldBinaryIntrinsic(Intrinsic::ID ID, Value *LHS, Value *RHS, Type *Ty,
Instruction *FMFSource) const override {
// Use TargetFolder or InstSimplifyFolder instead.
diff --git a/llvm/include/llvm/IR/IRBuilder.h b/llvm/include/llvm/IR/IRBuilder.h
index 7c600e762a451..6d3d864b46559 100644
--- a/llvm/include/llvm/IR/IRBuilder.h
+++ b/llvm/include/llvm/IR/IRBuilder.h
@@ -2614,6 +2614,8 @@ class IRBuilderBase {
return CreateShuffleVector(V, PoisonValue::get(V->getType()), Mask, Name);
}
+ Value *CreateVectorInterleave(ArrayRef<Value *> Ops, const Twine &Name = "");
+
Value *CreateExtractValue(Value *Agg, ArrayRef<unsigned> Idxs,
const Twine &Name = "") {
if (auto *V = Folder.FoldExtractValue(Agg, Idxs))
diff --git a/llvm/include/llvm/IR/IRBuilderFolder.h b/llvm/include/llvm/IR/IRBuilderFolder.h
index db4ab5af2433a..24416712c3d82 100644
--- a/llvm/include/llvm/IR/IRBuilderFolder.h
+++ b/llvm/include/llvm/IR/IRBuilderFolder.h
@@ -75,6 +75,8 @@ class LLVM_ABI IRBuilderFolder {
virtual Value *FoldCast(Instruction::CastOps Op, Value *V,
Type *DestTy) const = 0;
+ virtual Value *FoldVectorInterleave(ArrayRef<Value *> Ops) const = 0;
+
virtual Value *
FoldBinaryIntrinsic(Intrinsic::ID ID, Value *LHS, Value *RHS, Type *Ty,
Instruction *FMFSource = nullptr) const = 0;
diff --git a/llvm/include/llvm/IR/Intrinsics.h b/llvm/include/llvm/IR/Intrinsics.h
index 156805293367b..48735b06d3f53 100644
--- a/llvm/include/llvm/IR/Intrinsics.h
+++ b/llvm/include/llvm/IR/Intrinsics.h
@@ -283,8 +283,15 @@ namespace Intrinsic {
// or of the wrong kind will be renamed by adding ".renamed" to the name.
LLVM_ABI std::optional<Function *> remangleIntrinsicFunction(Function *F);
-} // End Intrinsic namespace
+ /// Returns the corresponding llvm.vector.interleaveN intrinsic for factor N.
+ LLVM_ABI Intrinsic::ID getInterleaveIntrinsicID(unsigned Factor);
-} // End llvm namespace
+ /// Returns the corresponding llvm.vector.deinterleaveN intrinsic for factor
+ /// N.
+ LLVM_ABI Intrinsic::ID getDeinterleaveIntrinsicID(unsigned Factor);
+
+ } // namespace Intrinsic
+
+ } // namespace llvm
#endif
diff --git a/llvm/include/llvm/IR/NoFolder.h b/llvm/include/llvm/IR/NoFolder.h
index 9f16c6983313a..cb843c8f7a19e 100644
--- a/llvm/include/llvm/IR/NoFolder.h
+++ b/llvm/include/llvm/IR/NoFolder.h
@@ -118,6 +118,10 @@ class LLVM_ABI NoFolder final : public IRBuilderFolder {
return nullptr;
}
+ Value *FoldVectorInterleave(ArrayRef<Value *> Ops) const override {
+ return nullptr;
+ }
+
//===--------------------------------------------------------------------===//
// Cast/Conversion Operators
//===--------------------------------------------------------------------===//
diff --git a/llvm/lib/Analysis/ConstantFolding.cpp b/llvm/lib/Analysis/ConstantFolding.cpp
index 759c553111d06..ea4c49db92818 100644
--- a/llvm/lib/Analysis/ConstantFolding.cpp
+++ b/llvm/lib/Analysis/ConstantFolding.cpp
@@ -1641,7 +1641,19 @@ bool llvm::canConstantFoldCallTo(const CallBase *Call, const Function *F) {
case Intrinsic::vector_extract:
case Intrinsic::vector_insert:
case Intrinsic::vector_interleave2:
+ case Intrinsic::vector_interleave3:
+ case Intrinsic::vector_interleave4:
+ case Intrinsic::vector_interleave5:
+ case Intrinsic::vector_interleave6:
+ case Intrinsic::vector_interleave7:
+ case Intrinsic::vector_interleave8:
case Intrinsic::vector_deinterleave2:
+ case Intrinsic::vector_deinterleave3:
+ case Intrinsic::vector_deinterleave4:
+ case Intrinsic::vector_deinterleave5:
+ case Intrinsic::vector_deinterleave6:
+ case Intrinsic::vector_deinterleave7:
+ case Intrinsic::vector_deinterleave8:
// Target intrinsics
case Intrinsic::amdgcn_perm:
case Intrinsic::amdgcn_wave_reduce_umin:
diff --git a/llvm/lib/Analysis/VectorUtils.cpp b/llvm/lib/Analysis/VectorUtils.cpp
index 1b3da590cff7f..150ddced03b15 100644
--- a/llvm/lib/Analysis/VectorUtils.cpp
+++ b/llvm/lib/Analysis/VectorUtils.cpp
@@ -240,30 +240,6 @@ Intrinsic::ID llvm::getVectorIntrinsicIDForCall(const CallInst *CI,
return Intrinsic::not_intrinsic;
}
-struct InterleaveIntrinsic {
- Intrinsic::ID Interleave, Deinterleave;
-};
-
-static InterleaveIntrinsic InterleaveIntrinsics[] = {
- {Intrinsic::vector_interleave2, Intrinsic::vector_deinterleave2},
- {Intrinsic::vector_interleave3, Intrinsic::vector_deinterleave3},
- {Intrinsic::vector_interleave4, Intrinsic::vector_deinterleave4},
- {Intrinsic::vector_interleave5, Intrinsic::vector_deinterleave5},
- {Intrinsic::vector_interleave6, Intrinsic::vector_deinterleave6},
- {Intrinsic::vector_interleave7, Intrinsic::vector_deinterleave7},
- {Intrinsic::vector_interleave8, Intrinsic::vector_deinterleave8},
-};
-
-Intrinsic::ID llvm::getInterleaveIntrinsicID(unsigned Factor) {
- assert(Factor >= 2 && Factor <= 8 && "Unexpected factor");
- return InterleaveIntrinsics[Factor - 2].Interleave;
-}
-
-Intrinsic::ID llvm::getDeinterleaveIntrinsicID(unsigned Factor) {
- assert(Factor >= 2 && Factor <= 8 && "Unexpected factor");
- return InterleaveIntrinsics[Factor - 2].Deinterleave;
-}
-
unsigned llvm::getInterleaveIntrinsicFactor(Intrinsic::ID ID) {
switch (ID) {
case Intrinsic::vector_interleave2:
diff --git a/llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp b/llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp
index 8855740f0cc8f..0cf1a5b390ae4 100644
--- a/llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp
+++ b/llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp
@@ -2194,11 +2194,10 @@ Value *ComplexDeinterleavingGraph::replaceNode(IRBuilderBase &Builder,
// Splats that are not constant are interleaved where they are located
Instruction *InsertPoint = (I->comesBefore(R) ? R : I)->getNextNode();
IRBuilder<> IRB(InsertPoint);
- ReplacementNode = IRB.CreateIntrinsic(Intrinsic::vector_interleave2,
- NewTy, {Node->Real, Node->Imag});
+ ReplacementNode = IRB.CreateVectorInterleave({Node->Real, Node->Imag});
} else {
- ReplacementNode = Builder.CreateIntrinsic(
- Intrinsic::vector_interleave2, NewTy, {Node->Real, Node->Imag});
+ ReplacementNode =
+ Builder.CreateVectorInterleave({Node->Real, Node->Imag});
}
break;
}
@@ -2228,8 +2227,7 @@ Value *ComplexDeinterleavingGraph::replaceNode(IRBuilderBase &Builder,
auto *B = replaceNode(Builder, Node->Operands[1]);
auto *NewMaskTy = VectorType::getDoubleElementsVectorType(
cast<VectorType>(MaskReal->getType()));
- auto *NewMask = Builder.CreateIntrinsic(Intrinsic::vector_interleave2,
- NewMaskTy, {MaskReal, MaskImag});
+ auto *NewMask = Builder.CreateVectorInterleave({MaskReal, MaskImag});
ReplacementNode = Builder.CreateSelect(NewMask, A, B);
break;
}
@@ -2260,8 +2258,8 @@ void ComplexDeinterleavingGraph::processReductionSingle(
}
if (!NewInit)
- NewInit = Builder.CreateIntrinsic(Intrinsic::vector_interleave2, NewVTy,
- {Init, Constant::getNullValue(VTy)});
+ NewInit =
+ Builder.CreateVectorInterleave({Init, Constant::getNullValue(VTy)});
NewPHI->addIncoming(NewInit, Incoming);
NewPHI->addIncoming(OperationReplacement, BackEdge);
@@ -2289,8 +2287,7 @@ void ComplexDeinterleavingGraph::processReductionOperation(
Value *InitImag = OldPHIImag->getIncomingValueForBlock(Incoming);
IRBuilder<> Builder(Incoming->getTerminator());
- auto *NewInit = Builder.CreateIntrinsic(Intrinsic::vector_interleave2, NewVTy,
- {InitReal, InitImag});
+ auto *NewInit = Builder.CreateVectorInterleave({InitReal, InitImag});
NewPHI->addIncoming(NewInit, Incoming);
NewPHI->addIncoming(OperationReplacement, BackEdge);
diff --git a/llvm/lib/IR/IRBuilder.cpp b/llvm/lib/IR/IRBuilder.cpp
index 28037d7ec5616..35f47f23196fa 100644
--- a/llvm/lib/IR/IRBuilder.cpp
+++ b/llvm/lib/IR/IRBuilder.cpp
@@ -13,6 +13,7 @@
#include "llvm/IR/IRBuilder.h"
#include "llvm/ADT/ArrayRef.h"
+#include "llvm/Analysis/VectorUtils.h"
#include "llvm/IR/Constant.h"
#include "llvm/IR/Constants.h"
#include "llvm/IR/DerivedTypes.h"
@@ -1144,9 +1145,35 @@ Value *IRBuilderBase::CreateVectorSplat(ElementCount EC, Value *V,
return CreateShuffleVector(V, Zeros, Name + ".splat");
}
-Value *IRBuilderBase::CreatePreserveArrayAccessIndex(
- Type *ElTy, Value *Base, unsigned Dimension, unsigned LastIndex,
- MDNode *DbgInfo) {
+Value *IRBuilderBase::CreateVectorInterleave(ArrayRef<Value *> Ops,
+ const Twine &Name) {
+ assert(Ops.size() >= 2 && Ops.size() <= 8 &&
+ "Unexpected number of operands to interleave");
+
+ // Make sure all operands are the same type.
+ assert(isa<VectorType>(Ops[0]->getType()) && "Unexpected type");
+
+#ifndef NDEBUG
+ for (unsigned I = 1; I < Ops.size(); I++) {
+ assert(Ops[I]->getType() == Ops[0]->getType() &&
+ "Vector interleave expects matching operand types!");
+ }
+#endif
+
+ if (auto *V = Folder.FoldVectorInterleave(Ops))
+ return V;
+
+ unsigned IID = Intrinsic::getInterleaveIntrinsicID(Ops.size());
+ auto *SubvecTy = cast<VectorType>(Ops[0]->getType());
+ Type *DestTy = VectorType::get(SubvecTy->getElementType(),
+ SubvecTy->getElementCount() * Ops.size());
+ return CreateIntrinsic(IID, {DestTy}, Ops, {}, Name);
+}
+
+Value *IRBuilderBase::CreatePreserveArrayAccessIndex(Type *ElTy, Value *Base,
+ unsigned Dimension,
+ unsigned LastIndex,
+ MDNode *DbgInfo) {
auto *BaseType = Base->getType();
assert(isa<PointerType>(BaseType) &&
"Invalid Base ptr type for preserve.array.access.index.");
diff --git a/llvm/lib/IR/Intrinsics.cpp b/llvm/lib/IR/Intrinsics.cpp
index 6c35ade3e57c5..58a1f745a7122 100644
--- a/llvm/lib/IR/Intrinsics.cpp
+++ b/llvm/lib/IR/Intrinsics.cpp
@@ -1133,3 +1133,27 @@ std::optional<Function *> Intrinsic::remangleIntrinsicFunction(Function *F) {
"Shouldn't change the signature");
return NewDecl;
}
+
+struct InterleaveIntrinsic {
+ Intrinsic::ID Interleave, Deinterleave;
+};
+
+static InterleaveIntrinsic InterleaveIntrinsics[] = {
+ {Intrinsic::vector_interleave2, Intrinsic::vector_deinterleave2},
+ {Intrinsic::vector_interleave3, Intrinsic::vector_deinterleave3},
+ {Intrinsic::vector_interleave4, Intrinsic::vector_deinterleave4},
+ {Intrinsic::vector_interleave5, Intrinsic::vector_deinterleave5},
+ {Intrinsic::vector_interleave6, Intrinsic::vector_deinterleave6},
+ {Intrinsic::vector_interleave7, Intrinsic::vector_deinterleave7},
+ {Intrinsic::vector_interleave8, Intrinsic::vector_deinterleave8},
+};
+
+Intrinsic::ID Intrinsic::getInterleaveIntrinsicID(unsigned Factor) {
+ assert(Factor >= 2 && Factor <= 8 && "Unexpected factor");
+ return InterleaveIntrinsics[Factor - 2].Interleave;
+}
+
+Intrinsic::ID Intrinsic::getDeinterleaveIntrinsicID(unsigned Factor) {
+ assert(Factor >= 2 && Factor <= 8 && "Unexpected factor");
+ return InterleaveIntrinsics[Factor - 2].Deinterleave;
+}
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 225658b76827e..68e7c20a070f4 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -3391,12 +3391,7 @@ static Value *interleaveVectors(IRBuilderBase &Builder, ArrayRef<Value *> Vals,
// must use intrinsics to interleave.
if (VecTy->isScalableTy()) {
assert(Factor <= 8 && "Unsupported interleave factor for scalable vectors");
- VectorType *InterleaveTy =
- VectorType::get(VecTy->getElementType(),
- VecTy->getElementCount().multiplyCoefficientBy(Factor));
- return Builder.CreateIntrinsic(InterleaveTy,
- getInterleaveIntrinsicID(Factor), Vals,
- /*FMFSource=*/nullptr, Name);
+ return Builder.CreateVectorInterleave(Vals, Name);
}
// Fixed length. Start by concatenating all vectors into a wide vector.
@@ -3503,8 +3498,8 @@ void VPInterleaveRecipe::execute(VPTransformState &State) {
assert(InterleaveFactor <= 8 &&
"Unsupported deinterleave factor for scalable vectors");
NewLoad = State.Builder.CreateIntrinsic(
- getDeinterleaveIntrinsicID(InterleaveFactor), NewLoad->getType(),
- NewLoad,
+ Intrinsic::getDeinterleaveIntrinsicID(InterleaveFactor),
+ NewLoad->getType(), NewLoad,
/*FMFSource=*/nullptr, "strided.vec");
}
diff --git a/llvm/test/CodeGen/AArch64/complex-deinterleaving-opt-crash.ll b/llvm/test/CodeGen/AArch64/complex-deinterleaving-opt-crash.ll
index b62dbc0ee8ea3..b3939268832f7 100644
--- a/llvm/test/CodeGen/AArch64/complex-deinterleaving-opt-crash.ll
+++ b/llvm/test/CodeGen/AArch64/complex-deinterleaving-opt-crash.ll
@@ -9,10 +9,9 @@ define void @reprocessing_crash() #0 {
; CHECK-LABEL: define void @reprocessing_crash(
; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
; CHECK-NEXT: [[ENTRY:.*]]:
-; CHECK-NEXT: [[TMP0:%.*]] = call <vscale x 4 x double> @llvm.vector.interleave2.nxv4f64(<vscale x 2 x double> zeroinitializer, <vscale x 2 x double> zeroinitializer)
; CHECK-NEXT: br label %[[VECTOR_BODY:.*]]
; CHECK: [[VECTOR_BODY]]:
-; CHECK-NEXT: [[TMP1:%.*]] = phi <vscale x 4 x double> [ [[TMP0]], %[[ENTRY]] ], [ [[TMP2:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[TMP1:%.*]] = phi <vscale x 4 x double> [ zeroinitializer, %[[ENTRY]] ], [ [[TMP2:%.*]], %[[VECTOR_BODY]] ]
; CHECK-NEXT: [[TMP2]] = fsub <vscale x 4 x double> [[TMP1]], zeroinitializer
; CHECK-NEXT: br i1 false, label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]]
; CHECK: [[MIDDLE_BLOCK]]:
@@ -48,11 +47,10 @@ define double @test_fp_single_reduction(i1 %c) #2 {
; CHECK-LABEL: define double @test_fp_single_reduction(
; CHECK-SAME: i1 [[C:%.*]]) #[[ATTR1:[0-9]+]] {
; CHECK-NEXT: [[ENTRY:.*]]:
-; CHECK-NEXT: [[TMP0:%.*]] = call <8 x double> @llvm.vector.interleave2.v8f64(<4 x double> zeroinitializer, <4 x double> zeroinitializer)
; CHECK-NEXT: br label %[[VECTOR_BODY:.*]]
; CHECK: [[VECTOR_BODY]]:
; CHECK-NEXT: [[VEC_PHI218:%.*]] = phi <4 x double> [ zeroinitializer, %[[ENTRY]] ], [ [[TMP2:%.*]], %[[VECTOR_BODY]] ]
-; CHECK-NEXT: [[TMP1:%.*]] = phi <8 x double> [ [[TMP0]], %[[ENTRY]] ], [ [[TMP3:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[TMP1:%.*]] = phi <8 x double> [ zeroinitializer, %[[ENTRY]] ], [ [[TMP3:%.*]], %[[VECTOR_BODY]] ]
; CHECK-NEXT: [[STRIDED_VEC:%.*]] = shufflevector <8 x double> zeroinitializer, <8 x double> zeroinitializer, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
; CHECK-NEXT: [[TMP2]] = fadd <4 x double> [[VEC_PHI218]], [[STRIDED_VEC]]
; CHECK-NEXT: [[TMP3]] = fadd <8 x double> [[TMP1]], zeroinitializer
diff --git a/llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll b/llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll
index 880bd2904154c..d67aa08125f74 100644
--- a/llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll
+++ b/llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll
@@ -14,20 +14,19 @@ target triple = "aarch64"
define %"class.std::complex" @complex_mul_v2f64(ptr %a, ptr %b) {
; CHECK-LABEL: complex_mul_v2f64:
; CHECK: // %bb.0: // %entry
+; CHECK-NEXT: movi v0.2d, #0000000000000000
; CHECK-NEXT: movi v1.2d, #0000000000000000
; CHECK-NEXT: mov w8, #100 // =0x64
-; CHECK-NEXT: cntd x9
; CHECK-NEXT: whilelo p1.d, xzr, x8
+; CHECK-NEXT: cntd x9
; CHECK-NEXT: rdvl x10, #2
-; CHECK-NEXT: mov x11, x9
; CHECK-NEXT: ptrue p0.d
-; CHECK-NEXT: zip2 z0.d, z1.d, z1.d
-; CHECK-NEXT: zip1 z1.d, z1.d, z1.d
+; CHECK-NEXT: mov x11, x9
; CHECK-NEXT: .LBB0_1: // %vector.body
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
; CHECK-NEXT: zip2 p2.d, p1.d, p1.d
-; CHECK-NEXT: mov z6.d, z1.d
-; CHECK-NEXT: mov z7.d, z0.d
+; CHECK-NEXT: mov z6.d, z0.d
+; CHECK-NEXT: mov z7.d, z1.d
; CHECK-NEXT: zip1 p1.d, p1.d, p1.d
; CHECK-NEXT: ld1d { z2.d }, p2/z, [x0, #1, mul vl]
; CHECK-NEXT: ld1d { z4.d }, p2/z, [x1, #1, mul vl]
@@ -39,14 +38,14 @@ define %"class.std::complex" @complex_mul_v2f64(ptr %a, ptr %b) {
; CHECK-NEXT: fcmla z6.d, p0/m, z5.d, z3.d, #0
; CHECK-NEXT: fcmla z7.d, p0/m, z4.d, z2....
[truncated]
|
|
@llvm/pr-subscribers-llvm-ir Author: David Sherwood (david-arm) ChangesThis PR adds a new interface to IRBuilder called CreateVectorInterleave, which can be used to create vector.interleave intrinsics of factors 2-8. I've also added a new interface to the Folder called FoldVectorInterleave, which can spot when every operand is the same splat and instead return a new splat of the appropriate size. For convenience I have also moved getInterleaveIntrinsicID and getDeinterleaveIntrinsicID from VectorUtils.cpp to Intrinsics.cpp where it can be used by IRBuilder. Patch is 30.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150931.diff 17 Files Affected:
diff --git a/llvm/include/llvm/Analysis/InstSimplifyFolder.h b/llvm/include/llvm/Analysis/InstSimplifyFolder.h
index 58793ed977f68..b7e3decb91d61 100644
--- a/llvm/include/llvm/Analysis/InstSimplifyFolder.h
+++ b/llvm/include/llvm/Analysis/InstSimplifyFolder.h
@@ -125,6 +125,10 @@ class LLVM_ABI InstSimplifyFolder final : public IRBuilderFolder {
dyn_cast_if_present<CallBase>(FMFSource));
}
+ Value *FoldVectorInterleave(ArrayRef<Value *> Ops) const override {
+ return ConstFolder.FoldVectorInterleave(Ops);
+ }
+
//===--------------------------------------------------------------------===//
// Cast/Conversion Operators
//===--------------------------------------------------------------------===//
diff --git a/llvm/include/llvm/Analysis/TargetFolder.h b/llvm/include/llvm/Analysis/TargetFolder.h
index d27455cf3505d..151ff93975ae6 100644
--- a/llvm/include/llvm/Analysis/TargetFolder.h
+++ b/llvm/include/llvm/Analysis/TargetFolder.h
@@ -189,6 +189,24 @@ class LLVM_ABI TargetFolder final : public IRBuilderFolder {
return nullptr;
}
+ Value *FoldVectorInterleave(ArrayRef<Value *> Ops) const override {
+ // Check to see if all operands are the same.
+ for (unsigned I = 1; I < Ops.size(); I++) {
+ if (Ops[I] != Ops[0])
+ return nullptr;
+ }
+
+ // Is this just a large splat?
+ if (auto *C = dyn_cast<Constant>(Ops[0])) {
+ if (auto *V = C->getSplatValue()) {
+ auto *SubvecTy = cast<VectorType>(Ops[0]->getType());
+ return ConstantVector::getSplat(
+ SubvecTy->getElementCount() * Ops.size(), V);
+ }
+ }
+ return nullptr;
+ }
+
Value *FoldBinaryIntrinsic(Intrinsic::ID ID, Value *LHS, Value *RHS, Type *Ty,
Instruction *FMFSource) const override {
auto *C1 = dyn_cast<Constant>(LHS);
diff --git a/llvm/include/llvm/Analysis/VectorUtils.h b/llvm/include/llvm/Analysis/VectorUtils.h
index 9a2773c06bae6..b55c4e0a6bf76 100644
--- a/llvm/include/llvm/Analysis/VectorUtils.h
+++ b/llvm/include/llvm/Analysis/VectorUtils.h
@@ -177,12 +177,6 @@ LLVM_ABI bool isVectorIntrinsicWithStructReturnOverloadAtField(
LLVM_ABI Intrinsic::ID
getVectorIntrinsicIDForCall(const CallInst *CI, const TargetLibraryInfo *TLI);
-/// Returns the corresponding llvm.vector.interleaveN intrinsic for factor N.
-LLVM_ABI Intrinsic::ID getInterleaveIntrinsicID(unsigned Factor);
-
-/// Returns the corresponding llvm.vector.deinterleaveN intrinsic for factor N.
-LLVM_ABI Intrinsic::ID getDeinterleaveIntrinsicID(unsigned Factor);
-
/// Returns the corresponding factor of llvm.vector.interleaveN intrinsics.
LLVM_ABI unsigned getInterleaveIntrinsicFactor(Intrinsic::ID ID);
diff --git a/llvm/include/llvm/IR/ConstantFolder.h b/llvm/include/llvm/IR/ConstantFolder.h
index 26b7242abc4d9..578f44aabf0e9 100644
--- a/llvm/include/llvm/IR/ConstantFolder.h
+++ b/llvm/include/llvm/IR/ConstantFolder.h
@@ -181,6 +181,24 @@ class LLVM_ABI ConstantFolder final : public IRBuilderFolder {
return nullptr;
}
+ Value *FoldVectorInterleave(ArrayRef<Value *> Ops) const override {
+ // Check to see if all operands are the same.
+ for (unsigned I = 1; I < Ops.size(); I++) {
+ if (Ops[I] != Ops[0])
+ return nullptr;
+ }
+
+ // Is this just a large splat?
+ if (auto *C = dyn_cast<Constant>(Ops[0])) {
+ if (auto *V = C->getSplatValue()) {
+ auto *SubvecTy = cast<VectorType>(Ops[0]->getType());
+ return ConstantVector::getSplat(
+ SubvecTy->getElementCount() * Ops.size(), V);
+ }
+ }
+ return nullptr;
+ }
+
Value *FoldBinaryIntrinsic(Intrinsic::ID ID, Value *LHS, Value *RHS, Type *Ty,
Instruction *FMFSource) const override {
// Use TargetFolder or InstSimplifyFolder instead.
diff --git a/llvm/include/llvm/IR/IRBuilder.h b/llvm/include/llvm/IR/IRBuilder.h
index 7c600e762a451..6d3d864b46559 100644
--- a/llvm/include/llvm/IR/IRBuilder.h
+++ b/llvm/include/llvm/IR/IRBuilder.h
@@ -2614,6 +2614,8 @@ class IRBuilderBase {
return CreateShuffleVector(V, PoisonValue::get(V->getType()), Mask, Name);
}
+ Value *CreateVectorInterleave(ArrayRef<Value *> Ops, const Twine &Name = "");
+
Value *CreateExtractValue(Value *Agg, ArrayRef<unsigned> Idxs,
const Twine &Name = "") {
if (auto *V = Folder.FoldExtractValue(Agg, Idxs))
diff --git a/llvm/include/llvm/IR/IRBuilderFolder.h b/llvm/include/llvm/IR/IRBuilderFolder.h
index db4ab5af2433a..24416712c3d82 100644
--- a/llvm/include/llvm/IR/IRBuilderFolder.h
+++ b/llvm/include/llvm/IR/IRBuilderFolder.h
@@ -75,6 +75,8 @@ class LLVM_ABI IRBuilderFolder {
virtual Value *FoldCast(Instruction::CastOps Op, Value *V,
Type *DestTy) const = 0;
+ virtual Value *FoldVectorInterleave(ArrayRef<Value *> Ops) const = 0;
+
virtual Value *
FoldBinaryIntrinsic(Intrinsic::ID ID, Value *LHS, Value *RHS, Type *Ty,
Instruction *FMFSource = nullptr) const = 0;
diff --git a/llvm/include/llvm/IR/Intrinsics.h b/llvm/include/llvm/IR/Intrinsics.h
index 156805293367b..48735b06d3f53 100644
--- a/llvm/include/llvm/IR/Intrinsics.h
+++ b/llvm/include/llvm/IR/Intrinsics.h
@@ -283,8 +283,15 @@ namespace Intrinsic {
// or of the wrong kind will be renamed by adding ".renamed" to the name.
LLVM_ABI std::optional<Function *> remangleIntrinsicFunction(Function *F);
-} // End Intrinsic namespace
+ /// Returns the corresponding llvm.vector.interleaveN intrinsic for factor N.
+ LLVM_ABI Intrinsic::ID getInterleaveIntrinsicID(unsigned Factor);
-} // End llvm namespace
+ /// Returns the corresponding llvm.vector.deinterleaveN intrinsic for factor
+ /// N.
+ LLVM_ABI Intrinsic::ID getDeinterleaveIntrinsicID(unsigned Factor);
+
+ } // namespace Intrinsic
+
+ } // namespace llvm
#endif
diff --git a/llvm/include/llvm/IR/NoFolder.h b/llvm/include/llvm/IR/NoFolder.h
index 9f16c6983313a..cb843c8f7a19e 100644
--- a/llvm/include/llvm/IR/NoFolder.h
+++ b/llvm/include/llvm/IR/NoFolder.h
@@ -118,6 +118,10 @@ class LLVM_ABI NoFolder final : public IRBuilderFolder {
return nullptr;
}
+ Value *FoldVectorInterleave(ArrayRef<Value *> Ops) const override {
+ return nullptr;
+ }
+
//===--------------------------------------------------------------------===//
// Cast/Conversion Operators
//===--------------------------------------------------------------------===//
diff --git a/llvm/lib/Analysis/ConstantFolding.cpp b/llvm/lib/Analysis/ConstantFolding.cpp
index 759c553111d06..ea4c49db92818 100644
--- a/llvm/lib/Analysis/ConstantFolding.cpp
+++ b/llvm/lib/Analysis/ConstantFolding.cpp
@@ -1641,7 +1641,19 @@ bool llvm::canConstantFoldCallTo(const CallBase *Call, const Function *F) {
case Intrinsic::vector_extract:
case Intrinsic::vector_insert:
case Intrinsic::vector_interleave2:
+ case Intrinsic::vector_interleave3:
+ case Intrinsic::vector_interleave4:
+ case Intrinsic::vector_interleave5:
+ case Intrinsic::vector_interleave6:
+ case Intrinsic::vector_interleave7:
+ case Intrinsic::vector_interleave8:
case Intrinsic::vector_deinterleave2:
+ case Intrinsic::vector_deinterleave3:
+ case Intrinsic::vector_deinterleave4:
+ case Intrinsic::vector_deinterleave5:
+ case Intrinsic::vector_deinterleave6:
+ case Intrinsic::vector_deinterleave7:
+ case Intrinsic::vector_deinterleave8:
// Target intrinsics
case Intrinsic::amdgcn_perm:
case Intrinsic::amdgcn_wave_reduce_umin:
diff --git a/llvm/lib/Analysis/VectorUtils.cpp b/llvm/lib/Analysis/VectorUtils.cpp
index 1b3da590cff7f..150ddced03b15 100644
--- a/llvm/lib/Analysis/VectorUtils.cpp
+++ b/llvm/lib/Analysis/VectorUtils.cpp
@@ -240,30 +240,6 @@ Intrinsic::ID llvm::getVectorIntrinsicIDForCall(const CallInst *CI,
return Intrinsic::not_intrinsic;
}
-struct InterleaveIntrinsic {
- Intrinsic::ID Interleave, Deinterleave;
-};
-
-static InterleaveIntrinsic InterleaveIntrinsics[] = {
- {Intrinsic::vector_interleave2, Intrinsic::vector_deinterleave2},
- {Intrinsic::vector_interleave3, Intrinsic::vector_deinterleave3},
- {Intrinsic::vector_interleave4, Intrinsic::vector_deinterleave4},
- {Intrinsic::vector_interleave5, Intrinsic::vector_deinterleave5},
- {Intrinsic::vector_interleave6, Intrinsic::vector_deinterleave6},
- {Intrinsic::vector_interleave7, Intrinsic::vector_deinterleave7},
- {Intrinsic::vector_interleave8, Intrinsic::vector_deinterleave8},
-};
-
-Intrinsic::ID llvm::getInterleaveIntrinsicID(unsigned Factor) {
- assert(Factor >= 2 && Factor <= 8 && "Unexpected factor");
- return InterleaveIntrinsics[Factor - 2].Interleave;
-}
-
-Intrinsic::ID llvm::getDeinterleaveIntrinsicID(unsigned Factor) {
- assert(Factor >= 2 && Factor <= 8 && "Unexpected factor");
- return InterleaveIntrinsics[Factor - 2].Deinterleave;
-}
-
unsigned llvm::getInterleaveIntrinsicFactor(Intrinsic::ID ID) {
switch (ID) {
case Intrinsic::vector_interleave2:
diff --git a/llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp b/llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp
index 8855740f0cc8f..0cf1a5b390ae4 100644
--- a/llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp
+++ b/llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp
@@ -2194,11 +2194,10 @@ Value *ComplexDeinterleavingGraph::replaceNode(IRBuilderBase &Builder,
// Splats that are not constant are interleaved where they are located
Instruction *InsertPoint = (I->comesBefore(R) ? R : I)->getNextNode();
IRBuilder<> IRB(InsertPoint);
- ReplacementNode = IRB.CreateIntrinsic(Intrinsic::vector_interleave2,
- NewTy, {Node->Real, Node->Imag});
+ ReplacementNode = IRB.CreateVectorInterleave({Node->Real, Node->Imag});
} else {
- ReplacementNode = Builder.CreateIntrinsic(
- Intrinsic::vector_interleave2, NewTy, {Node->Real, Node->Imag});
+ ReplacementNode =
+ Builder.CreateVectorInterleave({Node->Real, Node->Imag});
}
break;
}
@@ -2228,8 +2227,7 @@ Value *ComplexDeinterleavingGraph::replaceNode(IRBuilderBase &Builder,
auto *B = replaceNode(Builder, Node->Operands[1]);
auto *NewMaskTy = VectorType::getDoubleElementsVectorType(
cast<VectorType>(MaskReal->getType()));
- auto *NewMask = Builder.CreateIntrinsic(Intrinsic::vector_interleave2,
- NewMaskTy, {MaskReal, MaskImag});
+ auto *NewMask = Builder.CreateVectorInterleave({MaskReal, MaskImag});
ReplacementNode = Builder.CreateSelect(NewMask, A, B);
break;
}
@@ -2260,8 +2258,8 @@ void ComplexDeinterleavingGraph::processReductionSingle(
}
if (!NewInit)
- NewInit = Builder.CreateIntrinsic(Intrinsic::vector_interleave2, NewVTy,
- {Init, Constant::getNullValue(VTy)});
+ NewInit =
+ Builder.CreateVectorInterleave({Init, Constant::getNullValue(VTy)});
NewPHI->addIncoming(NewInit, Incoming);
NewPHI->addIncoming(OperationReplacement, BackEdge);
@@ -2289,8 +2287,7 @@ void ComplexDeinterleavingGraph::processReductionOperation(
Value *InitImag = OldPHIImag->getIncomingValueForBlock(Incoming);
IRBuilder<> Builder(Incoming->getTerminator());
- auto *NewInit = Builder.CreateIntrinsic(Intrinsic::vector_interleave2, NewVTy,
- {InitReal, InitImag});
+ auto *NewInit = Builder.CreateVectorInterleave({InitReal, InitImag});
NewPHI->addIncoming(NewInit, Incoming);
NewPHI->addIncoming(OperationReplacement, BackEdge);
diff --git a/llvm/lib/IR/IRBuilder.cpp b/llvm/lib/IR/IRBuilder.cpp
index 28037d7ec5616..35f47f23196fa 100644
--- a/llvm/lib/IR/IRBuilder.cpp
+++ b/llvm/lib/IR/IRBuilder.cpp
@@ -13,6 +13,7 @@
#include "llvm/IR/IRBuilder.h"
#include "llvm/ADT/ArrayRef.h"
+#include "llvm/Analysis/VectorUtils.h"
#include "llvm/IR/Constant.h"
#include "llvm/IR/Constants.h"
#include "llvm/IR/DerivedTypes.h"
@@ -1144,9 +1145,35 @@ Value *IRBuilderBase::CreateVectorSplat(ElementCount EC, Value *V,
return CreateShuffleVector(V, Zeros, Name + ".splat");
}
-Value *IRBuilderBase::CreatePreserveArrayAccessIndex(
- Type *ElTy, Value *Base, unsigned Dimension, unsigned LastIndex,
- MDNode *DbgInfo) {
+Value *IRBuilderBase::CreateVectorInterleave(ArrayRef<Value *> Ops,
+ const Twine &Name) {
+ assert(Ops.size() >= 2 && Ops.size() <= 8 &&
+ "Unexpected number of operands to interleave");
+
+ // Make sure all operands are the same type.
+ assert(isa<VectorType>(Ops[0]->getType()) && "Unexpected type");
+
+#ifndef NDEBUG
+ for (unsigned I = 1; I < Ops.size(); I++) {
+ assert(Ops[I]->getType() == Ops[0]->getType() &&
+ "Vector interleave expects matching operand types!");
+ }
+#endif
+
+ if (auto *V = Folder.FoldVectorInterleave(Ops))
+ return V;
+
+ unsigned IID = Intrinsic::getInterleaveIntrinsicID(Ops.size());
+ auto *SubvecTy = cast<VectorType>(Ops[0]->getType());
+ Type *DestTy = VectorType::get(SubvecTy->getElementType(),
+ SubvecTy->getElementCount() * Ops.size());
+ return CreateIntrinsic(IID, {DestTy}, Ops, {}, Name);
+}
+
+Value *IRBuilderBase::CreatePreserveArrayAccessIndex(Type *ElTy, Value *Base,
+ unsigned Dimension,
+ unsigned LastIndex,
+ MDNode *DbgInfo) {
auto *BaseType = Base->getType();
assert(isa<PointerType>(BaseType) &&
"Invalid Base ptr type for preserve.array.access.index.");
diff --git a/llvm/lib/IR/Intrinsics.cpp b/llvm/lib/IR/Intrinsics.cpp
index 6c35ade3e57c5..58a1f745a7122 100644
--- a/llvm/lib/IR/Intrinsics.cpp
+++ b/llvm/lib/IR/Intrinsics.cpp
@@ -1133,3 +1133,27 @@ std::optional<Function *> Intrinsic::remangleIntrinsicFunction(Function *F) {
"Shouldn't change the signature");
return NewDecl;
}
+
+struct InterleaveIntrinsic {
+ Intrinsic::ID Interleave, Deinterleave;
+};
+
+static InterleaveIntrinsic InterleaveIntrinsics[] = {
+ {Intrinsic::vector_interleave2, Intrinsic::vector_deinterleave2},
+ {Intrinsic::vector_interleave3, Intrinsic::vector_deinterleave3},
+ {Intrinsic::vector_interleave4, Intrinsic::vector_deinterleave4},
+ {Intrinsic::vector_interleave5, Intrinsic::vector_deinterleave5},
+ {Intrinsic::vector_interleave6, Intrinsic::vector_deinterleave6},
+ {Intrinsic::vector_interleave7, Intrinsic::vector_deinterleave7},
+ {Intrinsic::vector_interleave8, Intrinsic::vector_deinterleave8},
+};
+
+Intrinsic::ID Intrinsic::getInterleaveIntrinsicID(unsigned Factor) {
+ assert(Factor >= 2 && Factor <= 8 && "Unexpected factor");
+ return InterleaveIntrinsics[Factor - 2].Interleave;
+}
+
+Intrinsic::ID Intrinsic::getDeinterleaveIntrinsicID(unsigned Factor) {
+ assert(Factor >= 2 && Factor <= 8 && "Unexpected factor");
+ return InterleaveIntrinsics[Factor - 2].Deinterleave;
+}
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 225658b76827e..68e7c20a070f4 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -3391,12 +3391,7 @@ static Value *interleaveVectors(IRBuilderBase &Builder, ArrayRef<Value *> Vals,
// must use intrinsics to interleave.
if (VecTy->isScalableTy()) {
assert(Factor <= 8 && "Unsupported interleave factor for scalable vectors");
- VectorType *InterleaveTy =
- VectorType::get(VecTy->getElementType(),
- VecTy->getElementCount().multiplyCoefficientBy(Factor));
- return Builder.CreateIntrinsic(InterleaveTy,
- getInterleaveIntrinsicID(Factor), Vals,
- /*FMFSource=*/nullptr, Name);
+ return Builder.CreateVectorInterleave(Vals, Name);
}
// Fixed length. Start by concatenating all vectors into a wide vector.
@@ -3503,8 +3498,8 @@ void VPInterleaveRecipe::execute(VPTransformState &State) {
assert(InterleaveFactor <= 8 &&
"Unsupported deinterleave factor for scalable vectors");
NewLoad = State.Builder.CreateIntrinsic(
- getDeinterleaveIntrinsicID(InterleaveFactor), NewLoad->getType(),
- NewLoad,
+ Intrinsic::getDeinterleaveIntrinsicID(InterleaveFactor),
+ NewLoad->getType(), NewLoad,
/*FMFSource=*/nullptr, "strided.vec");
}
diff --git a/llvm/test/CodeGen/AArch64/complex-deinterleaving-opt-crash.ll b/llvm/test/CodeGen/AArch64/complex-deinterleaving-opt-crash.ll
index b62dbc0ee8ea3..b3939268832f7 100644
--- a/llvm/test/CodeGen/AArch64/complex-deinterleaving-opt-crash.ll
+++ b/llvm/test/CodeGen/AArch64/complex-deinterleaving-opt-crash.ll
@@ -9,10 +9,9 @@ define void @reprocessing_crash() #0 {
; CHECK-LABEL: define void @reprocessing_crash(
; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
; CHECK-NEXT: [[ENTRY:.*]]:
-; CHECK-NEXT: [[TMP0:%.*]] = call <vscale x 4 x double> @llvm.vector.interleave2.nxv4f64(<vscale x 2 x double> zeroinitializer, <vscale x 2 x double> zeroinitializer)
; CHECK-NEXT: br label %[[VECTOR_BODY:.*]]
; CHECK: [[VECTOR_BODY]]:
-; CHECK-NEXT: [[TMP1:%.*]] = phi <vscale x 4 x double> [ [[TMP0]], %[[ENTRY]] ], [ [[TMP2:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[TMP1:%.*]] = phi <vscale x 4 x double> [ zeroinitializer, %[[ENTRY]] ], [ [[TMP2:%.*]], %[[VECTOR_BODY]] ]
; CHECK-NEXT: [[TMP2]] = fsub <vscale x 4 x double> [[TMP1]], zeroinitializer
; CHECK-NEXT: br i1 false, label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]]
; CHECK: [[MIDDLE_BLOCK]]:
@@ -48,11 +47,10 @@ define double @test_fp_single_reduction(i1 %c) #2 {
; CHECK-LABEL: define double @test_fp_single_reduction(
; CHECK-SAME: i1 [[C:%.*]]) #[[ATTR1:[0-9]+]] {
; CHECK-NEXT: [[ENTRY:.*]]:
-; CHECK-NEXT: [[TMP0:%.*]] = call <8 x double> @llvm.vector.interleave2.v8f64(<4 x double> zeroinitializer, <4 x double> zeroinitializer)
; CHECK-NEXT: br label %[[VECTOR_BODY:.*]]
; CHECK: [[VECTOR_BODY]]:
; CHECK-NEXT: [[VEC_PHI218:%.*]] = phi <4 x double> [ zeroinitializer, %[[ENTRY]] ], [ [[TMP2:%.*]], %[[VECTOR_BODY]] ]
-; CHECK-NEXT: [[TMP1:%.*]] = phi <8 x double> [ [[TMP0]], %[[ENTRY]] ], [ [[TMP3:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[TMP1:%.*]] = phi <8 x double> [ zeroinitializer, %[[ENTRY]] ], [ [[TMP3:%.*]], %[[VECTOR_BODY]] ]
; CHECK-NEXT: [[STRIDED_VEC:%.*]] = shufflevector <8 x double> zeroinitializer, <8 x double> zeroinitializer, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
; CHECK-NEXT: [[TMP2]] = fadd <4 x double> [[VEC_PHI218]], [[STRIDED_VEC]]
; CHECK-NEXT: [[TMP3]] = fadd <8 x double> [[TMP1]], zeroinitializer
diff --git a/llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll b/llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll
index 880bd2904154c..d67aa08125f74 100644
--- a/llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll
+++ b/llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll
@@ -14,20 +14,19 @@ target triple = "aarch64"
define %"class.std::complex" @complex_mul_v2f64(ptr %a, ptr %b) {
; CHECK-LABEL: complex_mul_v2f64:
; CHECK: // %bb.0: // %entry
+; CHECK-NEXT: movi v0.2d, #0000000000000000
; CHECK-NEXT: movi v1.2d, #0000000000000000
; CHECK-NEXT: mov w8, #100 // =0x64
-; CHECK-NEXT: cntd x9
; CHECK-NEXT: whilelo p1.d, xzr, x8
+; CHECK-NEXT: cntd x9
; CHECK-NEXT: rdvl x10, #2
-; CHECK-NEXT: mov x11, x9
; CHECK-NEXT: ptrue p0.d
-; CHECK-NEXT: zip2 z0.d, z1.d, z1.d
-; CHECK-NEXT: zip1 z1.d, z1.d, z1.d
+; CHECK-NEXT: mov x11, x9
; CHECK-NEXT: .LBB0_1: // %vector.body
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
; CHECK-NEXT: zip2 p2.d, p1.d, p1.d
-; CHECK-NEXT: mov z6.d, z1.d
-; CHECK-NEXT: mov z7.d, z0.d
+; CHECK-NEXT: mov z6.d, z0.d
+; CHECK-NEXT: mov z7.d, z1.d
; CHECK-NEXT: zip1 p1.d, p1.d, p1.d
; CHECK-NEXT: ld1d { z2.d }, p2/z, [x0, #1, mul vl]
; CHECK-NEXT: ld1d { z4.d }, p2/z, [x1, #1, mul vl]
@@ -39,14 +38,14 @@ define %"class.std::complex" @complex_mul_v2f64(ptr %a, ptr %b) {
; CHECK-NEXT: fcmla z6.d, p0/m, z5.d, z3.d, #0
; CHECK-NEXT: fcmla z7.d, p0/m, z4.d, z2....
[truncated]
|
|
I had to add |
| return nullptr; | ||
| } | ||
|
|
||
| Value *FoldVectorInterleave(ArrayRef<Value *> Ops) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code should not be in IRBuilder folders, but rather part of the generic intrinsic ConstantFolding support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand. Do you mean that I shouldn't be changing any of the folders whatsoever, or that I should be doing something like this in TargetFolder:
Value *FoldBinaryIntrinsic(Intrinsic::ID ID, Value *LHS, Value *RHS, Type *Ty,
Instruction *FMFSource) const override {
auto *C1 = dyn_cast<Constant>(LHS);
auto *C2 = dyn_cast<Constant>(RHS);
if (C1 && C2)
return ConstantFoldBinaryIntrinsic(ID, C1, C2, Ty, FMFSource);
return nullptr;
}
I thought the standard practice in IRBuilder was to apply folds using the folder, i.e. like we do here:
Value *IRBuilderBase::CreateBinaryIntrinsic(Intrinsic::ID ID, Value *LHS,
Value *RHS, FMFSource FMFSource,
const Twine &Name) {
Module *M = BB->getModule();
Function *Fn = Intrinsic::getOrInsertDeclaration(M, ID, {LHS->getType()});
if (Value *V = Folder.FoldBinaryIntrinsic(ID, LHS, RHS, Fn->getReturnType(),
/*FMFSource=*/nullptr))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't use the pre-existing FoldBinaryIntrinsic or FoldUnaryIntrinsic functions either because the number of operands is variable, although I'm happy to add a FoldUnaryIntrinsic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to replace FoldBinaryIntrinsic with FoldIntrinsic -- we shouldn't specialize to binary. This is going to need a small bit of refactoring in ConstantFolding to expose a ConstantFoldIntrinsic function instead of ConstantFoldBinaryIntrinsic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thanks for the pointer, I'll take a look!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The places where I'd like to see these constant folds happening are where we're using a default version of IRBuilder, which uses ConstantFolder. However, the ConstantFolder.h version of FoldBinaryIntrinsic says the caller should be using TargetFolder or InstSimplifyFolder instead. So I could change FoldBinaryIntrinsic in llvm/Analysis/ConstantFolding.h (used by TargetFolder.h), but then I wouldn't see any improvements in generated code for IRBuilder instances using the default version.
To be honest, I'm not really sure the best way to proceed now. Would it be acceptable to add a variant of FoldIntrinsic to ConstantFolder.h as well, or is the preferred solution to just change all the IRBuilder instances I care about to use the TargetFolder instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using TargetFolder is generally preferred. (The whole TargetFolder vs ConstantFolder is an annoying legacy split, that's unfortunately not entirely straightforward to remove.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we need a llvm::ConstantFoldBinaryIntrinsic in ConstantFolding.cpp, and call that from both ConstantFolder.h and InstructionSimplify.cpp's ConstantFoldInstOperandsImpl? That would bring it inline with the other the other instruction types.
And we probably need to rename that to just ConstantFoldIntrinsic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can do that because llvm/include/llvm/IR/ConstantFolder.h only includes llvm/IR/ConstantFold.h, whereas I think what you're talking about is trying to reuse an intrinsic folding function in the Analysis directory.
I'll abandon the constant folds for now, because if I only add support in the TargetFolder the only way I could test it is via unit tests, since both the loop vectoriser and the complex interleaving pass use the default IRBuilder, which uses ConstantFolder. I care more about adding the CreateVectorInterleave interface - I just saw an opportunity to tidy up the IR but it turns out to be far too complicated to do all in one patch.
nikic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
lukel97
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Are there plans to create a method for deinterleave intrinsics too?
Sure I can do that. I originally thought there was more value in the CreateVectorInterleave interface due to requiring the extra checks and wider type creation, with also the potential benefit of constant folding. For a CreateVectorDeinterleave interface there is less work required, however I agree it makes sense to have it for consistency. |
|
Rebased the PR as there were a couple of RISCV loop vectorise tests failing, which don't fail for me downstream. |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/51/builds/20564 Here is the relevant piece of the build log for the reference |
|
Looks like I forgot to delete some unused variables causing sanitiser failures. I'll fix asap. |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/123/builds/24264 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/168/builds/14681 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/24398 Here is the relevant piece of the build log for the reference |
| Instruction *InsertPoint = (I->comesBefore(R) ? R : I)->getNextNode(); | ||
| IRBuilder<> IRB(InsertPoint); | ||
| ReplacementNode = IRB.CreateIntrinsic(Intrinsic::vector_interleave2, | ||
| NewTy, {Node->Real, Node->Imag}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like NewTy & NewMaskTy are unused now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. #151100
This PR adds a new interface to IRBuilder called CreateVectorInterleave, which can be used to create vector.interleave intrinsics of factors 2-8.
For convenience I have also moved getInterleaveIntrinsicID and getDeinterleaveIntrinsicID from VectorUtils.cpp to Intrinsics.cpp where it can be used by IRBuilder.