-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Closed
Labels
Description
After 6b19a54 the following crashes on opt -mtriple riscv64 -mattr=+v,+zvqdotq -p loop-vectorize with Assertion failed: (getType() == V->getType() && "All operands to PHI node must be the same type as the PHI node!"), function setIncomingValue, file Instructions.h, line 2720.:
define i32 @print_partial_reduction_predication(ptr %a, ptr %b, i64 %N) {
entry:
br label %for.body
for.body: ; preds = %for.body, %entry
%iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
%accum = phi i32 [ 0, %entry ], [ %add, %for.body ]
%gep.a = getelementptr i8, ptr %a, i64 %iv
%load.a = load i8, ptr %gep.a, align 1
%ext.a = zext i8 %load.a to i32
%gep.b = getelementptr i8, ptr %b, i64 %iv
%load.b = load i8, ptr %gep.b, align 1
%ext.b = zext i8 %load.b to i32
%mul = mul i32 %ext.b, %ext.a
%add = add i32 %mul, %accum
%iv.next = add i64 %iv, 1
%exitcond.not = icmp eq i64 %iv.next, %N
br i1 %exitcond.not, label %exit, label %for.body
exit:
ret i32 %add
}
The final VPlan has a scaled widened reduction PHI, but the reduction recipe isn't a VPPartialReductionRecipe which seems suspect:
Executing best plan with VF=vscale x 4, UF=1
VPlan 'Final VPlan for VF={vscale x 4},UF={1}' {
Live-in ir<%N> = original trip-count
ir-bb<entry>:
Successor(s): vector.ph
vector.ph:
EMIT vp<%2> = reduction-start-vector ir<0>, ir<0>, ir<4>
Successor(s): vector.body
vector.body:
EMIT-SCALAR vp<%evl.based.iv> = phi [ ir<0>, vector.ph ], [ vp<%index.evl.next>, vector.body ]
WIDEN-REDUCTION-PHI ir<%accum> = phi vp<%2>, ir<%add> (VF scaled by 1/4)
EMIT-SCALAR vp<%avl> = phi [ ir<%N>, vector.ph ], [ vp<%avl.next>, vector.body ]
EMIT-SCALAR vp<%3> = EXPLICIT-VECTOR-LENGTH vp<%avl>
CLONE ir<%gep.a> = getelementptr ir<%a>, vp<%evl.based.iv>
WIDEN ir<%load.a> = vp.load ir<%gep.a>, vp<%3>
WIDEN-CAST ir<%ext.a> = zext ir<%load.a> to i32
CLONE ir<%gep.b> = getelementptr ir<%b>, vp<%evl.based.iv>
WIDEN ir<%load.b> = vp.load ir<%gep.b>, vp<%3>
WIDEN-CAST ir<%ext.b> = zext ir<%load.b> to i32
WIDEN ir<%mul> = mul ir<%ext.b>, ir<%ext.a>
WIDEN-INTRINSIC vp<%4> = call llvm.vp.merge(ir<true>, ir<%mul>, ir<0>, vp<%3>)
REDUCE ir<%add> = ir<%accum> + vp.reduce.add (vp<%4>, vp<%3>)
EMIT-SCALAR vp<%5> = zext vp<%3> to i64
EMIT vp<%index.evl.next> = add vp<%5>, vp<%evl.based.iv>
EMIT vp<%avl.next> = sub nuw vp<%avl>, vp<%5>
EMIT vp<%6> = icmp eq vp<%avl.next>, ir<0>
EMIT branch-on-cond vp<%6>
Successor(s): middle.block, vector.body
middle.block:
EMIT vp<%8> = compute-reduction-result ir<%accum>, ir<%add>
Successor(s): ir-bb<exit>
ir-bb<exit>:
IR %add.lcssa = phi i32 [ %add, %for.body ] (extra operand: vp<%8> from middle.block)
No successors
}