Skip to content

LoopVectorizer reduction crash with RISC-V Zvqdotq #167861

@lukel97

Description

@lukel97

After 6b19a54 the following crashes on opt -mtriple riscv64 -mattr=+v,+zvqdotq -p loop-vectorize with Assertion failed: (getType() == V->getType() && "All operands to PHI node must be the same type as the PHI node!"), function setIncomingValue, file Instructions.h, line 2720.:

define i32 @print_partial_reduction_predication(ptr %a, ptr %b, i64 %N) {
entry:
  br label %for.body

for.body:                                         ; preds = %for.body, %entry
  %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
  %accum = phi i32 [ 0, %entry ], [ %add, %for.body ]
  %gep.a = getelementptr i8, ptr %a, i64 %iv
  %load.a = load i8, ptr %gep.a, align 1
  %ext.a = zext i8 %load.a to i32
  %gep.b = getelementptr i8, ptr %b, i64 %iv
  %load.b = load i8, ptr %gep.b, align 1
  %ext.b = zext i8 %load.b to i32
  %mul = mul i32 %ext.b, %ext.a
  %add = add i32 %mul, %accum
  %iv.next = add i64 %iv, 1
  %exitcond.not = icmp eq i64 %iv.next, %N
  br i1 %exitcond.not, label %exit, label %for.body

exit:
  ret i32 %add
}

The final VPlan has a scaled widened reduction PHI, but the reduction recipe isn't a VPPartialReductionRecipe which seems suspect:

Executing best plan with VF=vscale x 4, UF=1
VPlan 'Final VPlan for VF={vscale x 4},UF={1}' {
Live-in ir<%N> = original trip-count

ir-bb<entry>:
Successor(s): vector.ph

vector.ph:
  EMIT vp<%2> = reduction-start-vector ir<0>, ir<0>, ir<4>
Successor(s): vector.body

vector.body:
  EMIT-SCALAR vp<%evl.based.iv> = phi [ ir<0>, vector.ph ], [ vp<%index.evl.next>, vector.body ]
  WIDEN-REDUCTION-PHI ir<%accum> = phi vp<%2>, ir<%add> (VF scaled by 1/4)
  EMIT-SCALAR vp<%avl> = phi [ ir<%N>, vector.ph ], [ vp<%avl.next>, vector.body ]
  EMIT-SCALAR vp<%3> = EXPLICIT-VECTOR-LENGTH vp<%avl>
  CLONE ir<%gep.a> = getelementptr ir<%a>, vp<%evl.based.iv>
  WIDEN ir<%load.a> = vp.load ir<%gep.a>, vp<%3>
  WIDEN-CAST ir<%ext.a> = zext ir<%load.a> to i32
  CLONE ir<%gep.b> = getelementptr ir<%b>, vp<%evl.based.iv>
  WIDEN ir<%load.b> = vp.load ir<%gep.b>, vp<%3>
  WIDEN-CAST ir<%ext.b> = zext ir<%load.b> to i32
  WIDEN ir<%mul> = mul ir<%ext.b>, ir<%ext.a>
  WIDEN-INTRINSIC vp<%4> = call llvm.vp.merge(ir<true>, ir<%mul>, ir<0>, vp<%3>)
  REDUCE ir<%add> = ir<%accum> +  vp.reduce.add (vp<%4>, vp<%3>)
  EMIT-SCALAR vp<%5> = zext vp<%3> to i64
  EMIT vp<%index.evl.next> = add vp<%5>, vp<%evl.based.iv>
  EMIT vp<%avl.next> = sub nuw vp<%avl>, vp<%5>
  EMIT vp<%6> = icmp eq vp<%avl.next>, ir<0>
  EMIT branch-on-cond vp<%6>
Successor(s): middle.block, vector.body

middle.block:
  EMIT vp<%8> = compute-reduction-result ir<%accum>, ir<%add>
Successor(s): ir-bb<exit>

ir-bb<exit>:
  IR   %add.lcssa = phi i32 [ %add, %for.body ] (extra operand: vp<%8> from middle.block)
No successors
}

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions