Skip to content

Commit 5eb19bf

Browse files
committed
[X86CmovConversion] Make heuristic for optimized cmov depth more conservative (PR44539)
Fix/workaround for https://bugs.llvm.org/show_bug.cgi?id=44539. As discussed there, this pass makes some overly optimistic assumptions, as it does not have access to actual branch weights. This patch makes the computation of the depth of the optimized cmov more conservative, by assuming a distribution of 75/25 rather than 50/50 and placing the weights to get the more conservative result (larger depth). The fully conservative choice would be std::max(TrueOpDepth, FalseOpDepth), but that would break at least one existing test (which may or may not be an issue in practice). Differential Revision: https://reviews.llvm.org/D74155
1 parent 37f4665 commit 5eb19bf

File tree

1 file changed

+7
-6
lines changed

1 file changed

+7
-6
lines changed

llvm/lib/Target/X86/X86CmovConversion.cpp

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -364,12 +364,13 @@ bool X86CmovConverterPass::collectCmovCandidates(
364364
/// \param TrueOpDepth depth cost of CMOV true value operand.
365365
/// \param FalseOpDepth depth cost of CMOV false value operand.
366366
static unsigned getDepthOfOptCmov(unsigned TrueOpDepth, unsigned FalseOpDepth) {
367-
//===--------------------------------------------------------------------===//
368-
// With no info about branch weight, we assume 50% for each value operand.
369-
// Thus, depth of optimized CMOV instruction is the rounded up average of
370-
// its True-Operand-Value-Depth and False-Operand-Value-Depth.
371-
//===--------------------------------------------------------------------===//
372-
return (TrueOpDepth + FalseOpDepth + 1) / 2;
367+
// The depth of the result after branch conversion is
368+
// TrueOpDepth * TrueOpProbability + FalseOpDepth * FalseOpProbability.
369+
// As we have no info about branch weight, we assume 75% for one and 25% for
370+
// the other, and pick the result with the largest resulting depth.
371+
return std::max(
372+
divideCeil(TrueOpDepth * 3 + FalseOpDepth, 4),
373+
divideCeil(FalseOpDepth * 3 + TrueOpDepth, 4));
373374
}
374375

375376
bool X86CmovConverterPass::checkForProfitableCmovCandidates(

0 commit comments

Comments
 (0)