Skip to content

[LLVM][ConstantFold] Undefined values are not constant #130713

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 12, 2025

Conversation

kees
Copy link
Contributor

@kees kees commented Mar 11, 2025

llvm.is.constant (and therefore Clang's __builtin_constant_p()) need to report undefined values as non-constant or future DCE choices end up making no sense. This was encountered while building the Linux kernel which uses __builtin_constant_p() while trying to evaluate if it is safe to use a compile-time constant resolution for string lengths or if it must kick over to a full runtime call to strlen(). Obviously an undefined variable cannot be known at compile-time, so __builtin_constant_p() needs to return false. This change will also mean that Clang will match GCC's behavior under the same conditions.

Fixes #130649

llvm.is.constant (and therefore Clang's __builtin_constant_p()) need to
report undefined values as non-constant or future DCE choices end up
making no sense. This was encountered while building the Linux kernel
which uses __builtin_constant_p() while trying to evaluate if it is safe
to use a compile-time constant resolution for string lengths or if it
must kick over to a full runtime call to strlen(). Obviously an undefined
variable cannot be known at compile-time, so __builtin_constant_p()
needs to return false. This change will also mean that Clang will match
GCC's behavior under the same conditions.

Fixes llvm#130649
@llvmbot
Copy link
Member

llvmbot commented Mar 11, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Kees Cook (kees)

Changes

llvm.is.constant (and therefore Clang's __builtin_constant_p()) need to report undefined values as non-constant or future DCE choices end up making no sense. This was encountered while building the Linux kernel which uses __builtin_constant_p() while trying to evaluate if it is safe to use a compile-time constant resolution for string lengths or if it must kick over to a full runtime call to strlen(). Obviously an undefined variable cannot be known at compile-time, so __builtin_constant_p() needs to return false. This change will also mean that Clang will match GCC's behavior under the same conditions.

Fixes #130649


Full diff: https://github.com/llvm/llvm-project/pull/130713.diff

2 Files Affected:

  • (modified) llvm/lib/IR/Constants.cpp (+2)
  • (modified) llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll (+18)
diff --git a/llvm/lib/IR/Constants.cpp b/llvm/lib/IR/Constants.cpp
index 9e3e739fae3dc..36f4136457617 100644
--- a/llvm/lib/IR/Constants.cpp
+++ b/llvm/lib/IR/Constants.cpp
@@ -841,6 +841,8 @@ Constant *Constant::mergeUndefsWith(Constant *C, Constant *Other) {
 }
 
 bool Constant::isManifestConstant() const {
+  if (isa<UndefValue>(this))
+    return false;
   if (isa<ConstantData>(this))
     return true;
   if (isa<ConstantAggregate>(this) || isa<ConstantExpr>(this)) {
diff --git a/llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll b/llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll
index 32bd2fff2732e..2801c3d2a9770 100644
--- a/llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll
+++ b/llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll
@@ -120,3 +120,21 @@ define i1 @global_array() {
   %1 = call i1 @llvm.is.constant.i64(i64 ptrtoint (ptr @real_mode_blob_end to i64))
   ret i1 %1
 }
+
+;; Ensure that is.constant of undef gets lowered reasonably to "false" in
+;; optimized codegen: specifically that the "true" branch is eliminated.
+;; CHECK-NOT: tail call i32 @subfun_1()
+;; CHECK:     tail call i32 @subfun_2()
+;; CHECK-NOT: tail call i32 @subfun_1()
+define i32 @test_undef_branch() nounwind {
+  %v = call i1 @llvm.is.constant.i32(i32 undef)
+  br i1 %v, label %True, label %False
+
+True:
+  %call1 = tail call i32 @subfun_1()
+  ret i32 %call1
+
+False:
+  %call2 = tail call i32 @subfun_2()
+  ret i32 %call2
+}

@llvmbot
Copy link
Member

llvmbot commented Mar 11, 2025

@llvm/pr-subscribers-llvm-ir

Author: Kees Cook (kees)

Changes

llvm.is.constant (and therefore Clang's __builtin_constant_p()) need to report undefined values as non-constant or future DCE choices end up making no sense. This was encountered while building the Linux kernel which uses __builtin_constant_p() while trying to evaluate if it is safe to use a compile-time constant resolution for string lengths or if it must kick over to a full runtime call to strlen(). Obviously an undefined variable cannot be known at compile-time, so __builtin_constant_p() needs to return false. This change will also mean that Clang will match GCC's behavior under the same conditions.

Fixes #130649


Full diff: https://github.com/llvm/llvm-project/pull/130713.diff

2 Files Affected:

  • (modified) llvm/lib/IR/Constants.cpp (+2)
  • (modified) llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll (+18)
diff --git a/llvm/lib/IR/Constants.cpp b/llvm/lib/IR/Constants.cpp
index 9e3e739fae3dc..36f4136457617 100644
--- a/llvm/lib/IR/Constants.cpp
+++ b/llvm/lib/IR/Constants.cpp
@@ -841,6 +841,8 @@ Constant *Constant::mergeUndefsWith(Constant *C, Constant *Other) {
 }
 
 bool Constant::isManifestConstant() const {
+  if (isa<UndefValue>(this))
+    return false;
   if (isa<ConstantData>(this))
     return true;
   if (isa<ConstantAggregate>(this) || isa<ConstantExpr>(this)) {
diff --git a/llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll b/llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll
index 32bd2fff2732e..2801c3d2a9770 100644
--- a/llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll
+++ b/llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll
@@ -120,3 +120,21 @@ define i1 @global_array() {
   %1 = call i1 @llvm.is.constant.i64(i64 ptrtoint (ptr @real_mode_blob_end to i64))
   ret i1 %1
 }
+
+;; Ensure that is.constant of undef gets lowered reasonably to "false" in
+;; optimized codegen: specifically that the "true" branch is eliminated.
+;; CHECK-NOT: tail call i32 @subfun_1()
+;; CHECK:     tail call i32 @subfun_2()
+;; CHECK-NOT: tail call i32 @subfun_1()
+define i32 @test_undef_branch() nounwind {
+  %v = call i1 @llvm.is.constant.i32(i32 undef)
+  br i1 %v, label %True, label %False
+
+True:
+  %call1 = tail call i32 @subfun_1()
+  ret i32 %call1
+
+False:
+  %call2 = tail call i32 @subfun_2()
+  ret i32 %call2
+}

Copy link

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:
git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' a5a33d82931def5a2f6f4dcfca834574190b4b50 3dea51b6a139cab729fbcb1027e13c13fd80b65e llvm/lib/IR/Constants.cpp llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll

The following files introduce new uses of undef:

  • llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

@kees
Copy link
Contributor Author

kees commented Mar 11, 2025

Undef is now deprecated and should only be used in the rare cases where no replacement is possible.

AIUI, this test really does want to use undef and not poison.

@dtcxzyw dtcxzyw requested a review from nikic March 11, 2025 04:03
@kees kees requested review from nickdesaulniers, nathanchance and efriedma-quic and removed request for nikic March 11, 2025 04:03
@dtcxzyw
Copy link
Member

dtcxzyw commented Mar 11, 2025

Obviously an undefined variable cannot be known at compile-time, so __builtin_constant_p() needs to return false.

IMO undef/poison can be refined to any constant value.

@efriedma-quic
Copy link
Collaborator

IMO undef/poison can be refined to any constant value.

Agreed.

To explain to anyone else who isn't following... suppose you have __builtin_constant_p(cond ? 0 : uninit_variable). Existing optimizations want to fold cond ? 0 : uninit_variable to zero... but then __builtin_constant_p is returning true for a value that's supposed to be undef. So undefined values may or may not be constant, depending on the context.

Of course, that's true with or without this patch, so I guess you could argue that this patch is an improvement anyway. But really, we need to fix Linux code so it doesn't depend on this behavior, or we'll trip over it again later.

@kees kees requested a review from nikic March 11, 2025 17:52
@kees
Copy link
Contributor Author

kees commented Mar 11, 2025

(ah, my reviewers addition collided with the bot and @nikic got removed -- I've fix that now.)

Okay, it sounds like there isn't really an issue for this commit, just generally that undef is no good (which I agree with). So since it solves the specific corner case that Linux ran into and aligns Clang behavior with GCC, it seems like a net benefit. Okay to commit?

intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Mar 12, 2025
It seems that Clang thinks __builtin_constant_p() of undefined variables
should return true[1]. This is being fixed separately[2], but in the
meantime, expand the fortify tests to help track this kind of thing down
faster in the future.

Link: ClangBuiltLinux#2073 [1]
Link: llvm/llvm-project#130713 [2]
Signed-off-by: Kees Cook <[email protected]>
Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with this patch. But it is recommended to fix the uninitialized variable problem in Linux code.

;; optimized codegen: specifically that the "true" branch is eliminated.
;; CHECK-NOT: tail call i32 @subfun_1()
;; CHECK: tail call i32 @subfun_2()
;; CHECK-NOT: tail call i32 @subfun_1()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
;; CHECK-NOT: tail call i32 @subfun_1()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my understanding of this CHECK pattern at the top of the file (where I copied this test style from) the second CHECK-NOT is to make sure there is no re-ordering happening: i.e. it must be fully missing, not just present later.

@kees kees merged commit ea2e66a into llvm:main Mar 12, 2025
13 of 14 checks passed
frederik-h pushed a commit to frederik-h/llvm-project that referenced this pull request Mar 18, 2025
llvm.is.constant (and therefore Clang's __builtin_constant_p()) need to
report undefined values as non-constant or future DCE choices end up
making no sense. This was encountered while building the Linux kernel
which uses __builtin_constant_p() while trying to evaluate if it is safe
to use a compile-time constant resolution for string lengths or if it
must kick over to a full runtime call to strlen(). Obviously an
undefined variable cannot be known at compile-time, so
__builtin_constant_p() needs to return false. This change will also mean
that Clang will match GCC's behavior under the same conditions.

Fixes llvm#130649
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

__builtin_constant_p incorrect with undefined variables
4 participants