-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[GlobalOpt] Check if global gets compared to pointer of different obj. #153789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-llvm-transforms Author: Florian Hahn (fhahn) ChangesAdd an extra check to the compare handling to make sure the other For example, consider the IR below, which may write Full diff: https://github.com/llvm/llvm-project/pull/153789.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Utils/GlobalStatus.cpp b/llvm/lib/Transforms/Utils/GlobalStatus.cpp
index 0b3016a86e287..2452737e97278 100644
--- a/llvm/lib/Transforms/Utils/GlobalStatus.cpp
+++ b/llvm/lib/Transforms/Utils/GlobalStatus.cpp
@@ -61,6 +61,42 @@ bool llvm::isSafeToDestroyConstant(const Constant *C) {
return true;
}
+static bool maybePointerToDifferentObjectRecursive(Value *V, SmallPtrSetImpl<Value *> &Visited) {
+ if (!Visited.insert(V).second)
+ return false;
+
+ if (Visited.size() > 32)
+ return true;
+
+ // PtrToInt may be used to construct a pointer to a different object. Loads and calls may return a pointer for a different object.
+ if (isa<PtrToIntInst, LoadInst, CallInst>(V))
+ return true;
+
+ if (auto *CE = dyn_cast<ConstantExpr>(V)) {
+ if (CE->getOpcode() == Instruction::PtrToInt)
+ return true;
+
+ for (auto &Op : CE->operands()) {
+ if (maybePointerToDifferentObjectRecursive(Op.get(), Visited))
+ return true;
+ }
+ return false;
+ }
+
+ if (auto *U = dyn_cast<User>(V)) {
+ for (auto &Op : U->operands()) {
+ if (maybePointerToDifferentObjectRecursive(Op.get(), Visited))
+ return true;
+ }
+ }
+ return false;
+}
+
+bool maybePointerToDifferentObject(Value *V) {
+ SmallPtrSet<Value *, 32> Visited;
+ return maybePointerToDifferentObjectRecursive(V, Visited);
+}
+
static bool analyzeGlobalAux(const Value *V, GlobalStatus &GS,
SmallPtrSetImpl<const Value *> &VisitedUsers) {
if (const GlobalVariable *GV = dyn_cast<GlobalVariable>(V))
@@ -158,6 +194,12 @@ static bool analyzeGlobalAux(const Value *V, GlobalStatus &GS,
return true;
} else if (isa<CmpInst>(I)) {
GS.IsCompared = true;
+
+ if (VisitedUsers.insert(I).second) {
+ for (Value *Op : I->operands())
+ if ( Op != V && maybePointerToDifferentObject(Op))
+ return true;
+ }
} else if (const MemTransferInst *MTI = dyn_cast<MemTransferInst>(I)) {
if (MTI->isVolatile())
return true;
diff --git a/llvm/test/Transforms/GlobalOpt/globals-compares-with-ptrtoint.ll b/llvm/test/Transforms/GlobalOpt/globals-compares-with-ptrtoint.ll
new file mode 100644
index 0000000000000..25495aa6de559
--- /dev/null
+++ b/llvm/test/Transforms/GlobalOpt/globals-compares-with-ptrtoint.ll
@@ -0,0 +1,148 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals all --version 5
+; RUN: opt -p globalopt -S %s | FileCheck %s
+
+@A = internal global i64 zeroinitializer
+@B = internal global i64 zeroinitializer
+
+@C = internal global i64 zeroinitializer
+@D = internal global i64 zeroinitializer
+
+@E = internal global i64 zeroinitializer
+@F = internal global i64 zeroinitializer
+
+@G = internal global i64 zeroinitializer
+@H = internal global i64 zeroinitializer
+
+
+
+;.
+; CHECK: @A = internal global i64 0
+; CHECK: @B = internal global i64 0
+; CHECK: @C = internal global i64 0
+; CHECK: @D = internal global i64 0
+; CHECK: @G = internal global i64 0
+; CHECK: @H = internal global i64 0
+;.
+define i64 @A_and_B_cmp_ptrtoint_constant_expr() {
+; CHECK-LABEL: define i64 @A_and_B_cmp_ptrtoint_constant_expr() local_unnamed_addr {
+; CHECK-NEXT: [[CMP:%.*]] = icmp eq ptr inttoptr (i64 add (i64 ptrtoint (ptr @A to i64), i64 8) to ptr), @B
+; CHECK-NEXT: br i1 [[CMP]], label %[[THEN:.*]], label %[[ELSE:.*]]
+; CHECK: [[THEN]]:
+; CHECK-NEXT: store i64 10, ptr inttoptr (i64 add (i64 ptrtoint (ptr @A to i64), i64 8) to ptr), align 4
+; CHECK-NEXT: br label %[[EXIT:.*]]
+; CHECK: [[ELSE]]:
+; CHECK-NEXT: br label %[[EXIT]]
+; CHECK: [[EXIT]]:
+; CHECK-NEXT: [[L:%.*]] = load i64, ptr @B, align 4
+; CHECK-NEXT: ret i64 [[L]]
+;
+ %cmp = icmp eq ptr inttoptr (i64 add (i64 ptrtoint (ptr @A to i64), i64 8) to ptr) , @B
+ br i1 %cmp, label %then, label %else
+
+then:
+ store i64 10, ptr inttoptr (i64 add (i64 ptrtoint (ptr @A to i64), i64 8) to ptr)
+ br label %exit
+
+else:
+ br label %exit
+
+exit:
+ %l = load i64, ptr @B
+ ret i64 %l
+}
+
+define i64 @G_and_H_cmp_ptrtoint_constant_expr_cmp_ops_swapped() {
+; CHECK-LABEL: define i64 @G_and_H_cmp_ptrtoint_constant_expr_cmp_ops_swapped() local_unnamed_addr {
+; CHECK-NEXT: [[CMP:%.*]] = icmp eq ptr inttoptr (i64 add (i64 ptrtoint (ptr @G to i64), i64 8) to ptr), @H
+; CHECK-NEXT: br i1 [[CMP]], label %[[THEN:.*]], label %[[ELSE:.*]]
+; CHECK: [[THEN]]:
+; CHECK-NEXT: store i64 10, ptr inttoptr (i64 add (i64 ptrtoint (ptr @G to i64), i64 8) to ptr), align 4
+; CHECK-NEXT: br label %[[EXIT:.*]]
+; CHECK: [[ELSE]]:
+; CHECK-NEXT: br label %[[EXIT]]
+; CHECK: [[EXIT]]:
+; CHECK-NEXT: [[L:%.*]] = load i64, ptr @H, align 4
+; CHECK-NEXT: ret i64 [[L]]
+;
+ %cmp = icmp eq ptr inttoptr (i64 add (i64 ptrtoint (ptr @G to i64), i64 8) to ptr) , @H
+ br i1 %cmp, label %then, label %else
+
+then:
+ store i64 10, ptr inttoptr (i64 add (i64 ptrtoint (ptr @G to i64), i64 8) to ptr)
+ br label %exit
+
+else:
+ br label %exit
+
+exit:
+ %l = load i64, ptr @H
+ ret i64 %l
+}
+
+define i64 @C_and_D_cmp_ptr_load() {
+; CHECK-LABEL: define i64 @C_and_D_cmp_ptr_load() local_unnamed_addr {
+; CHECK-NEXT: [[P:%.*]] = alloca ptr, align 8
+; CHECK-NEXT: store ptr inttoptr (i64 add (i64 ptrtoint (ptr @C to i64), i64 8) to ptr), ptr [[P]], align 8
+; CHECK-NEXT: [[L_P:%.*]] = load ptr, ptr [[P]], align 8
+; CHECK-NEXT: [[CMP:%.*]] = icmp eq ptr [[L_P]], @D
+; CHECK-NEXT: br i1 [[CMP]], label %[[THEN:.*]], label %[[ELSE:.*]]
+; CHECK: [[THEN]]:
+; CHECK-NEXT: store i64 10, ptr inttoptr (i64 add (i64 ptrtoint (ptr @C to i64), i64 8) to ptr), align 4
+; CHECK-NEXT: br label %[[EXIT:.*]]
+; CHECK: [[ELSE]]:
+; CHECK-NEXT: br label %[[EXIT]]
+; CHECK: [[EXIT]]:
+; CHECK-NEXT: [[L:%.*]] = load i64, ptr @D, align 4
+; CHECK-NEXT: ret i64 [[L]]
+;
+ %p = alloca ptr
+ store ptr inttoptr (i64 add (i64 ptrtoint (ptr @C to i64), i64 8) to ptr), ptr %p
+ %l.p = load ptr, ptr %p
+ %cmp = icmp eq ptr %l.p, @D
+ br i1 %cmp, label %then, label %else
+
+then:
+ store i64 10, ptr inttoptr (i64 add (i64 ptrtoint (ptr @C to i64), i64 8) to ptr)
+ br label %exit
+
+else:
+ br label %exit
+
+exit:
+ %l = load i64, ptr @D
+ ret i64 %l
+}
+
+define i64 @D_and_E_cmp_ptrtoint_constant_expr() {
+; CHECK-LABEL: define i64 @D_and_E_cmp_ptrtoint_constant_expr() local_unnamed_addr {
+; CHECK-NEXT: [[PTR2INT:%.*]] = ptrtoint ptr @A to i64
+; CHECK-NEXT: [[ADD:%.*]] = add i64 [[PTR2INT]], 8
+; CHECK-NEXT: [[INT2PTR:%.*]] = inttoptr i64 [[ADD]] to ptr
+; CHECK-NEXT: [[CMP:%.*]] = icmp eq ptr [[INT2PTR]], @B
+; CHECK-NEXT: br i1 [[CMP]], label %[[THEN:.*]], label %[[ELSE:.*]]
+; CHECK: [[THEN]]:
+; CHECK-NEXT: store i64 10, ptr inttoptr (i64 add (i64 ptrtoint (ptr @A to i64), i64 8) to ptr), align 4
+; CHECK-NEXT: br label %[[EXIT:.*]]
+; CHECK: [[ELSE]]:
+; CHECK-NEXT: br label %[[EXIT]]
+; CHECK: [[EXIT]]:
+; CHECK-NEXT: [[L:%.*]] = load i64, ptr @B, align 4
+; CHECK-NEXT: ret i64 [[L]]
+;
+ %ptr2int = ptrtoint ptr @A to i64
+ %add = add i64 %ptr2int, 8
+ %int2ptr = inttoptr i64 %add to ptr
+ %cmp = icmp eq ptr %int2ptr , @B
+ br i1 %cmp, label %then, label %else
+
+then:
+ store i64 10, ptr inttoptr (i64 add (i64 ptrtoint (ptr @A to i64), i64 8) to ptr)
+ br label %exit
+
+else:
+ br label %exit
+
+exit:
+ %l = load i64, ptr @B
+ ret i64 %l
+}
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
fceb7b1 to
ce3c5d8
Compare
efriedma-quic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic seems like it's working the wrong way: in general, you have to assume the pointer points to a different object unless you can prove it points to the same object.
If you want to check if a pointer points to a certain object, you can just use getUnderlyingObject().
Add an extra check to the compare handling to make sure the other compared pointer cannot be based on a different object. For example, consider the IR below, which may write @b via a pointer constructed from @A using PtrToInt and a compare against @b. ``` %cmp = icmp eq ptr inttoptr (i64 add (i64 ptrtoint (ptr @A to i64), i64 8) to ptr) , @b br i1 %cmp, label %then, label %else then: store i64 10, ptr inttoptr (i64 add (i64 ptrtoint (ptr @A to i64), i64 8) to ptr) br label %exit else: br label %exit exit: %l = load i64, ptr @b ret i64 %l } ```
ce3c5d8 to
05c75e6
Compare
fhahn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic seems like it's working the wrong way: in general, you have to assume the pointer points to a different object unless you can prove it points to the same object.
If you want to check if a pointer points to a certain object, you can just use getUnderlyingObject().
Hm, originally I was worried that restricting to same UOs may be too restrictive and cause a large number of regressions.
But I checked this against a large corpus of IR, and there are only changes in 4 files, so it's probably not too bad.
They all seem to come from compares with a constant and a function argument, like in @compare_arg
dtcxzyw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
|
|
||
| const Value *UO = nullptr; | ||
| for (Value *Op : I->operands()) { | ||
| if (match(Op, m_Zero())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't work when the null pointer is defined. If the function attribute is not suitable, check if AS == 0.
nikic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused by this change. Of course this depends on specific provenance semantics, but assuming a typical provenance exposure model, you cannot in fact use ptr inttoptr (i64 add (i64 ptrtoint (ptr @A to i64), i64 8) to ptr) to write to ptr @b simply because you compared the two values. Comparisons do not expose provenance. This requires the presence of ptrtoint @b.
Of course this breaks down because we optimize away the ptrtoint. But this patch is not the right way to address it. I'd be willing to accept it as a temporary workaround if this is addressing a miscompile observed in real-world code -- is it?
|
Have we explicitly said somewhere that icmp never exposes provenance? I mean, we can make that call I guess, but it has some weird implications. Like, we currently combine I think this came out of the discussion on #125205; I don't know of any real-world testcase. |
Not quite in these terms, but we do specify now that icmp does not capture the provenance (in the "Pointer Capture" section). By the way, there has been some recent interest in actually specifying out provenance semantics, so I'm hoping we'll get some clarity around these issues soon(TM). |
|
Yeah the main motivation was @efriedma-quic 's comments here #125205 |
|
Sounds good, I'll close this PR for now, as my motivating case also disappeared |
Add an extra check to the compare handling to make sure the other
compared pointer cannot be based on a different object.
For example, consider the IR below, which may write
@Bvia a pointer constructedfrom
@Ausing PtrToInt and a compare against@B.