-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Thread Safety Analysis: Basic capability alias-analysis #142955
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@aaronpuchert - RFC regarding basic capability alias analysis. For the bare minimum this would work, and likely covers 90% of the cases I worry about. I believe later enhancements could be built on top. There might be something I'm missing though. Kindly take a look. |
c2dcde6
to
608c4f6
Compare
Cleaned it up some more. |
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-analysis Author: Marco Elver (melver) ChangesAdd a simple form of alias analysis for capabilities by substituting local pointer variables with their initializers if they are For example, the analysis will no longer generate false positives for cases such as:
This implementation would satisfy the basic needs of addressing the concerns for Linux kernel application [1]. Current limitations:
Link: https://lore.kernel.org/all/CANpmjNPquO=W1JAh1FNQb8pMQjgeZAKCPQUAd7qUg=5pjJ6x=Q@mail.gmail.com/ [1] Full diff: https://github.com/llvm/llvm-project/pull/142955.diff 5 Files Affected:
diff --git a/clang/include/clang/Analysis/Analyses/ThreadSafetyCommon.h b/clang/include/clang/Analysis/Analyses/ThreadSafetyCommon.h
index 6c97905a2d7f9..cd2c278c2278f 100644
--- a/clang/include/clang/Analysis/Analyses/ThreadSafetyCommon.h
+++ b/clang/include/clang/Analysis/Analyses/ThreadSafetyCommon.h
@@ -395,7 +395,7 @@ class SExprBuilder {
CapabilityExpr translateAttrExpr(const Expr *AttrExp, CallingContext *Ctx);
// Translate a variable reference.
- til::LiteralPtr *createVariable(const VarDecl *VD);
+ til::SExpr *createVariable(const VarDecl *VD, CallingContext *Ctx = nullptr);
// Translate a clang statement or expression to a TIL expression.
// Also performs substitution of variables; Ctx provides the context.
@@ -501,6 +501,9 @@ class SExprBuilder {
void mergeEntryMapBackEdge();
void mergePhiNodesBackEdge(const CFGBlock *Blk);
+ // Returns true if a variable is assumed to be reassigned.
+ bool isVariableReassigned(const VarDecl *VD);
+
private:
// Set to true when parsing capability expressions, which get translated
// inaccurately in order to hack around smart pointers etc.
@@ -531,6 +534,9 @@ class SExprBuilder {
std::vector<til::Phi *> IncompleteArgs;
til::BasicBlock *CurrentBB = nullptr;
BlockInfo *CurrentBlockInfo = nullptr;
+
+ // Map caching if a local variable is reassigned.
+ llvm::DenseMap<const VarDecl *, bool> LocalVariableReassigned;
};
#ifndef NDEBUG
diff --git a/clang/lib/Analysis/ThreadSafety.cpp b/clang/lib/Analysis/ThreadSafety.cpp
index 80e7c8eff671a..a0ab241125d1c 100644
--- a/clang/lib/Analysis/ThreadSafety.cpp
+++ b/clang/lib/Analysis/ThreadSafety.cpp
@@ -1141,11 +1141,10 @@ class ThreadSafetyAnalyzer {
void warnIfMutexNotHeld(const FactSet &FSet, const NamedDecl *D,
const Expr *Exp, AccessKind AK, Expr *MutexExp,
- ProtectedOperationKind POK, til::LiteralPtr *Self,
+ ProtectedOperationKind POK, til::SExpr *Self,
SourceLocation Loc);
void warnIfMutexHeld(const FactSet &FSet, const NamedDecl *D, const Expr *Exp,
- Expr *MutexExp, til::LiteralPtr *Self,
- SourceLocation Loc);
+ Expr *MutexExp, til::SExpr *Self, SourceLocation Loc);
void checkAccess(const FactSet &FSet, const Expr *Exp, AccessKind AK,
ProtectedOperationKind POK);
@@ -1599,7 +1598,7 @@ class BuildLockset : public ConstStmtVisitor<BuildLockset> {
}
void handleCall(const Expr *Exp, const NamedDecl *D,
- til::LiteralPtr *Self = nullptr,
+ til::SExpr *Self = nullptr,
SourceLocation Loc = SourceLocation());
void examineArguments(const FunctionDecl *FD,
CallExpr::const_arg_iterator ArgBegin,
@@ -1629,7 +1628,7 @@ class BuildLockset : public ConstStmtVisitor<BuildLockset> {
/// of at least the passed in AccessKind.
void ThreadSafetyAnalyzer::warnIfMutexNotHeld(
const FactSet &FSet, const NamedDecl *D, const Expr *Exp, AccessKind AK,
- Expr *MutexExp, ProtectedOperationKind POK, til::LiteralPtr *Self,
+ Expr *MutexExp, ProtectedOperationKind POK, til::SExpr *Self,
SourceLocation Loc) {
LockKind LK = getLockKindFromAccessKind(AK);
CapabilityExpr Cp = SxBuilder.translateAttrExpr(MutexExp, D, Exp, Self);
@@ -1688,8 +1687,7 @@ void ThreadSafetyAnalyzer::warnIfMutexNotHeld(
/// Warn if the LSet contains the given lock.
void ThreadSafetyAnalyzer::warnIfMutexHeld(const FactSet &FSet,
const NamedDecl *D, const Expr *Exp,
- Expr *MutexExp,
- til::LiteralPtr *Self,
+ Expr *MutexExp, til::SExpr *Self,
SourceLocation Loc) {
CapabilityExpr Cp = SxBuilder.translateAttrExpr(MutexExp, D, Exp, Self);
if (Cp.isInvalid()) {
@@ -1857,7 +1855,7 @@ void ThreadSafetyAnalyzer::checkPtAccess(const FactSet &FSet, const Expr *Exp,
/// of an implicitly called cleanup function.
/// \param Loc If \p Exp = nullptr, the location.
void BuildLockset::handleCall(const Expr *Exp, const NamedDecl *D,
- til::LiteralPtr *Self, SourceLocation Loc) {
+ til::SExpr *Self, SourceLocation Loc) {
CapExprSet ExclusiveLocksToAdd, SharedLocksToAdd;
CapExprSet ExclusiveLocksToRemove, SharedLocksToRemove, GenericLocksToRemove;
CapExprSet ScopedReqsAndExcludes;
@@ -1869,7 +1867,7 @@ void BuildLockset::handleCall(const Expr *Exp, const NamedDecl *D,
const auto *TagT = Exp->getType()->getAs<TagType>();
if (D->hasAttrs() && TagT && Exp->isPRValue()) {
til::LiteralPtr *Placeholder =
- Analyzer->SxBuilder.createVariable(nullptr);
+ cast<til::LiteralPtr>(Analyzer->SxBuilder.createVariable(nullptr));
[[maybe_unused]] auto inserted =
Analyzer->ConstructedObjects.insert({Exp, Placeholder});
assert(inserted.second && "Are we visiting the same expression again?");
diff --git a/clang/lib/Analysis/ThreadSafetyCommon.cpp b/clang/lib/Analysis/ThreadSafetyCommon.cpp
index ddbd0a9ca904b..d95dfa087fdf0 100644
--- a/clang/lib/Analysis/ThreadSafetyCommon.cpp
+++ b/clang/lib/Analysis/ThreadSafetyCommon.cpp
@@ -19,6 +19,7 @@
#include "clang/AST/Expr.h"
#include "clang/AST/ExprCXX.h"
#include "clang/AST/OperationKinds.h"
+#include "clang/AST/RecursiveASTVisitor.h"
#include "clang/AST/Stmt.h"
#include "clang/AST/Type.h"
#include "clang/Analysis/Analyses/ThreadSafetyTIL.h"
@@ -241,7 +242,22 @@ CapabilityExpr SExprBuilder::translateAttrExpr(const Expr *AttrExp,
return CapabilityExpr(E, AttrExp->getType(), Neg);
}
-til::LiteralPtr *SExprBuilder::createVariable(const VarDecl *VD) {
+til::SExpr *SExprBuilder::createVariable(const VarDecl *VD,
+ CallingContext *Ctx) {
+ if (VD) {
+ // Substitute local pointer variables with their initializers if they are
+ // explicitly const or never reassigned.
+ QualType Ty = VD->getType();
+ if (Ty->isPointerType() && VD->hasInit() && !isVariableReassigned(VD)) {
+ const Expr *Init = VD->getInit()->IgnoreParenImpCasts();
+ // Check for self-initialization to prevent infinite recursion.
+ if (const auto *InitDRE = dyn_cast<DeclRefExpr>(Init)) {
+ if (InitDRE->getDecl()->getCanonicalDecl() == VD->getCanonicalDecl())
+ return new (Arena) til::LiteralPtr(VD);
+ }
+ return translate(Init, Ctx);
+ }
+ }
return new (Arena) til::LiteralPtr(VD);
}
@@ -353,6 +369,9 @@ til::SExpr *SExprBuilder::translateDeclRefExpr(const DeclRefExpr *DRE,
: cast<ObjCMethodDecl>(D)->getCanonicalDecl()->getParamDecl(I);
}
+ if (const auto *VarD = dyn_cast<VarDecl>(VD))
+ return createVariable(VarD, Ctx);
+
// For non-local variables, treat it as a reference to a named object.
return new (Arena) til::LiteralPtr(VD);
}
@@ -1012,6 +1031,63 @@ void SExprBuilder::exitCFG(const CFGBlock *Last) {
IncompleteArgs.clear();
}
+bool SExprBuilder::isVariableReassigned(const VarDecl *VD) {
+ // Note: The search is performed lazily per-variable and result is cached. An
+ // alternative would have been to eagerly create a set of all reassigned
+ // variables, but that would consume significantly more memory. The number of
+ // variables needing the reassignment check in a single function is expected
+ // to be small. Also see createVariable().
+ struct ReassignmentFinder : RecursiveASTVisitor<ReassignmentFinder> {
+ explicit ReassignmentFinder(const VarDecl *VD) : QueryVD(VD) {
+ assert(QueryVD->getCanonicalDecl() == QueryVD);
+ }
+
+ bool VisitBinaryOperator(BinaryOperator *BO) {
+ if (!BO->isAssignmentOp())
+ return true;
+ auto *DRE = dyn_cast<DeclRefExpr>(BO->getLHS()->IgnoreParenImpCasts());
+ if (!DRE)
+ return true;
+ auto *AssignedVD = dyn_cast<VarDecl>(DRE->getDecl());
+ if (!AssignedVD)
+ return true;
+ // Don't count the initialization itself as a reassignment.
+ if (AssignedVD->hasInit() &&
+ AssignedVD->getInit()->getBeginLoc() == BO->getBeginLoc())
+ return true;
+ // If query variable appears as LHS of assignment.
+ if (QueryVD == AssignedVD->getCanonicalDecl()) {
+ FoundReassignment = true;
+ return false; // stop
+ }
+ return true;
+ }
+
+ const VarDecl *QueryVD;
+ bool FoundReassignment = false;
+ };
+
+ if (VD->getType().isConstQualified())
+ return false; // Assume UB-freedom.
+ if (!VD->isLocalVarDecl())
+ return true; // Not a local variable (assume reassigned).
+ auto *FD = dyn_cast<FunctionDecl>(VD->getDeclContext());
+ if (!FD)
+ return true; // Assume reassigned.
+
+ // Try to look up in cache; use the canonical declaration to ensure consistent
+ // lookup in the cache.
+ VD = VD->getCanonicalDecl();
+ auto It = LocalVariableReassigned.find(VD);
+ if (It != LocalVariableReassigned.end())
+ return It->second;
+
+ ReassignmentFinder Visitor(VD);
+ // const_cast ok: FunctionDecl not modified.
+ Visitor.TraverseDecl(const_cast<FunctionDecl *>(FD));
+ return LocalVariableReassigned[VD] = Visitor.FoundReassignment;
+}
+
#ifndef NDEBUG
namespace {
diff --git a/clang/test/Sema/warn-thread-safety-analysis.c b/clang/test/Sema/warn-thread-safety-analysis.c
index b43f97af9162e..549cb1231baa6 100644
--- a/clang/test/Sema/warn-thread-safety-analysis.c
+++ b/clang/test/Sema/warn-thread-safety-analysis.c
@@ -184,9 +184,13 @@ int main(void) {
/// Cleanup functions
{
struct Mutex* const __attribute__((cleanup(unlock_scope))) scope = &mu1;
- mutex_exclusive_lock(scope); // Note that we have to lock through scope, because no alias analysis!
+ mutex_exclusive_lock(scope); // Lock through scope works.
// Cleanup happens automatically -> no warning.
}
+ {
+ struct Mutex* const __attribute__((unused, cleanup(unlock_scope))) scope = &mu1;
+ mutex_exclusive_lock(&mu1); // With basic alias analysis lock through mu1 also works.
+ }
foo_.a_value = 0; // expected-warning {{writing variable 'a_value' requires holding mutex 'mu_' exclusively}}
*foo_.a_ptr = 1; // expected-warning {{writing the value pointed to by 'a_ptr' requires holding mutex 'bar.other_mu' exclusively}}
diff --git a/clang/test/SemaCXX/warn-thread-safety-analysis.cpp b/clang/test/SemaCXX/warn-thread-safety-analysis.cpp
index d64ed1e5f260a..da5fd350daf74 100644
--- a/clang/test/SemaCXX/warn-thread-safety-analysis.cpp
+++ b/clang/test/SemaCXX/warn-thread-safety-analysis.cpp
@@ -1556,9 +1556,9 @@ void main() {
Child *c;
Base *b = c;
- b->func1(); // expected-warning {{calling function 'func1' requires holding mutex 'b->mu_' exclusively}}
+ b->func1(); // expected-warning {{calling function 'func1' requires holding mutex 'c->mu_' exclusively}}
b->mu_.Lock();
- b->func2(); // expected-warning {{cannot call function 'func2' while mutex 'b->mu_' is held}}
+ b->func2(); // expected-warning {{cannot call function 'func2' while mutex 'c->mu_' is held}}
b->mu_.Unlock();
c->func1(); // expected-warning {{calling function 'func1' requires holding mutex 'c->mu_' exclusively}}
@@ -7224,3 +7224,101 @@ class TestNegativeWithReentrantMutex {
};
} // namespace Reentrancy
+
+// Tests for tracking aliases of capabilities.
+namespace CapabilityAliases {
+struct Foo {
+ Mutex mu;
+ int data GUARDED_BY(mu);
+};
+
+void testBasicPointerAlias(Foo *f) {
+ Foo *ptr = f;
+ ptr->mu.Lock(); // lock through alias
+ f->data = 42; // access through original
+ ptr->mu.Unlock(); // unlock through alias
+}
+
+// FIXME: No alias tracking for non-const reassigned pointers.
+void testReassignment() {
+ Foo f1, f2;
+ Foo *ptr = &f1;
+ ptr->mu.Lock();
+ f1.data = 42; // expected-warning{{writing variable 'data' requires holding mutex 'f1.mu' exclusively}} \
+ // expected-note{{found near match 'ptr->mu'}}
+ ptr->mu.Unlock();
+
+ ptr = &f2; // reassign
+ ptr->mu.Lock();
+ f2.data = 42; // expected-warning{{writing variable 'data' requires holding mutex 'f2.mu' exclusively}} \
+ // expected-note{{found near match 'ptr->mu'}}
+ f1.data = 42; // expected-warning{{writing variable 'data' requires holding mutex 'f1.mu'}} \
+ // expected-note{{found near match 'ptr->mu'}}
+ ptr->mu.Unlock();
+}
+
+// Nested field access through pointer
+struct Container {
+ Foo foo;
+};
+
+void testNestedAccess(Container *c) {
+ Foo *ptr = &c->foo; // pointer to nested field
+ ptr->mu.Lock();
+ c->foo.data = 42;
+ ptr->mu.Unlock();
+}
+
+void testNestedAcquire(Container *c) EXCLUSIVE_LOCK_FUNCTION(&c->foo.mu) {
+ Foo *buf = &c->foo;
+ buf->mu.Lock();
+}
+
+struct ContainerOfPtr {
+ Foo *foo_ptr;
+};
+
+void testIndirectAccess(ContainerOfPtr *fc) {
+ Foo *ptr = fc->foo_ptr; // get pointer
+ ptr->mu.Lock();
+ fc->foo_ptr->data = 42; // access via original
+ ptr->mu.Unlock();
+}
+
+// FIXME: No alias tracking through complex control flow.
+void testSimpleControlFlow(Foo *f1, Foo *f2, bool cond) {
+ Foo *ptr;
+ if (cond) {
+ ptr = f1;
+ } else {
+ ptr = f2;
+ }
+ ptr->mu.Lock();
+ if (cond) {
+ f1->data = 42; // expected-warning{{writing variable 'data' requires holding mutex 'f1->mu' exclusively}} \
+ // expected-note{{found near match 'ptr->mu'}}
+ } else {
+ f2->data = 42; // expected-warning{{writing variable 'data' requires holding mutex 'f2->mu' exclusively}} \
+ // expected-note{{found near match 'ptr->mu'}}
+ }
+ ptr->mu.Unlock();
+}
+
+void testLockFunction(Foo *f) EXCLUSIVE_LOCK_FUNCTION(&f->mu) {
+ Mutex *mu = &f->mu;
+ mu->Lock();
+}
+
+void testUnlockFunction(Foo *f) UNLOCK_FUNCTION(&f->mu) {
+ Mutex *mu = &f->mu;
+ mu->Unlock();
+}
+
+// Semantically UB, but let's not crash the compiler with this (should be
+// handled by -Wuninitialized).
+void testSelfInit() {
+ Mutex *mu = mu; // don't do this at home
+ mu->Lock();
+ mu->Unlock();
+}
+} // namespace CapabilityAliases
|
Gentle ping. |
0964bd4
to
ff8f6e2
Compare
Some more self-review, and fixes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, but I still need to wrap my head around this. For now just some very high-level comments.
277dc49
to
8185269
Compare
Thanks for the comments - tried to address most of them. PTAL. |
b6fd3cf
to
b4b995b
Compare
e5e2f35
to
07bc50c
Compare
Just FYI - I rebased the kernel patches, and attempted to apply -Wthread-safety to kernel/sched/, which previously was impossible. With this PR, it does work with modest changes (most are annotations, only few some small code changes): https://git.kernel.org/pub/scm/linux/kernel/git/melver/linux.git/log/?h=cap-analysis/dev I also discovered that with this form of alias analysis, we can take care of a function acquiring a capability inside a returned object with a hack like this:
... which was necessary for taking care of kernel/sched. |
8a99864
to
471d932
Compare
Alright, I think this is as far as I can push it. This PR now contains 3 commits just to keep the story clear (happy to squash them if that's better). Besides the Linux kernel prototype integration, I took this version and tested it on our internal codebase that has used But because there's a real chance this alias analysis might confuse existing users with new warnings, we should hide it behind PTAL. |
860025e
to
c801def
Compare
Squashed the loop-invariant alias fix, so this should address most comments - initially thought I'd preserve the history of how we got here, but this seems cleaner. Also moved the isStaticLocal() fix to the base commit, and the 2nd commit is now only about translateCXXNewExpr(). PTAL. |
87ff340
to
0bc6707
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me now, but you'll need the current maintainers to sign off on it. :-)
0bc6707
to
dc3e720
Compare
Dropped the translateCXXNewExpr() patch for now.
Thank you for the review! |
dc3e720
to
309bc52
Compare
Gentle ping - I want to submit this by end of next week. |
7bcd142
to
11f1c5b
Compare
Add basic alias analysis for capabilities by reusing LocalVariableMap, which tracks currently valid definitions of variables. Aliases created through complex control flow are not tracked. This implementation would satisfy the basic needs of addressing the concerns for Linux kernel application [1]. For example, the analysis will no longer generate false positives for cases such as (and many others): void testNestedAccess(Container *c) { Foo *ptr = &c->foo; ptr->mu.Lock(); c->foo.data = 42; // OK - no false positive ptr->mu.Unlock(); } void testNestedAcquire(Container *c) EXCLUSIVE_LOCK_FUNCTION(&c->foo.mu) { Foo *buf = &c->foo; buf->mu.Lock(); // OK - no false positive } Given the analysis is now able to identify potentially unsafe patterns it was not able to identify previously (see added FIXME test case for an example), mark alias resolution as a "beta" feature behind the flag `-Wthread-safety-beta`. Fixing LocalVariableMap: It was found that LocalVariableMap was not properly tracking loop-invariant aliases: the old implementation failed because the merge logic compared raw VarDefinition IDs. The algorithm for handling back-edges (in createReferenceContext()) generates new 'reference' definitions for loop-scoped variables. Later ID comparison caused alias invalidation at back-edge merges (in intersectBackEdge()) and at subsequent forward-merges with non-loop paths (in intersectContexts()). Fix LocalVariableMap by adding the getCanonicalDefinitionID() helper that resolves any definition ID down to its non-reference base. As a result, a variable's definition is preserved across control-flow merges as long as its underlying canonical definition remains the same. Link: https://lore.kernel.org/all/CANpmjNPquO=W1JAh1FNQb8pMQjgeZAKCPQUAd7qUg=5pjJ6x=Q@mail.gmail.com/ [1]
11f1c5b
to
241dc5b
Compare
I will merge this later - if this causes problems, please revert, but since it's hidden behind |
Add basic alias analysis for capabilities by reusing LocalVariableMap,
which tracks currently valid definitions of variables. Aliases created
through complex control flow are not tracked. This implementation would
satisfy the basic needs of addressing the concerns for Linux kernel
application [1].
For example, the analysis will no longer generate false positives for
cases such as (and many others):
Given the analysis is now able to identify potentially unsafe patterns
it was not able to identify previously (see added FIXME test case for an
example), mark alias resolution as a "beta" feature behind the flag
-Wthread-safety-beta
.Fixing LocalVariableMap: It was found that LocalVariableMap was not
properly tracking loop-invariant aliases: the old implementation failed
because the merge logic compared raw VarDefinition IDs. The algorithm
for handling back-edges (in createReferenceContext()) generates new
'reference' definitions for loop-scoped variables. Later ID comparison
caused alias invalidation at back-edge merges (in intersectBackEdge())
and at subsequent forward-merges with non-loop paths (in intersectContexts()).
Fix LocalVariableMap by adding the getCanonicalDefinitionID() helper
that resolves any definition ID down to its non-reference base. As a
result, a variable's definition is preserved across control-flow merges
as long as its underlying canonical definition remains the same.
Link: https://lore.kernel.org/all/CANpmjNPquO=W1JAh1FNQb8pMQjgeZAKCPQUAd7qUg=5pjJ6x=Q@mail.gmail.com/ [1]