Skip to content

Conversation

@jdoerfert
Copy link

DO NOT FILE A PULL REQUEST

This repository does not accept pull requests. Please follow http://llvm.org/docs/Contributing.html#how-to-submit-a-patch for contribution to LLVM.

DO NOT FILE A PULL REQUEST

@shiltian shiltian merged commit c19aace into shiltian:openmp-reoganizing-parallelism Oct 21, 2022
@jdoerfert jdoerfert deleted the patch-1 branch October 26, 2022 17:58
shiltian pushed a commit that referenced this pull request Oct 31, 2022
Found by msan -fsanitize-memory-use-after-dtor.

==8259==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x55dbec54d2b8 in dtorRecord(clang::interp::Block*, char*, clang::interp::Descriptor*) clang/lib/AST/Interp/Descriptor.cpp:150:22
    #1 0x55dbec54bfcf in dtorArrayDesc(clang::interp::Block*, char*, clang::interp::Descriptor*) clang/lib/AST/Interp/Descriptor.cpp:97:7
    #2 0x55dbec508578 in invokeDtor clang/lib/AST/Interp/InterpBlock.h:79:7
    #3 0x55dbec508578 in clang::interp::Program::~Program() clang/lib/AST/Interp/Program.h:55:19
    #4 0x55dbec50657a in operator() third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__memory/unique_ptr.h:55:5
    #5 0x55dbec50657a in std::__msan::unique_ptr<clang::interp::Program, std::__msan::default_delete<clang::interp::Program>>::~unique_ptr() third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__memory/unique_ptr.h:261:7
    #6 0x55dbec5035a1 in clang::interp::Context::~Context() clang/lib/AST/Interp/Context.cpp:27:22
    #7 0x55dbebec1daa in operator() third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__memory/unique_ptr.h:55:5
    #8 0x55dbebec1daa in std::__msan::unique_ptr<clang::interp::Context, std::__msan::default_delete<clang::interp::Context>>::~unique_ptr() third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__memory/unique_ptr.h:261:7
    llvm#9 0x55dbebe285f9 in clang::ASTContext::~ASTContext() clang/lib/AST/ASTContext.cpp:1038:40
    llvm#10 0x55dbe941ff13 in llvm::RefCountedBase<clang::ASTContext>::Release() const llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:101:7
    llvm#11 0x55dbe94353ef in release llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:159:38
    llvm#12 0x55dbe94353ef in release llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:224:7
    llvm#13 0x55dbe94353ef in ~IntrusiveRefCntPtr llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:191:27
    llvm#14 0x55dbe94353ef in clang::CompilerInstance::setASTContext(clang::ASTContext*) clang/lib/Frontend/CompilerInstance.cpp:178:3
    llvm#15 0x55dbe95ad0ad in clang::FrontendAction::EndSourceFile() clang/lib/Frontend/FrontendAction.cpp:1100:8
    llvm#16 0x55dbe9445fcf in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) clang/lib/Frontend/CompilerInstance.cpp:1047:11
    llvm#17 0x55dbe6b3afef in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:266:25
    llvm#18 0x55dbe6b13288 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) clang/tools/driver/cc1_main.cpp:250:15
    llvm#19 0x55dbe6b0095f in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) clang/tools/driver/driver.cpp:319:12
    llvm#20 0x55dbe6aff41c in clang_main(int, char**) clang/tools/driver/driver.cpp:395:12
    llvm#21 0x7f9be07fa632 in __libc_start_main
    llvm#22 0x55dbe6a702e9 in _start

  Member fields were destroyed
    #0 0x55dbe6a7da5d in __sanitizer_dtor_callback_fields compiler-rt/lib/msan/msan_interceptors.cpp:949:5
    #1 0x55dbec5094ac in ~SmallVectorImpl llvm/include/llvm/ADT/SmallVector.h:479:7
    #2 0x55dbec5094ac in ~SmallVectorImpl llvm/include/llvm/ADT/SmallVector.h:612:3
    #3 0x55dbec5094ac in llvm::SmallVector<clang::interp::Record::Base, 8u>::~SmallVector() llvm/include/llvm/ADT/SmallVector.h:1207:3
    #4 0x55dbec508e79 in clang::interp::Record::~Record() clang/lib/AST/Interp/Record.h:24:7
    #5 0x55dbec508612 in clang::interp::Program::~Program() clang/lib/AST/Interp/Program.h:49:26
    #6 0x55dbec50657a in operator() third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__memory/unique_ptr.h:55:5
    #7 0x55dbec50657a in std::__msan::unique_ptr<clang::interp::Program, std::__msan::default_delete<clang::interp::Program>>::~unique_ptr() third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__memory/unique_ptr.h:261:7
    #8 0x55dbec5035a1 in clang::interp::Context::~Context() clang/lib/AST/Interp/Context.cpp:27:22
    llvm#9 0x55dbebec1daa in operator() third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__memory/unique_ptr.h:55:5
    llvm#10 0x55dbebec1daa in std::__msan::unique_ptr<clang::interp::Context, std::__msan::default_delete<clang::interp::Context>>::~unique_ptr() third_party/crosstool/v18/stable/toolchain/bin/../include/c++/v1/__memory/unique_ptr.h:261:7
    llvm#11 0x55dbebe285f9 in clang::ASTContext::~ASTContext() clang/lib/AST/ASTContext.cpp:1038:40
    llvm#12 0x55dbe941ff13 in llvm::RefCountedBase<clang::ASTContext>::Release() const llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:101:7
    llvm#13 0x55dbe94353ef in release llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:159:38
    llvm#14 0x55dbe94353ef in release llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:224:7
    llvm#15 0x55dbe94353ef in ~IntrusiveRefCntPtr llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:191:27
    llvm#16 0x55dbe94353ef in clang::CompilerInstance::setASTContext(clang::ASTContext*) clang/lib/Frontend/CompilerInstance.cpp:178:3
    llvm#17 0x55dbe95ad0ad in clang::FrontendAction::EndSourceFile() clang/lib/Frontend/FrontendAction.cpp:1100:8
    llvm#18 0x55dbe9445fcf in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) clang/lib/Frontend/CompilerInstance.cpp:1047:11
    llvm#19 0x55dbe6b3afef in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:266:25
    llvm#20 0x55dbe6b13288 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) clang/tools/driver/cc1_main.cpp:250:15
    llvm#21 0x55dbe6b0095f in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) clang/tools/driver/driver.cpp:319:12
    llvm#22 0x55dbe6aff41c in clang_main(int, char**) clang/tools/driver/driver.cpp:395:12
    llvm#23 0x7f9be07fa632 in __libc_start_main
    llvm#24 0x55dbe6a702e9 in _start
shiltian pushed a commit that referenced this pull request Nov 3, 2022
…ction selection.

Before this patch:
- For `r = or op0, op1`, `tryBitfieldInsertOpFromOr` combines it to BFI when
  1) one of the two operands is bit-field-positioning or bit-field-extraction op; and
  2) bits from the two operands don't overlap

After this patch:
- Right before OR is combined to BFI, evaluates if ORR with left-shifted operand is better.

A motivating example (https://godbolt.org/z/rnMrzs5vn, which is added as a test case in `test_orr_not_bfi` in `CodeGen/AArch64/bitfield-insert.ll`)

For IR:
```
define i64 @test_orr_not_bfxil(i64 %0) {
  %2 = and i64 %0, 127
  %3 = lshr i64 %0, 1
  %4 = and i64 %3, 16256
  %5 = or i64 %4, %2
  ret i64 %5
}
```

Before:
```
   lsr     x8, x0, #1
   and     x8, x8, #0x3f80
   bfxil   x8, x0, #0, #7
```

After:
```
   ubfx x8, x0, #8, #7
   and x9, x0, #0x7f
   orr x0, x9, x8, lsl #7
```

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D135102
shiltian pushed a commit that referenced this pull request Nov 15, 2022
AArch64InstrInfo::optimizePTestInstr attempts to remove a PTEST of a
predicate generating operation that identically sets flags (implictly).

When the PTEST and the predicate-generating operation use the same mask
the PTEST is currently removed. This is incorrect since it doesn't
consider element size. PTEST operates on 8-bit predicates, but for
instructions like compare that also support 16/32/64-bit predicates, the
implicit PTEST performed by the instruction will consider fewer lanes
for these element sizes and could set different first or last active
flags.

For example, consider the following instruction sequence

  ptrue p0.b			; P0=1111-1111-1111-1111
  index z0.s, #0, #1		; Z0=<0,1,2,3>
  index z1.s, #1, #1		; Z1=<1,2,3,4>
  cmphi p1.s, p0/z, z1.s, z0.s  ; P1=0001-0001-0001-0001
				;       ^ last active
  ptest p0, p1.b		; P1=0001-0001-0001-0001
				;     ^ last active

where the compare generates a canonical all active 32-bit predicate (equivalent
to 'ptrue p1.s, all'). The implicit PTEST sets the last active flag, whereas
the PTEST instruction with the same mask doesn't.

This patch restricts the optimization to instructions operating on 8-bit
predicates. One caveat is the optimization is safe regardless of element
size for any active, this will be addressed in a later patch.

Reviewed By: bsmith

Differential Revision: https://reviews.llvm.org/D137716
shiltian pushed a commit that referenced this pull request Nov 17, 2022
Verify three cases of G_UNMERGE_VALUES separately:

1. Splitting a vector into subvectors (the converse of
   G_CONCAT_VECTORS).
2. Splitting a vector into its elements (the converse of
   G_BUILD_VECTOR).
3. Splitting a scalar into smaller scalars (the converse of
   G_MERGE_VALUES).

Previously #1 allowed strange combinations like this:
  %1:_(<2 x s16>),%2:_(<2 x s16>) = G_UNMERGE_VALUES %0(<2 x s32>)
This has been tightened up to check that the source and destination
element types match, and some MIR test cases updated accordingly.

Differential Revision: https://reviews.llvm.org/D111132
shiltian pushed a commit that referenced this pull request Nov 25, 2022
…-seh.mm (NFC)"

This reverts commit 01023bf. The extended test now triggers undefined behavior:
```
/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Transforms/ObjCARC/ObjCARCOpts.cpp:577:41: runtime error: load of value 180, which is not a valid value for type 'bool'
    #0 0xaaaae3333a30 in hasCFGChanged /b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Transforms/ObjCARC/ObjCARCOpts.cpp:577:41
    #1 0xaaaae3333a30 in llvm::ObjCARCOptPass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) /b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Transforms/ObjCARC/ObjCARCOpts.cpp:2494:26
    ...
```
shiltian pushed a commit that referenced this pull request Nov 25, 2022
Casting a pointer to a suitably large integral type by reinterpret-cast
should result in the same value as by using the `__builtin_bit_cast()`.
The compiler exploits this: https://godbolt.org/z/zMP3sG683

However, the analyzer does not bind the same symbolic value to these
expressions, resulting in weird situations, such as failing equality
checks and even results in crashes: https://godbolt.org/z/oeMP7cj8q

Previously, in the `RegionStoreManager::getBinding()` even if `T` was
non-null, we replaced it with `TVR->getValueType()` in case the `MR` was
`TypedValueRegion`.
It doesn't make much sense to auto-detect the type if the type is
already given. By not doing the auto-detection, we would just do the
right thing and perform the load by that type.
This means that we will cast the value to that type.

So, in this patch, I'm proposing to do auto-detection only if the type
was null.

Here is a snippet of code, annotated by the previous and new dump values.
`LocAsInteger` should wrap the `SymRegion`, since we want to load the
address as if it was an integer.
In none of the following cases should type auto-detection be triggered,
hence we should eventually reach an `evalCast()` to lazily cast the loaded
value into that type.

```lang=C++
void LValueToRValueBitCast_dumps(void *p, char (*array)[8]) {
  clang_analyzer_dump(p);     // remained: &SymRegion{reg_$0<void * p>}
  clang_analyzer_dump(array); // remained: {{&SymRegion{reg_$1<char (*)[8] array>}
  clang_analyzer_dump((unsigned long)p);
  // remained: {{&SymRegion{reg_$0<void * p>} [as 64 bit integer]}}
  clang_analyzer_dump(__builtin_bit_cast(unsigned long, p));     <--------- change #1
  // previously: {{&SymRegion{reg_$0<void * p>}}}
  // now:        {{&SymRegion{reg_$0<void * p>} [as 64 bit integer]}}
  clang_analyzer_dump((unsigned long)array); // remained: {{&SymRegion{reg_$1<char (*)[8] array>} [as 64 bit integer]}}
  clang_analyzer_dump(__builtin_bit_cast(unsigned long, array)); <--------- change #2
  // previously: {{&SymRegion{reg_$1<char (*)[8] array>}}}
  // now:        {{&SymRegion{reg_$1<char (*)[8] array>} [as 64 bit integer]}}
}
```

Reviewed By: xazax.hun

Differential Revision: https://reviews.llvm.org/D136603
shiltian pushed a commit that referenced this pull request Dec 9, 2022
The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

Add initial revision of assignment tracking analysis pass
---------------------------------------------------------
This patch squashes five individually reviewed patches into one:

    #1 https://reviews.llvm.org/D136320
    #2 https://reviews.llvm.org/D136321
    #3 https://reviews.llvm.org/D136325
    #4 https://reviews.llvm.org/D136331
    #5 https://reviews.llvm.org/D136335

Patch #1 introduces 2 new files: AssignmentTrackingAnalysis.h and .cpp. The
two subsequent patches modify those files only. Patch #4 plumbs the analysis
into SelectionDAG, and patch #5 is a collection of tests for the analysis as
a whole.

The analysis was broken up into smaller chunks for review purposes but for the
most part the tests were written using the whole analysis. It would be possible
to break up the tests for patches #1 through #3 for the purpose of landing the
patches seperately. However, most them would require an update for each
patch. In addition, patch #4 - which connects the analysis to SelectionDAG - is
required by all of the tests.

If there is build-bot trouble, we might try a different landing sequence.

Analysis problem and goal
-------------------------

Variables values can be stored in memory, or available as SSA values, or both.
Using the Assignment Tracking metadata, it's not possible to determine a
variable location just by looking at a debug intrinsic in
isolation. Instructions without any metadata can change the location of a
variable. The meaning of dbg.assign intrinsics changes depending on whether
there are linked instructions, and where they are relative to those
instructions. So we need to analyse the IR and convert the embedded information
into a form that SelectionDAG can consume to produce debug variable locations
in MIR.

The solution is a dataflow analysis which, aiming to maximise the memory
location coverage for variables, outputs a mapping of instruction positions to
variable location definitions.

API usage
---------

The analysis is named `AssignmentTrackingAnalysis`. It is added as a required
pass for SelectionDAGISel when assignment tracking is enabled.

The results of the analysis are exposed via `getResults` using the returned
`const FunctionVarLocs *`'s const methods:

    const VarLocInfo *single_locs_begin() const;
    const VarLocInfo *single_locs_end() const;
    const VarLocInfo *locs_begin(const Instruction *Before) const;
    const VarLocInfo *locs_end(const Instruction *Before) const;
    void print(raw_ostream &OS, const Function &Fn) const;

Debug intrinsics can be ignored after running the analysis. Instead, variable
location definitions that occur between an instruction `Inst` and its
predecessor (or block start) can be found by looping over the range:

    locs_begin(Inst), locs_end(Inst)

Similarly, variables with a memory location that is valid for their lifetime
can be iterated over using the range:

    single_locs_begin(), single_locs_end()

Further detail
--------------

For an explanation of the dataflow implementation and the integration with
SelectionDAG, please see the reviews linked at the top of this commit message.

Reviewed By: jmorse
shiltian pushed a commit that referenced this pull request Jan 14, 2023
In RegisterInfos_loongarch64.h, r22 is defined twice. Having an extra array
member causes problems reading and writing registers defined after r22. So,
for r22, keep the alias fp, delete the s9 alias.

The PC register is incorrectly accessed when the step command is executed.
The step command behavior is incorrect.

This test reflects this problem:

```
loongson@linux:~$ cat test.c

 #include <stdio.h>

int func(int a) {
  return a + 1;
}

int main(int argc, char const *argv[]) {
  func(10);
  return 0;
}

loongson@linux:~$ clang -g test.c  -o test

```

Without this patch:
```
loongson@linux:~$ llvm-project/llvm/build/bin/lldb test
(lldb) target create "test"
Current executable set to '/home/loongson/test' (loongarch64).
(lldb) b main
Breakpoint 1: where = test`main + 40 at test.c:8:3, address = 0x0000000120000668
(lldb) r
Process 278049 launched: '/home/loongson/test' (loongarch64)
Process 278049 stopped
* thread #1, name = 'test', stop reason = breakpoint 1.1
    frame #0: 0x0000000120000668 test`main(argc=1, argv=0x00007fffffff72a8) at test.c:8:3
   5   	}
   6
   7   	int main(int argc, char const *argv[]) {
-> 8   	  func(10);
   9   	  return 0;
   10  	}
   11
(lldb) s
Process 278049 stopped
* thread #1, name = 'test', stop reason = step in
    frame #0: 0x0000000120000670 test`main(argc=1, argv=0x00007fffffff72a8) at test.c:9:3
   6
   7   	int main(int argc, char const *argv[]) {
   8   	  func(10);
-> 9   	  return 0;
   10  	}

```

With this patch:

```
loongson@linux:~$ llvm-project/llvm/build/bin/lldb test
(lldb) target create "test"
Current executable set to '/home/loongson/test' (loongarch64).
(lldb) b main
Breakpoint 1: where = test`main + 40 at test.c:8:3, address = 0x0000000120000668
(lldb) r
Process 278632 launched: '/home/loongson/test' (loongarch64)
Process 278632 stopped
* thread #1, name = 'test', stop reason = breakpoint 1.1
    frame #0: 0x0000000120000668 test`main(argc=1, argv=0x00007fffffff72a8) at test.c:8:3
   5   	}
   6
   7   	int main(int argc, char const *argv[]) {
-> 8   	  func(10);
   9   	  return 0;
   10  	}
   11
(lldb) s
Process 278632 stopped
* thread #1, name = 'test', stop reason = step in
    frame #0: 0x0000000120000624 test`func(a=10) at test.c:4:10
   1   	 #include <stdio.h>
   2
   3   	int func(int a) {
-> 4   	  return a + 1;
   5   	}

```

Reviewed By: SixWeining, DavidSpickett

Differential Revision: https://reviews.llvm.org/D140615
shiltian pushed a commit that referenced this pull request Jan 17, 2023
When building/testing ASan inside the GCC tree on Solaris while using GNU
`ld` instead of Solaris `ld`, a large number of tests SEGVs on both sparc
and x86 like this:

  Thread 2 received signal SIGSEGV, Segmentation fault.
  [Switching to Thread 1 (LWP 1)]
  0xfe014cfc in __sanitizer::atomic_load<__sanitizer::atomic_uintptr_t>
(a=0xfc602a58, mo=__sanitizer::memory_order_acquire) at
sanitizer_common/sanitizer_atomic_clang_x86.h:46
  46	      v = a->val_dont_use;
  1: x/i $pc
  => 0xfe014cfc
<_ZN11__sanitizer11atomic_loadINS_16atomic_uintptr_tEEENT_4TypeEPVKS2_NS_12memory_orderE+62>:
mov (%eax),%eax
  (gdb) bt
  #0 0xfe014cfc in __sanitizer::atomic_load<__sanitizer::atomic_uintptr_t>
(a=0xfc602a58, mo=__sanitizer::memory_order_acquire) at
sanitizer_common/sanitizer_atomic_clang_x86.h:46
  #1 0xfe0bd1d7 in __sanitizer::DTLS_NextBlock (cur=0xfc602a58) at
sanitizer_common/sanitizer_tls_get_addr.cpp:53
  #2 0xfe0bd319 in __sanitizer::DTLS_Find (id=1) at
sanitizer_common/sanitizer_tls_get_addr.cpp:77
  #3 0xfe0bd466 in __sanitizer::DTLS_on_tls_get_addr (arg_void=0xfeffd068,
res=0xfe602a18, static_tls_begin=0, static_tls_end=0) at
sanitizer_common/sanitizer_tls_get_addr.cpp:116
  #4 0xfe063f81 in __interceptor___tls_get_addr (arg=0xfeffd068) at
sanitizer_common/sanitizer_common_interceptors.inc:5501
  #5 0xfe0a3054 in __sanitizer::CollectStaticTlsBlocks (info=0xfeffd108,
size=40, data=0xfeffd16c) at
sanitizer_common/sanitizer_linux_libcdep.cpp:366
  #6  0xfe6ba9fa in dl_iterate_phdr () from /usr/lib/ld.so.1
  #7 0xfe0a3132 in __sanitizer::GetStaticTlsBoundary (addr=0xfe608020,
size=0xfeffd244, align=0xfeffd1b0) at
sanitizer_common/sanitizer_linux_libcdep.cpp:382
  #8 0xfe0a33f7 in __sanitizer::GetTls (addr=0xfe608020, size=0xfeffd244)
at sanitizer_common/sanitizer_linux_libcdep.cpp:482
  llvm#9 0xfe0a34b1 in __sanitizer::GetThreadStackAndTls (main=true,
stk_addr=0xfe608010, stk_size=0xfeffd240, tls_addr=0xfe608020,
tls_size=0xfeffd244) at sanitizer_common/sanitizer_linux_libcdep.cpp:565

The address being accessed is unmapped.  However, even when the tests
`PASS` with Solaris `ld`, `ASAN_OPTIONS=verbosity=2` shows

  ==6582==__tls_get_addr: Can't guess glibc version

Given that that the code is stricly `glibc`-specific according to
`sanitizer_tls_get_addr.h`, there seems little point in using the
interceptor on non-`glibc` targets.

That's what this patch does.  Tested on `i386-pc-solaris2.11` and
`sparc-sun-solaris2.11` inside the GCC tree.

Differential Revision: https://reviews.llvm.org/D141385
shiltian pushed a commit that referenced this pull request Jan 26, 2023
Change https://reviews.llvm.org/D140059 exposed the following crash in
Z3Solver, where bit widths were not checked consistently with that
change. This change makes the check consistent, and fixes the crash.

```
clang: <root>/llvm/include/llvm/ADT/APSInt.h:99:
  int64_t llvm::APSInt::getExtValue() const: Assertion
  `isRepresentableByInt64() && "Too many bits for int64_t"' failed.
...
Stack dump:
0. Program arguments: clang -cc1 -internal-isystem <root>/lib/clang/16/include
  -nostdsysteminc -analyze -analyzer-checker=core,unix.Malloc,debug.ExprInspection
  -analyzer-config crosscheck-with-z3=true -verify reproducer.c

 #0 0x00000000045b3476 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int)
  <root>/llvm/lib/Support/Unix/Signals.inc:567:22
 #1 0x00000000045b3862 PrintStackTraceSignalHandler(void*)
  <root>/llvm/lib/Support/Unix/Signals.inc:641:1
 #2 0x00000000045b14a5 llvm::sys::RunSignalHandlers()
  <root>/llvm/lib/Support/Signals.cpp:104:20
 #3 0x00000000045b2eb4 SignalHandler(int)
  <root>/llvm/lib/Support/Unix/Signals.inc:412:1
 ...
 llvm#9 0x0000000004be2eb3 llvm::APSInt::getExtValue() const
  <root>/llvm/include/llvm/ADT/APSInt.h:99:5
  <root>/llvm/lib/Support/Z3Solver.cpp:740:53
  clang::ASTContext&, clang::ento::SymExpr const*, llvm::APSInt const&, llvm::APSInt const&, bool)
  <root>/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SMTConv.h:552:61
```

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D142627
shiltian pushed a commit that referenced this pull request Jan 27, 2023
…ak ordering

`std::sort` requires a comparison operator that obides by strict weak
ordering. `operator<=` on pointer does not and leads to undefined
behaviour. Specifically, when we grow the `scratch_type_systems` vector
slightly larger (and thus take `std::sort` down a slightly different
codepath), we segfault. This happened while working on a patch that
would in fact grow this vector. In such a case ASAN reports:

```
$ ./bin/lldb ./lldb-test-build.noindex/lang/cpp/complete-type-check/TestCppIsTypeComplete.test_builtin_types/a.out -o "script -- lldb.target.FindFirstType(\"void\")"
(lldb) script -- lldb.target.FindFirstType("void")
=================================================================
==59975==ERROR: AddressSanitizer: container-overflow on address 0x000108f6b510 at pc 0x000280177b4c bp 0x00016b7d7430 sp 0x00016b7d7428
READ of size 8 at 0x000108f6b510 thread T0
    #0 0x280177b48 in std::__1::shared_ptr<lldb_private::TypeSystem>::shared_ptr[abi:v15006](std::__1::shared_ptr<lldb_private::TypeSystem> const&)+0xb4 (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0x177b48)
(BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #1 0x280dcc008 in void std::__1::__introsort<std::__1::_ClassicAlgPolicy, lldb_private::Target::GetScratchTypeSystems(bool)::$_3&, std::__1::shared_ptr<lldb_private::TypeSystem>*>(std::__1::shared_ptr<lldb_private::TypeSystem>*, std::__1::shared_
ptr<lldb_private::TypeSystem>*, lldb_private::Target::GetScratchTypeSystems(bool)::$_3&, std::__1::iterator_traits<std::__1::shared_ptr<lldb_private::TypeSystem>*>::difference_type)+0x1050 (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblld
b.17.0.0git.dylib:arm64+0xdcc008) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #2 0x280d88788 in lldb_private::Target::GetScratchTypeSystems(bool)+0x5a4 (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0xd88788) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #3 0x28021f0b4 in lldb::SBTarget::FindFirstType(char const*)+0x624 (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0x21f0b4) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #4 0x2804e9590 in _wrap_SBTarget_FindFirstType(_object*, _object*)+0x26c (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0x4e9590) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #5 0x1062d3ad4 in cfunction_call+0x5c (/opt/homebrew/Cellar/[email protected]/3.11.1/Frameworks/Python.framework/Versions/3.11/Python:arm64+0xcfad4) (BuildId: c9efc4bbb1943f9a9b7cc4e91fce477732000000200000000100000000000d00)

<--- snipped --->

0x000108f6b510 is located 400 bytes inside of 512-byte region [0x000108f6b380,0x000108f6b580)
allocated by thread T0 here:
    #0 0x105209414 in wrap__Znwm+0x74 (/Applications/Xcode2.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/14.0.3/lib/darwin/libclang_rt.asan_osx_dynamic.dylib:arm64e+0x51414) (BuildId: 0a44828ceb64337bbfff60b22cd838f0320000
00200000000100000000000b00)
    #1 0x280dca3b4 in std::__1::__split_buffer<std::__1::shared_ptr<lldb_private::TypeSystem>, std::__1::allocator<std::__1::shared_ptr<lldb_private::TypeSystem>>&>::__split_buffer(unsigned long, unsigned long, std::__1::allocator<std::__1::shared_pt
r<lldb_private::TypeSystem>>&)+0x11c (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0xdca3b4) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #2 0x280dc978c in void std::__1::vector<std::__1::shared_ptr<lldb_private::TypeSystem>, std::__1::allocator<std::__1::shared_ptr<lldb_private::TypeSystem>>>::__push_back_slow_path<std::__1::shared_ptr<lldb_private::TypeSystem> const&>(std::__1::s
hared_ptr<lldb_private::TypeSystem> const&)+0x13c (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0xdc978c) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #3 0x280d88dec in std::__1::vector<std::__1::shared_ptr<lldb_private::TypeSystem>, std::__1::allocator<std::__1::shared_ptr<lldb_private::TypeSystem>>>::push_back[abi:v15006](std::__1::shared_ptr<lldb_private::TypeSystem> const&)+0x80 (/Users/mic
haelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0xd88dec) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #4 0x280d8857c in lldb_private::Target::GetScratchTypeSystems(bool)+0x398 (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0xd8857c) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #5 0x28021f0b4 in lldb::SBTarget::FindFirstType(char const*)+0x624 (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0x21f0b4) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #6 0x2804e9590 in _wrap_SBTarget_FindFirstType(_object*, _object*)+0x26c (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0x4e9590) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #7 0x1062d3ad4 in cfunction_call+0x5c (/opt/homebrew/Cellar/[email protected]/3.11.1/Frameworks/Python.framework/Versions/3.11/Python:arm64+0xcfad4) (BuildId: c9efc4bbb1943f9a9b7cc4e91fce477732000000200000000100000000000d00)
    #8 0x10627fff0 in _PyObject_MakeTpCall+0x7c (/opt/homebrew/Cellar/[email protected]/3.11.1/Frameworks/Python.framework/Versions/3.11/Python:arm64+0x7bff0) (BuildId: c9efc4bbb1943f9a9b7cc4e91fce477732000000200000000100000000000d00)
    llvm#9 0x106378a98 in _PyEval_EvalFrameDefault+0xbcf8 (/opt/homebrew/Cellar/[email protected]/3.11.1/Frameworks/Python.framework/Versions/3.11/Python:arm64+0x174a98) (BuildId: c9efc4bbb1943f9a9b7cc4e91fce477732000000200000000100000000000d00)
```

Differential Revision: https://reviews.llvm.org/D142709
shiltian pushed a commit that referenced this pull request Feb 1, 2023
… -analyzer-config

I am working on another patch that changes StringMap's hash function,
which changes the iteration order here, and breaks some tests,
specifically:

    clang/test/Analysis/NSString.m
    clang/test/Analysis/shallow-mode.m

with errors like:

    generated arguments do not match in round-trip
    generated arguments #1 in round-trip: <...> "-analyzer-config" "ipa=inlining" "-analyzer-config" "max-nodes=75000" <...>
    generated arguments #2 in round-trip: <...> "-analyzer-config" "max-nodes=75000" "-analyzer-config" "ipa=inlining" <...>

To avoid this, sort the options by key, instead of using the default map
iteration order.

Reviewed By: jansvoboda11, MaskRay

Differential Revision: https://reviews.llvm.org/D142861
shiltian pushed a commit that referenced this pull request Feb 8, 2023
This reverts commit d768b97.

Causes sanitizer failure: https://lab.llvm.org/buildbot/#/builders/238/builds/1114

```
/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Support/xxhash.cpp:107:12: runtime error: applying non-zero offset 8 to null pointer
    #0 0xaaaab28ec6c8 in llvm::xxHash64(llvm::StringRef) /b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Support/xxhash.cpp:107:12
    #1 0xaaaab28cbd38 in llvm::StringMapImpl::LookupBucketFor(llvm::StringRef) /b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Support/StringMap.cpp:87:28
```

Probably causes test failure in `warn-unsafe-buffer-usage-fixits-local-var-span.cpp`: https://lab.llvm.org/buildbot/#/builders/60/builds/10619

Probably causes reverse-iteration test failure in `test-output-format.ll`: https://lab.llvm.org/buildbot/#/builders/54/builds/3545
shiltian pushed a commit that referenced this pull request Mar 6, 2023
For example, if you have a chain of inlined funtions like this:

   1 #include <stdlib.h>
   2 int g1 = 4, g2 = 6;
   3
   4 static inline void bar(int q) {
   5   if (q > 5)
   6     abort();
   7 }
   8
   9 static inline void foo(int q) {
  10   bar(q);
  11 }
  12
  13 int main() {
  14   foo(g1);
  15   foo(g2);
  16   return 0;
  17 }

with optimizations you could end up with a single abort call for the two
inlined instances of foo(). When merging the locations for those inlined
instances you would previously end up with a 0:0 location in main().
Leaving out that inlined chain from the location for the abort call
could make troubleshooting difficult in some cases.

This patch changes DILocation::getMergedLocation() to try to handle such
cases. The function is rewritten to first find a common starting point
for the two locations (same subprogram and inlined-at location), and
then in reverse traverses the inlined-at chain looking for matches in
each subprogram. For each subprogram, the merge function will find the
nearest common scope for the two locations, and matching line and
column (or set them to 0 if not matching).

In the example above, you will for the abort call get a location in
bar() at 6:5, inlined in foo() at 10:3, inlined in main() at 0:0 (since
the two inlined functions are on different lines, but in the same
scope).

I have not seen anything in the DWARF standard that would disallow
inlining a non-zero location at 0:0 in the inlined-at function, and both
LLDB and GDB seem to accept these locations (with D142552 needed for
LLDB to handle cases where the file, line and column number are all 0).
One incompatibility with GDB is that it seems to ignore 0-line locations
in some cases, but I am not aware of any specific issue that this patch
produces related to that.

With x86-64 LLDB (trunk) you previously got:

  frame #0: 0x00007ffff7a44930 libc.so.6`abort
  frame #1: 0x00005555555546ec a.out`main at merge.c:0

and will now get:

  frame #0: 0x[...] libc.so.6`abort
  frame #1: 0x[...] a.out`main [inlined] bar(q=<unavailable>) at merge.c:6:5
  frame #2: 0x[...] a.out`main [inlined] foo(q=<unavailable>) at merge.c:10:3
  frame #3: 0x[...] a.out`main at merge.c:0

and with x86-64 GDB (11.1) you will get:

  (gdb) bt
  #0  0x00007ffff7a44930 in abort () from /lib64/libc.so.6
  #1  0x00005555555546ec in bar (q=<optimized out>) at merge.c:6
  #2  foo (q=<optimized out>) at merge.c:10
  #3  0x00005555555546ec in main ()

Reviewed By: aprantl, dblaikie

Differential Revision: https://reviews.llvm.org/D142556
shiltian pushed a commit that referenced this pull request Mar 21, 2023
Previously we only looked at the si_signo field, so you got:
```
(lldb) bt
* thread #1, name = 'a.out.mte', stop reason = signal SIGSEGV
  * frame #0: 0x00000000004007f4
```
This patch adds si_code so we can show:
```
(lldb) bt
* thread #1, name = 'a.out.mte', stop reason = signal SIGSEGV: sync tag check fault
  * frame #0: 0x00000000004007f4
```

The order of errno and code was incorrect in ElfLinuxSigInfo::Parse.
It was the order that a "swapped" siginfo arch would use, which for Linux,
is only MIPS. We removed MIPS Linux support some time ago.

See:
https://github.com/torvalds/linux/blob/fe15c26ee26efa11741a7b632e9f23b01aca4cc6/include/uapi/asm-generic/siginfo.h#L121

A test is added using memory tagging faults. Which were the original
motivation for the changes.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D146045
shiltian pushed a commit that referenced this pull request Mar 22, 2023
This change prevents rare deadlocks observed for specific macOS/iOS GUI
applications which issue many `dlopen()` calls from multiple different
threads at startup and where TSan finds and reports a race during
startup.  Providing a reliable test for this has been deemed infeasible.

Although I've only observed this deadlock on Apple platforms,
conceptually the cause is not confined to Apple code so the fix lives in
platform-independent code.

Deadlock scenario:
```
Thread 2                    | Thread 4
ReportRace()                |
Lock internal TSan mutexes  |
  &ctx->slot_mtx            |
                            | dlopen() interceptor
                            | OnLibraryLoaded()
                            | MemoryMappingLayout::DumpListOfModules()
                            | calls dyld API, which takes internal lock
                            | lock() interceptor
                            | TSan tries to take internal mutexes again
                            |   &ctx->slot_mtx
call into symbolizer        |
MemoryMappingLayout::DumpListOfModules()
calls dyld API, which hangs on trying to take lock
```
Resulting in:
* Thread 2 has internal TSan mutex, blocked on dyld lock
* Thread 4 has dyld lock, blocked on internal TSan mutex

The fix prevents this situation by not intercepting any of the calls
originating from `MemoryMappingLayout::DumpListOfModules()`.

Stack traces for deadlock between ReportRace() and dlopen() interceptor:
```
thread #2, queue = 'com.apple.root.default-qos'
  frame #0: libsystem_kernel.dylib
  frame #1: libclang_rt.tsan_osx_dynamic.dylib`::wrap_os_unfair_lock_lock_with_options(lock=<unavailable>, options=<unavailable>) at tsan_interceptors_mac.cpp:306:3
  frame #2: dyld`dyld4::RuntimeLocks::withLoadersReadLock(this=0x000000016f21b1e0, work=0x00000001814523c0) block_pointer) at DyldRuntimeState.cpp:227:28
  frame #3: dyld`dyld4::APIs::_dyld_get_image_header(this=0x0000000101012a20, imageIndex=614) at DyldAPIs.cpp:240:11
  frame #4: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::CurrentImageHeader(this=<unavailable>) at sanitizer_procmaps_mac.cpp:391:35
  frame #5: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::Next(this=0x000000016f2a2800, segment=0x000000016f2a2738) at sanitizer_procmaps_mac.cpp:397:51
  frame #6: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::DumpListOfModules(this=0x000000016f2a2800, modules=0x00000001011000a0) at sanitizer_procmaps_mac.cpp:460:10
  frame #7: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::ListOfModules::init(this=0x00000001011000a0) at sanitizer_mac.cpp:610:18
  frame #8: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Symbolizer::FindModuleForAddress(unsigned long) [inlined] __sanitizer::Symbolizer::RefreshModules(this=0x0000000101100078) at sanitizer_symbolizer_libcdep.cpp:185:12
  frame llvm#9: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Symbolizer::FindModuleForAddress(this=0x0000000101100078, address=6465454512) at sanitizer_symbolizer_libcdep.cpp:204:5
  frame llvm#10: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Symbolizer::SymbolizePC(this=0x0000000101100078, addr=6465454512) at sanitizer_symbolizer_libcdep.cpp:88:15
  frame llvm#11: libclang_rt.tsan_osx_dynamic.dylib`__tsan::SymbolizeCode(addr=6465454512) at tsan_symbolize.cpp:106:35
  frame llvm#12: libclang_rt.tsan_osx_dynamic.dylib`__tsan::SymbolizeStack(trace=StackTrace @ 0x0000600002d66d00) at tsan_rtl_report.cpp:112:28
  frame llvm#13: libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedReportBase::AddMemoryAccess(this=0x000000016f2a2a90, addr=4381057136, external_tag=<unavailable>, s=<unavailable>, tid=<unavailable>, stack=<unavailable>, mset=0x00000001012fc310) at tsan_rtl_report.cpp:190:16
  frame llvm#14: libclang_rt.tsan_osx_dynamic.dylib`__tsan::ReportRace(thr=0x00000001012fc000, shadow_mem=0x000008020a4340e0, cur=<unavailable>, old=<unavailable>, typ0=1) at tsan_rtl_report.cpp:795:9
  frame llvm#15: libclang_rt.tsan_osx_dynamic.dylib`__tsan::DoReportRace(thr=0x00000001012fc000, shadow_mem=0x000008020a4340e0, cur=Shadow @ x22, old=Shadow @ 0x0000600002d6b4f0, typ=1) at tsan_rtl_access.cpp:166:3
  frame llvm#16: libclang_rt.tsan_osx_dynamic.dylib`::__tsan_read8(void *) at tsan_rtl_access.cpp:220:5
  frame llvm#17: libclang_rt.tsan_osx_dynamic.dylib`::__tsan_read8(void *) [inlined] __tsan::MemoryAccess(thr=0x00000001012fc000, pc=<unavailable>, addr=<unavailable>, size=8, typ=1) at tsan_rtl_access.cpp:442:3
  frame llvm#18: libclang_rt.tsan_osx_dynamic.dylib`::__tsan_read8(addr=<unavailable>) at tsan_interface.inc:34:3
  <call into TSan from from instrumented code>

thread #4, queue = 'com.apple.dock.fullscreen'
  frame #0:  libsystem_kernel.dylib
  frame #1:  libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::FutexWait(p=<unavailable>, cmp=<unavailable>) at sanitizer_mac.cpp:540:3
  frame #2:  libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Semaphore::Wait(this=<unavailable>) at sanitizer_mutex.cpp:35:7
  frame #3:  libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Mutex::Lock(this=0x0000000102992a80) at sanitizer_mutex.h:196:18
  frame #4:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock(this=<unavailable>, mu=0x0000000102992a80) at sanitizer_mutex.h:383:10
  frame #5:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock(this=<unavailable>, mu=0x0000000102992a80) at sanitizer_mutex.h:382:77
  frame #6:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() at tsan_rtl.h:708:10
  frame #7:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __tsan::TryTraceFunc(thr=0x000000010f084000, pc=0) at tsan_rtl.h:751:7
  frame #8:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __tsan::FuncExit(thr=0x000000010f084000) at tsan_rtl.h:798:7
  frame llvm#9:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor(this=0x000000016f3ba280) at tsan_interceptors_posix.cpp:300:5
  frame llvm#10: libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor(this=<unavailable>) at tsan_interceptors_posix.cpp:293:41
  frame llvm#11: libclang_rt.tsan_osx_dynamic.dylib`::wrap_os_unfair_lock_lock_with_options(lock=0x000000016f21b1e8, options=OS_UNFAIR_LOCK_NONE) at tsan_interceptors_mac.cpp:310:1
  frame llvm#12: dyld`dyld4::RuntimeLocks::withLoadersReadLock(this=0x000000016f21b1e0, work=0x00000001814525d4) block_pointer) at DyldRuntimeState.cpp:227:28
  frame llvm#13: dyld`dyld4::APIs::_dyld_get_image_vmaddr_slide(this=0x0000000101012a20, imageIndex=412) at DyldAPIs.cpp:273:11
  frame llvm#14: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::Next(__sanitizer::MemoryMappedSegment*) at sanitizer_procmaps_mac.cpp:286:17
  frame llvm#15: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::Next(this=0x000000016f3ba560, segment=0x000000016f3ba498) at sanitizer_procmaps_mac.cpp:432:15
  frame llvm#16: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::DumpListOfModules(this=0x000000016f3ba560, modules=0x000000016f3ba618) at sanitizer_procmaps_mac.cpp:460:10
  frame llvm#17: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::ListOfModules::init(this=0x000000016f3ba618) at sanitizer_mac.cpp:610:18
  frame llvm#18: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::LibIgnore::OnLibraryLoaded(this=0x0000000101f3aa40, name="<some library>") at sanitizer_libignore.cpp:54:11
  frame llvm#19: libclang_rt.tsan_osx_dynamic.dylib`::wrap_dlopen(filename="<some library>", flag=<unavailable>) at sanitizer_common_interceptors.inc:6466:3
  <library code>
```

rdar://106766395

Differential Revision: https://reviews.llvm.org/D146593
shiltian pushed a commit that referenced this pull request May 5, 2023
…callback

The `TypeSystemMap::m_mutex` guards against concurrent modifications
of members of `TypeSystemMap`. In particular, `m_map`.

`TypeSystemMap::ForEach` iterates through the entire `m_map` calling
a user-specified callback for each entry. This is all done while
`m_mutex` is locked. However, there's nothing that guarantees that
the callback itself won't call back into `TypeSystemMap` APIs on the
same thread. This lead to double-locking `m_mutex`, which is undefined
behaviour. We've seen this cause a deadlock in the swift plugin with
following backtrace:

```

int main() {
    std::unique_ptr<int> up = std::make_unique<int>(5);

    volatile int val = *up;
    return val;
}

clang++ -std=c++2a -g -O1 main.cpp

./bin/lldb -o “br se -p return” -o run -o “v *up” -o “expr *up” -b
```

```
frame #4: std::lock_guard<std::mutex>::lock_guard
frame #5: lldb_private::TypeSystemMap::GetTypeSystemForLanguage <<<< Lock #2
frame #6: lldb_private::TypeSystemMap::GetTypeSystemForLanguage
frame #7: lldb_private::Target::GetScratchTypeSystemForLanguage
...
frame llvm#26: lldb_private::SwiftASTContext::LoadLibraryUsingPaths
frame llvm#27: lldb_private::SwiftASTContext::LoadModule
frame llvm#30: swift::ModuleDecl::collectLinkLibraries
frame llvm#31: lldb_private::SwiftASTContext::LoadModule
frame llvm#34: lldb_private::SwiftASTContext::GetCompileUnitImportsImpl
frame llvm#35: lldb_private::SwiftASTContext::PerformCompileUnitImports
frame llvm#36: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetSwiftASTContext
frame llvm#37: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetPersistentExpressionState
frame llvm#38: lldb_private::Target::GetPersistentSymbol
frame llvm#41: lldb_private::TypeSystemMap::ForEach                 <<<< Lock #1
frame llvm#42: lldb_private::Target::GetPersistentSymbol
frame llvm#43: lldb_private::IRExecutionUnit::FindInUserDefinedSymbols
frame llvm#44: lldb_private::IRExecutionUnit::FindSymbol
frame llvm#45: lldb_private::IRExecutionUnit::MemoryManager::GetSymbolAddressAndPresence
frame llvm#46: lldb_private::IRExecutionUnit::MemoryManager::findSymbol
frame llvm#47: non-virtual thunk to lldb_private::IRExecutionUnit::MemoryManager::findSymbol
frame llvm#48: llvm::LinkingSymbolResolver::findSymbol
frame llvm#49: llvm::LegacyJITSymbolResolver::lookup
frame llvm#50: llvm::RuntimeDyldImpl::resolveExternalSymbols
frame llvm#51: llvm::RuntimeDyldImpl::resolveRelocations
frame llvm#52: llvm::MCJIT::finalizeLoadedModules
frame llvm#53: llvm::MCJIT::finalizeObject
frame llvm#54: lldb_private::IRExecutionUnit::ReportAllocations
frame llvm#55: lldb_private::IRExecutionUnit::GetRunnableInfo
frame llvm#56: lldb_private::ClangExpressionParser::PrepareForExecution
frame llvm#57: lldb_private::ClangUserExpression::TryParse
frame llvm#58: lldb_private::ClangUserExpression::Parse
```

Our solution is to simply iterate over a local copy of `m_map`.

**Testing**

* Confirmed on manual reproducer (would reproduce 100% of the time
  before the patch)

Differential Revision: https://reviews.llvm.org/D149949
shiltian pushed a commit that referenced this pull request May 25, 2023
…est unittest

Need to finalize the DIBuilder to avoid leak sanitizer errors
like this:

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x55c99ea1761d in operator new(unsigned long)
    #1 0x55c9a518ae49 in operator new
    #2 0x55c9a518ae49 in llvm::MDTuple::getImpl(...)
    #3 0x55c9a4f1b1ec in getTemporary
    #4 0x55c9a4f1b1ec in llvm::DIBuilder::createFunction(...)
shiltian pushed a commit that referenced this pull request May 26, 2023
The motivation for this change is a workload generated by the XLA compiler
targeting nvidia GPUs.

This kernel has a few hundred i8 loads and stores.  Merging is critical for
performance.

The current LSV doesn't merge these well because it only considers instructions
within a block of 64 loads+stores.  This limit is necessary to contain the
O(n^2) behavior of the pass.  I'm hesitant to increase the limit, because this
pass is already one of the slowest parts of compiling an XLA program.

So we rewrite basically the whole thing to use a new algorithm.  Before, we
compared every load/store to every other to see if they're consecutive.  The
insight (from tra@) is that this is redundant.  If we know the offset from PtrA
to PtrB, then we don't need to compare PtrC to both of them in order to tell
whether C may be adjacent to A or B.

So that's what we do.  When scanning a basic block, we maintain a list of
chains, where we know the offset from every element in the chain to the first
element in the chain.  Each instruction gets compared only to the leaders of
all the chains.

In the worst case, this is still O(n^2), because all chains might be of length
1.  To prevent compile time blowup, we only consider the 64 most recently used
chains.  Thus we do no more comparisons than before, but we have the potential
to make much longer chains.

This rewrite affects many tests.  The changes to tests fall into two
categories.

1. The old code had what appears to be a bug when deciding whether a misaligned
   vectorized load is fast.  Suppose TTI reports that load <i32 x 4> align 4
   has relative speed 1, and suppose that load i32 align 4 has relative speed
   32.

   The intent of the code seems to be that we prefer the scalar load, because
   it's faster.  But the old code would choose the vectorized load.
   accessIsMisaligned would set RelativeSpeed to 0 for the scalar load (and not
   even call into TTI to get the relative speed), because the scalar load is
   aligned.

   After this patch, we will prefer the scalar load if it's faster.

2. This patch changes the logic for how we vectorize.  Usually this results in
   vectorizing more.

Explanation of changes to tests:

 - AMDGPU/adjust-alloca-alignment.ll: #1
 - AMDGPU/flat_atomic.ll: #2, we vectorize more.
 - AMDGPU/int_sideeffect.ll: #2, there are two possible locations for the call to @foo, and the pass is brittle to this.  Before, we'd vectorize in case 1 and not case 2.  Now we vectorize in case 2 and not case 1.  So we just move the call.
 - AMDGPU/adjust-alloca-alignment.ll: #2, we vectorize more
 - AMDGPU/insertion-point.ll: #2 we vectorize more
 - AMDGPU/merge-stores-private.ll: #1 (undoes changes from git rev 86f9117, which appear to have hit the bug from #1)
 - AMDGPU/multiple_tails.ll: #1
 - AMDGPU/vect-ptr-ptr-size-mismatch.ll: Fix alignment (I think related to #1 above).
 - AMDGPU CodeGen: I have difficulty commenting on these changes, but many of them look like #2, we vectorize more.
 - NVPTX/4x2xhalf.ll: Fix alignment (I think related to #1 above).
 - NVPTX/vectorize_i8.ll: We don't generate <3 x i8> vectors on NVPTX because they're not legal (and eventually get split)
 - X86/correct-order.ll: #2, we vectorize more, probably because of changes to the chain-splitting logic.
 - X86/subchain-interleaved.ll: #2, we vectorize more
 - X86/vector-scalar.ll: #2, we can now vectorize scalar float + <1 x float>
 - X86/vectorize-i8-nested-add-inseltpoison.ll: Deleted the nuw test because it was nonsensical.  It was doing `add nuw %v0, -1`, but this is equivalent to `add nuw %v0, 0xffff'ffff`, which is equivalent to asserting that %v0 == 0.
 - X86/vectorize-i8-nested-add.ll: Same as nested-add-inseltpoison.ll

Differential Revision: https://reviews.llvm.org/D149893
shiltian pushed a commit that referenced this pull request Jun 23, 2023
Use hlfir::loadTrivialScalars to dereference pointer, allocatables, and
load numerical and logical scalars.

This has a small fallout on tests:

- load is done on the HLFIR entity (#0 of hlfir.declare) and not the FIR one (#1). This makes no difference at the FIR level (#1 and #0 only differs to account for assumed and explicit shape lower bounds).

- loadTrivialScalars get rids of allocatable fir.box for monomoprhic scalars
  (it is not needed). This exposed a bug in lowering of MERGE with
  a polymorphic and a monomorphic argument: when the monomorphic is not
  a fir.box, the polymorphic fir.class should not be reboxed but its
  address should be read.

Reviewed By: tblah

Differential Revision: https://reviews.llvm.org/D153252
shiltian pushed a commit that referenced this pull request Jun 26, 2023
Allow specifying 'nomerge' attribute for function pointers,
e.g. like in the following C code:

    extern void (*foo)(void) __attribute__((nomerge));
    void bar(long i) {
      if (i)
        foo();
      else
        foo();
    }

With the goal to attach 'nomerge' to both calls done through 'foo':

    @foo = external local_unnamed_addr global ptr, align 8
    define dso_local void @bar(i64 noundef %i) local_unnamed_addr #0 {
      ; ...
      %0 = load ptr, ptr @foo, align 8, !tbaa !5
      ; ...
    if.then:
      tail call void %0() #1
      br label %if.end
    if.else:
      tail call void %0() #1
      br label %if.end
    if.end:
      ret void
    }
    ; ...
    attributes #1 = { nomerge ... }

Report a warning in case if 'nomerge' is specified for a variable that
is not a function pointer, e.g.:

    t.c:2:22: warning: 'nomerge' attribute is ignored because 'j' is not a function pointer [-Wignored-attributes]
        2 | int j __attribute__((nomerge));
          |                      ^

The intended use-case is for BPF backend.

BPF provides a sort of "standard library" functions that are called
helpers. BPF also verifies usage of these helpers before program
execution. Because of limitations of verification / runtime model it
is important to keep calls to some of such helpers from merging.

An example could be found by the link [1], there input C code:

     if (data_end - data > 1024) {
         bpf_for_each_map_elem(&map1, cb, &cb_data, 0);
     } else {
         bpf_for_each_map_elem(&map2, cb, &cb_data, 0);
     }

Is converted to bytecode equivalent to:

     if (data_end - data > 1024)
       tmp = &map1;
     else
       tmp = &map2;
     bpf_for_each_map_elem(tmp, cb, &cb_data, 0);

However, BPF verification/runtime requires to use the same map address
for each particular `bpf_for_each_map_elem()` call.

The 'nomerge' attribute is a perfect match for this situation, but
unfortunately BPF helpers are declared as pointers to functions:

    static long (*bpf_for_each_map_elem)(void *map, ...) = (void *) 164;

Hence, this commit, allowing to use 'nomerge' for function pointers.

[1] https://lore.kernel.org/bpf/[email protected]/

Differential Revision: https://reviews.llvm.org/D152986
shiltian pushed a commit that referenced this pull request Jul 4, 2023
Running this on Amazon Ubuntu the final backtrace is:
```
(lldb) thread backtrace
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
  * frame #0: 0x0000aaaaaaaa07d0 a.out`func_c at main.c:10:3
    frame #1: 0x0000aaaaaaaa07c4 a.out`func_b at main.c:14:3
    frame #2: 0x0000aaaaaaaa07b4 a.out`func_a at main.c:18:3
    frame #3: 0x0000aaaaaaaa07a4 a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:22:3
    frame #4: 0x0000fffff7b373fc libc.so.6`___lldb_unnamed_symbol2962 + 108
    frame #5: 0x0000fffff7b374cc libc.so.6`__libc_start_main + 152
    frame #6: 0x0000aaaaaaaa06b0 a.out`_start + 48
```
This causes the test to fail because of the extra ___lldb_unnamed_symbol2962 frame
(an inlined function?).

To fix this, strictly check all the frames in main.c then for the rest
just check we find __libc_start_main and _start in that order regardless
of other frames in between.

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D154204
shiltian pushed a commit that referenced this pull request Jul 11, 2023
The original MFS work D85368 shows good performance improvement with
Instrumented FDO. However, AutoFDO or Flow-Sensitive AutoFDO (FSAFDO)
does not show performance gain. This is mainly caused by a less
accurate profile compared to the iFDO profile.

For the past few months, we have been working to improve FSAFDO
quality, like in D145171. Taking advantage of this improvement, MFS
now shows performance improvements over FSAFDO profiles.

That being said, 2 minor changes need to be made, 1) An FS-AutoFDO
profile generation pass needs to be added right before MFS pass and an
FSAFDO profile load pass is needed when FS-AutoFDO is enabled and the
MFS flag is present. 2) MFS only applies to hot functions, because we
believe (and experiment also shows) FS-AutoFDO is more accurate about
functions that have plenty of samples than those with no or very few
samples.

With this improvement, we see a 1.2% performance improvement in clang
benchmark, 0.9% QPS improvement in our internal search benchmark, and
3%-5% improvement in internal storage benchmark.

This is #1 of the two patches that enables the improvement.

Reviewed By: wenlei, snehasish, xur

Differential Revision: https://reviews.llvm.org/D152399
shiltian pushed a commit that referenced this pull request Jul 12, 2023
…tput

The crash happens in clang::driver::tools::SplitDebugName when Output is
InputInfo::Nothing. It doesn't happen with standalone clang driver because
output is created in Driver::BuildJobsForActionNoCache.

Example backtrace:
```
* thread #1, name = 'clangd', stop reason = hit program assert
  * frame #0: 0x00007ffff5c4eacf libc.so.6`raise + 271
    frame #1: 0x00007ffff5c21ea5 libc.so.6`abort + 295
    frame #2: 0x00007ffff5c21d79 libc.so.6`__assert_fail_base.cold.0 + 15
    frame #3: 0x00007ffff5c47426 libc.so.6`__assert_fail + 70
    frame #4: 0x000055555dc0923c clangd`clang::driver::InputInfo::getFilename(this=0x00007fffffff9398) const at InputInfo.h:84:5
    frame #5: 0x000055555dcd0d8d clangd`clang::driver::tools::SplitDebugName(JA=0x000055555f6c6a50, Args=0x000055555f6d0b80, Input=0x00007fffffff9678, Output=0x00007fffffff9398) at CommonArgs.cpp:1275:40
    frame #6: 0x000055555dc955a5 clangd`clang::driver::tools::Clang::ConstructJob(this=0x000055555f6c69d0, C=0x000055555f6c64a0, JA=0x000055555f6c6a50, Output=0x00007fffffff9398, Inputs=0x00007fffffff9668, Args=0x000055555f6d0b80, LinkingOutput=0x0000000000000000) const at Clang.cpp:5690:33
    frame #7: 0x000055555dbf6b54 clangd`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fffffffb5e0, C=0x000055555f6c64a0, A=0x000055555f6c6a50, TC=0x000055555f6c4be0, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=1, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:5618:10
    frame #8: 0x000055555dbf4ef0 clangd`clang::driver::Driver::BuildJobsForAction(this=0x00007fffffffb5e0, C=0x000055555f6c64a0, A=0x000055555f6c6a50, TC=0x000055555f6c4be0, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=1, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:5306:26
    frame llvm#9: 0x000055555dbeb590 clangd`clang::driver::Driver::BuildJobs(this=0x00007fffffffb5e0, C=0x000055555f6c64a0) const at Driver.cpp:4844:5
    frame llvm#10: 0x000055555dbe6b0f clangd`clang::driver::Driver::BuildCompilation(this=0x00007fffffffb5e0, ArgList=ArrayRef<const char *> @ 0x00007fffffffb268) at Driver.cpp:1496:3
    frame llvm#11: 0x000055555b0cc0d9 clangd`clang::createInvocation(ArgList=ArrayRef<const char *> @ 0x00007fffffffbb38, Opts=CreateInvocationOptions @ 0x00007fffffffbb90) at CreateInvocationFromCommandLine.cpp:53:52
    frame llvm#12: 0x000055555b378e7b clangd`clang::clangd::buildCompilerInvocation(Inputs=0x00007fffffffca58, D=0x00007fffffffc158, CC1Args=size=0) at Compiler.cpp:116:44
    frame llvm#13: 0x000055555895a6c8 clangd`clang::clangd::(anonymous namespace)::Checker::buildInvocation(this=0x00007fffffffc760, TFS=0x00007fffffffe570, Contents= Has Value=false ) at Check.cpp:212:9
    frame llvm#14: 0x0000555558959cec clangd`clang::clangd::check(File=(Data = "build/test.cpp", Length = 64), TFS=0x00007fffffffe570, Opts=0x00007fffffffe600) at Check.cpp:486:34
    frame llvm#15: 0x000055555892164a clangd`main(argc=4, argv=0x00007fffffffecd8) at ClangdMain.cpp:993:12
    frame llvm#16: 0x00007ffff5c3ad85 libc.so.6`__libc_start_main + 229
    frame llvm#17: 0x00005555585bbe9e clangd`_start + 46
```

Test Plan: ninja ClangDriverTests && tools/clang/unittests/Driver/ClangDriverTests

Differential Revision: https://reviews.llvm.org/D154602
shiltian pushed a commit that referenced this pull request Jul 24, 2023
BlockDecl should be invalidated because of its invalid ParmVarDecl.

Fixes #1 of llvm#64005

Differential Revision: https://reviews.llvm.org/D155984
shiltian pushed a commit that referenced this pull request Aug 8, 2023
TSan reports the following data race:

  Write of size 4 at 0x000109e0b160 by thread T2 (mutexes: write M0, write M1):
    #0 NativeFile::Close() File.cpp:329
    #1 ConnectionFileDescriptor::Disconnect(lldb_private::Status*) ConnectionFileDescriptorPosix.cpp:232
    #2 Communication::Disconnect(lldb_private::Status*) Communication.cpp:61
    #3 process_gdb_remote::ProcessGDBRemote::DidExit() ProcessGDBRemote.cpp:1164
    #4 Process::SetExitStatus(int, char const*) Process.cpp:1097
    #5 process_gdb_remote::ProcessGDBRemote::MonitorDebugserverProcess(...) ProcessGDBRemote.cpp:3387

  Previous read of size 4 at 0x000109e0b160 by main thread (mutexes: write M2):
    #0 NativeFile::IsValid() const File.h:393
    #1 ConnectionFileDescriptor::IsConnected() const ConnectionFileDescriptorPosix.cpp:121
    #2 Communication::IsConnected() const Communication.cpp:79
    #3 process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(...) GDBRemoteCommunication.cpp:256
    #4 process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(...l) GDBRemoteCommunication.cpp:244
    #5 process_gdb_remote::GDBRemoteClientBase::SendPacketAndWaitForResponseNoLock(llvm::StringRef, StringExtractorGDBRemote&) GDBRemoteClientBase.cpp:246

The problem is that in WaitForPacketNoLock's run loop, it checks that
the connection is still connected. This races with the
ConnectionFileDescriptor disconnecting. Most (but not all) access to the
IOObject in ConnectionFileDescriptorPosix is already gated by the mutex.
This patch just protects IsConnected in the same way.

Differential revision: https://reviews.llvm.org/D157347
shiltian pushed a commit that referenced this pull request Aug 9, 2023
TSan reports the following race:

  Write of size 8 at 0x000107707ee8 by main thread:
    #0 lldb_private::ThreadedCommunication::StartReadThread(...) ThreadedCommunication.cpp:175
    #1 lldb_private::Process::SetSTDIOFileDescriptor(...) Process.cpp:4533
    #2 lldb_private::Platform::DebugProcess(...) Platform.cpp:1121
    #3 lldb_private::PlatformDarwin::DebugProcess(...) PlatformDarwin.cpp:711
    #4 lldb_private::Target::Launch(...) Target.cpp:3235
    #5 CommandObjectProcessLaunch::DoExecute(...) CommandObjectProcess.cpp:256
    #6 lldb_private::CommandObjectParsed::Execute(...) CommandObject.cpp:751
    #7 lldb_private::CommandInterpreter::HandleCommand(...) CommandInterpreter.cpp:2054

  Previous read of size 8 at 0x000107707ee8 by thread T5:
    #0 lldb_private::HostThread::IsJoinable(...) const HostThread.cpp:30
    #1 lldb_private::ThreadedCommunication::StopReadThread(...) ThreadedCommunication.cpp:192
    #2 lldb_private::Process::ShouldBroadcastEvent(...) Process.cpp:3420
    #3 lldb_private::Process::HandlePrivateEvent(...) Process.cpp:3728
    #4 lldb_private::Process::RunPrivateStateThread(...) Process.cpp:3914
    #5 std::__1::__function::__func<lldb_private::Process::StartPrivateStateThread(...) function.h:356
    #6 lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(...) HostNativeThreadBase.cpp:62
    #7 lldb_private::HostThreadMacOSX::ThreadCreateTrampoline(...) HostThreadMacOSX.mm:18

The problem is the lack of synchronization between starting and stopping
the read thread. This patch fixes that by protecting those operations
with a mutex.

Differential revision: https://reviews.llvm.org/D157361
shiltian pushed a commit that referenced this pull request Mar 14, 2024
…p canonicalization (llvm#84225)

The current canonicalization of `memref.dim` operating on the result of
`memref.reshape` into `memref.load` is incorrect as it doesn't check
whether the `index` operand of `memref.dim` dominates the source
`memref.reshape` op. It always introduces `memref.load` right after
`memref.reshape` to ensure the `memref` is not mutated before the
`memref.load` call. As a result, the following error is observed:

```
$> mlir-opt --canonicalize input.mlir

func.func @reshape_dim(%arg0: memref<*xf32>, %arg1: memref<?xindex>, %arg2: index) -> index {
    %c4 = arith.constant 4 : index
    %reshape = memref.reshape %arg0(%arg1) : (memref<*xf32>, memref<?xindex>) -> memref<*xf32>
    %0 = arith.muli %arg2, %c4 : index
    %dim = memref.dim %reshape, %0 : memref<*xf32>
    return %dim : index
  }
```

results in:

```
dominator.mlir:22:12: error: operand #1 does not dominate this use
    %dim = memref.dim %reshape, %0 : memref<*xf32>
           ^
dominator.mlir:22:12: note: see current operation: %1 = "memref.load"(%arg1, %2) <{nontemporal = false}> : (memref<?xindex>, index) -> index
dominator.mlir:21:10: note: operand defined here (op in the same block)
    %0 = arith.muli %arg2, %c4 : index
```

Properly fixing this issue requires a dominator analysis which is
expensive to run within a canonicalization pattern. So, this patch fixes
the canonicalization pattern by being more strict/conservative about the
legality condition in which we perform this canonicalization.
The more general pattern is also added to `tensor.dim`. Since tensors are
immutable we don't need to worry about where to introduce the
`tensor.extract` call after canonicalization.
shiltian pushed a commit that referenced this pull request Mar 20, 2024
…lvm#85653)

This reverts commit daebe5c.

This commit causes the following asan issue:

```
<snip>/llvm-project/build/bin/mlir-opt <snip>/llvm-project/mlir/test/Dialect/XeGPU/XeGPUOps.mlir | <snip>/llvm-project/build/bin/FileCheck <snip>/llvm-project/mlir/test/Dialect/XeGPU/XeGPUOps.mlir
# executed command: <snip>/llvm-project/build/bin/mlir-opt <snip>/llvm-project/mlir/test/Dialect/XeGPU/XeGPUOps.mlir
# .---command stderr------------
# | =================================================================
# | ==2772558==ERROR: AddressSanitizer: stack-use-after-return on address 0x7fd2c2c42b90 at pc 0x55e406d54614 bp 0x7ffc810e4070 sp 0x7ffc810e4068
# | READ of size 8 at 0x7fd2c2c42b90 thread T0
# |     #0 0x55e406d54613 in operator()<long int const*> /usr/include/c++/13/bits/predefined_ops.h:318
# |     #1 0x55e406d54613 in __count_if<long int const*, __gnu_cxx::__ops::_Iter_pred<mlir::verifyListOfOperandsOrIntegers(Operation*, llvm::StringRef, unsigned int, llvm::ArrayRef<long int>, ValueRange)::<lambda(int64_t)> > > /usr/include/c++/13/bits/stl_algobase.h:2125
# |     #2 0x55e406d54613 in count_if<long int const*, mlir::verifyListOfOperandsOrIntegers(Operation*, 
...
```
shiltian pushed a commit that referenced this pull request Mar 20, 2024
…oint. (llvm#83821)"

This reverts commit c2c1e6e. It creates
a use after free.

==8342==ERROR: AddressSanitizer: heap-use-after-free on address 0x50f000001760 at pc 0x55b9fb84a8fb bp 0x7ffc18468a10 sp 0x7ffc18468a08
READ of size 1 at 0x50f000001760 thread T0
 #0 0x55b9fb84a8fa in dropPoisonGeneratingFlags llvm/lib/Transforms/Vectorize/VPlan.h:1040:13
 #1 0x55b9fb84a8fa in llvm::VPlanTransforms::dropPoisonGeneratingRecipes(llvm::VPlan&, llvm::function_ref<bool (llvm::BasicBlock*)>)::$_0::operator()(llvm::VPRecipeBase*) const llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:1236:23
 #2 0x55b9fb84a196 in llvm::VPlanTransforms::dropPoisonGeneratingRecipes(llvm::VPlan&, llvm::function_ref<bool (llvm::BasicBlock*)>) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

Can be reproduced with asan on
Transforms/LoopVectorize/AArch64/sve-interleaved-masked-accesses.ll
Transforms/LoopVectorize/X86/pr81872.ll
Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll
shiltian pushed a commit that referenced this pull request Apr 29, 2024
Builder alerted me to the failing test, attempt #1 in the blind.
shiltian pushed a commit that referenced this pull request May 10, 2024
…e exception specification of a function (llvm#90760)

[temp.deduct.general] p6 states:
> At certain points in the template argument deduction process it is
necessary to take a function type that makes use of template parameters
and replace those template parameters with the corresponding template
arguments.
This is done at the beginning of template argument deduction when any
explicitly specified template arguments are substituted into the
function type, and again at the end of template argument deduction when
any template arguments that were deduced or obtained from default
arguments are substituted.

[temp.deduct.general] p7 goes on to say:
> The _deduction substitution loci_ are
> - the function type outside of the _noexcept-specifier_,
> - the explicit-specifier,
> - the template parameter declarations, and
> - the template argument list of a partial specialization
>
> The substitution occurs in all types and expressions that are used in
the deduction substitution loci. [...]

Consider the following:
```cpp
struct A
{
    static constexpr bool x = true;
};

template<typename T, typename U>
void f(T, U) noexcept(T::x); // #1

template<typename T, typename U>
void f(T, U*) noexcept(T::y); // #2

template<>
void f<A>(A, int*) noexcept; // clang currently accepts, GCC and EDG reject
```

Currently, `Sema::SubstituteExplicitTemplateArguments` will substitute
into the _noexcept-specifier_ when deducing template arguments from a
function declaration or when deducing template arguments for taking the
address of a function template (and the substitution is treated as a
SFINAE context). In the above example, `#1` is selected as the primary
template because substitution of the explicit template arguments into
the _noexcept-specifier_ of `#2` failed, which resulted in the candidate
being ignored.

This behavior is incorrect ([temp.deduct.general] note 4 says as much), and
this patch corrects it by deferring all substitution into the
_noexcept-specifier_ until it is instantiated.

As part of the necessary changes to make this patch work, the
instantiation of the exception specification of a function template
specialization when taking the address of a function template is changed
to only occur for the function selected by overload resolution per
[except.spec] p13.1 (as opposed to being instantiated for every candidate).
shiltian pushed a commit that referenced this pull request May 10, 2024
…ined member functions & member function templates (llvm#88963)

Consider the following snippet from the discussion of CWG2847 on the core reflector:
```
template<typename T>
concept C = sizeof(T) <= sizeof(long);

template<typename T>
struct A 
{
    template<typename U>
    void f(U) requires C<U>; // #1, declares a function template 

    void g() requires C<T>; // #2, declares a function

    template<>
    void f(char);  // #3, an explicit specialization of a function template that declares a function
};

template<>
template<typename U>
void A<short>::f(U) requires C<U>; // #4, an explicit specialization of a function template that declares a function template

template<>
template<>
void A<int>::f(int); // #5, an explicit specialization of a function template that declares a function

template<>
void A<long>::g(); // #6, an explicit specialization of a function that declares a function
```

A number of problems exist:
- Clang rejects `#4` because the trailing _requires-clause_ has `U`
substituted with the wrong template parameter depth when
`Sema::AreConstraintExpressionsEqual` is called to determine whether it
matches the trailing _requires-clause_ of the implicitly instantiated
function template.
- Clang rejects `#5` because the function template specialization
instantiated from `A<int>::f` has a trailing _requires-clause_, but `#5`
does not (nor can it have one as it isn't a templated function).
- Clang rejects `#6` for the same reasons it rejects `#5`.

This patch resolves these issues by making the following changes:
- To fix `#4`, `Sema::AreConstraintExpressionsEqual` is passed
`FunctionTemplateDecl`s when comparing the trailing _requires-clauses_
of `#4` and the function template instantiated from `#1`.
- To fix `#5` and `#6`, the trailing _requires-clauses_ are not compared
for explicit specializations that declare functions.

In addition to these changes, `CheckMemberSpecialization` now considers
constraint satisfaction/constraint partial ordering when determining
which member function is specialized by an explicit specialization of a
member function for an implicit instantiation of a class template (we
previously would select the first function that has the same type as the
explicit specialization). With constraints taken under consideration, we
match EDG's behavior for these declarations.
shiltian pushed a commit that referenced this pull request May 18, 2024
...which caused issues like

> ==42==ERROR: AddressSanitizer failed to deallocate 0x32 (50) bytes at
address 0x117e0000 (error code: 28)
> ==42==Cannot dump memory map on emscriptenAddressSanitizer: CHECK
failed: sanitizer_common.cpp:81 "((0 && "unable to unmmap")) != (0)"
(0x0, 0x0) (tid=288045824)
> #0 0x14f73b0c in __asan::CheckUnwind()+0x14f73b0c
(this.program+0x14f73b0c)
> #1 0x14f8a3c2 in __sanitizer::CheckFailed(char const*, int, char
const*, unsigned long long, unsigned long long)+0x14f8a3c2
(this.program+0x14f8a3c2)
> #2 0x14f7d6e1 in __sanitizer::ReportMunmapFailureAndDie(void*,
unsigned long, int, bool)+0x14f7d6e1 (this.program+0x14f7d6e1)
> #3 0x14f81fbd in __sanitizer::UnmapOrDie(void*, unsigned
long)+0x14f81fbd (this.program+0x14f81fbd)
> #4 0x14f875df in __sanitizer::SuppressionContext::ParseFromFile(char
const*)+0x14f875df (this.program+0x14f875df)
> #5 0x14f74eab in __asan::InitializeSuppressions()+0x14f74eab
(this.program+0x14f74eab)
> #6 0x14f73a1a in __asan::AsanInitInternal()+0x14f73a1a
(this.program+0x14f73a1a)

when trying to use an ASan suppressions file under Emscripten: Even
though it would be considered OK by SUSv4, the Emscripten runtime states
"We don't support partial munmapping" (see

<emscripten-core/emscripten@f4115eb>
"Implement MAP_ANONYMOUS on top of malloc in STANDALONE_WASM mode
(llvm#16289)").

Co-authored-by: Stephan Bergmann <[email protected]>
shiltian pushed a commit that referenced this pull request May 18, 2024
…ication as used during partial ordering (llvm#91534)

We do not deduce template arguments from the exception specification
when determining the primary template of a function template
specialization or when taking the address of a function template.
Therefore, this patch changes `isAtLeastAsSpecializedAs` such that we do
not mark template parameters in the exception specification as 'used'
during partial ordering (per [temp.deduct.partial]
p12) to prevent the following from being ambiguous:

```
template<typename T, typename U>
void f(U) noexcept(noexcept(T())); // #1

template<typename T>
void f(T*) noexcept; // #2

template<>
void f<int>(int*) noexcept; // currently ambiguous, selects #2 with this patch applied 
```

Although there is no corresponding wording in the standard (see core issue filed here
cplusplus/CWG#537), this seems
to be the intended behavior given the definition of _deduction
substitution loci_ in [temp.deduct.general] p7 (and EDG does the same thing).
shiltian pushed a commit that referenced this pull request May 18, 2024
…erSize (llvm#67657)"

This reverts commit f0b3654.

This commit triggers UB by reading an uninitialized variable.

`UP.PartialThreshold` is used uninitialized in `getUnrollingPreferences()` when
it is called from `LoopVectorizationPlanner::executePlan()`. In this case the
`UP` variable is created on the stack and its fields are not initialized.

```
==8802==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x557c0b081b99 in llvm::BasicTTIImplBase<llvm::X86TTIImpl>::getUnrollingPreferences(llvm::Loop*, llvm::ScalarEvolution&, llvm::TargetTransformInfo::UnrollingPreferences&, llvm::OptimizationRemarkEmitter*) llvm-project/llvm/include/llvm/CodeGen/BasicTTIImpl.h
    #1 0x557c0b07a40c in llvm::TargetTransformInfo::Model<llvm::X86TTIImpl>::getUnrollingPreferences(llvm::Loop*, llvm::ScalarEvolution&, llvm::TargetTransformInfo::UnrollingPreferences&, llvm::OptimizationRemarkEmitter*) llvm-project/llvm/include/llvm/Analysis/TargetTransformInfo.h:2277:17
    #2 0x557c0f5d69ee in llvm::TargetTransformInfo::getUnrollingPreferences(llvm::Loop*, llvm::ScalarEvolution&, llvm::TargetTransformInfo::UnrollingPreferences&, llvm::OptimizationRemarkEmitter*) const llvm-project/llvm/lib/Analysis/TargetTransformInfo.cpp:387:19
    #3 0x557c0e6b96a0 in llvm::LoopVectorizationPlanner::executePlan(llvm::ElementCount, unsigned int, llvm::VPlan&, llvm::InnerLoopVectorizer&, llvm::DominatorTree*, bool, llvm::DenseMap<llvm::SCEV const*, llvm::Value*, llvm::DenseMapInfo<llvm::SCEV const*, void>, llvm::detail::DenseMapPair<llvm::SCEV const*, llvm::Value*>> const*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7624:7
    #4 0x557c0e6e4b63 in llvm::LoopVectorizePass::processLoop(llvm::Loop*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10253:13
    #5 0x557c0e6f2429 in llvm::LoopVectorizePass::runImpl(llvm::Function&, llvm::ScalarEvolution&, llvm::LoopInfo&, llvm::TargetTransformInfo&, llvm::DominatorTree&, llvm::BlockFrequencyInfo*, llvm::TargetLibraryInfo*, llvm::DemandedBits&, llvm::AssumptionCache&, llvm::LoopAccessInfoManager&, llvm::OptimizationRemarkEmitter&, llvm::ProfileSummaryInfo*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10344:30
    #6 0x557c0e6f2f97 in llvm::LoopVectorizePass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10383:9

[...]

  Uninitialized value was created by an allocation of 'UP' in the stack frame
    #0 0x557c0e6b961e in llvm::LoopVectorizationPlanner::executePlan(llvm::ElementCount, unsigned int, llvm::VPlan&, llvm::InnerLoopVectorizer&, llvm::DominatorTree*, bool, llvm::DenseMap<llvm::SCEV const*, llvm::Value*, llvm::DenseMapInfo<llvm::SCEV const*, void>, llvm::detail::DenseMapPair<llvm::SCEV const*, llvm::Value*>> const*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7623:3
```
shiltian pushed a commit that referenced this pull request May 18, 2024
…vm#90820)

This solves some ambuguity introduced in P0522 regarding how
template template parameters are partially ordered, and should reduce
the negative impact of enabling `-frelaxed-template-template-args`
by default.

When performing template argument deduction, a template template
parameter
containing no packs should be more specialized than one that does.

Given the following example:
```C++
template<class T2> struct A;
template<template<class ...T3s> class TT1, class T4> struct A<TT1<T4>>; // #1
template<template<class    T5 > class TT2, class T6> struct A<TT2<T6>>; // #2

template<class T1> struct B;
template struct A<B<char>>;
```

Prior to P0522, candidate `#2` would be more specialized.
After P0522, neither is more specialized, so this becomes ambiguous.
With this change, `#2` becomes more specialized again,
maintaining compatibility with pre-P0522 implementations.

The problem is that in P0522, candidates are at least as specialized
when matching packs to fixed-size lists both ways, whereas before,
a fixed-size list is more specialized.

This patch keeps the original behavior when checking template arguments
outside deduction, but restores this aspect of pre-P0522 matching
during deduction.

---

Since this changes provisional implementation of CWG2398 which has
not been released yet, and already contains a changelog entry,
we don't provide a changelog entry here.
shiltian pushed a commit that referenced this pull request May 22, 2024
…llvm#92855)

This solves some ambuguity introduced in P0522 regarding how template
template parameters are partially ordered, and should reduce the
negative impact of enabling `-frelaxed-template-template-args` by
default.

When performing template argument deduction, we extend the provisional
wording introduced in llvm#89807 so
it also covers deduction of class templates.

Given the following example:
```C++
template <class T1, class T2 = float> struct A;
template <class T3> struct B;

template <template <class T4> class TT1, class T5> struct B<TT1<T5>>;   // #1
template <class T6, class T7>                      struct B<A<T6, T7>>; // #2

template struct B<A<int>>;
```
Prior to P0522, `#2` was picked. Afterwards, this became ambiguous. This
patch restores the pre-P0522 behavior, `#2` is picked again.

This has the beneficial side effect of making the following code valid:
```C++
template<class T, class U> struct A {};
A<int, float> v;
template<template<class> class TT> void f(TT<int>);

// OK: TT picks 'float' as the default argument for the second parameter.
void g() { f(v); }
```

---

Since this changes provisional implementation of CWG2398 which has not
been released yet, and already contains a changelog entry, we don't
provide a changelog entry here.
shiltian pushed a commit that referenced this pull request May 24, 2024
The problematic program is as follows:

```shell
#define pre_a 0
#define PRE(x) pre_##x

void f(void) {
    PRE(a) && 0;
}

int main(void) { return 0; }
```

in which after token concatenation (`##`), there's another nested macro
`pre_a`.

Currently only the outer expansion region will be produced. ([compiler
explorer
link](https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename:'1',fontScale:14,fontUsePx:'0',j:1,lang:___c,selection:(endColumn:29,endLineNumber:8,positionColumn:29,positionLineNumber:8,selectionStartColumn:29,selectionStartLineNumber:8,startColumn:29,startLineNumber:8),source:'%23define+pre_a+0%0A%23define+PRE(x)+pre_%23%23x%0A%0Avoid+f(void)+%7B%0A++++PRE(a)+%26%26+0%3B%0A%7D%0A%0Aint+main(void)+%7B+return+0%3B+%7D'),l:'5',n:'0',o:'C+source+%231',t:'0')),k:51.69491525423727,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((g:!((h:compiler,i:(compiler:cclang_assertions_trunk,filters:(b:'0',binary:'1',binaryObject:'1',commentOnly:'0',debugCalls:'1',demangle:'0',directives:'0',execute:'0',intel:'0',libraryCode:'1',trim:'1',verboseDemangling:'0'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:2,lang:___c,libs:!(),options:'-fprofile-instr-generate+-fcoverage-mapping+-fcoverage-mcdc+-Xclang+-dump-coverage-mapping+',overrides:!(),selection:(endColumn:1,endLineNumber:1,positionColumn:1,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:1),l:'5',n:'0',o:'+x86-64+clang+(assertions+trunk)+(Editor+%231)',t:'0')),k:34.5741843594503,l:'4',m:28.903654485049834,n:'0',o:'',s:0,t:'0'),(g:!((h:output,i:(compilerName:'x86-64+clang+(trunk)',editorid:1,fontScale:14,fontUsePx:'0',j:2,wrap:'1'),l:'5',n:'0',o:'Output+of+x86-64+clang+(assertions+trunk)+(Compiler+%232)',t:'0')),header:(),l:'4',m:71.09634551495017,n:'0',o:'',s:0,t:'0')),k:48.30508474576271,l:'3',n:'0',o:'',t:'0')),l:'2',m:100,n:'0',o:'',t:'0')),version:4))

```text
f:
  File 0, 4:14 -> 6:2 = #0
  Decision,File 0, 5:5 -> 5:16 = M:0, C:2
  Expansion,File 0, 5:5 -> 5:8 = #0 (Expanded file = 1)
  File 0, 5:15 -> 5:16 = #1
  Branch,File 0, 5:15 -> 5:16 = 0, 0 [2,0,0] 
  File 1, 2:16 -> 2:23 = #0
  File 2, 1:15 -> 1:16 = #0
  File 2, 1:15 -> 1:16 = #0
  Branch,File 2, 1:15 -> 1:16 = 0, 0 [1,2,0] 
```

The inner expansion region isn't produced because:

1. In the range-based for loop quoted below, each sloc is processed and
possibly emit a corresponding expansion region.
2. For our sloc in question, its direct parent returned by
`getIncludeOrExpansionLoc()` is a `<scratch space>`, because that's how
`##` is processed.


https://github.com/llvm/llvm-project/blob/88b6186af3908c55b357858eb348b5143f21c289/clang/lib/CodeGen/CoverageMappingGen.cpp#L518-L520

3. This `<scratch space>` cannot be found in the FileID mapping so
`ParentFileID` will be assigned an `std::nullopt`


https://github.com/llvm/llvm-project/blob/88b6186af3908c55b357858eb348b5143f21c289/clang/lib/CodeGen/CoverageMappingGen.cpp#L521-L526

4. As a result this iteration of for loop finishes early and no
expansion region is added for the sloc.

This problem gets worse with MC/DC: as the example shows, there's a
branch from File 2 but File 2 itself is missing. This will trigger
assertion failures.

The fix is more or less a workaround and takes a similar approach as
llvm#89573.

~~Depends on llvm#89573.~~ This includes llvm#89573. Kudos to @chapuni!
This and llvm#89573 together fix llvm#87000: I tested locally, both the reduced
program and my original use case (fwiw, Linux kernel) can run
successfully.

---------

Co-authored-by: NAKAMURA Takumi <[email protected]>
shiltian pushed a commit that referenced this pull request Jun 12, 2024
…des (llvm#94453)

LSR will generate chains of related instructions with a known increment
between them. With SVE, in the case of the test case, this can include
increments like 'vscale * 16 + 8'. The idea of this patch is if we have
a '+8' increment already calculated in the chain, we can generate a
(legal) '+ vscale*16' addressing mode from it, allowing us to use the
'[x16, #1, mul vl]' addressing mode instructions.

In order to do this we keep track of the known 'bases' when generating
chains in GenerateIVChain, checking for each if the accumulated
increment expression from the base neatly folds into a legal addressing
mode. If they do not we fall back to the existing LeftOverExpr, whether
it is legal or not.

This is mostly orthogonal to llvm#88124, dealing with the generation of
chains as opposed to rest of LSR. The existing vscale addressing mode
work has greatly helped compared to the last time I looked at this,
allowing us to check that the addressing modes are indeed legal.
shiltian pushed a commit that referenced this pull request Jun 19, 2024
…on (llvm#94752)

Fixes llvm#62925.

The following code:
```cpp
#include <map>

int main() {
   std::map m1 = {std::pair{"foo", 2}, {"bar", 3}}; // guide #2
   std::map m2(m1.begin(), m1.end()); // guide #1
}
```
Is rejected by clang, but accepted by both gcc and msvc:
https://godbolt.org/z/6v4fvabb5 .

So basically CTAD with copy-list-initialization is rejected.

Note that this exact code is also used in a cppreference article:
https://en.cppreference.com/w/cpp/container/map/deduction_guides

I checked the C++11 and C++20 standard drafts to see whether suppressing
user conversion is the correct thing to do for user conversions. Based
on the standard I don't think that it is correct.

```
13.3.1.4 Copy-initialization of class by user-defined conversion [over.match.copy]
Under the conditions specified in 8.5, as part of a copy-initialization of an object of class type, a user-defined
conversion can be invoked to convert an initializer expression to the type of the object being initialized.
Overload resolution is used to select the user-defined conversion to be invoked
```
So we could use user defined conversions according to the standard.

```
If a narrowing conversion is required to initialize any of the elements, the
program is ill-formed.
```
We should not do narrowing.

```
In copy-list-initialization, if an explicit constructor is chosen, the initialization is ill-formed.
```
We should not use explicit constructors.
shiltian pushed a commit that referenced this pull request Jun 19, 2024
`rethrow` instruction is a terminator, but when when its DAG is built in
`SelectionDAGBuilder` in a custom routine, it was NOT treated as such.

```ll
rethrow:                                          ; preds = %catch.start
  invoke void @llvm.wasm.rethrow() #1 [ "funclet"(token %1) ]
          to label %unreachable unwind label %ehcleanup

ehcleanup:                                        ; preds = %rethrow, %catch.dispatch
  %tmp = phi i32 [ 10, %catch.dispatch ], [ 20, %rethrow ]
  ...
```

In this bitcode, because of the `phi`, a `CONST_I32` will be created in
the `rethrow` BB. Without this patch, the DAG for the `rethrow` BB looks
like this:
```
  t0: ch,glue = EntryToken
      t3: ch = CopyToReg t0, Register:i32 %9, Constant:i32<20>
      t5: ch = llvm.wasm.rethrow t0, TargetConstant:i32<12161>
    t6: ch = TokenFactor t3, t5
  t8: ch = br t6, BasicBlock:ch<unreachable 0x562532e43c50>
```
Note that `CopyToReg` and `llvm.wasm.rethrow` don't have dependence so
either can come first in the selected code, which can result in the code
like
```mir
bb.3.rethrow:
  RETHROW 0, implicit-def dead $arguments
  %9:i32 = CONST_I32 20, implicit-def dead $arguments
  BR %bb.6, implicit-def dead $arguments
```

After this patch, `llvm.wasm.rethrow` is treated as a terminator, and
the DAG will look like
```
        t0: ch,glue = EntryToken
      t3: ch = CopyToReg t0, Register:i32 %9, Constant:i32<20>
    t5: ch = llvm.wasm.rethrow t3, TargetConstant:i32<12161>
  t7: ch = br t5, BasicBlock:ch<unreachable 0x5555e3d32c70>
```
Note that now `rethrow` takes a token from `CopyToReg`, so `rethrow` has
to come after `CopyToReg`. And the resulting code will be
```mir
bb.3.rethrow:
  %9:i32 = CONST_I32 20, implicit-def dead $arguments
  RETHROW 0, implicit-def dead $arguments
  BR %bb.6, implicit-def dead $arguments
```

I'm not very familiar with the internals of `getRoot` vs.
`getControlRoot`, but other terminator instructions seem to use the
latter, and using it for `rethrow` too worked.
shiltian pushed a commit that referenced this pull request Aug 7, 2024
```
  UBSan-Standalone-sparc :: TestCases/Misc/Linux/diag-stacktrace.cpp
```
`FAIL`s on 32 and 64-bit Linux/sparc64 (and on Solaris/sparcv9, too: the
test isn't Linux-specific at all). With
`UBSAN_OPTIONS=fast_unwind_on_fatal=1`, the stack trace shows a
duplicate innermost frame:
```
compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:31: runtime error: execution reached the end of a value-returning function without returning a value
    #0 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35
    #1 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35
    #2 0x7003a714 in g() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:17:38
```
which isn't seen with `fast_unwind_on_fatal=0`.

This turns out to be another fallout from fixing
`__builtin_return_address`/`__builtin_extract_return_addr` on SPARC. In
`sanitizer_stacktrace_sparc.cpp` (`BufferedStackTrace::UnwindFast`) the
`pc` arg is the return address, while `pc1` from the stack frame
(`fr_savpc`) is the address of the `call` insn, leading to a double
entry for the innermost frame in `trace_buffer[]`.

This patch fixes this by moving the adjustment before all uses.

Tested on `sparc64-unknown-linux-gnu` and `sparcv9-sun-solaris2.11`
(with the `ubsan/TestCases/Misc/Linux` tests enabled).
shiltian pushed a commit that referenced this pull request Aug 26, 2024
…lvm#104148)

`hasOperands` does not always execute matchers in the order they are
written. This can cause issue in code using bindings when one operand
matcher is relying on a binding set by the other. With this change, the
first matcher present in the code is always executed first and any
binding it sets are available to the second matcher.

Simple example with current version (1 match) and new version (2
matches):
```bash
> cat tmp.cpp
int a = 13;
int b = ((int) a) - a;
int c = a - ((int) a);

> clang-query tmp.cpp
clang-query> set traversal IgnoreUnlessSpelledInSource
clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d"))))))

Match #1:

tmp.cpp:1:1: note: "d" binds here
int a = 13;
^~~~~~~~~~
tmp.cpp:2:9: note: "root" binds here
int b = ((int)a) - a;
        ^~~~~~~~~~~~
1 match.

> ./build/bin/clang-query tmp.cpp
clang-query> set traversal IgnoreUnlessSpelledInSource
clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d"))))))

Match #1:

tmp.cpp:1:1: note: "d" binds here
    1 | int a = 13;
      | ^~~~~~~~~~
tmp.cpp:2:9: note: "root" binds here
    2 | int b = ((int)a) - a;
      |         ^~~~~~~~~~~~

Match #2:

tmp.cpp:1:1: note: "d" binds here
    1 | int a = 13;
      | ^~~~~~~~~~
tmp.cpp:3:9: note: "root" binds here
    3 | int c = a - ((int)a);
      |         ^~~~~~~~~~~~
2 matches.
```

If this should be documented or regression tested anywhere please let me
know where.
shiltian pushed a commit that referenced this pull request Aug 26, 2024
…104523)

Compilers and language runtimes often use helper functions that are
fundamentally uninteresting when debugging anything but the
compiler/runtime itself. This patch introduces a user-extensible
mechanism that allows for these frames to be hidden from backtraces and
automatically skipped over when navigating the stack with `up` and
`down`.

This does not affect the numbering of frames, so `f <N>` will still
provide access to the hidden frames. The `bt` output will also print a
hint that frames have been hidden.

My primary motivation for this feature is to hide thunks in the Swift
programming language, but I'm including an example recognizer for
`std::function::operator()` that I wished for myself many times while
debugging LLDB.

rdar://126629381


Example output. (Yes, my proof-of-concept recognizer could hide even
more frames if we had a method that returned the function name without
the return type or I used something that isn't based off regex, but it's
really only meant as an example).

before:
```
(lldb) thread backtrace --filtered=false
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10
    frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25
    frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12
    frame #3: 0x0000000100003968 a.out`std::__1::__function::__alloc_func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()[abi:se200000](this=0x000000016fdff280, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:171:12
    frame #4: 0x00000001000026bc a.out`std::__1::__function::__func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()(this=0x000000016fdff278, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:313:10
    frame #5: 0x0000000100003c38 a.out`std::__1::__function::__value_func<int (int, int)>::operator()[abi:se200000](this=0x000000016fdff278, __args=0x000000016fdff224, __args=0x000000016fdff220) const at function.h:430:12
    frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10
    frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10
    frame #8: 0x0000000183cdf154 dyld`start + 2476
(lldb) 
```

after

```
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10
    frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25
    frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12
    frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10
    frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10
    frame #8: 0x0000000183cdf154 dyld`start + 2476
Note: Some frames were hidden by frame recognizers
```
nicolasmarie1 pushed a commit to nicolasmarie1/llvm-project that referenced this pull request Aug 30, 2024
`JITDylibSearchOrderResolver` local variable can be destroyed before
completion of all callbacks. Capture it together with `Deps` in
`OnEmitted` callback.

Original error:

```
==2035==ERROR: AddressSanitizer: stack-use-after-return on address 0x7bebfa155b70 at pc 0x7ff2a9a88b4a bp 0x7bec08d51980 sp 0x7bec08d51978
READ of size 8 at 0x7bebfa155b70 thread T87 (tf_xla-cpu-llvm)
    #0 0x7ff2a9a88b49 in operator() llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:58
    shiltian#1 0x7ff2a9a88b49 in __invoke<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:149:25
    shiltian#2 0x7ff2a9a88b49 in __call<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:224:5
    shiltian#3 0x7ff2a9a88b49 in operator() libcxx/include/__functional/function.h:210:12
    shiltian#4 0x7ff2a9a88b49 in void std::__u::__function::__policy_invoker<void (llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr,
```
shiltian pushed a commit that referenced this pull request Sep 12, 2024
…llvm#94981)

This extends default argument deduction to cover class templates as
well, applying only to partial ordering, adding to the provisional
wording introduced in llvm#89807.

This solves some ambuguity introduced in P0522 regarding how template
template parameters are partially ordered, and should reduce the
negative impact of enabling `-frelaxed-template-template-args` by
default.

Given the following example:
```C++
template <class T1, class T2 = float> struct A;
template <class T3> struct B;

template <template <class T4> class TT1, class T5> struct B<TT1<T5>>;   // #1
template <class T6, class T7>                      struct B<A<T6, T7>>; // #2

template struct B<A<int>>;
```
Prior to P0522, `#2` was picked. Afterwards, this became ambiguous. This
patch restores the pre-P0522 behavior, `#2` is picked again.
shiltian pushed a commit that referenced this pull request Jan 22, 2025
This will be sent by Arm's Guarded Control Stack extension when an
invalid return is executed.

The signal does have an address we could show, but it's the PC at which
the fault occured. The debugger has plenty of ways to show you that
already, so I've left it out.

```
(lldb) c
Process 460 resuming
Process 460 stopped
* thread #1, name = 'test', stop reason = signal SIGSEGV: control protection fault
    frame #0: 0x0000000000400784 test`main at main.c:57:1
   54  	  afunc();
   55  	  printf("return from main\n");
   56  	  return 0;
-> 57  	}
(lldb) dis
<...>
->  0x400784 <+100>: ret
```

The new test case generates the signal by corrupting the link register
then attempting to return. This will work whether we manually enable GCS
or the C library does it for us.

(in the former case you could just return from main and it would fault)
shiltian pushed a commit that referenced this pull request Jan 22, 2025
llvm#123877)

Reverts llvm#122811 due to buildbot breakage e.g.,
https://lab.llvm.org/buildbot/#/builders/52/builds/5421/steps/11/logs/stdio

ASan output from local re-run:
```
==2780289==ERROR: AddressSanitizer: use-after-poison on address 0x7e0b87e28d28 at pc 0x55a979a99e7e bp 0x7ffe4b18f0b0 sp 0x7ffe4b18f0a8
READ of size 1 at 0x7e0b87e28d28 thread T0
    #0 0x55a979a99e7d in getStorageClass /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/Object/COFF.h:344
    #1 0x55a979a99e7d in isSectionDefinition /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/Object/COFF.h:429:9
    #2 0x55a979a99e7d in getSymbols /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/LLDMapFile.cpp:54:42
    #3 0x55a979a99e7d in lld::coff::writeLLDMapFile(lld::coff::COFFLinkerContext const&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/LLDMapFile.cpp:103:40
    #4 0x55a979a16879 in (anonymous namespace)::Writer::run() /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Writer.cpp:810:3
    #5 0x55a979a00aac in lld::coff::writeResult(lld::coff::COFFLinkerContext&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Writer.cpp:354:15
    #6 0x55a97985f7ed in lld::coff::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Driver.cpp:2826:3
    #7 0x55a97984cdd3 in lld::coff::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Driver.cpp:97:15
    #8 0x55a9797f9793 in lld::unsafeLldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>, bool) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:163:12
    llvm#9 0x55a9797fa3b6 in operator() /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:188:15
    llvm#10 0x55a9797fa3b6 in void llvm::function_ref<void ()>::callback_fn<lld::lldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>)::$_0>(long) /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:12
    llvm#11 0x55a97966cb93 in operator() /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:69:12
    llvm#12 0x55a97966cb93 in llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:426:3
    llvm#13 0x55a9797f9dc3 in lld::lldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:187:14
    llvm#14 0x55a979627512 in lld_main(int, char**, llvm::ToolContext const&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/tools/lld/lld.cpp:103:14
    llvm#15 0x55a979628731 in main /usr/local/google/home/thurston/buildbot_repro/llvm_build_asan/tools/lld/tools/lld/lld-driver.cpp:17:10
    llvm#16 0x7ffb8b202c89 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    llvm#17 0x7ffb8b202d44 in __libc_start_main csu/../csu/libc-start.c:360:3
    llvm#18 0x55a97953ef60 in _start (/usr/local/google/home/thurston/buildbot_repro/llvm_build_asan/bin/lld+0x8fd1f60)
```
shiltian pushed a commit that referenced this pull request May 15, 2025
The mcmodel=tiny memory model is only valid on ARM targets. While trying
this on X86 compiler throws an internal error along with stack dump.
llvm#125641
This patch resolves the issue.
Reduced test case:
```
#include <stdio.h>
int main( void )
{
printf( "Hello, World!\n" ); 
return 0; 
}
```
```
0.	Program arguments: /opt/compiler-explorer/clang-trunk/bin/clang++ -gdwarf-4 -g -o /app/output.s -fno-verbose-asm -S --gcc-toolchain=/opt/compiler-explorer/gcc-snapshot -fcolor-diagnostics -fno-crash-diagnostics -mcmodel=tiny <source>
1.	<eof> parser at end of file
 #0 0x0000000003b10218 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3b10218)
 #1 0x0000000003b0e35c llvm::sys::CleanupOnSignal(unsigned long) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3b0e35c)
 #2 0x0000000003a5dbc3 llvm::CrashRecoveryContext::HandleExit(int) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a5dbc3)
 #3 0x0000000003b05cfe llvm::sys::Process::Exit(int, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3b05cfe)
 #4 0x0000000000d4e3eb LLVMErrorHandler(void*, char const*, bool) cc1_main.cpp:0:0
 #5 0x0000000003a67c93 llvm::report_fatal_error(llvm::Twine const&, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a67c93)
 #6 0x0000000003a67df8 (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a67df8)
 #7 0x0000000002549148 llvm::X86TargetMachine::X86TargetMachine(llvm::Target const&, llvm::Triple const&, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, std::optional<llvm::Reloc::Model>, std::optional<llvm::CodeModel::Model>, llvm::CodeGenOptLevel, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x2549148)
 #8 0x00000000025491fc llvm::RegisterTargetMachine<llvm::X86TargetMachine>::Allocator(llvm::Target const&, llvm::Triple const&, llvm::StringRef, llvm::StringRef, llvm::TargetOptions const&, std::optional<llvm::Reloc::Model>, std::optional<llvm::CodeModel::Model>, llvm::CodeGenOptLevel, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x25491fc)
 llvm#9 0x0000000003db74cc clang::emitBackendOutput(clang::CompilerInstance&, clang::CodeGenOptions&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3db74cc)
llvm#10 0x0000000004460d95 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4460d95)
llvm#11 0x00000000060005ec clang::ParseAST(clang::Sema&, bool, bool) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x60005ec)
llvm#12 0x00000000044614b5 clang::CodeGenAction::ExecuteAction() (/opt/compiler-explorer/clang-trunk/bin/clang+++0x44614b5)
llvm#13 0x0000000004737121 clang::FrontendAction::Execute() (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4737121)
llvm#14 0x00000000046b777b clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x46b777b)
llvm#15 0x00000000048229e3 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x48229e3)
llvm#16 0x0000000000d50621 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/compiler-explorer/clang-trunk/bin/clang+++0xd50621)
llvm#17 0x0000000000d48e2d ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
llvm#18 0x00000000044acc99 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0
llvm#19 0x0000000003a5dac3 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x3a5dac3)
llvm#20 0x00000000044aceb9 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (.part.0) Job.cpp:0:0
llvm#21 0x00000000044710dd clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x44710dd)
llvm#22 0x0000000004472071 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x4472071)
llvm#23 0x000000000447c3fc clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0x447c3fc)
llvm#24 0x0000000000d4d2b1 clang_main(int, char**, llvm::ToolContext const&) (/opt/compiler-explorer/clang-trunk/bin/clang+++0xd4d2b1)
llvm#25 0x0000000000c12464 main (/opt/compiler-explorer/clang-trunk/bin/clang+++0xc12464)
llvm#26 0x00007ae43b029d90 (/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
llvm#27 0x00007ae43b029e40 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e40)
llvm#28 0x0000000000d488c5 _start (/opt/compiler-explorer/clang-trunk/bin/clang+++0xd488c5)
```

---------

Co-authored-by: Shashwathi N <[email protected]>
shiltian pushed a commit that referenced this pull request May 15, 2025
… `getForwardSlice` matchers (llvm#115670)

Improve mlir-query tool by implementing `getBackwardSlice` and
`getForwardSlice` matchers. As an addition `SetQuery` also needed to be
added to enable custom configuration for each query. e.g: `inclusive`,
`omitUsesFromAbove`, `omitBlockArguments`.

Note: backwardSlice and forwardSlice algoritms are the same as the ones
in `mlir/lib/Analysis/SliceAnalysis.cpp`
Example of current matcher. The query was made to the file:
`mlir/test/mlir-query/complex-test.mlir`

```mlir
./mlir-query /home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir -c "match getDefinitions(hasOpName(\"arith.add
f\"),2)"

Match #1:

/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:5:8:
  %0 = linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel", "parallel"]} ins(%arg0 : tensor<5x5xf32>) outs(%arg1 : tensor<5x5xf32>) {
       ^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:7:10: note: "root" binds here
    %2 = arith.addf %in, %in : f32
         ^
Match #2:

/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:10:16:
  %collapsed = tensor.collapse_shape %0 [[0, 1]] : tensor<5x5xf32> into tensor<25xf32>
               ^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:13:11:
    %c2 = arith.constant 2 : index
          ^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:14:18:
    %extracted = tensor.extract %collapsed[%c2] : tensor<25xf32>
                 ^
/home/dbudii/personal/llvm-project/mlir/test/mlir-query/complex-test.mlir:15:10: note: "root" binds here
    %2 = arith.addf %extracted, %extracted : f32
         ^
2 matches.
```
shiltian pushed a commit that referenced this pull request Oct 23, 2025
**Mitigation for:** google/sanitizers#749

**Disclosure:** I'm not an ASan compiler expert yet (I'm trying to
learn!), I primarily work in the runtime. Some of this PR was developed
with the help of AI tools (primarily as a "fuzzy `grep` engine"), but
I've manually refined and tested the output, and can speak for every
line. In general, I used it only to orient myself and for
"rubberducking".

**Context:**

The msvc ASan team (👋 ) has received an internal request to improve
clang's exception handling under ASan for Windows. Namely, we're
interested in **mitigating** this bug:
google/sanitizers#749

To summarize, today, clang + ASan produces a false-positive error for
this program:

```C++
#include <cstdio>
#include <exception>
int main()
{
	try	{
		throw std::exception("test");
	}catch (const std::exception& ex){
		puts(ex.what());
	}
	return 0;
}
```

The error reads as such:


```
C:\Users\dajusto\source\repros\upstream>type main.cpp
#include <cstdio>
#include <exception>
int main()
{
        try     {
                throw std::exception("test");
        }catch (const std::exception& ex){
                puts(ex.what());
        }
        return 0;
}
C:\Users\dajusto\source\repros\upstream>"C:\Users\dajusto\source\repos\llvm-project\build.runtimes\bin\clang.exe" -fsanitize=address -g -O0 main.cpp

C:\Users\dajusto\source\repros\upstream>a.exe
=================================================================
==19112==ERROR: AddressSanitizer: access-violation on unknown address 0x000000000000 (pc 0x7ff72c7c11d9 bp 0x0080000ff960 sp 0x0080000fcf50 T0)
==19112==The signal is caused by a READ memory access.
==19112==Hint: address points to the zero page.
    #0 0x7ff72c7c11d8 in main C:\Users\dajusto\source\repros\upstream\main.cpp:8
    #1 0x7ff72c7d479f in _CallSettingFrame C:\repos\msvc\src\vctools\crt\vcruntime\src\eh\amd64\handlers.asm:49
    #2 0x7ff72c7c8944 in __FrameHandler3::CxxCallCatchBlock(struct _EXCEPTION_RECORD *) C:\repos\msvc\src\vctools\crt\vcruntime\src\eh\frame.cpp:1567
    #3 0x7ffb4a90e3e5  (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18012e3e5)
    #4 0x7ff72c7c1128 in main C:\Users\dajusto\source\repros\upstream\main.cpp:6
    #5 0x7ff72c7c33db in invoke_main C:\repos\msvc\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78
    #6 0x7ff72c7c33db in __scrt_common_main_seh C:\repos\msvc\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
    #7 0x7ffb49b05c06  (C:\WINDOWS\System32\KERNEL32.DLL+0x180035c06)
    #8 0x7ffb4a8455ef  (C:\WINDOWS\SYSTEM32\ntdll.dll+0x1800655ef)

==19112==Register values:
rax = 0  rbx = 80000ff8e0  rcx = 27d76d00000  rdx = 80000ff8e0
rdi = 80000fdd50  rsi = 80000ff6a0  rbp = 80000ff960  rsp = 80000fcf50
r8  = 100  r9  = 19930520  r10 = 8000503a90  r11 = 80000fd540
r12 = 80000fd020  r13 = 0  r14 = 80000fdeb8  r15 = 0
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: access-violation C:\Users\dajusto\source\repros\upstream\main.cpp:8 in main
==19112==ABORTING
```

The root of the issue _appears to be_ that ASan's instrumentation is
incompatible with Window's assumptions for instantiating `catch`-block's
parameters (`ex` in the snippet above).

The nitty gritty details are lost on me, but I understand that to make
this work without loss of ASan coverage, a "serious" refactoring is
needed. In the meantime, users risk false positive errors when pairing
ASan + catch-block parameters on Windows.

**To mitigate this** I think we should avoid instrumenting catch-block
parameters on Windows. It appears to me this is as "simple" as marking
catch block parameters as "uninteresting" in
`AddressSanitizer::isInterestingAlloca`. My manual tests seem to confirm
this.

I believe this is strictly better than today's status quo, where the
runtime generates false positives. Although we're now explicitly
choosing to instrument less, the benefit is that now more programs can
run with ASan without _funky_ macros that disable ASan on exception
blocks.

**This PR:** implements the mitigation above, and creates a simple new
test for it.

_Thanks!_

---------

Co-authored-by: Antonio Frighetto <[email protected]>
shiltian pushed a commit that referenced this pull request Oct 23, 2025
…nteger registers (llvm#163646)

Fix the `RegisterValue::SetValueFromData` method so that it works also
for 128-bit registers that contain integers.

Without this change, the `RegisterValue::SetValueFromData` method does
not work correctly
for 128-bit registers that contain (signed or unsigned) integers.

---

Steps to reproduce the problem:

(1)

Create a program that writes a 128-bit number to a 128-bit registers
`xmm0`. E.g.:
```
#include <stdint.h>

int main() {
  __asm__ volatile (
      "pinsrq $0, %[lo], %%xmm0\n\t"  // insert low 64 bits
      "pinsrq $1, %[hi], %%xmm0"    // insert high 64 bits
      :
      : [lo]"r"(0x7766554433221100),
        [hi]"r"(0xffeeddccbbaa9988)
  );
  return 0;
}
```

(2)

Compile this program with LLVM compiler:
```
$ $YOUR/clang -g -o main main.c
```

(3)

Modify LLDB so that when it will be reading value from the `xmm0`
register, instead of assuming that it is vector register, it will treat
it as if it contain an integer. This can be achieved e.g. this way:
```
diff --git a/lldb/source/Utility/RegisterValue.cpp b/lldb/source/Utility/RegisterValue.cpp
index 0e99451..a4b51db3e56d 100644
--- a/lldb/source/Utility/RegisterValue.cpp
+++ b/lldb/source/Utility/RegisterValue.cpp
@@ -188,6 +188,7 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info,
     break;
   case eEncodingUint:
   case eEncodingSint:
+  case eEncodingVector:
     if (reg_info.byte_size == 1)
       SetUInt8(src.GetMaxU32(&src_offset, src_len));
     else if (reg_info.byte_size <= 2)
@@ -217,23 +218,6 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info,
     else if (reg_info.byte_size == sizeof(long double))
       SetLongDouble(src.GetLongDouble(&src_offset));
     break;
-  case eEncodingVector: {
-    m_type = eTypeBytes;
-    assert(reg_info.byte_size <= kMaxRegisterByteSize);
-    buffer.bytes.resize(reg_info.byte_size);
-    buffer.byte_order = src.GetByteOrder();
-    if (src.CopyByteOrderedData(
-            src_offset,          // offset within "src" to start extracting data
-            src_len,             // src length
-            buffer.bytes.data(), // dst buffer
-            buffer.bytes.size(), // dst length
-            buffer.byte_order) == 0) // dst byte order
-    {
-      error = Status::FromErrorStringWithFormat(
-          "failed to copy data for register write of %s", reg_info.name);
-      return error;
-    }
-  }
   }
 
   if (m_type == eTypeInvalid)
```

(4)

Rebuild the LLDB.

(5)

Observe what happens how LLDB will print the content of this register
after it was initialized with 128-bit value.
```
$YOUR/lldb --source ./main
(lldb) target create main
Current executable set to '.../main' (x86_64).
(lldb) breakpoint set --file main.c --line 11
Breakpoint 1: where = main`main + 45 at main.c:11:3, address = 0x000000000000164d
(lldb) settings set stop-line-count-before 20
(lldb) process launch
Process 2568735 launched: '.../main' (x86_64)
Process 2568735 stopped
* thread #1, name = 'main', stop reason = breakpoint 1.1
    frame #0: 0x000055555555564d main`main at main.c:11:3
   1   	#include <stdint.h>
   2   	
   3   	int main() {
   4   	  __asm__ volatile (
   5   	      "pinsrq $0, %[lo], %%xmm0\n\t"  // insert low 64 bits
   6   	      "pinsrq $1, %[hi], %%xmm0"    // insert high 64 bits
   7   	      :
   8   	      : [lo]"r"(0x7766554433221100),
   9   	        [hi]"r"(0xffeeddccbbaa9988)
   10  	  );
-> 11  	  return 0;
   12  	}
(lldb) register read --format hex xmm0
    xmm0 = 0x7766554433221100ffeeddccbbaa9988
```

You can see that the upper and lower 64-bit wide halves are swapped.

---------

Co-authored-by: Matej Košík <[email protected]>
shiltian pushed a commit that referenced this pull request Oct 23, 2025
…lvm#162993)

Early if conversion can create instruction sequences such as
```
mov  x1, #1
csel x0, x1, x2, eq
```
which could be simplified into the following instead
```
csinc x0, x2, xzr, ne
```

One notable example that generates code like this is `cmpxchg weak`.

This is fixed by handling an immediate value of 1 as `add(wzr, 1)` so
that the addition can be folded into CSEL by using CSINC instead.
shiltian pushed a commit that referenced this pull request Nov 11, 2025
In `Driver.cpp` `std::atomic<uint64_t>` is used which may need
libatomic.

Build failure (if that is of interest):
```
[127/135] Linking CXX shared library lib/liblldMachO.so.20.1
ninja: job failed: : && /usr/lib/ccache/bin/clang++-20 -fPIC -Os -fstack-clash-protection -Wformat -Werror=format-security -D_GLIBCXX_ASSERTIONS=1 -D_LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS=1 -D_LIBCPP_ENABLE_HARDENED_MODE=1 -g -O2 -DNDEBUG -g1 -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections  -Wl,--as-needed,-O1,--sort-common -Wl,-z,defs -Wl,-z,nodelete   -Wl,-rpath-link,/home/user/aports/main/lld20/src/lld-20.1.5.src/build/./lib  -Wl,--gc-sections -shared -Wl,-soname,liblldMachO.so.20.1 -o lib/liblldMachO.so.20.1 MachO/CMakeFiles/lldMachO.dir/Arch/ARM64.cpp.o MachO/CMakeFiles/lldMachO.dir/Arch/ARM64Common.cpp.o MachO/CMakeFiles/lldMachO.dir/Arch/ARM64_32.cpp.o MachO/CMakeFiles/lldMachO.dir/Arch/X86_64.cpp.o MachO/CMakeFiles/lldMachO.dir/ConcatOutputSection.cpp.o MachO/CMakeFiles/lldMachO.dir/Driver.cpp.o MachO/CMakeFiles/lldMachO.dir/DriverUtils.cpp.o MachO/CMakeFiles/lldMachO.dir/Dwarf.cpp.o MachO/CMakeFiles/lldMachO.dir/EhFrame.cpp.o MachO/CMakeFiles/lldMachO.dir/ExportTrie.cpp.o MachO/CMakeFiles/lldMachO.dir/ICF.cpp.o MachO/CMakeFiles/lldMachO.dir/InputFiles.cpp.o MachO/CMakeFiles/lldMachO.dir/InputSection.cpp.o MachO/CMakeFiles/lldMachO.dir/LTO.cpp.o MachO/CMakeFiles/lldMachO.dir/MapFile.cpp.o MachO/CMakeFiles/lldMachO.dir/MarkLive.cpp.o MachO/CMakeFiles/lldMachO.dir/ObjC.cpp.o MachO/CMakeFiles/lldMachO.dir/OutputSection.cpp.o MachO/CMakeFiles/lldMachO.dir/OutputSegment.cpp.o MachO/CMakeFiles/lldMachO.dir/Relocations.cpp.o MachO/CMakeFiles/lldMachO.dir/BPSectionOrderer.cpp.o MachO/CMakeFiles/lldMachO.dir/SectionPriorities.cpp.o MachO/CMakeFiles/lldMachO.dir/Sections.cpp.o MachO/CMakeFiles/lldMachO.dir/SymbolTable.cpp.o MachO/CMakeFiles/lldMachO.dir/Symbols.cpp.o MachO/CMakeFiles/lldMachO.dir/SyntheticSections.cpp.o MachO/CMakeFiles/lldMachO.dir/Target.cpp.o MachO/CMakeFiles/lldMachO.dir/UnwindInfoSection.cpp.o MachO/CMakeFiles/lldMachO.dir/Writer.cpp.o -L/usr/lib/llvm20/lib -Wl,-rpath,"\$ORIGIN/../lib:/usr/lib/llvm20/lib:/home/user/aports/main/lld20/src/lld-20.1.5.src/build/lib:"  lib/liblldCommon.so.20.1  /usr/lib/llvm20/lib/libLLVM.so.20.1 && :
/usr/lib/gcc/powerpc-alpine-linux-musl/14.3.0/../../../../powerpc-alpine-linux-musl/bin/ld: MachO/CMakeFiles/lldMachO.dir/Driver.cpp.o: in function `handleExplicitExports()':
/usr/lib/gcc/powerpc-alpine-linux-musl/14.3.0/../../../../include/c++/14.3.0/bits/atomic_base.h:501:(.text._ZL21handleExplicitExportsv+0xb8): undefined reference to `__atomic_load_8'
/usr/lib/gcc/powerpc-alpine-linux-musl/14.3.0/../../../../powerpc-alpine-linux-musl/bin/ld: /usr/lib/gcc/powerpc-alpine-linux-musl/14.3.0/../../../../include/c++/14.3.0/bits/atomic_base.h:501:(.text._ZL21handleExplicitExportsv+0x180): undefined reference to `__atomic_load_8'
/usr/lib/gcc/powerpc-alpine-linux-musl/14.3.0/../../../../powerpc-alpine-linux-musl/bin/ld: MachO/CMakeFiles/lldMachO.dir/Driver.cpp.o: in function `void llvm::function_ref<void (unsigned int)>::callback_fn<llvm::parallelForEach<lld::macho::Symbol* const*, handleExplicitExports()::$_0>(lld::macho::Symbol* const*, lld::macho::Symbol* const*, handleExplicitExports()::$_0)::{lambda(unsigned int)#1}>(int, unsigned int)':
/usr/lib/gcc/powerpc-alpine-linux-musl/14.3.0/../../../../include/c++/14.3.0/bits/atomic_base.h:631:(.text._ZN4llvm12function_refIFvjEE11callback_fnIZNS_15parallelForEachIPKPN3lld5macho6SymbolEZL21handleExplicitExportsvE3$_0EEvT_SC_T0_EUljE_EEvij+0xd4): undefined reference to `__atomic_fetch_add_8'
clang++-20: error: linker command failed with exit code 1 (use -v to see invocation)
```

CC @int3 @gkmhub @smeenai

Similar to
llvm@f0b451c
shiltian pushed a commit that referenced this pull request Nov 11, 2025
llvm#164955 has a use-after-scope
(https://lab.llvm.org/buildbot/#/builders/169/builds/16454):

```
==mlir-opt==3940651==ERROR: AddressSanitizer: stack-use-after-scope on address 0x6e1f6ba5c878 at pc 0x6336b214912a bp 0x7ffe607f1670 sp 0x7ffe607f1668
READ of size 4 at 0x6e1f6ba5c878 thread T0
    #0 0x6336b2149129 in size /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:80:32
    #1 0x6336b2149129 in operator[] /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:299:5
    #2 0x6336b2149129 in populateBoundsForShapedValueDim /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/mlir/lib/Dialect/MemRef/IR/ValueBoundsOpInterfaceImpl.cpp:113:43
...
```

This patch attempts to fix-forward by stack-allocating reassocIndices,
instead of taking a reference to a return value.
shiltian pushed a commit that referenced this pull request Nov 11, 2025
## Summary

Fix `FindProcesses` to respect Android's `hidepid=2` security model and
enable name matching for Android apps.

## Problem

1. Called `adb shell pidof` or `adb shell ps` directly, bypassing
Android's process visibility restrictions
2. Name matching failed for Android apps - searched for
`com.example.myapp` but GDB Remote Protocol reports `app_process64`

Android apps fork from Zygote, so `/proc/PID/exe` points to
`app_process64` for all apps. The actual package name is only in
`/proc/PID/cmdline`. The previous implementation applied name filters
without supplementing with cmdline, so searches failed.

## Fix

- Delegate to lldb-server via GDB Remote Protocol (respects `hidepid=2`)
- Get all visible processes, supplement zygote/app_process entries with
cmdline, then apply name matching
- Only fetch cmdline for zygote apps (performance), parallelize with
`xargs -P 8`
- Remove redundant code (GDB Remote Protocol already provides GID/arch)

## Test Results

### Before this fix:

```
(lldb) platform process list
error: no processes were found on the "remote-android" platform

(lldb) platform process list -n com.example.hellojni
1 matching process was found on "remote-android"
PID    PARENT USER       TRIPLE                         NAME
====== ====== ========== ============================== ============================
5276   359    u0_a192                                   com.example.hellojni
                         ^^^^^^^^ Missing triple!
```

### After this fix:

```
(lldb) platform process list
PID    PARENT USER       TRIPLE                         NAME
====== ====== ========== ============================== ============================
1      0      root       aarch64-unknown-linux-android  init
2      0      root                                      [kthreadd]
359    1      system     aarch64-unknown-linux-android  app_process64
5276   359    u0_a192    aarch64-unknown-linux-android  com.example.hellojni
5357   5355   u0_a192    aarch64-unknown-linux-android  sh
5377   5370   u0_a192    aarch64-unknown-linux-android  lldb-server
                          ^^^^^^^^ User-space processes now have triples!

(lldb) platform process list -n com.example.hellojni
1 matching process was found on "remote-android"
PID    PARENT USER       TRIPLE                         NAME
====== ====== ========== ============================== ============================
5276   359    u0_a192    aarch64-unknown-linux-android  com.example.hellojni


(lldb) process attach -n com.example.hellojni
Process 5276 stopped
* thread #1, name = 'example.hellojni', stop reason = signal SIGSTOP
```

## Test Plan

With an Android device/emulator connected:

1. Start lldb-server on device:
```bash
adb push lldb-server /data/local/tmp/
adb shell chmod +x /data/local/tmp/lldb-server
adb shell /data/local/tmp/lldb-server platform  --listen 127.0.0.1:9500 --server
```

2. Connect from LLDB:
```
(lldb) platform select remote-android
(lldb) platform connect connect://127.0.0.1:9500
(lldb) platform process list
```

3. Verify:
- `platform process list` returns all processes with triple information
- `platform process list -n com.example.app` finds Android apps by
package name
- `process attach -n com.example.app` successfully attaches to Android
apps

## Impact

Restores `platform process list` on Android with architecture
information and package name lookup. All name matching modes now work
correctly.

Fixes llvm#164192
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants