[llvm-profgen] Add --sample-period to estimate absolute counts #1

tcreech-intel · 2024-07-16T22:08:54Z

Without --sample-period, no assumptions are made about perf profile sample frequencies. This is useful for comparing relative hotness of different program locations within the same profile.

With --sample-period, LBR- and IP-based profile hit counts are adjusted to estimate the absolute total event count for each program location. This makes it reasonable to compare hit counts between different profiles, e.g., between LBR-based execution frequency profiles and IP-based branch mispredict profiles.

When a DEC legacy STRUCTURE definition appears within another, its STRUCTURE statement must also declare some components of the enclosing structure. Fixes llvm#99288.

- Add implementation for x86_64 and linux - Add test The output is like ==XXYYZZ==Register values: rax = 0x... rbx = 0x... rcx = 0x... rdx = 0x... rdi = 0x... rsi = 0x... rbp = 0x... rsp = 0x... r8 = 0x... r9 = 0x... r10 = 0x... r11 = 0x... r12 = 0x... r13 = 0x... r14 = 0x... r15 = 0x...

goo.gl is going away: https://developers.googleblog.com/en/google-url-shortener-links-will-no-longer-be-available/ Fix goo.gl link from: - http://goo.gl/QKbem + https://static.usenix.org/event/usenix05/tech/general/full_papers/seward/seward_html/usenix2005.html and reflow the comment a bit to make it look a bit better after the URL change, although it's not perfect now. Committed as obvious. Bug: llvm#99586

…e() overload (llvm#98403) This PR adds `SBSaveCoreOptions`, which is a container class for options when LLDB is taking coredumps. For this first iteration this container just keeps parity with the extant API of `file, style, plugin`. In the future this options object can be extended to allow users to take a subset of their core dumps.

…ions' paper goo.gl is going away: https://developers.googleblog.com/en/google-url-shortener-links-will-no-longer-be-available/ Fix goo.gl link from: - https://goo.gl/4Rb9As + https://docs.google.com/document/d/1momWzKFf4D6h8H3YlfgKQ3qeZy5ayvMRh6yR-Xn2hUE Committed as obvious. Bug: llvm#99586

Previously these targets were disabled, but with a relatively new rules_python we can build these pointing at a hermetic python, which allows us to build these safely. Users can still access the files directly if they need to customize how these are built.

This patch adds a couple of unit tests: - SetSubtractSmallPtrSet exercises the code path involving remove_if, added in d772cdd. Note that SmallPtrSet supports remove_if. - SetSubtractSmallVector exercises the code path involving S1.erase(*SI) and ensures that set_subtract continues to accept S2 being a vector, which does not have contains.

Pointers print more leading zeroes for better alignment.

This patch fixes: compiler-rt/lib/sanitizer_common/sanitizer_tls_get_addr.cpp:126:72: error: format specifies type 'void *' but the argument has type 'uptr *' (aka 'unsigned long *') [-Werror,-Wformat-pedantic]

1. Move checks into parent test/CMakeLists.txt 2. COMPILER_RT_INCLUDE_TESTS disable both lit and gtests. Before it was very inconsistent between sanitizers.

This implements tracking of moving instrs with `moveBefore()`.

Signed-off-by: Brian Cain <[email protected]>

The current git_repository usage points to tags, which leads to warnings that the build may not be reproducable due to not using a git sha. The docs for [git_repository](https://bazel.build/rules/lib/repo/git#git_repository) recommend using `http_archive`, so switch to that instead. Also bump to newer versions for these two repos.

…96025) Base on the discussion https://discourse.llvm.org/t/fp-can-we-add-pure-attribute-for-math-library-functions-default/79459, math libcalls set errno, so it should emit "int" TBAA metadata on FP libcalls to solve the alias issue. Note: Only add support for expf in this PR Fix llvm#86635

…action (llvm#73618) Differential Revision: https://reviews.llvm.org/D150525 Implements: - https://wg21.link/P1132R8 - `out_ptr` - a scalable output pointer abstraction - https://eel.is/c++draft/smartptr.adapt - 20.3.4 Smart pointer adaptors - https://wg21.link/LWG3734 - Inconsistency in `inout_ptr` and `out_ptr` for empty case - https://wg21.link/LWG3897- `inout_ptr` will not update raw pointer to 0 --------- Co-authored-by: Hristo Hristov <[email protected]>

Detect and support fixed PIC indirect jumps of the following form: ``` movslq En(%rip), %r1 leaq PIC_JUMP_TABLE(%rip), %r2 addq %r2, %r1 jmpq *%r1 ``` with PIC_JUMP_TABLE that looks like following: ``` JT: ---------- E1:| L1 - JT | |----------| E2:| L2 - JT | |----------| | | ...... En:| Ln - JT | ---------- ``` The code could be produced by compilers, see llvm#91648. Test Plan: updated jump-table-fixed-ref-pic.test Reviewers: maksfb, ayermolo, dcci, rafaelauler Reviewed By: rafaelauler Pull Request: llvm#91667

Add a BinaryFunction field for pseudo probe function GUID. Populate it during pseudo probe section parsing, and emit it in YAML profile (both regular and BAT), along with function checksum. To be used for stale function matching. Test Plan: update pseudoprobe-decoding-inline.test

AddressProbesMap is keyed by binary addresses, and it makes sense to treat them as ordered. This also enables slicing by binary function/ binary basic block, to be used in BOLT (llvm#99554). Test Plan: NFC Reviewers: wlei-llvm Reviewed By: wlei-llvm Pull Request: llvm#99553

Read pseudo probes in regular and BAT YAML profile generation, and attach them to YAML profile basic blocks. This exposes GUID, probe id, and probe type in profile for future use in stale profile matching. Test Plan: updated pseudoprobe-decoding-inline.test Reviewers: dcci, rafaelauler, ayermolo, maksfb Reviewed By: rafaelauler Pull Request: llvm#99554

…puting leading zeros (llvm#99524) It turns out we can safely use DAG.computeKnownBits(N0).countMinLeadingZeros() with constant legal vectors, so remove the check for it.

See llvm#98894

…9596)

This is follow-up for llvm#78901 after validation. Drop the comments for stability since zu is the last feature for cpuid APX_F.

…lueType (llvm#99537) The original `CheckValueTypeMatcher` stores StringRef as the member variable type, however it's more efficient to use use MVT::SimpleValueType since it prevents string comparison in isEqualImpl, it also reduce the memory consumption in each object.

Documents which patterns are tested in: * vector-transfer-collapse-inner-most-dims.mlir.

In bolt/lib/Passes/AsmDump.cpp, the MCInstPrinter is created with false AsmVerbose. The AsmVerbose argument to createAsmStreamer is unused. Deprecate the legacy Target::createAsmStreamer overload, which might be used by downstream.

…sVerboseAsm() ... to improve consistency. Most targets don't use VerboseAsm. When they do (X86, SystemZ), they use MCStreamer::isVerboseAsm().

…orTy The parameter is confusing as it duplicates MCStreamer::isVeboseAsm (initialized from MCTargetOptions::AsmVerbose). After 233cca1, no in-tree target uses the parameter.

Similar to e4c360a (2020).

They currently get the header from MCLinkerOptimizationHint.h, which will be removed from MCAssembler.h.

…race" (llvm#99757) Reverts llvm#99731 Remove accidentally added temporary file. Also, fix the uninitialized read of line number.

llvm#99465) … architecture On 32 bit systems, TypeParameterValue is 64bit wide while CFI_index_t is 32bit wide.

…eamer Similar to commit 28fcafb (2011) for MachObjectWriter. MCWinCOFFStreamer can now access WinCOFFObjectWriter directly without holding object file format specific inforamtion in MCAssembler (e.g. IncrementalLinkerCompatible).

… specific VersionInfo

…9815)

Fixes llvm#99758.

We need to distinguish ShapedTypes with and without value semantics. This is needed for downstream users to define their custom vector and tensor types that can work with the arith/math dialect. RFC https://discourse.llvm.org/t/rfc-mlir-types-with-encoding/80189

This patch adds a frame recognizer for Clang's `__builtin_verbose_trap`, which behaves like a `__builtin_trap`, but emits a failure-reason string into debug-info in order for debuggers to display it to a user. The frame recognizer triggers when we encounter a frame with a function name that begins with `__clang_trap_msg`, which is the magic prefix Clang emits into debug-info for verbose traps. Once such frame is encountered we display the frame function name as the `Stop Reason` and display that frame to the user. Example output: ``` (lldb) run warning: a.out was compiled with optimization - stepping may behave oddly; variables may not be available. Process 35942 launched: 'a.out' (arm64) Process 35942 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = Misc.: Function is not implemented frame #1: 0x0000000100003fa4 a.out`main [inlined] Dummy::func(this=<unavailable>) at verbose_trap.cpp:3:5 [opt] 1 struct Dummy { 2 void func() { -> 3 __builtin_verbose_trap("Misc.", "Function is not implemented"); 4 } 5 }; 6 7 int main() { (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = Misc.: Function is not implemented frame #0: 0x0000000100003fa4 a.out`main [inlined] __clang_trap_msg$Misc.$Function is not implemented$ at verbose_trap.cpp:0 [opt] * frame #1: 0x0000000100003fa4 a.out`main [inlined] Dummy::func(this=<unavailable>) at verbose_trap.cpp:3:5 [opt] frame llvm#2: 0x0000000100003fa4 a.out`main at verbose_trap.cpp:8:13 [opt] frame llvm#3: 0x0000000189d518b4 dyld`start + 1988 ```

Without `--sample-period`, no assumptions are made about perf profile sample frequencies. This is useful for comparing relative hotness of different program locations within the same profile. With `--sample-period`, LBR- and IP-based profile hit counts are adjusted to estimate the absolute total event count for each program location. This makes it reasonable to compare hit counts between different profiles, e.g., between LBR-based execution frequency profiles and IP-based branch mispredict profiles.

…linux (llvm#99613) Examples of the output: ARM: ``` # ./a.out AddressSanitizer:DEADLYSIGNAL ================================================================= ==122==ERROR: AddressSanitizer: SEGV on unknown address 0x0000007a (pc 0x76e13ac0 bp 0x7eb7fd00 sp 0x7eb7fcc8 T0) ==122==The signal is caused by a READ memory access. ==122==Hint: address points to the zero page. #0 0x76e13ac0 (/lib/libc.so.6+0x7cac0) #1 0x76dce680 in gsignal (/lib/libc.so.6+0x37680) llvm#2 0x005c2250 (/root/a.out+0x145250) llvm#3 0x76db982c (/lib/libc.so.6+0x2282c) llvm#4 0x76db9918 in __libc_start_main (/lib/libc.so.6+0x22918) ==122==Register values: r0 = 0x00000000 r1 = 0x0000007a r2 = 0x0000000b r3 = 0x76d95020 r4 = 0x0000007a r5 = 0x00000001 r6 = 0x005dcc5c r7 = 0x0000010c r8 = 0x0000000b r9 = 0x76f9ece0 r10 = 0x00000000 r11 = 0x7eb7fd00 r12 = 0x76dce670 sp = 0x7eb7fcc8 lr = 0x76e13ab4 pc = 0x76e13ac0 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (/lib/libc.so.6+0x7cac0) ==122==ABORTING ``` AArch64: ``` # ./a.out UndefinedBehaviorSanitizer:DEADLYSIGNAL ==99==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000063 (pc 0x007fbbbc5860 bp 0x007fcfdcb700 sp 0x007fcfdcb700 T99) ==99==The signal is caused by a UNKNOWN memory access. ==99==Hint: address points to the zero page. #0 0x007fbbbc5860 (/lib64/libc.so.6+0x82860) #1 0x007fbbb81578 (/lib64/libc.so.6+0x3e578) llvm#2 0x00556051152c (/root/a.out+0x3152c) llvm#3 0x007fbbb6e268 (/lib64/libc.so.6+0x2b268) llvm#4 0x007fbbb6e344 (/lib64/libc.so.6+0x2b344) llvm#5 0x0055604e45ec (/root/a.out+0x45ec) ==99==Register values: x0 = 0x0000000000000000 x1 = 0x0000000000000063 x2 = 0x000000000000000b x3 = 0x0000007fbbb41440 x4 = 0x0000007fbbb41580 x5 = 0x3669288942d44cce x6 = 0x0000000000000000 x7 = 0x00000055605110b0 x8 = 0x0000000000000083 x9 = 0x0000000000000000 x10 = 0x0000000000000000 x11 = 0x0000000000000000 x12 = 0x0000007fbbdb3360 x13 = 0x0000000000010000 x14 = 0x0000000000000039 x15 = 0x00000000004113a0 x16 = 0x0000007fbbb81560 x17 = 0x0000005560540138 x18 = 0x000000006474e552 x19 = 0x0000000000000063 x20 = 0x0000000000000001 x21 = 0x000000000000000b x22 = 0x0000005560511510 x23 = 0x0000007fcfdcb918 x24 = 0x0000007fbbdb1b50 x25 = 0x0000000000000000 x26 = 0x0000007fbbdb2000 x27 = 0x000000556053f858 x28 = 0x0000000000000000 fp = 0x0000007fcfdcb700 lr = 0x0000007fbbbc584c sp = 0x0000007fcfdcb700 UndefinedBehaviorSanitizer can not provide additional info. SUMMARY: UndefinedBehaviorSanitizer: SEGV (/lib64/libc.so.6+0x82860) ==99==ABORTING ```

``` UBSan-Standalone-sparc :: TestCases/Misc/Linux/diag-stacktrace.cpp ``` `FAIL`s on 32 and 64-bit Linux/sparc64 (and on Solaris/sparcv9, too: the test isn't Linux-specific at all). With `UBSAN_OPTIONS=fast_unwind_on_fatal=1`, the stack trace shows a duplicate innermost frame: ``` compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:31: runtime error: execution reached the end of a value-returning function without returning a value #0 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35 #1 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35 llvm#2 0x7003a714 in g() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:17:38 ``` which isn't seen with `fast_unwind_on_fatal=0`. This turns out to be another fallout from fixing `__builtin_return_address`/`__builtin_extract_return_addr` on SPARC. In `sanitizer_stacktrace_sparc.cpp` (`BufferedStackTrace::UnwindFast`) the `pc` arg is the return address, while `pc1` from the stack frame (`fr_savpc`) is the address of the `call` insn, leading to a double entry for the innermost frame in `trace_buffer[]`. This patch fixes this by moving the adjustment before all uses. Tested on `sparc64-unknown-linux-gnu` and `sparcv9-sun-solaris2.11` (with the `ubsan/TestCases/Misc/Linux` tests enabled).

Otherwise debug-info is stripped, which influences the language of the current frame. Also, set explicit breakpoint because Windows seems to not obey the debugtrap. Log from failing test on Windows: ``` (lldb) command source -s 0 'lit-lldb-init-quiet' Executing commands in 'D:\test\lit-lldb-init-quiet'. (lldb) command source -C --silent-run true lit-lldb-init (lldb) target create "main.out" Current executable set to 'D:\test\main.out' (x86_64). (lldb) settings set interpreter.stop-command-source-on-error false (lldb) command source -s 0 'with-target.input' Executing commands in 'D:\test\with-target.input'. (lldb) expr blah ^ error: use of undeclared identifier 'blah' note: Falling back to default language. Ran expression as 'Objective C++'. (lldb) run Process 29404 launched: 'D:\test\main.out' (x86_64) Process 29404 stopped * thread #1, stop reason = Exception 0x80000003 encountered at address 0x7ff7b3df7189 frame #0: 0x00007ff7b3df718a main.out -> 0x7ff7b3df718a: xorl %eax, %eax 0x7ff7b3df718c: popq %rcx 0x7ff7b3df718d: retq 0x7ff7b3df718e: int3 (lldb) expr blah ^ error: use of undeclared identifier 'blah' note: Falling back to default language. Ran expression as 'Objective C++'. (lldb) expr -l objc -- blah ^ error: use of undeclared identifier 'blah' note: Expression evaluation in pure Objective-C not supported. Ran expression as 'Objective C++'. (lldb) expr -l c -- blah ^ error: use of undeclared identifier 'blah' note: Expression evaluation in pure C not supported. Ran expression as 'ISO C++'. ```

The Tkinter module was renamed to tkinter in Python 3.0. https://docs.python.org/2/library/tkinter.html https://docs.python.org/3/library/tkinter.html Rest of it appears to work when imported inside of LLDB: ``` $ ./bin/lldb /tmp/test.o (lldb) target create "/tmp/test.o" Current executable set to '/tmp/test.o' (x86_64). (lldb) b main Breakpoint 1: where = test.o`main + 8 at test.c:1:18, address = 0x0000000000001131 (lldb) run Process 121572 launched: '/tmp/test.o' (x86_64) Process 121572 stopped * thread #1, name = 'test.o', stop reason = breakpoint 1.1 frame #0: 0x0000555555555131 test.o`main at test.c:1:18 -> 1 int main() { int a = 1; char b = '?'; return 0; } (lldb) command script import <...>/llvm-project/lldb/examples/python/lldbtk.py (lldb) tk- Available completions: tk-process -- For more information run 'help tk-process' tk-target -- For more information run 'help tk-target' tk-variables -- For more information run 'help tk-variables' (lldb) tk-process (lldb) tk-target (lldb) tk-variables ```

…ypes (llvm#162278) When we take the following C program: ``` int main() { return 0; } ``` and create a statically-linked executable from it: ``` clang -static -g -o main main.c ``` Then we can observe the following `lldb` behavior: ``` $ lldb (lldb) target create main Current executable set to '.../main' (x86_64). (lldb) breakpoint set --name main Breakpoint 1: where = main`main + 11 at main.c:2:3, address = 0x000000000022aa7b (lldb) process launch Process 3773637 launched: '/home/me/tmp/built-in/main' (x86_64) Process 3773637 stopped * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x000000000022aa7b main`main at main.c:2:3 1 int main() { -> 2 return 0; 3 } (lldb) script lldb.debugger.GetSelectedTarget().FindFirstType("__int128").size 0 (lldb) script lldb.debugger.GetSelectedTarget().FindFirstType("unsigned __int128").size 0 (lldb) quit ``` The value return by the `SBTarget::FindFirstType` method is wrong for the `__int128` and `unsigned __int128` basic types. The proposed changes make the `TypeSystemClang::GetBasicTypeEnumeration` method consistent with `gcc` and `clang` C [language extension](https://gcc.gnu.org/onlinedocs/gcc/_005f_005fint128.html) related to 128-bit integer types as well as with the `BuiltinType::getName` method in the LLVM codebase itself. When the above change is applied, the behavior of the `lldb` changes in the following (desired) way: ``` $ lldb (lldb) target create main Current executable set to '.../main' (x86_64). (lldb) breakpoint set --name main Breakpoint 1: where = main`main + 11 at main.c:2:3, address = 0x000000000022aa7b (lldb) process launch Process 3773637 launched: '/home/me/tmp/built-in/main' (x86_64) Process 3773637 stopped * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x000000000022aa7b main`main at main.c:2:3 1 int main() { -> 2 return 0; 3 } (lldb) script lldb.debugger.GetSelectedTarget().FindFirstType("__int128").size 16 (lldb) script lldb.debugger.GetSelectedTarget().FindFirstType("unsigned __int128").size 16 (lldb) quit ``` --------- Co-authored-by: Matej Košík <[email protected]>

**Mitigation for:** google/sanitizers#749 **Disclosure:** I'm not an ASan compiler expert yet (I'm trying to learn!), I primarily work in the runtime. Some of this PR was developed with the help of AI tools (primarily as a "fuzzy `grep` engine"), but I've manually refined and tested the output, and can speak for every line. In general, I used it only to orient myself and for "rubberducking". **Context:** The msvc ASan team (👋 ) has received an internal request to improve clang's exception handling under ASan for Windows. Namely, we're interested in **mitigating** this bug: google/sanitizers#749 To summarize, today, clang + ASan produces a false-positive error for this program: ```C++ #include <cstdio> #include <exception> int main() { try { throw std::exception("test"); }catch (const std::exception& ex){ puts(ex.what()); } return 0; } ``` The error reads as such: ``` C:\Users\dajusto\source\repros\upstream>type main.cpp #include <cstdio> #include <exception> int main() { try { throw std::exception("test"); }catch (const std::exception& ex){ puts(ex.what()); } return 0; } C:\Users\dajusto\source\repros\upstream>"C:\Users\dajusto\source\repos\llvm-project\build.runtimes\bin\clang.exe" -fsanitize=address -g -O0 main.cpp C:\Users\dajusto\source\repros\upstream>a.exe ================================================================= ==19112==ERROR: AddressSanitizer: access-violation on unknown address 0x000000000000 (pc 0x7ff72c7c11d9 bp 0x0080000ff960 sp 0x0080000fcf50 T0) ==19112==The signal is caused by a READ memory access. ==19112==Hint: address points to the zero page. #0 0x7ff72c7c11d8 in main C:\Users\dajusto\source\repros\upstream\main.cpp:8 #1 0x7ff72c7d479f in _CallSettingFrame C:\repos\msvc\src\vctools\crt\vcruntime\src\eh\amd64\handlers.asm:49 llvm#2 0x7ff72c7c8944 in __FrameHandler3::CxxCallCatchBlock(struct _EXCEPTION_RECORD *) C:\repos\msvc\src\vctools\crt\vcruntime\src\eh\frame.cpp:1567 llvm#3 0x7ffb4a90e3e5 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18012e3e5) llvm#4 0x7ff72c7c1128 in main C:\Users\dajusto\source\repros\upstream\main.cpp:6 llvm#5 0x7ff72c7c33db in invoke_main C:\repos\msvc\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78 llvm#6 0x7ff72c7c33db in __scrt_common_main_seh C:\repos\msvc\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288 llvm#7 0x7ffb49b05c06 (C:\WINDOWS\System32\KERNEL32.DLL+0x180035c06) llvm#8 0x7ffb4a8455ef (C:\WINDOWS\SYSTEM32\ntdll.dll+0x1800655ef) ==19112==Register values: rax = 0 rbx = 80000ff8e0 rcx = 27d76d00000 rdx = 80000ff8e0 rdi = 80000fdd50 rsi = 80000ff6a0 rbp = 80000ff960 rsp = 80000fcf50 r8 = 100 r9 = 19930520 r10 = 8000503a90 r11 = 80000fd540 r12 = 80000fd020 r13 = 0 r14 = 80000fdeb8 r15 = 0 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: access-violation C:\Users\dajusto\source\repros\upstream\main.cpp:8 in main ==19112==ABORTING ``` The root of the issue _appears to be_ that ASan's instrumentation is incompatible with Window's assumptions for instantiating `catch`-block's parameters (`ex` in the snippet above). The nitty gritty details are lost on me, but I understand that to make this work without loss of ASan coverage, a "serious" refactoring is needed. In the meantime, users risk false positive errors when pairing ASan + catch-block parameters on Windows. **To mitigate this** I think we should avoid instrumenting catch-block parameters on Windows. It appears to me this is as "simple" as marking catch block parameters as "uninteresting" in `AddressSanitizer::isInterestingAlloca`. My manual tests seem to confirm this. I believe this is strictly better than today's status quo, where the runtime generates false positives. Although we're now explicitly choosing to instrument less, the benefit is that now more programs can run with ASan without _funky_ macros that disable ASan on exception blocks. **This PR:** implements the mitigation above, and creates a simple new test for it. _Thanks!_ --------- Co-authored-by: Antonio Frighetto <[email protected]>

…nteger registers (llvm#163646) Fix the `RegisterValue::SetValueFromData` method so that it works also for 128-bit registers that contain integers. Without this change, the `RegisterValue::SetValueFromData` method does not work correctly for 128-bit registers that contain (signed or unsigned) integers. --- Steps to reproduce the problem: (1) Create a program that writes a 128-bit number to a 128-bit registers `xmm0`. E.g.: ``` #include <stdint.h> int main() { __asm__ volatile ( "pinsrq $0, %[lo], %%xmm0\n\t" // insert low 64 bits "pinsrq $1, %[hi], %%xmm0" // insert high 64 bits : : [lo]"r"(0x7766554433221100), [hi]"r"(0xffeeddccbbaa9988) ); return 0; } ``` (2) Compile this program with LLVM compiler: ``` $ $YOUR/clang -g -o main main.c ``` (3) Modify LLDB so that when it will be reading value from the `xmm0` register, instead of assuming that it is vector register, it will treat it as if it contain an integer. This can be achieved e.g. this way: ``` diff --git a/lldb/source/Utility/RegisterValue.cpp b/lldb/source/Utility/RegisterValue.cpp index 0e99451..a4b51db3e56d 100644 --- a/lldb/source/Utility/RegisterValue.cpp +++ b/lldb/source/Utility/RegisterValue.cpp @@ -188,6 +188,7 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info, break; case eEncodingUint: case eEncodingSint: + case eEncodingVector: if (reg_info.byte_size == 1) SetUInt8(src.GetMaxU32(&src_offset, src_len)); else if (reg_info.byte_size <= 2) @@ -217,23 +218,6 @@ Status RegisterValue::SetValueFromData(const RegisterInfo &reg_info, else if (reg_info.byte_size == sizeof(long double)) SetLongDouble(src.GetLongDouble(&src_offset)); break; - case eEncodingVector: { - m_type = eTypeBytes; - assert(reg_info.byte_size <= kMaxRegisterByteSize); - buffer.bytes.resize(reg_info.byte_size); - buffer.byte_order = src.GetByteOrder(); - if (src.CopyByteOrderedData( - src_offset, // offset within "src" to start extracting data - src_len, // src length - buffer.bytes.data(), // dst buffer - buffer.bytes.size(), // dst length - buffer.byte_order) == 0) // dst byte order - { - error = Status::FromErrorStringWithFormat( - "failed to copy data for register write of %s", reg_info.name); - return error; - } - } } if (m_type == eTypeInvalid) ``` (4) Rebuild the LLDB. (5) Observe what happens how LLDB will print the content of this register after it was initialized with 128-bit value. ``` $YOUR/lldb --source ./main (lldb) target create main Current executable set to '.../main' (x86_64). (lldb) breakpoint set --file main.c --line 11 Breakpoint 1: where = main`main + 45 at main.c:11:3, address = 0x000000000000164d (lldb) settings set stop-line-count-before 20 (lldb) process launch Process 2568735 launched: '.../main' (x86_64) Process 2568735 stopped * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x000055555555564d main`main at main.c:11:3 1 #include <stdint.h> 2 3 int main() { 4 __asm__ volatile ( 5 "pinsrq $0, %[lo], %%xmm0\n\t" // insert low 64 bits 6 "pinsrq $1, %[hi], %%xmm0" // insert high 64 bits 7 : 8 : [lo]"r"(0x7766554433221100), 9 [hi]"r"(0xffeeddccbbaa9988) 10 ); -> 11 return 0; 12 } (lldb) register read --format hex xmm0 xmm0 = 0x7766554433221100ffeeddccbbaa9988 ``` You can see that the upper and lower 64-bit wide halves are swapped. --------- Co-authored-by: Matej Košík <[email protected]>

…lvm#162993) Early if conversion can create instruction sequences such as ``` mov x1, #1 csel x0, x1, x2, eq ``` which could be simplified into the following instead ``` csinc x0, x2, xzr, ne ``` One notable example that generates code like this is `cmpxchg weak`. This is fixed by handling an immediate value of 1 as `add(wzr, 1)` so that the addition can be folded into CSEL by using CSINC instead.

klausler and others added 30 commits July 18, 2024 16:32

[flang] A nested STRUCTURE must declare entities (llvm#99379)

0684db3

When a DEC legacy STRUCTURE definition appears within another, its STRUCTURE statement must also declare some components of the enclosing structure. Fixes llvm#99288.

[gn build] Port 4120570

9e4c236

[sanitizer] Use strict-whitespace in tests

59441f2

[bazel] Port llvm#98403 (llvm#99592)

f304b88

[sanitizer_common] Use %p to print addresses (llvm#98578)

bf4347b

Pointers print more leading zeroes for better alignment.

[compiler-rt] Fix a warning

467f969

This patch fixes: compiler-rt/lib/sanitizer_common/sanitizer_tls_get_addr.cpp:126:72: error: format specifies type 'void *' but the argument has type 'uptr *' (aka 'unsigned long *') [-Werror,-Wformat-pedantic]

[compiler-rt] Cleanup use of COMPILER_RT_INCLUDE_TESTS (llvm#98246)

d4b28fb

1. Move checks into parent test/CMakeLists.txt 2. COMPILER_RT_INCLUDE_TESTS disable both lit and gtests. Before it was very inconsistent between sanitizers.

[NFC][sanitizer] Fix unused variable 'RegName' warning

98ebdd0

[SandboxIR][Tracker] Track Instruction::moveBefore() (llvm#99568)

cbbd153

This implements tracking of moving instrs with `moveBefore()`.

[clang] [hexagon] handle --unwindlib arg (llvm#99552)

962d018

Signed-off-by: Brian Cain <[email protected]>

[gn build] Port e475bb7

401d7bc

[CodeGen] Remove checks for vectors in unsigned division prior to com…

8717407

…puting leading zeros (llvm#99524) It turns out we can safely use DAG.computeKnownBits(N0).countMinLeadingZeros() with constant legal vectors, so remove the check for it.

[GlobalIsel] import G_SCMP and G_UCMP (llvm#99518)

f554dd7

See llvm#98894

[ADT] Use UnorderedElementsAre in SetOperationsTest.cpp (NFC) (llvm#9…

687fc08

…9596)

[X86][Driver] Enable feature zu for -mapxf

88e9bd8

This is follow-up for llvm#78901 after validation. Drop the comments for stability since zu is the last feature for cpuid APX_F.

[BOLT] Add MC dependency for Profile

79a0b66

banach-space and others added 18 commits July 21, 2024 17:44

[mlir][test] Add comments in a test (nfc) (llvm#99810)

14a543e

Documents which patterns are tested in: * vector-transfer-collapse-inner-most-dims.mlir.

[ARM,Hexagon] Ignore IsVerboseAsm parameter in favor of MCStreamer::i…

233cca1

…sVerboseAsm() ... to improve consistency. Most targets don't use VerboseAsm. When they do (X86, SystemZ), they use MCStreamer::isVerboseAsm().

[MC] Remove unnecessary isVerboseAsm from Target::AsmTargetStreamerCt…

8f14e39

…orTy The parameter is confusing as it duplicates MCStreamer::isVeboseAsm (initialized from MCTargetOptions::AsmVerbose). After 233cca1, no in-tree target uses the parameter.

[MC] Remove unnecessary isVerboseAsm from createAsmTargetStreamer

e9c8514

[MC] Drop unnecessary MCSymbol::setExternal calls for ELF

7f017f0

Similar to e4c360a (2020).

[MC] Move isPrivateExtern to MCSymbolMachO

e299b16

*AsmBackend.cpp: Include StringSwitch.h

6717dc5

They currently get the header from MCLinkerOptimizationHint.h, which will be removed from MCAssembler.h.

[MC] Move LOHContainer to MachObjectwriter

a2af375

Reapply "Add source file name for template instantiations in -ftime-t…

ecaacd1

…race" (llvm#99757) Reverts llvm#99731 Remove accidentally added temporary file. Also, fix the uninitialized read of line number.

[Flang][Runtime] Fix implicit conversion warning when targeting 32bit… (

6db2465

llvm#99465) … architecture On 32 bit systems, TypeParameterValue is 64bit wide while CFI_index_t is 32bit wide.

[MC] Export llvm::SPIRVObjectTargetWriter and drop reliance on Mach-o…

ffcd7e9

… specific VersionInfo

[libc][math][c23] Add entrypoints and tests for dsqrt{l,f128} (llvm#9…

c156237

…9815)

[MC] Move VersionInfo to MachObjectWriter

09a399a

[Transforms] Use range-based for loops (NFC) (llvm#99607)

5c83498

[clang-format] Fix a bug in annotating StartOfName (llvm#99791)

dcebe29

Fixes llvm#99758.

tcreech-intel force-pushed the sample_period branch from ec30f56 to 3c857dd Compare July 21, 2024 21:28

tcreech-intel force-pushed the sample_period branch from 3c857dd to 5684a43 Compare July 21, 2024 21:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[llvm-profgen] Add --sample-period to estimate absolute counts #1

[llvm-profgen] Add --sample-period to estimate absolute counts #1

Uh oh!

tcreech-intel commented Jul 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

121 participants

[llvm-profgen] Add --sample-period to estimate absolute counts #1

Are you sure you want to change the base?

[llvm-profgen] Add --sample-period to estimate absolute counts #1

Uh oh!

Conversation

tcreech-intel commented Jul 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

121 participants