-
Notifications
You must be signed in to change notification settings - Fork 34
Upload VPUX LLVM into OpenSource #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The `IsSPMD` global can only be read by threads other than the main thread *after* initialization is complete. To allow usage of `mapping::getBlockSize` before initialization is done, we can pass the `IsSPMD` state explicitly. This is similar to other APIs that take `IsSPMD` explicitly to avoid such a race, e.g., `mapping::isInitialThreadInLevel0(IsSPMD)` Fixes llvm/llvm-project#53857 (cherry picked from commit 57b4c52)
When we move an allocation from the heap to the stack we need to allocate it in the alloca AS and then cast the result. This also prevents us from inserting the alloca after the allocation call but rather right before. Fixes llvm/llvm-project#53858 (cherry picked from commit 8ad39fb)
Instead of doing an inbounds strip first and another non-inbounds strip afterward for equality comparisons, directly do a single inbounds or non-inbounds strip based on whether we have an equality predicate or not. This is NFC-ish in that the alloca equality codepath is the only part that sees additional non-inbounds offsets now, and for that codepath it doesn't matter whether or not the GEP is inbounds, as it does a stronger check itself. InstCombine would infer inbounds for such GEPs. (cherry picked from commit f35af77)
While testing LLVM 14.0.0 rc1 on Solaris, compilation of `FAIL`ed with
/var/llvm/llvm-14.0.0-rc1/rc1/llvm-project/mlir/lib/Analysis/Presburger/Utils.cpp: In lambda function:
/var/llvm/llvm-14.0.0-rc1/rc1/llvm-project/mlir/lib/Analysis/Presburger/Utils.cpp:48:58: error: call of overloaded ‘floor(int64_t)’ is ambiguous
48 | [gcd](int64_t &n) { return floor(n / gcd); });
| ^
...
/usr/gcc/10/lib/gcc/sparcv9-sun-solaris2.11/10.3.0/include-fixed/iso/math_iso.h:201:21:
note: candidate: ‘long double std::floor(long double)’
201 | inline long double floor(long double __X) { return __floorl(__X); }
| ^~~~~
/usr/gcc/10/lib/gcc/sparcv9-sun-solaris2.11/10.3.0/include-fixed/iso/math_iso.h:165:15:
note: candidate: ‘float std::floor(float)’
165 | inline float floor(float __X) { return __floorf(__X); }
| ^~~~~
/usr/gcc/10/lib/gcc/sparcv9-sun-solaris2.11/10.3.0/include-fixed/iso/math_iso.h:78:15:
note: candidate: ‘double std::floor(double)’
78 | extern double floor __P((double));
| ^~~~~
The same issue had already occured in the past, cf. D108750
<https://reviews.llvm.org/D108750>, and the solution is the same: cast the
`floor` arg to `double`.
Tested on `amd64-pc-solaris2.11` and `sparcv9-sun-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D119324
(cherry picked from commit 9159675)
Large COFF section names are moved into the string table and the section header field is the offset into the string table encoded in ASCII for offset smaller than 7 digits and in base64 for larger offsets. The operation of taking the string table offsets is done in a few places in the codebase, so it is helpful to move this operation into `BinaryFormat` so that it can be shared everywhere it's done. So this patch takes the implementation of this operation from `llvm/lib/MC/WinCOFFObjectWriter.cpp` and moves it into `BinaryFormat`. Reviewed By: jhenderson, rnk Differential Revision: https://reviews.llvm.org/D118793 (cherry picked from commit 85f4023)
The section name encoding for `llvm-objcopy` had two main issues, the first is that the size used for the `snprintf` in the original code is incorrect because `snprintf` adds a null byte, so this code was only able to encode offsets of 6 digits - `/`, `\0` and 6 digits of the offset - rather than the 7 digits it should support. And the second part is that it didn't support the base64 encoding for offsets larger than 7 digits. This issue specifically showed up when using the `clang-offload-bundler` with a binary containing a lot of symbols/sections, since it uses `llvm-objcopy` to add the sections containing the offload code. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D118692 (cherry picked from commit ddf528b)
Currently, loading from or storing to a stack location with a structured load or store crashes in isAArch64FrameOffsetLegal as the opcodes are not handled by getMemOpInfo. This patch adds the opcodes for structured load/store instructions with an immediate index to getMemOpInfo & getLoadStoreImmIdx, setting appropriate values for the scale, width & min/max offsets. Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D119338 (cherry picked from commit fc1b212)
…on-Neon types Fixes: #53679 Differential Revision: https://reviews.llvm.org/D119428 (cherry picked from commit c53ad72)
…::ReconstructShuffle Previously the code in AArch64TargetLowering::ReconstructShuffle assumed the input vectors were always fixed-width, however this is not always the case since you can extract elements from scalable vectors and insert into fixed-width ones. We were hitting crashes here for two different cases: 1. When lowering a fixed-length vector extract from a scalable vector with i1 element types. This happens due to the fact the i1 elements get promoted to larger integer types for fixed-width vectors and leads to sequences of INSERT_VECTOR_ELT and EXTRACT_VECTOR_ELT nodes. In this case AArch64TargetLowering::ReconstructShuffle will still fail to make a transformation, but at least it no longer crashes. 2. When lowering a sequence of extractelement/insertelement operations on mixed fixed-width/scalable vectors. For now, I've just changed AArch64TargetLowering::ReconstructShuffle to bail out if it finds a scalable vector. Tests for both instances described above have been added here: (1) CodeGen/AArch64/sve-extract-fixed-vector.ll (2) CodeGen/AArch64/sve-fixed-length-reshuffle.ll Differential Revision: https://reviews.llvm.org/D116602 (cherry picked from commit a57a7f3)
…tionPHI The code was relying upon the implicit conversion of TypeSize to uint64_t and assuming the type in question was always fixed. However, I discovered an issue when running the canon-freeze pass with some IR loops that contains scalable vector types. I've changed the code to bail out if the size is unknown at compile time, since we cannot compute whether the step is a multiple of the type size or not. I added a test here: Transforms/CanonicalizeFreezeInLoops/phis.ll Differential Revision: https://reviews.llvm.org/D118696 (cherry picked from commit 1badfbb)
The lowering code for shuffle_vector has a code path that looks through extract_subvector, this code path did not properly account for the potential presense of larger than Neon vector types and could produce unselectable DAG nodes. Differential Revision: https://reviews.llvm.org/D119252 (cherry picked from commit 98936ae)
While testing LLVM 14.0.0 rc1 on Solaris, I ran into a compile failure:
from /var/llvm/llvm-14.0.0-rc1/rc1/llvm-project/mlir/lib/ExecutionEngine/SparseTensorUtils.cpp:22:
/usr/include/sys/types.h:103:16: error: conflicting declaration ‘typedef short int index_t’
103 | typedef short index_t;
| ^~~~~~~
In file included from
/var/llvm/llvm-14.0.0-rc1/rc1/llvm-project/mlir/lib/ExecutionEngine/SparseTensorUtils.cpp:17:
/var/llvm/llvm-14.0.0-rc1/rc1/llvm-project/mlir/include/mlir/ExecutionEngine/SparseTensorUtils.h:26:7:
note: previous declaration as ‘using index_t = uint64_t’
26 | using index_t = uint64_t;
| ^~~~~~~
The same issue had already occured in the past and fixed in D72619
<https://reviews.llvm.org/D72619>. More detailed explanation can also be
found there.
Tested on `amd64-pc-solaris2.11` and `sparcv9-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D119323
(cherry picked from commit d2215e7)
Even after D86621 <https://reviews.llvm.org/D86621>, `clang -m32` on Solaris/sparcv9 doesn't inline atomics with 8-byte operands, unlike `gcc`. This leads to many link failures in the testsuite (undefined references to `__atomic_load_8` and `__sync_val_compare_and_swap_8`. Until a proper codegen fix can be implemented, this patch works around the first of those by linking with `-latomic`. Tested on `sparcv9-sun-solaris2.11`. Differential Revision: https://reviews.llvm.org/D118021 (cherry picked from commit a6afa9e)
(cherry picked from commit cedc23b)
…N_LINUX=on (cherry picked from commit da04744)
…IE_ON_LINUX=on (cherry picked from commit deee339)
CLANG_DEFAULT_PIE_ON_LINUX=on will soon become the default. The purpose of these tests has gone. (cherry picked from commit 0ac6be6)
prelink (will be removed by glibc 2.37) does not support PIE. (cherry picked from commit 6111228)
LLDB windows on ARM64 14.0.0 release will include LLDB binary. This patch adds a release note about it.
…uments without name-like identifier As originally reported by @steakhal in http://github.com/llvm/llvm-project/issues/54074, the name extraction logic of `readability-suspicious-call-argument` crashes if the argument passed to a function was a function call to a non-trivially named entity (e.g. an operator). Fixed this crash case by ignoring such constructs and considering them as having no name. Reviewed By: aaron.ballman, steakhal Differential Revision: http://reviews.llvm.org/D120555 (Cherry-picked from commit 416e689)
ctfconvert seems to use REL-format `.rel.SUNW_dof` for 32-bit architectures. ``` Binary file usr/ports/lang/perl5.32/work/perl-5.32.1/dtrace_mini.o matches [alfredo.junior@dell-a ~/tmp/llvm-bug]$ readelf -r dtrace_mini.o Relocation section (.rel.SUNW_dof): r_offset r_info r_type st_value st_name 00000184 0000281a R_PPC_REL32 00000000 $dtrace1772974259.Perl_dtrace_probe_load ``` Support R_PPC_REL32 to fix `ld.lld: error: drti.c:(.SUNW_dof+0x4E4): internal linker error: cannot read addend for relocation R_PPC_REL32`. While here, add some common relocation types for AArch64, PPC, and PPC64. We perform minimum tests. Reviewed By: adalava, arichardson Differential Revision: https://reviews.llvm.org/D120535 (cherry picked from commit 767e64f)
Skip the hip-fpie-option.hip Driver test if default-pie-on-linux is used. This test currently relies on default-no-pie, and it has been changed to require default-pie in main. Differential Revision: https://reviews.llvm.org/D120577
The no-pic large code model style `movabsq $callback, %rsi` does not work with -pie. (cherry picked from commit 84647ff)
ClangBuiltLinux/linux#1606 When GNU_PROPERTY_X86_FEATURE_1_IBT is enabled, ld.lld will create .plt output section even if there is no PLT entry. Fix this by implementing IBTPltSection::isNeeded instead of using the default code path (which always returns true). Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D120600 (cherry picked from commit 9d7001e)
According to Linux documentation (see e.g. https://linux.die.net/man/3/closedir): > A successful call to `closedir()` also closes the underlying file > descriptor associated with `dirp`. Thus, calling `close()` after a successful call to `closedir()` is at best redundant. Worse, should a different thread open a file in-between the calls to `closedir()` and `close()` and get the same file descriptor, the call to `close()` might actually close a different file than was intended. rdar://89251874 Differential Revision: https://reviews.llvm.org/D120453 (cherry picked from commit 3906ebf)
Differential Revision: https://reviews.llvm.org/D120065 (cherry picked from commit c79c13c)
Allocate on ASTContext, rather than just on heap, so that template parameter lists are freed up. Differential Revision: https://reviews.llvm.org/D120081 (cherry picked from commit 977b1f5)
(cherry picked from commit bcbb037)
A call to getInsertIndex() in getTreeCost() is returning None, which causes an assert because a non-constant index value for insertelement was not expected. This case occurs when the insertelement index value is defined with a PHI. Differential Revision: https://reviews.llvm.org/D120223 (cherry picked from commit 3cc15e2)
…ypes This patch fixes an invalid TypeSize->uint64_t implicit conversion in FoldReinterpretLoadFromConst. If the size of the constant is scalable we bail out of the optimisation for now. Tests added here: Transforms/InstCombine/load-store-forward.ll Differential Revision: https://reviews.llvm.org/D120240 (cherry picked from commit 47eff64)
This patch implements avr-gcc's calling convention: https://gcc.gnu.org/wiki/avr-gcc#Calling_Convention Reviewed By: aykevl Differential Revision: https://reviews.llvm.org/D120720 (cherry picked from commit 86c1d07)
* Remove `std::forward` call for `iterator_range` iterator de-reference. * Use exact type `T` in `has_StreamOperator` instead of constant reference. It fixes formatting usage for some tricky cases, like special ranges, which de-reference to value type. It fixes formatting usage with `mlir::Operation` type, which is always passed by non-const reference. Differential Revision: https://reviews.llvm.org/D94769
Remove `std::forward` call for `iterator_range` iterator de-reference. It fixes formatting usage for some tricky cases, like special ranges, which de-reference to value type. Differential Revision: https://reviews.llvm.org/D94769
It is added to callbacks for MLIR -> outer format conversion. Can be used for the transations that can't work with `raw_ostream`.
Based on the following discussion: https://llvm.discourse.group/t/declarative-assembly-format-requirement-for-type-presence/4399 Relax checks in `OperationParser` - it allows to skip value type specification, if the value was already defined in the same block. Differential Revision: https://reviews.llvm.org/D111650
* - fixed uninitialized array declaration in `mlir-tblgen/OpFormatGen.cpp` - fixed creating `void` methods returning value in `mlir-tblgen/OpInterfacesGen.cpp` * Apply comment on OpInterfacesGen.cpp by nikita-kud
…orce Do not set CMAKE_CXX_FLAGS_<config> with FORCE
Dialect flags
Maxim-Doronin
pushed a commit
that referenced
this pull request
Feb 16, 2023
We experienced some deadlocks when we used multiple threads for logging using `scan-builds` intercept-build tool when we used multiple threads by e.g. logging `make -j16` ``` (gdb) bt #0 0x00007f2bb3aff110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 #1 0x00007f2bb3af70a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007f2bb3d152e4 in ?? () #3 0x00007ffcc5f0cc80 in ?? () #4 0x00007f2bb3d2bf5b in ?? () from /lib64/ld-linux-x86-64.so.2 #5 0x00007f2bb3b5da27 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #6 0x00007f2bb3b5dbe0 in exit () from /lib/x86_64-linux-gnu/libc.so.6 #7 0x00007f2bb3d144ee in ?? () #8 0x746e692f706d742f in ?? () #9 0x692d747065637265 in ?? () #10 0x2f653631326b3034 in ?? () #11 0x646d632e35353532 in ?? () #12 0x0000000000000000 in ?? () ``` I think the gcc's exit call caused the injected `libear.so` to be unloaded by the `ld`, which in turn called the `void on_unload() __attribute__((destructor))`. That tried to acquire an already locked mutex which was left locked in the `bear_report_call()` call, that probably encountered some error and returned early when it forgot to unlock the mutex. All of these are speculation since from the backtrace I could not verify if frames 2 and 3 are in fact corresponding to the `libear.so` module. But I think it's a fairly safe bet. So, hereby I'm releasing the held mutex on *all paths*, even if some failure happens. PS: I would use lock_guards, but it's C. Reviewed-by: NoQ Differential Revision: https://reviews.llvm.org/D118439 (cherry picked from commit d919d02)
nikita-kud
pushed a commit
that referenced
this pull request
May 9, 2025
`clang-repl --cuda` was previously crashing with a segmentation fault, instead of reporting a clean error ``` (base) anutosh491@Anutoshs-MacBook-Air bin % ./clang-repl --cuda #0 0x0000000111da4fbc llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x150fbc) #1 0x0000000111da31dc llvm::sys::RunSignalHandlers() (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x14f1dc) #2 0x0000000111da5628 SignalHandler(int) (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x151628) #3 0x000000019b242de4 (/usr/lib/system/libsystem_platform.dylib+0x180482de4) #4 0x0000000107f638d0 clang::IncrementalCUDADeviceParser::IncrementalCUDADeviceParser(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, clang::CompilerInstance&, llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem>, llvm::Error&, std::__1::list<clang::PartialTranslationUnit, std::__1::allocator<clang::PartialTranslationUnit>> const&) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x216b8d0) #5 0x0000000107f638d0 clang::IncrementalCUDADeviceParser::IncrementalCUDADeviceParser(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, clang::CompilerInstance&, llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem>, llvm::Error&, std::__1::list<clang::PartialTranslationUnit, std::__1::allocator<clang::PartialTranslationUnit>> const&) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x216b8d0) #6 0x0000000107f6bac8 clang::Interpreter::createWithCUDA(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x2173ac8) #7 0x000000010206f8a8 main (/opt/local/libexec/llvm-20/bin/clang-repl+0x1000038a8) #8 0x000000019ae8c274 Segmentation fault: 11 ``` The underlying issue was that the `DeviceCompilerInstance` (used for device-side CUDA compilation) was never initialized with a `Sema`, which is required before constructing the `IncrementalCUDADeviceParser`. https://github.com/llvm/llvm-project/blob/89687e6f383b742a3c6542dc673a84d9f82d02de/clang/lib/Interpreter/DeviceOffload.cpp#L32 https://github.com/llvm/llvm-project/blob/89687e6f383b742a3c6542dc673a84d9f82d02de/clang/lib/Interpreter/IncrementalParser.cpp#L31 Unlike the host-side `CompilerInstance` which runs `ExecuteAction` inside the Interpreter constructor (thereby setting up Sema), the device-side CI was passed into the parser uninitialized, leading to an assertion or crash when accessing its internals. To fix this, I refactored the `Interpreter::create` method to include an optional `DeviceCI` parameter. If provided, we know we need to take care of this instance too. Only then do we construct the `IncrementalCUDADeviceParser`. (cherry picked from commit 21fb19f)
nikita-kud
pushed a commit
that referenced
this pull request
Jun 30, 2025
…e (#138091) Check this error for more context (https://github.com/compiler-research/CppInterOp/actions/runs/14749797085/job/41407625681?pr=491#step:10:531) This fails with ``` * thread #1, name = 'CppInterOpTests', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x55500356d6d3) * frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99 frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830 frame #2: 0x00007fffee20917a libclangCppInterOp.so.21.0gitstd::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 58 frame #3: 0x00007fffee224796 libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 838 frame #4: 0x00007fffee22494d libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 13 frame #5: 0x00007fffed95ec62 libclangCppInterOp.so.21.0gitclang::IncrementalCUDADeviceParser::~IncrementalCUDADeviceParser() + 98 frame #6: 0x00007fffed9551b6 libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 102 frame #7: 0x00007fffed95598d libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 13 frame #8: 0x00007fffed9181e7 libclangCppInterOp.so.21.0gitcompat::createClangInterpreter(std::vector<char const*, std::allocator<char const*>>&) + 2919 ``` Problem : 1) The destructor currently handles no clearance for the DeviceParser and the DeviceAct. We currently only have this https://github.com/llvm/llvm-project/blob/976493822443c52a71ed3c67aaca9a555b20c55d/clang/lib/Interpreter/Interpreter.cpp#L416-L419 2) The ownership for DeviceCI currently is present in IncrementalCudaDeviceParser. But this should be similar to how the combination for hostCI, hostAction and hostParser are managed by the Interpreter. As on master the DeviceAct and DeviceParser are managed by the Interpreter but not DeviceCI. This is problematic because : IncrementalParser holds a Sema& which points into the DeviceCI. On master, DeviceCI is destroyed before the base class ~IncrementalParser() runs, causing Parser::reset() to access a dangling Sema (and as Sema holds a reference to Preprocessor which owns PragmaNamespace) we see this ``` * frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99 frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830 ``` (cherry picked from commit 529b6fc)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.