-
Notifications
You must be signed in to change notification settings - Fork 809
LLVM and SPIRV-LLVM-Translator #1372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
vladimirlaz
merged 330 commits into
intel:sycl
from
vladimirlaz:private/vlazarev/llvmspirv_pulldown
Mar 23, 2020
Merged
LLVM and SPIRV-LLVM-Translator #1372
vladimirlaz
merged 330 commits into
intel:sycl
from
vladimirlaz:private/vlazarev/llvmspirv_pulldown
Mar 23, 2020
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This reverts commit 8b409ea. Reverting this patch for now because it breaks some buildbots.
Summary: Add lowering support for inserting pointers or scalars into scalars, vectors or pointers Reviewers: arsenm, dsanders Reviewed By: arsenm Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, volkan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75994
…g statement The nullptr check here was removed in 4ef50a3 when I replaced (nearly) all log->Print to LLDB_LOG calls (which automatically check for this stuff). But it seems this one call escaped my sed call. Currently working on a test that can cover this code path but we can revert this until I have found one.
I used the implementation for floor instead of round. It also turns out the OpenCL builtin library wasn't using the round builtin, but implemented the expanded form.
The GDB replay server sanity-checks that every packet it receives matches what it expects from the serialized packet log. This mechanism tripped for TestReproducerAttach.py on Linux, because one of the packets (jModulesInfo) uses run-length encoding. The replay server was comparing the expanded incoming packet with the unexpanded packet in the log. As a result, it claimed to have received an unexpected packet, which caused the test to fail. This patch addresses that issue by expanding the run-length encoding before comparing the packets. Differential revision: https://reviews.llvm.org/D76163
To group the code in one place, simplify it and make it easier to add the containsErrors bit and find existing bugs.
This reverts commit ddd20ed. The patch was landed by accident.
Summary: Previously, the range for "->" CXXOperatorCallExpr is the range of the class object (not including the operator!), e.g. "[[vector_ptr]]->size()". This patch includes the range of the operator, which fixes the issue where clangd doesn't go to the overloaded operator "->" definition. Reviewers: sammccall Reviewed By: sammccall Subscribers: ilya-biryukov, jkorous, arphaman, kadircet, usaxena95, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76128
The result is better if ftrunc is emitted and separately legalized when unavailable.
Summary: It's possible for an instance of the visual studio debugger to return a NoneType line number location when stepping during a debugging session. This patches teaches DexTer how to handle this particular case without crashing out. Reviewers: Orlando Differential revision: https://reviews.llvm.org/D75992
...to properly silence clang deprecation warnings in `test/std/utilities/meta/meta.trans/meta.trans.other/result_of11.pass.cpp`.
Fixes integers that don't evenly divide to i32 pieces. We should probably extract some of the code in the legalizer to start handling argument breakdowns. I'm dissatisfied with the argument lowering's handling of vectors for example, and we should not be producing the weird G_EXTRACTs we do now.
We were letting G_ANYEXT with a vcc register bank through, which was incorrect and would select to an invalid copy. Fix this up like G_ZEXT and G_SEXT. Also drop old code to fixup the non-boolean case in RegBankSelect. We now have to perform that expansion during selection, so there's no benefit to doing it during RegBankSelect.
This test case fails due to different handling of weak items between LLD and LD on PPC. The issue only occurs when the default linker is LLD and the test case is run on a system where ASLR is enabled.
This would hit an assertion from trying to use the wrong bitwidth for the constants.
Add optional support for opt-in partial reduction cases by providing an optional partial mask to indicate which elements have been extracted for the scalar reduction.
Summary: Copy of https://reviews.llvm.org/D72089 with Ilya's permission. See https://reviews.llvm.org/D72089 for the first batch of comments. Reviewers: gribozavr2 Reviewed By: gribozavr2 Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76220
This was creating natural aligned loads and stores, which may not be the case. The target could request a wider type load with less alignment.
Summary: Skip folds that rely on DataLayout::getTypeAllocSize(). For scalable vector, only minimal type alloc size is known at compile-time. Reviewers: sdesmalen, efriedma, spatel, apazos Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75892
Summary: Intrinsics and relative codegen has been implemented for the following SVE instructions: 1. PRF<T> <prfop>, <Pg>, [<Xn|SP>, <Zm>.S, <mod>] -> 32-bit scaled offset 2. PRF<T> <prfop>, <Pg>, [<Xn|SP>, <Zm>.D, <mod>] -> 32-bit unpacked scaled offset 3. PRF<T> <prfop>, <Pg>, [<Xn|SP>, <Zm>.D] -> 64-bit scaled offset 4. PRF<T> <prfop>, <Pg>, [<Zn>.S{, #<imm>}] -> 32-bit element 5. PRF<T> <prfop>, <Pg>, [<Zn>.D{, #<imm>}] -> 64-bit element The instructions are associated the following intrinsics, respectively: 1. void @llvm.aarch64.sve.gather.prf<T>.scaled.<mod>.nx4vi32( i8* %base, <vscale x 4 x i32> %offset, <vscale x 4 x i1> %Pg, i32 %prfop) 2. void @llvm.aarch64.sve.gather.prf<T>.scaled.<mod>.nx2vi32( i8* %base, <vscale x 2 x i32> %offset, <vscale x 2 x i1> %Pg, i32 %prfop) 3. void @llvm.aarch64.sve.gather.prf<T>.scaled.nx2vi64( i8* %base, <vscale x 2 x i64> %offset, <vscale x 2 x i1> %Pg, i32 %prfop) 4. void @llvm.aarch64.sve.gather.prf<T>.nx4vi32( <vscale x 4 x i32> %bases, i64 %imm, <vscale x 4 x i1> %Pg, i32 %prfop) 5. void @llvm.aarch64.sve.gather.prf<T>.nx2vi64( <vscale x 2 x i64> %bases, i64 %imm, <vscale x 2 x i1> %Pg, i32 %prfop) The intrinsics are the IR counterpart of the following SVE ACLE functions: * void svprf<T>(svbool_t pg, const void *base, svprfop op) * void svprf<T>_vnum(svbool_t pg, const void *base, int64_t vnum, svprfop op) * void svprf<T>_gather[_u32base](svbool_t pg, svuint32_t bases, svprfop op) * void svprf<T>_gather[_u64base](svbool_t pg, svuint64_t bases, svprfop op) * void svprf<T>_gather_[s32]offset(svbool_t pg, const void *base, svint32_t offsets, svprfop op) * void svprf<T>_gather_[u32]offset(svbool_t pg, const void *base, svint32_t offsets, svprfop op) * void svprf<T>_gather_[s64]offset(svbool_t pg, const void *base, svint64_t offsets, svprfop op) * void svprf<T>_gather_[u64]offset(svbool_t pg, const void *base, svint64_t offsets, svprfop op) * void svprf<T>_gather[_u32base]_offset(svbool_t pg, svuint32_t bases, int64_t offset, svprfop op) * void svprf<T>_gather[_u64base]_offset(svbool_t pg, svuint64_t bases,int64_t offset, svprfop op) Reviewers: andwar, sdesmalen, efriedma, rengolin Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75580
This adds the Arm Optimized Routines (see https://github.com/ARM-software/optimized-routines) source code under the the LLVM license. The version of the code provided in this patch is v20.02 of the Arm Optimized Routines project. This entire contribution is being committed as is even though it does not currently fit the LLVM libc model and does not follow the LLVM coding style. In the near future, implementations from this patch will be moved over to their right place in the LLVM-libc tree. This will be done over many small patches, all of which will go through the normal LLVM code review process. See this libc-dev post for the plan: http://lists.llvm.org/pipermail/libc-dev/2020-March/000044.html Differential revision of the original upload: https://reviews.llvm.org/D75355
This reverts commit 45555c3. Causes clang crashes in some causes, see comments on https://reviews.llvm.org/D75815 for details (including repro steps).
Also, add config.mk file which will help test the implementations in the "math" directory for x86_64 with a simple "make check".
MSVC is unable to deduce template types when the type involves auto.
Fixes deprecation warning in EXPENSIVE_CHECKS builds.
…nstants As pointed out in https://bugs.llvm.org/show_bug.cgi?id=45232 this code can end up shifting a 64-bit unsigned value left by 64 bits. Althought this works as expected on some platforms it is definitely UB. This patch removes the UB and adds the associated test case. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45232
arm64e adds support for pointer authentication, which was adopted by libplatform to harden setjmp/longjmp and friends. We need to teach the TSan interceptors for those functions about this. Reviewed By: kubamracek Differential Revision: https://reviews.llvm.org/D76257
HIPToolChain::TranslateArgs call TranslateArgs of host toolchain with the input args to get a list of derived args called DAL, then go through the input args by itself and append them to DAL. This assumes that the host toolchain should not append any unchanged args to DAL, otherwise there will be duplicates since HIPToolChain will append it again. This works for GNU toolchain since it returns an empty list for DAL. However, MSVC toolchain will append unchanged args to DAL, which causes duplicate args. This patch let MSVC toolchain not append unchanged args for HIP offloading kind, which fixes this issue. Differential Revision: https://reviews.llvm.org/D76032
…LEMENT/OR/BSWAP/BITREVERSE instructions (PR36319) These are all covered by the bswap/bitreverse vector tests.
Added parsing/sema/serialization support for extended device clause in executable target directives.
This is fixing up various places that use the implicit TypeSize->uint64_t conversion. The new overloads in MemoryLocation.h are already used in various places that construct a MemoryLocation from a TypeSize, including MemorySSA. (They were using the implicit conversion before.) Differential Revision: https://reviews.llvm.org/D76249
Submitted as obvious.
…rrectness. This patch rewrites the RegisterBankEmitter class to derive RegisterClassHierarchy from CodeGenTarget::getRegBank() rather than constructing our own copy. All are now accessed through a const reference. Differential Revision: https://reviews.llvm.org/D76006
This patch generates TableGen descriptions for the specified register banks which contain a list of register sizes corresponding to the available HwModes. The appropriate size is used during codegen according to the current HwMode. As this HwMode was not available on generation, it is set upon construction of the RegisterBankInfo class. Targets simply need to provide the HwMode argument to the <target>GenRegisterBankInfo constructor. The RISC-V RegisterBankInfo constructor has been updated accordingly (plus an unused argument removed). Differential Revision: https://reviews.llvm.org/D76007
Summary: This is somewhat complex(annoying) as it involves directly tracking the uses within each of the callgraph nodes, and updating them as needed during inlining. The benefit of this is that we can have a more exact cost model, enable inlining some otherwise non-inlinable cases, and also ensure that newly dead callables are properly disposed of. Differential Revision: https://reviews.llvm.org/D75476
Summary: The memory history plugin for Asan creates a HistoryThread with the recorded PC values provided by the Asan runtime. In other cases, thoses PCs are gathered by LLDB directly. The PCs returned by the Asan runtime are the PCs of the calls in the backtrace, not the return addresses you would normally get when unwinding the stack (look for a call to GetPreviousIntructionPc in AsanGetStack). When the above addresses are passed to the unwinder, it will subtract 1 from each address of the non zero frames because it treats them as return addresses. This can lead to the final report referencing the wrong line. This patch fixes this issue by threading a flag through HistoryThread and HistoryUnwinder that tells them to treat every frame like the first one. The Asan MemoryHistory plugin can then use this flag. This fixes running TestMemoryHistory on arm64 devices, although it's hard to guarantee that the test will continue to exhibit the boundary condition that triggers this bug. Reviewers: jasonmolenda, kubamracek Subscribers: kristof.beyls, danielkiss, lldb-commits Tags: #lldb Differential Revision: https://reviews.llvm.org/D76341
Summary: These intrinsics will be used to lower vector transfer read/write. Reviewers: aartbik, tetuante, jsetoain Reviewed By: aartbik Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75986
…ructions (PR36319)
Summary: The usage story in for NDEBUG isn't fleshed out yet, so this revision ensures that none of the diagnostic code exists in the binary. Differential Revision: https://reviews.llvm.org/D76372
…s CSE when setting the nofpexcept flag for constrained intrinsics SelectionDAG CSEs nodes based on their result type and operands, but not their flags. The flags are expected to be intersected when they are CSEd. In SelectionDAGBuilder, for FP nodes we manage both the fast math flags and the nofpexcept flag after the nodes have already been CSEd when they were created with getNode. The management of the fastmath flags before the constrained nodes prevents the nofpexcept management from working correctly. This commit moves the FMF handling for constrained intrinsics into their visitor and disables the common FMF handling for these nodes. Differential Revision: https://reviews.llvm.org/D75224
This tool is used for generating and manipulating GSYM files. Differential Revision: https://reviews.llvm.org/D76204
The existence of the class is more confusing than helpful, I think; the commonality is mostly just "GEP is legal", which can be queried using APIs on GetElementPtrInst. Differential Revision: https://reviews.llvm.org/D75660
Extension is published as intel#1290 Co-Authored-By: Nikita Rudenko <[email protected]> Co-Authored-By: Anton Sidorenko <[email protected]> Co-Authored-By: Alexey Sachkov <[email protected]>
This reverts commit f45e370. Signed-off-by: Vladimir Lazarev <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
LLVM: e24e95f
LLVM-SPIRV-Translator: 7a0767f2