Bump to fe2119a7b08b with conflict resolution (1) #258

mgehre-amd · 2024-08-15T12:47:20Z

No description provided.

…lvm#84962) This PR includes an initial scheduler model shows improvement on multiple workloads over NoSchedModel and SiFive7Model for sifive-p670. We plan on making significant changes to this model in the future so that it is more accurate. This patch would close llvm#80612.

CI checks were passing in llvm#84962 (c48d818) but that commit caused failures once merged due to ships passing since the PR was not rebased on llvm#85131. This commit fixes this problem by adding sched resources for integer min max instructions from Zbb in P600 model.

We're seeing an issue on Macs, which shouldn't be using this config, so we will temporarily disable this while we investigate.

…ciation. (llvm#85486) Since we are introducing new multiplies earlier in the arithmetic, the nsw/nuw flags on later instructions are no longer accurate. Fixes llvm#85457.

…vm#85652)

This patch fixes: lld/MachO/ObjC.cpp:633:12: error: unused variable 'expectedListSize' [-Werror,-Wunused-variable] lld/MachO/ObjC.cpp:1034:12: error: unused variable 'newCatDef' [-Werror,-Wunused-variable]

This adds an API call ompx_dump_mapping_tables. This allows users to debug the mapping tables and can be especially useful for unified shared memory applications to check if the code behaves in the way it should. The implementation reuses code already present to dump mapping tables (in a debug setting). --------- Co-authored-by: Joseph Huber <[email protected]>

@nikolaypanchenko

Instead of passing VPlan in a number of places, just store it directly in VPRecipeBuilder. A single instance is only used for a single VPlan. This simplifies the code and was suggested by @nikolaypanchenko in llvm#84464.

This reverts commit e419084. Likely cause of buildbot failure: https://lab.llvm.org/buildbot/#/builders/179/builds/9629

…vm#85458) Setting GCC_INSTALL_PREFIX leads to a warning (llvm#77537). Link: https://discourse.llvm.org/t/add-gcc-install-dir-deprecate-gcc-toolchain-and-remove-gcc-install-prefix/65091 Link: https://discourse.llvm.org/t/correct-cmake-parameters-for-building-clang-and-lld-for-riscv/72833

… with `unreachable` Since some of the users of `CodeExtractor` like `HotColdSplitting` run late in the pipeline, returns are not cleaned to `unreachable`. So, just emit `unreachable` directly if the function is `noreturn`. Closes llvm#84682

We already handle the `+x` case, and noticed it was missing in the bug affecting llvm#82555 Proofs: https://alive2.llvm.org/ce/z/WUSvmV Closes llvm#85345

As an extension to llvm#84751, this adds some extra uses of beforeOrAfterPointer() instead of UnknownSize.

This patch yields small speed-ups in compiler build and execution times, but more importantly, reduces the stack depth needed in a build environment where tail call optimization does not appear to occur.

The folding of the SCALE() intrinsic function is implemented via multiplication by a power of two; this simplifies handling of exceptional cases. But sometimes scaling by a power of two requires an exponent larger or smaller than a floating-point format can represent, and two multiplications are required.

…lvm#85587) Excess hexadecimal digits were too significant for rounding purposes, leading to inappropriate rounding away from zero for some modes.

There's several symbol attributes that cannot be applied to named constants, but that weren't being checked.

Replace a pointer that should never be null with a reference argument so that it's always defined. Fixes llvm#85615.

Reland of llvm#84991 A downstream overlay mode user ran into issues with the isnan macro not working in our sources with a specific libc configuration. This patch replaces the last direct includes of math.h with our internal math_macros.h, along with the necessary build system changes.

- Instead of lowering float/double ISD::ATOMIC_LOAD / ISD::ATOMIC_STORE nodes to regular LOAD/STORE nodes, make them legal and select those nodes properly instead. This avoids exposing them to the DAGCombiner. - AtomicExpand pass no longer casts float/double atomic load/stores to integer (FP128 is still casted).

The code that prints ValueObjects is duplicated across two different cases of the dwim-print command, and a subsequent commit will add a third case. As such, this commit factors out the common code into a lambda. A free function was considered, but there is too much function-local context required in that. We also reword some of the comments so that they stop counting cases, making it easier to add other cases later.

This lambda does not capture anything, the `&` is just misleading.

…_AT_specification (llvm#85485) According to the DWARF spec a DIE that has DW_AT_specification or DW_AT_abstract_origin can be part of .debug_name if a DIE those attribute points to has DW_AT_name or DW_AT_linkage_name.

this DCHECK was not valid for hwasan_load1, and was not necessary for the. the function is written without any assumptions of alignment of the pointer.

## Abstract This pull request removes the `__workaround_52970` concept. This concept is a workaround for a bug described in llvm#52970, which causes the compiler to trigger ADL on a pointer to an incomplete type in an SFINAE context. This bug is fixed in Clang 14. ## Reference - [[clang] Don't typo-fix an expression in a SFINAE context](https://reviews.llvm.org/D117603) - [[libc++] [ranges] ADL-proof the [range.access] CPOs.](https://reviews.llvm.org/D116239)

Authored-by: Pravin Jagtap <[email protected]>

…lvm#84405) We would like the resolver to be generated eagerly, even if the versioned function is not called from the current translation unit. Fixes llvm#81494. It further allows Multi Versioning to work even if the default target version attribute is omitted from function declarations.

…m#84545) This fixes the following failure when doing a clean build (in particular no .ninja* lying around) of lib/libMLIRLinalgToStandard.a only: ``` In file included from mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h:12, from mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h:21, from mlir/lib/Conversion/LinalgToStandard/LinalgToStandard.cpp:15: mlir/include/mlir/Dialect/Vector/Transforms/VectorRewritePatterns.h:20:10: fatal error: mlir/Dialect/Vector/Transforms/VectorTransformsEnums.h.inc: No such file or directory ```

Most FIR passes only look for FIR operations inside of functions (either because they run only on func.func or they run on the module but iterate over functions internally). But there can also be FIR operations inside of fir.global, some OpenMP and OpenACC container operations. This has worked so far for fir.global and OpenMP reductions because they only contained very simple FIR code which doesn't need most passes to be lowered into LLVM IR. I am not sure how OpenACC works. In the long run, I hope to see a more systematic approach to making sure that every pass runs on all of these container operations. I will write an RFC for this soon. In the meantime, this pass duplicates the CFG conversion pass to also run on omp reduction operations. This is similar to how the AbstractResult pass is already duplicated for fir.global operations. OpenMP array reductions 2/6 Previous PR: llvm#84952 Next PR: llvm#84954 --------- Co-authored-by: Mats Petersson <[email protected]>

…84954) OpenMP reduction declare operations can contain FIR code which needs to be lowered to LLVM. With array reductions, these regions can contain more complicated operations which need PreCGRewriting. A similar extra case was already needed for fir::GlobalOp. OpenMP array reductions 3/6 Previous PR: llvm#84953 Next PR: llvm#84955

@vt

As part of the migration to ptradd (https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699), we need to change the representation of the `inrange` attribute, which is used for vtable splitting. Currently, inrange is specified as follows: ``` getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2) ``` The `inrange` is placed on a GEP index, and all accesses must be "in range" of that index. The new representation is as follows: ``` getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2) ``` This specifies which offsets are "in range" of the GEP result. The new representation will continue working when canonicalizing to ptradd representation: ``` getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48) ``` The inrange offsets are relative to the return value of the GEP. An alternative design could make them relative to the source pointer instead. The result-relative format was chosen on the off-chance that we want to extend support to non-constant GEPs in the future, in which case this variant is more expressive. This implementation "upgrades" the old inrange representation in bitcode by simply dropping it. This is a very niche feature, and I don't think trying to upgrade it is worthwhile. Let me know if you disagree.

It looks like the mappings for call instructions were forgotten here. This fixes a bug in OpenMP when in-lining a region containing call operations multiple times. OpenMP array reductions 4/6 Previous PR: llvm#84954 Next PR: llvm#84957

…m#85819) - Give the tablegen record for the Real the same name as the tablegen record for the pseudo. This removes all cases where the same instruction name has to be mentioned more than once on the definition line. - Use multiclasses for all Real definitions, to allow suffixes to be added bit by bit, e.g. first _SADDR and then _gfx11. This is a similar approach to the one used in BUFInstructions.td.

This reverts commit 9848fa4. Causes buildbot failures.

…code (llvm#84957) Moving extractSequenceType to FIRType.h so that this can also be used from OpenMP. OpenMP array reductions 5/6 Previous PR: llvm#84955 Next PR: llvm#84958

This fixes the following failure when doing a clean build (in particular no .ninja* lying around) of lib/libMLIRAffineAnalysis.a only: In file included from mlir/lib/Dialect/Affine/Analysis/AffineAnalysis.cpp:20: mlir/include/mlir/Dialect/Func/IR/FuncOps.h:29:10: fatal error: mlir/Dialect/Func/IR/FuncOps.h.inc: No such file or directory

The test added by llvm#83261 has been consistently failing. Mark as UNSUPPORTED just like on x86_64 and aarch64.

…3255) Allows us to indicate that an addressing mode featuring a vscale-relative immediate offset is supported.

This fixes the following failure when doing a clean build (in particular no .ninja* lying around) of lib/libMLIRSPIRVDialect.a only: ``` mlir/lib/Dialect/SPIRV/IR/SPIRVDialect.cpp:17: mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h:120:10: fatal error: mlir/Dialect/GPU/IR/CompilationAttrInterfaces.h.inc: No such file or directory ```

…tion." (llvm#85914) Reverts llvm#84405 In between of passing the precommit tests on github and being merged some change (perhaps in the AArch64 backend?) landed which resulted in altering the generated resolver. I will regenerate the tests perhaps using a less sensitive runline to such changes.

These traits can be expressed without having to instantiate any classes, reducing compile times slightly.

…ate (llvm#84173) Adds a second parameter (default to 0) to isLegalAddImmediate, to represent a scalable immediate. Extends the AArch64 implementation to match immediates based on what addvl and inc[h|w|d] support.

These were disabled due to ODR violations with mixed versions of clang-tidy and the clang libraries. This issue was fixed in e19e860. This reverts commit 0bbada9.

This has been tested with arrays with compile-time constant bounds. Allocatable arrays and arrays with non-constant bounds are not yet supported. User-defined reduction functions are also not yet supported. The design is intended to work for arrays with non-constant bounds too without a lot of extra work (mostly there are bugs in OpenMPIRBuilder I haven't fixed yet). We need some way to get these runtime bounds into the reduction init and combiner regions. To keep things simple for now I opted to always box the array arguments so the box can be passed as one argument and the lower bounds and extents read from the box. This has the disadvantage of resulting in fir.box_dim operations inside of the critical section. If these prove to be a performance issue, we could follow OpenACC reading box lower bounds and extents before the reduction and passing them as block arguments to the reduction init and combiner regions. I would prefer to keep things simple for now. Note: this implementation only works when the HLFIR lowering is used. I don't think it is worth supporting FIR-only lowering because the plan is for that to be removed soon. OpenMP array reductions 6/6 Previous PR: llvm#84957

…the comparison This makes no real difference currently as we only fold unary shuffles, but I'm hoping to handle binary shuffles in a future patch.

michaelmaitland and others added 30 commits March 18, 2024 13:44

[flang][NFC] Fix include style (llvm#85655)

85d7fef

[cmake] Disable FatLTO in clang build for Fuchsia (llvm#85677)

457f762

We're seeing an issue on Macs, which shouldn't be using this config, so we will temporarily disable this while we investigate.

[LICM] Drop nsw/nuw flags on affected instructions in hoistMulAddAsso…

1261c02

…ciation. (llvm#85486) Since we are introducing new multiplies earlier in the arithmetic, the nsw/nuw flags on later instructions are no longer accurate. Fixes llvm#85457.

[mlir][nvgpu] Support strided memref when creating TMA descriptor (ll…

7d55b91

…vm#85652)

[lld] Fix warnings

6800f42

This patch fixes: lld/MachO/ObjC.cpp:633:12: error: unused variable 'expectedListSize' [-Werror,-Wunused-variable] lld/MachO/ObjC.cpp:1034:12: error: unused variable 'newCatDef' [-Werror,-Wunused-variable]

[VPlan] Store VPlan directly in VPRecipeBuilder (NFCI).

8578b6e

Instead of passing VPlan in a number of places, just store it directly in VPRecipeBuilder. A single instance is only used for a single VPlan. This simplifies the code and was suggested by @nikolaypanchenko in llvm#84464.

Revert "[RemoveDIs] Enable direct-to-bitcode writing by default"

9a5c0d6

This reverts commit e419084. Likely cause of buildbot failure: https://lab.llvm.org/buildbot/#/builders/179/builds/9629

[libc] Remove fileno from GPU entrypoints

cf835b9

[InstSimply] Add tests for simplify (fmul -x, +/-0); NFC

6984ba7

[InstSimply] Simplify (fmul -x, +/-0) -> -/+0

5265be1

We already handle the `+x` case, and noticed it was missing in the bug affecting llvm#82555 Proofs: https://alive2.llvm.org/ce/z/WUSvmV Closes llvm#85345

[CodeGen] More uses of LocationSize::beforeOrAfterPointer().

18da51b

As an extension to llvm#84751, this adds some extra uses of beforeOrAfterPointer() instead of UnknownSize.

[ELF] Change getSymbolIndex to use const reference. NFC

ea72c08

[flang] Reduce recursion in common::visit (llvm#85483)

0007d7e

This patch yields small speed-ups in compiler build and execution times, but more importantly, reduces the stack depth needed in a build environment where tail call optimization does not appear to occur.

[flang][runtime] Round hex REAL input correctly with excess digits (l…

7eb5d4f

…lvm#85587) Excess hexadecimal digits were too significant for rounding purposes, leading to inappropriate rounding away from zero for some modes.

[flang] Catch inappropriate attributes for PARAMETERs (llvm#85588)

53e8d50

There's several symbol attributes that cannot be applied to named constants, but that weren't being checked.

[flang] Fix crash on erroneous program (llvm#85615) (llvm#85659)

d0d9839

Replace a pointer that should never be null with a reference argument so that it's always defined. Fixes llvm#85615.

Add missing includes (to fix the modules build)

7459f72

[ELF] Improve --pack-dyn-relocs tests for Android and RELR

228757f

[MLIR] Remove unused implicit capture in the lambda (NFC)

e0b19e9

This lambda does not capture anything, the `&` is just misleading.

remove incorrect DCHECK

7ef1a59

this DCHECK was not valid for hwasan_load1, and was not necessary for the. the function is written without any assumptions of alignment of the pointer.

xiaoyang-sde and others added 25 commits March 20, 2024 09:49

[AMDGPU] Add test for fpext & fptrunc with bf16. (llvm#85909)

070d1e8

Authored-by: Pravin Jagtap <[email protected]>

[PowerPC] Fix operand regclass of XSTSTDCSP

d032638

Revert "Enable exp10 libcall on linux (llvm#68736)"

eefef90

This reverts commit 9848fa4. Causes buildbot failures.

[flang][NFC] move extractSequenceType helper out of OpenACC to share …

3b0a426

…code (llvm#84957) Moving extractSequenceType to FIRType.h so that this can also be used from OpenMP. OpenMP array reductions 5/6 Previous PR: llvm#84955 Next PR: llvm#84958

[OpenMP] Disable workshare_chunk.c test case on SystemZ

fe13412

The test added by llvm#83261 has been consistently failing. Mark as UNSUPPORTED just like on x86_64 and aarch64.

[AArch64] Support scalable offsets with isLegalAddressingMode (llvm#8…

cd768ec

…3255) Allows us to indicate that an addressing mode featuring a vscale-relative immediate offset is supported.

[bazel] Add a missing dependency for aa95aa6

d93cfd8

[libc++][NFC] Remove uses of add_{const,cv,volatile} (llvm#85635)

7227ec9

These traits can be expressed without having to instantiate any classes, reducing compile times slightly.

[libc++][CI] Reenables the module tests. (llvm#85799)

22f2056

These were disabled due to ODR violations with mixed versions of clang-tidy and the clang libraries. This issue was fixed in e19e860. This reverts commit 0bbada9.

[gn build] Port 36a3f8f

6086937

[VectorCombine] foldBitcastShuffle - include the cost of bitcasts in …

fe2119a

…the comparison This makes no real difference currently as we only fold unary shuffles, but I'm hoping to handle binary shuffles in a future patch.

Merge commit 'fe2119a7b08b' into matthias.bump_to_fe2119a7b08b

4990e5c

mgehre-amd requested a review from cferry-AMD August 15, 2024 12:47

cferry-AMD approved these changes Aug 15, 2024

View reviewed changes

mgehre-amd enabled auto-merge August 15, 2024 13:27

mgehre-amd merged commit de8cc8f into feature/fused-ops Aug 15, 2024

mgehre-amd deleted the matthias.bump_to_fe2119a7b08b branch August 15, 2024 14:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bump to fe2119a7b08b with conflict resolution (1) #258

Bump to fe2119a7b08b with conflict resolution (1) #258

Uh oh!

mgehre-amd commented Aug 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

126 participants

Bump to fe2119a7b08b with conflict resolution (1) #258

Bump to fe2119a7b08b with conflict resolution (1) #258

Uh oh!

Conversation

mgehre-amd commented Aug 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

126 participants