Skip to content

Conversation

…" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test"

This reverts commits 677cc15 and 78bc1b6.

The test CodeGenHIP/default-attributes.hip is failing on multiple bots.
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU llvm:globalisel llvm:transforms labels Jul 15, 2024
@dyung dyung merged commit adaff46 into llvm:main Jul 15, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 15, 2024

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-clang

Author: None (dyung)

Changes

This reverts commits 677cc15 and 78bc1b6.

The test CodeGenHIP/default-attributes.hip is failing on multiple bots even after the attempted fix including the following:

These bots have been broken for a day, so reverting to get everything back to green.


Patch is 16.75 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/98851.diff

562 Files Affected:

  • (modified) clang/test/CodeGenHIP/default-attributes.hip (+16-35)
  • (modified) llvm/docs/ReleaseNotes.rst (-4)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp (+5-8)
  • (modified) llvm/lib/Target/AMDGPU/SIFrameLowering.cpp (-6)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/addsubu64.ll (+8-8)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmax.ll (+36-116)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmin.ll (+36-116)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_udec_wrap.ll (+259-278)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_uinc_wrap.ll (+278-297)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/bool-legalization.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll (+10-10)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/cvt_f32_ubyte.ll (+38-38)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/divergent-control-flow.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/dynamic-alloca-uniform.ll (+15-15)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.ll (+138-138)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch-init.ll (+3-5)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch.ll (+88-104)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/fp-atomics-gfx940.ll (+14-66)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/fp64-atomics-gfx90a.ll (+362-483)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/frem.ll (+56-56)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/function-returns.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/implicit-kernarg-backend-usage-global-isel.ll (+20-20)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/inline-asm-mismatched-size.ll (-3)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement-stack-lower.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.large.ll (+3-5)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-amdgpu_kernel-system-sgprs.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-amdgpu_kernel.ll (+236-236)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-fence.ll (-120)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-sibling-call.ll (+148-121)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/lds-global-value.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/lds-zero-initializer.ll (+19-19)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.div.scale.ll (+233-295)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.end.cf.i32.ll (+5-5)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.end.cf.i64.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.global.atomic.csub.ll (+12-12)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.global.atomic.fadd.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.if.break.i32.ll (+9-9)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.if.break.i64.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll (+55-58)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.is.private.ll (+11-12)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.is.shared.ll (+11-12)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.kernarg.segment.ptr.ll (+3-4)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.mfma.gfx90a.ll (+17-17)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.mov.dpp.ll (+12-12)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.queue.ptr.ll (+2-13)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.sbfe.ll (+49-49)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.set.inactive.ll (+43-43)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.trig.preop.ll (+16-16)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.ubfe.ll (+63-63)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.update.dpp.ll (+38-45)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.workgroup.id.ll (+4-5)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.workitem.id.ll (+9-11)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/localizer.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/memory-legalizer-atomic-fence.ll (+40-42)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/mul-known-bits.i64.ll (+48-66)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll (+12-12)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/non-entry-alloca.ll (+23-23)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll (+369-369)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/shl-ext-reduce.ll (+17-18)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/store-local.128.ll (+86-86)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/store-local.96.ll (+86-86)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll (+243-243)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/vni8-across-blocks.ll (+98-98)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/widen-i8-i16-scalar-loads.ll (+30-30)
  • (modified) llvm/test/CodeGen/AMDGPU/add.ll (+159-168)
  • (modified) llvm/test/CodeGen/AMDGPU/add.v2i16.ll (+102-122)
  • (modified) llvm/test/CodeGen/AMDGPU/addrspacecast.ll (+2-4)
  • (modified) llvm/test/CodeGen/AMDGPU/agpr-copy-no-free-registers.ll (+64-65)
  • (modified) llvm/test/CodeGen/AMDGPU/agpr-register-count.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/always-uniform.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/amd.endpgm.ll (+17-17)
  • (modified) llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fold-binop-select.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll (+1435-1442)
  • (modified) llvm/test/CodeGen/AMDGPU/amdgpu-mul24-knownbits.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/amdgpu-simplify-libcall-sincos.ll (+24-24)
  • (modified) llvm/test/CodeGen/AMDGPU/amdgpu.private-memory.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/amdgpu.work-item-intrinsics.deprecated.ll (+18-18)
  • (modified) llvm/test/CodeGen/AMDGPU/amdpal-elf.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/anyext.ll (+19-19)
  • (modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_buffer.ll (+702-808)
  • (modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll (+592-701)
  • (modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll (+1059-1160)
  • (modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_raw_buffer.ll (+554-652)
  • (modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_struct_buffer.ll (+626-722)
  • (modified) llvm/test/CodeGen/AMDGPU/atomics_cond_sub.ll (+24-24)
  • (modified) llvm/test/CodeGen/AMDGPU/attr-amdgpu-waves-per-eu.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/attributor-noopt.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/bf16.ll (+288-288)
  • (modified) llvm/test/CodeGen/AMDGPU/bfe-combine.ll (+18-18)
  • (modified) llvm/test/CodeGen/AMDGPU/bfe-patterns.ll (+28-28)
  • (modified) llvm/test/CodeGen/AMDGPU/bfi_int.ll (+121-115)
  • (modified) llvm/test/CodeGen/AMDGPU/bfi_nested.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/bfm.ll (+8-8)
  • (modified) llvm/test/CodeGen/AMDGPU/bitreverse.ll (+73-89)
  • (modified) llvm/test/CodeGen/AMDGPU/br_cc.f16.ll (+32-32)
  • (modified) llvm/test/CodeGen/AMDGPU/branch-relax-spill.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/branch-relaxation.ll (+46-46)
  • (modified) llvm/test/CodeGen/AMDGPU/bswap.ll (+21-21)
  • (modified) llvm/test/CodeGen/AMDGPU/buffer-fat-pointer-atomicrmw-fadd.ll (+9434-5415)
  • (modified) llvm/test/CodeGen/AMDGPU/buffer-fat-pointer-atomicrmw-fmax.ll (+2787-2371)
  • (modified) llvm/test/CodeGen/AMDGPU/buffer-fat-pointer-atomicrmw-fmin.ll (+2787-2371)
  • (modified) llvm/test/CodeGen/AMDGPU/buffer-rsrc-ptr-ops.ll (+14-14)
  • (modified) llvm/test/CodeGen/AMDGPU/build_vector.ll (+37-37)
  • (modified) llvm/test/CodeGen/AMDGPU/call-constexpr.ll (+2-3)
  • (modified) llvm/test/CodeGen/AMDGPU/call-graph-register-usage.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/call-reqd-group-size.ll (+18-18)
  • (modified) llvm/test/CodeGen/AMDGPU/callee-special-input-sgprs-fixed-abi.ll (+2-4)
  • (modified) llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs-packed.ll (+2-4)
  • (modified) llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll (+1-3)
  • (modified) llvm/test/CodeGen/AMDGPU/calling-conventions.ll (+77-135)
  • (modified) llvm/test/CodeGen/AMDGPU/carryout-selection.ll (+382-394)
  • (modified) llvm/test/CodeGen/AMDGPU/cc-update.ll (+9-9)
  • (modified) llvm/test/CodeGen/AMDGPU/cf-loop-on-constant.ll (+7-7)
  • (modified) llvm/test/CodeGen/AMDGPU/cgp-addressing-modes-gfx1030.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/cgp-addressing-modes-gfx908.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/cgp-bitfield-extract.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/chain-hi-to-lo.ll (+13-13)
  • (modified) llvm/test/CodeGen/AMDGPU/clamp-modifier.ll (+101-133)
  • (modified) llvm/test/CodeGen/AMDGPU/clamp.ll (+272-455)
  • (modified) llvm/test/CodeGen/AMDGPU/cluster_stores.ll (+6-6)
  • (modified) llvm/test/CodeGen/AMDGPU/coalesce-vgpr-alignment.ll (+2-4)
  • (modified) llvm/test/CodeGen/AMDGPU/code-object-v3.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/codegen-internal-only-func.ll (+24-3)
  • (modified) llvm/test/CodeGen/AMDGPU/collapse-endcf.ll (+7-7)
  • (modified) llvm/test/CodeGen/AMDGPU/combine-cond-add-sub.ll (+113-113)
  • (modified) llvm/test/CodeGen/AMDGPU/combine-reg-or-const.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/combine-vload-extract.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll (+76-74)
  • (modified) llvm/test/CodeGen/AMDGPU/copy-to-reg-scc-clobber.ll (+16-16)
  • (modified) llvm/test/CodeGen/AMDGPU/copy_to_scc.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/ctlz.ll (+126-144)
  • (modified) llvm/test/CodeGen/AMDGPU/ctlz_zero_undef.ll (+136-136)
  • (modified) llvm/test/CodeGen/AMDGPU/ctpop16.ll (+44-44)
  • (modified) llvm/test/CodeGen/AMDGPU/ctpop64.ll (+62-62)
  • (modified) llvm/test/CodeGen/AMDGPU/cttz.ll (+84-84)
  • (modified) llvm/test/CodeGen/AMDGPU/cttz_zero_undef.ll (+83-83)
  • (modified) llvm/test/CodeGen/AMDGPU/cvt_f32_ubyte.ll (+110-141)
  • (modified) llvm/test/CodeGen/AMDGPU/dag-divergence-atomic.ll (+99-102)
  • (modified) llvm/test/CodeGen/AMDGPU/dagcomb-extract-vec-elt-different-sizes.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/dagcombine-setcc-select.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/divergence-driven-buildvector.ll (+76-76)
  • (modified) llvm/test/CodeGen/AMDGPU/divergence-driven-sext-inreg.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/divergence-driven-trunc-to-i1.ll (+6-6)
  • (modified) llvm/test/CodeGen/AMDGPU/ds-alignment.ll (+45-45)
  • (modified) llvm/test/CodeGen/AMDGPU/ds-combine-large-stride.ll (+13-13)
  • (modified) llvm/test/CodeGen/AMDGPU/ds-combine-with-dependence.ll (+4-6)
  • (modified) llvm/test/CodeGen/AMDGPU/ds-sub-offset.ll (+34-46)
  • (modified) llvm/test/CodeGen/AMDGPU/ds_read2.ll (+126-117)
  • (modified) llvm/test/CodeGen/AMDGPU/ds_write2.ll (+75-75)
  • (modified) llvm/test/CodeGen/AMDGPU/early-inline.ll (-1)
  • (modified) llvm/test/CodeGen/AMDGPU/elf-notes.ll (+1-3)
  • (modified) llvm/test/CodeGen/AMDGPU/exec-mask-opt-cannot-create-empty-or-backward-segment.ll (+5-5)
  • (modified) llvm/test/CodeGen/AMDGPU/expand-scalar-carry-out-select-user.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/extract_vector_dynelt.ll (+5-5)
  • (modified) llvm/test/CodeGen/AMDGPU/extract_vector_elt-f16.ll (+101-112)
  • (modified) llvm/test/CodeGen/AMDGPU/extract_vector_elt-i16.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/extract_vector_elt-i8.ll (+6-6)
  • (modified) llvm/test/CodeGen/AMDGPU/extractelt-to-trunc.ll (+10-10)
  • (modified) llvm/test/CodeGen/AMDGPU/fabs.f16.ll (+79-88)
  • (modified) llvm/test/CodeGen/AMDGPU/fabs.ll (+56-56)
  • (modified) llvm/test/CodeGen/AMDGPU/fadd.f16.ll (+54-78)
  • (modified) llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.global.ll (+18-18)
  • (modified) llvm/test/CodeGen/AMDGPU/fcanonicalize.f16.ll (+245-224)
  • (modified) llvm/test/CodeGen/AMDGPU/fcanonicalize.ll (+238-274)
  • (modified) llvm/test/CodeGen/AMDGPU/fcmp.f16.ll (+466-466)
  • (modified) llvm/test/CodeGen/AMDGPU/fcopysign.f16.ll (+290-311)
  • (modified) llvm/test/CodeGen/AMDGPU/fcopysign.f32.ll (+168-169)
  • (modified) llvm/test/CodeGen/AMDGPU/fcopysign.f64.ll (+219-218)
  • (modified) llvm/test/CodeGen/AMDGPU/fdiv.f16.ll (+127-155)
  • (modified) llvm/test/CodeGen/AMDGPU/fdiv.ll (+142-148)
  • (modified) llvm/test/CodeGen/AMDGPU/fdiv32-to-rcp-folding.ll (+46-46)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-scratch-init.ll (+26-26)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-scratch-svs.ll (+182-242)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-scratch.ll (+343-371)
  • (modified) llvm/test/CodeGen/AMDGPU/flat_atomics.ll (+1653-1653)
  • (modified) llvm/test/CodeGen/AMDGPU/flat_atomics_i32_system.ll (+129-129)
  • (modified) llvm/test/CodeGen/AMDGPU/flat_atomics_i64.ll (+552-552)
  • (modified) llvm/test/CodeGen/AMDGPU/flat_atomics_i64_system.ll (+49-49)
  • (modified) llvm/test/CodeGen/AMDGPU/fma-combine.ll (+377-411)
  • (modified) llvm/test/CodeGen/AMDGPU/fma.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/fmax3.ll (+16-16)
  • (modified) llvm/test/CodeGen/AMDGPU/fmax_legacy.f64.ll (+8-8)
  • (modified) llvm/test/CodeGen/AMDGPU/fmaximum.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/fmed3.ll (+348-470)
  • (modified) llvm/test/CodeGen/AMDGPU/fmin3.ll (+24-24)
  • (modified) llvm/test/CodeGen/AMDGPU/fmin_legacy.f64.ll (+16-16)
  • (modified) llvm/test/CodeGen/AMDGPU/fminimum.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/fmul-2-combine-multi-use.ll (+162-162)
  • (modified) llvm/test/CodeGen/AMDGPU/fmul.f16.ll (+122-122)
  • (modified) llvm/test/CodeGen/AMDGPU/fmuladd.f16.ll (+136-220)
  • (modified) llvm/test/CodeGen/AMDGPU/fnearbyint.ll (+61-62)
  • (modified) llvm/test/CodeGen/AMDGPU/fneg-combines.new.ll (+36-36)
  • (modified) llvm/test/CodeGen/AMDGPU/fneg-fabs.f16.ll (+89-89)
  • (modified) llvm/test/CodeGen/AMDGPU/fneg-fabs.f64.ll (+29-29)
  • (modified) llvm/test/CodeGen/AMDGPU/fneg-fabs.ll (+24-24)
  • (modified) llvm/test/CodeGen/AMDGPU/fneg-modifier-casting.ll (+14-14)
  • (modified) llvm/test/CodeGen/AMDGPU/fneg.f16.ll (+62-66)
  • (modified) llvm/test/CodeGen/AMDGPU/fneg.ll (+108-110)
  • (modified) llvm/test/CodeGen/AMDGPU/fp-atomics-gfx1200.ll (+6-116)
  • (modified) llvm/test/CodeGen/AMDGPU/fp-atomics-gfx940.ll (+318-24)
  • (modified) llvm/test/CodeGen/AMDGPU/fp-classify.ll (+161-161)
  • (modified) llvm/test/CodeGen/AMDGPU/fp-min-max-buffer-atomics.ll (+127-120)
  • (modified) llvm/test/CodeGen/AMDGPU/fp-min-max-buffer-ptr-atomics.ll (+120-113)
  • (modified) llvm/test/CodeGen/AMDGPU/fp16_to_fp32.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/fp16_to_fp64.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/fp32_to_fp16.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/fp64-atomics-gfx90a.ll (+630-372)
  • (modified) llvm/test/CodeGen/AMDGPU/fp64-min-max-buffer-atomics.ll (+64-64)
  • (modified) llvm/test/CodeGen/AMDGPU/fp64-min-max-buffer-ptr-atomics.ll (+64-64)
  • (modified) llvm/test/CodeGen/AMDGPU/fp_to_sint.ll (+59-59)
  • (modified) llvm/test/CodeGen/AMDGPU/fp_to_uint.ll (+52-52)
  • (modified) llvm/test/CodeGen/AMDGPU/fpext.f16.ll (+68-60)
  • (modified) llvm/test/CodeGen/AMDGPU/fptosi.f16.ll (+25-25)
  • (modified) llvm/test/CodeGen/AMDGPU/fptoui.f16.ll (+27-28)
  • (modified) llvm/test/CodeGen/AMDGPU/fptrunc.f16.ll (+80-80)
  • (modified) llvm/test/CodeGen/AMDGPU/fptrunc.ll (+94-96)
  • (modified) llvm/test/CodeGen/AMDGPU/frem.ll (+224-224)
  • (modified) llvm/test/CodeGen/AMDGPU/fshl.ll (+173-171)
  • (modified) llvm/test/CodeGen/AMDGPU/fshr.ll (+105-103)
  • (modified) llvm/test/CodeGen/AMDGPU/fsqrt.f32.ll (+89-92)
  • (modified) llvm/test/CodeGen/AMDGPU/fsub.f16.ll (+78-78)
  • (modified) llvm/test/CodeGen/AMDGPU/function-args-inreg.ll (+664-685)
  • (modified) llvm/test/CodeGen/AMDGPU/fused-bitlogic.ll (+12-12)
  • (modified) llvm/test/CodeGen/AMDGPU/gds-allocation.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/gep-const-address-space.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/gfx11-user-sgpr-init16-bug.ll (+14-17)
  • (modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd-wrong-subtarget.ll (+18-18)
  • (modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll (+14212-7373)
  • (modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmax.ll (+2624-1958)
  • (modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmin.ll (+2624-1958)
  • (modified) llvm/test/CodeGen/AMDGPU/global-atomics-fp-wrong-subtarget.ll (+8-8)
  • (modified) llvm/test/CodeGen/AMDGPU/global-constant.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/global-i16-load-store.ll (+12-12)
  • (modified) llvm/test/CodeGen/AMDGPU/global-load-saddr-to-vaddr.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics.ll (+1351-1351)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i32_system.ll (+122-122)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i64.ll (+812-812)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i64_system.ll (+53-53)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fadd.ll (+2239-3826)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fmax.ll (+1266-2974)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fmin.ll (+1266-2974)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fsub.ll (+2039-3626)
  • (modified) llvm/test/CodeGen/AMDGPU/global_smrd.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/half.ll (+183-183)
  • (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-agpr-register-count.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-heap-v5.ll (+2-3)
  • (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-hostcall-v4.ll (+2-3)
  • (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-hostcall-v5.ll (+2-3)
  • (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-kernel-code-props.ll (+24-34)
  • (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-multigrid-sync-arg-v5.ll (+2-3)
  • (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-queue-ptr-v5.ll (+7-9)
  • (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-queueptr-v5.ll (+2-3)
  • (modified) llvm/test/CodeGen/AMDGPU/hsa-metadata-resource-usage-function-ordering.ll (+2-3)
  • (modified) llvm/test/CodeGen/AMDGPU/hsa.ll (+2-4)
  • (modified) llvm/test/CodeGen/AMDGPU/idiv-licm.ll (+214-220)
  • (modified) llvm/test/CodeGen/AMDGPU/idot2.ll (+328-347)
  • (modified) llvm/test/CodeGen/AMDGPU/idot4s.ll (+340-394)
  • (modified) llvm/test/CodeGen/AMDGPU/idot4u.ll (+638-736)
  • (modified) llvm/test/CodeGen/AMDGPU/idot8s.ll (+355-367)
  • (modified) llvm/test/CodeGen/AMDGPU/idot8u.ll (+459-466)
  • (modified) llvm/test/CodeGen/AMDGPU/imm.ll (+234-266)
  • (modified) llvm/test/CodeGen/AMDGPU/imm16.ll (+258-272)
  • (modified) llvm/test/CodeGen/AMDGPU/immv216.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/implicit-kernarg-backend-usage.ll (+20-20)
  • (modified) llvm/test/CodeGen/AMDGPU/implicitarg-attributes.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/indirect-call-known-callees.ll (+39-43)
  • (modified) llvm/test/CodeGen/AMDGPU/infinite-loop.ll (+10-6)
  • (modified) llvm/test/CodeGen/AMDGPU/inline-asm.i128.ll (+12-12)
  • (modified) llvm/test/CodeGen/AMDGPU/inline-attr.ll (+7-10)
  • (modified) llvm/test/CodeGen/AMDGPU/inlineasm-packed.ll (+1-1)
  • (modified) llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll (+366-368)
  • (modified) llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll (+426-426)
  • (modified) llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2bf16.ll (+285-282)
  • (modified) llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll (+283-344)
  • (modified) llvm/test/CodeGen/AMDGPU/insert_waitcnt_for_precise_memory.ll (+130-130)
  • (modified) llvm/test/CodeGen/AMDGPU/ipra.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/kernarg-size.ll (+2-4)
  • (modified) llvm/test/CodeGen/AMDGPU/kernel-args.ll (+465-468)
  • (modified) llvm/test/CodeGen/AMDGPU/kernel-argument-dag-lowering.ll (+48-50)
  • (modified) llvm/test/CodeGen/AMDGPU/kill-infinite-loop.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/large-alloca-compute.ll (+7-7)
  • (modified) llvm/test/CodeGen/AMDGPU/lds-frame-extern.ll (+192-240)
  • (modified) llvm/test/CodeGen/AMDGPU/lds-zero-initializer.ll (+14-14)
  • (modified) llvm/test/CodeGen/AMDGPU/llc-pipeline.ll (+12)
diff --git a/clang/test/CodeGenHIP/default-attributes.hip b/clang/test/CodeGenHIP/default-attributes.hip
index ee16ecd134bfe..63572bfd242b9 100644
--- a/clang/test/CodeGenHIP/default-attributes.hip
+++ b/clang/test/CodeGenHIP/default-attributes.hip
@@ -8,68 +8,49 @@
 #define __device__ __attribute__((device))
 #define __global__ __attribute__((global))
 
-//.
-// OPTNONE: @__hip_cuid_ = addrspace(1) global i8 0
-// OPTNONE: @llvm.compiler.used = appending addrspace(1) global [1 x ptr] [ptr addrspacecast (ptr addrspace(1) @__hip_cuid_ to ptr)], section "llvm.metadata"
-// OPTNONE: @__oclc_ABI_version = weak_odr hidden local_unnamed_addr addrspace(4) constant i32 500
-//.
-// OPT: @__hip_cuid_ = addrspace(1) global i8 0
-// OPT: @__oclc_ABI_version = weak_odr hidden local_unnamed_addr addrspace(4) constant i32 500
-// OPT: @llvm.compiler.used = appending addrspace(1) global [1 x ptr] [ptr addrspacecast (ptr addrspace(1) @__hip_cuid_ to ptr)], section "llvm.metadata"
-//.
-__device__ void extern_func();
-
 // OPTNONE: Function Attrs: convergent mustprogress noinline nounwind optnone
 // OPTNONE-LABEL: define {{[^@]+}}@_Z4funcv
 // OPTNONE-SAME: () #[[ATTR0:[0-9]+]] {
 // OPTNONE-NEXT:  entry:
-// OPTNONE-NEXT:    call void @_Z11extern_funcv() #[[ATTR3:[0-9]+]]
 // OPTNONE-NEXT:    ret void
 //
-// OPT: Function Attrs: convergent mustprogress nounwind
+// OPT: Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
 // OPT-LABEL: define {{[^@]+}}@_Z4funcv
 // OPT-SAME: () local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // OPT-NEXT:  entry:
-// OPT-NEXT:    tail call void @_Z11extern_funcv() #[[ATTR3:[0-9]+]]
 // OPT-NEXT:    ret void
 //
 __device__ void func() {
- extern_func();
+
 }
 
 // OPTNONE: Function Attrs: convergent mustprogress noinline norecurse nounwind optnone
 // OPTNONE-LABEL: define {{[^@]+}}@_Z6kernelv
-// OPTNONE-SAME: () #[[ATTR2:[0-9]+]] {
+// OPTNONE-SAME: () #[[ATTR1:[0-9]+]] {
 // OPTNONE-NEXT:  entry:
-// OPTNONE-NEXT:    call void @_Z11extern_funcv() #[[ATTR3]]
 // OPTNONE-NEXT:    ret void
 //
-// OPT: Function Attrs: convergent mustprogress norecurse nounwind
+// OPT: Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
 // OPT-LABEL: define {{[^@]+}}@_Z6kernelv
-// OPT-SAME: () local_unnamed_addr #[[ATTR2:[0-9]+]] {
+// OPT-SAME: () local_unnamed_addr #[[ATTR1:[0-9]+]] {
 // OPT-NEXT:  entry:
-// OPT-NEXT:    tail call void @_Z11extern_funcv() #[[ATTR3]]
 // OPT-NEXT:    ret void
 //
 __global__ void kernel() {
- extern_func();
+
 }
 //.
-// OPTNONE: attributes #[[ATTR0]] = { convergent mustprogress noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
-// OPTNONE: attributes #[[ATTR1:[0-9]+]] = { convergent nounwind "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
-// OPTNONE: attributes #[[ATTR2]] = { convergent mustprogress noinline norecurse nounwind optnone "amdgpu-flat-work-group-size"="1,1024" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "uniform-work-group-size"="true" }
-// OPTNONE: attributes #[[ATTR3]] = { convergent nounwind }
+// OPTNONE: attributes #0 = { convergent mustprogress noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
+// OPTNONE: attributes #1 = { convergent mustprogress noinline norecurse nounwind optnone "amdgpu-flat-work-group-size"="1,1024" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "uniform-work-group-size"="true" }
 //.
-// OPT: attributes #[[ATTR0]] = { convergent mustprogress nounwind "amdgpu-waves-per-eu"="4,10" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "uniform-work-group-size"="false" }
-// OPT: attributes #[[ATTR1:[0-9]+]] = { convergent nounwind "amdgpu-waves-per-eu"="4,10" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "uniform-work-group-size"="false" }
-// OPT: attributes #[[ATTR2]] = { convergent mustprogress norecurse nounwind "amdgpu-flat-work-group-size"="1,1024" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "uniform-work-group-size"="true" }
-// OPT: attributes #[[ATTR3]] = { convergent nounwind }
+// OPT: attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
+// OPT: attributes #1 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) "amdgpu-flat-work-group-size"="1,1024" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "uniform-work-group-size"="true" }
 //.
-// OPTNONE: [[META0:![0-9]+]] = !{i32 1, !"amdhsa_code_object_version", i32 500}
-// OPTNONE: [[META1:![0-9]+]] = !{i32 1, !"amdgpu_printf_kind", !"hostcall"}
-// OPTNONE: [[META2:![0-9]+]] = !{i32 1, !"wchar_size", i32 4}
+// OPTNONE: !0 = !{i32 1, !"amdhsa_code_object_version", i32 500}
+// OPTNONE: !1 = !{i32 1, !"amdgpu_printf_kind", !"hostcall"}
+// OPTNONE: !2 = !{i32 1, !"wchar_size", i32 4}
 //.
-// OPT: [[META0:![0-9]+]] = !{i32 1, !"amdhsa_code_object_version", i32 500}
-// OPT: [[META1:![0-9]+]] = !{i32 1, !"amdgpu_printf_kind", !"hostcall"}
-// OPT: [[META2:![0-9]+]] = !{i32 1, !"wchar_size", i32 4}
+// OPT: !0 = !{i32 1, !"amdhsa_code_object_version", i32 500}
+// OPT: !1 = !{i32 1, !"amdgpu_printf_kind", !"hostcall"}
+// OPT: !2 = !{i32 1, !"wchar_size", i32 4}
 //.
diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index 55b3b486d705d..1dd7fce2334c9 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -139,10 +139,6 @@ Changes to the AMDGPU Backend
   :ref:`atomicrmw <i_atomicrmw>` instruction with `fadd`, `fmin` and
   `fmax` with addrspace(3) instead.
 
-* AMDGPUAttributor is no longer run as part of the codegen pass
-  pipeline. It is expected to run as part of the middle end
-  optimizations.
-
 Changes to the ARM Backend
 --------------------------
 
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 9ddf0a310ed06..f50a18ccc2188 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -731,14 +731,6 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
         PM.addPass(createCGSCCToFunctionPassAdaptor(std::move(FPM)));
       });
 
-  // FIXME: Why is AMDGPUAttributor not in CGSCC?
-  PB.registerOptimizerLastEPCallback(
-      [this](ModulePassManager &MPM, OptimizationLevel Level) {
-        if (Level != OptimizationLevel::O0) {
-          MPM.addPass(AMDGPUAttributorPass(*this));
-        }
-      });
-
   PB.registerFullLinkTimeOptimizationLastEPCallback(
       [this](ModulePassManager &PM, OptimizationLevel Level) {
         // We want to support the -lto-partitions=N option as "best effort".
@@ -1045,6 +1037,11 @@ void AMDGPUPassConfig::addIRPasses() {
     addPass(createAMDGPULowerModuleLDSLegacyPass(&TM));
   }
 
+  // AMDGPUAttributor infers lack of llvm.amdgcn.lds.kernel.id calls, so run
+  // after their introduction
+  if (TM.getOptLevel() > CodeGenOptLevel::None)
+    addPass(createAMDGPUAttributorLegacyPass());
+
   if (TM.getOptLevel() > CodeGenOptLevel::None)
     addPass(createInferAddressSpacesPass());
 
diff --git a/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp b/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
index 8c951105101d9..97a8ff4486609 100644
--- a/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
@@ -679,12 +679,6 @@ void SIFrameLowering::emitEntryFunctionPrologue(MachineFunction &MF,
         break;
       }
     }
-
-    // FIXME: We can spill incoming arguments and restore at the end of the
-    // prolog.
-    if (!ScratchWaveOffsetReg)
-      report_fatal_error(
-          "could not find temporary scratch offset register in prolog");
   } else {
     ScratchWaveOffsetReg = PreloadedScratchWaveOffsetReg;
   }
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/addsubu64.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/addsubu64.ll
index 359c1e53de99e..a38b6e3263882 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/addsubu64.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/addsubu64.ll
@@ -6,8 +6,8 @@ define amdgpu_kernel void @s_add_u64(ptr addrspace(1) %out, i64 %a, i64 %b) {
 ; GFX11-LABEL: s_add_u64:
 ; GFX11:       ; %bb.0: ; %entry
 ; GFX11-NEXT:    s_clause 0x1
-; GFX11-NEXT:    s_load_b128 s[4:7], s[2:3], 0x24
-; GFX11-NEXT:    s_load_b64 s[0:1], s[2:3], 0x34
+; GFX11-NEXT:    s_load_b128 s[4:7], s[0:1], 0x24
+; GFX11-NEXT:    s_load_b64 s[0:1], s[0:1], 0x34
 ; GFX11-NEXT:    v_mov_b32_e32 v2, 0
 ; GFX11-NEXT:    s_waitcnt lgkmcnt(0)
 ; GFX11-NEXT:    s_add_u32 s0, s6, s0
@@ -22,8 +22,8 @@ define amdgpu_kernel void @s_add_u64(ptr addrspace(1) %out, i64 %a, i64 %b) {
 ; GFX12-LABEL: s_add_u64:
 ; GFX12:       ; %bb.0: ; %entry
 ; GFX12-NEXT:    s_clause 0x1
-; GFX12-NEXT:    s_load_b128 s[4:7], s[2:3], 0x24
-; GFX12-NEXT:    s_load_b64 s[0:1], s[2:3], 0x34
+; GFX12-NEXT:    s_load_b128 s[4:7], s[0:1], 0x24
+; GFX12-NEXT:    s_load_b64 s[0:1], s[0:1], 0x34
 ; GFX12-NEXT:    v_mov_b32_e32 v2, 0
 ; GFX12-NEXT:    s_wait_kmcnt 0x0
 ; GFX12-NEXT:    s_add_nc_u64 s[0:1], s[6:7], s[0:1]
@@ -58,8 +58,8 @@ define amdgpu_kernel void @s_sub_u64(ptr addrspace(1) %out, i64 %a, i64 %b) {
 ; GFX11-LABEL: s_sub_u64:
 ; GFX11:       ; %bb.0: ; %entry
 ; GFX11-NEXT:    s_clause 0x1
-; GFX11-NEXT:    s_load_b128 s[4:7], s[2:3], 0x24
-; GFX11-NEXT:    s_load_b64 s[0:1], s[2:3], 0x34
+; GFX11-NEXT:    s_load_b128 s[4:7], s[0:1], 0x24
+; GFX11-NEXT:    s_load_b64 s[0:1], s[0:1], 0x34
 ; GFX11-NEXT:    v_mov_b32_e32 v2, 0
 ; GFX11-NEXT:    s_waitcnt lgkmcnt(0)
 ; GFX11-NEXT:    s_sub_u32 s0, s6, s0
@@ -74,8 +74,8 @@ define amdgpu_kernel void @s_sub_u64(ptr addrspace(1) %out, i64 %a, i64 %b) {
 ; GFX12-LABEL: s_sub_u64:
 ; GFX12:       ; %bb.0: ; %entry
 ; GFX12-NEXT:    s_clause 0x1
-; GFX12-NEXT:    s_load_b128 s[4:7], s[2:3], 0x24
-; GFX12-NEXT:    s_load_b64 s[0:1], s[2:3], 0x34
+; GFX12-NEXT:    s_load_b128 s[4:7], s[0:1], 0x24
+; GFX12-NEXT:    s_load_b64 s[0:1], s[0:1], 0x34
 ; GFX12-NEXT:    v_mov_b32_e32 v2, 0
 ; GFX12-NEXT:    s_wait_kmcnt 0x0
 ; GFX12-NEXT:    s_sub_nc_u64 s[0:1], s[6:7], s[0:1]
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmax.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmax.ll
index 0a8e805027c77..9be8620b024eb 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmax.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmax.ll
@@ -2026,7 +2026,7 @@ define float @buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_m
 ; GFX12-NEXT:    s_wait_samplecnt 0x0
 ; GFX12-NEXT:    s_wait_bvhcnt 0x0
 ; GFX12-NEXT:    s_wait_kmcnt 0x0
-; GFX12-NEXT:    v_dual_mov_b32 v1, v0 :: v_dual_mov_b32 v2, s6
+; GFX12-NEXT:    v_dual_mov_b32 v1, v0 :: v_dual_mov_b32 v2, s4
 ; GFX12-NEXT:    s_mov_b32 s4, 0
 ; GFX12-NEXT:    s_delay_alu instid0(VALU_DEP_1)
 ; GFX12-NEXT:    v_max_num_f32_e32 v3, v1, v1
@@ -2056,7 +2056,7 @@ define float @buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_m
 ; GFX940-LABEL: buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
 ; GFX940:       ; %bb.0:
 ; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    v_mov_b32_e32 v2, s6
+; GFX940-NEXT:    v_mov_b32_e32 v2, s4
 ; GFX940-NEXT:    v_mov_b32_e32 v1, v0
 ; GFX940-NEXT:    buffer_load_dword v0, v2, s[0:3], 0 offen
 ; GFX940-NEXT:    s_mov_b64 s[4:5], 0
@@ -2083,7 +2083,7 @@ define float @buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_m
 ; GFX11-LABEL: buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
 ; GFX11:       ; %bb.0:
 ; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX11-NEXT:    v_dual_mov_b32 v1, v0 :: v_dual_mov_b32 v2, s6
+; GFX11-NEXT:    v_dual_mov_b32 v1, v0 :: v_dual_mov_b32 v2, s4
 ; GFX11-NEXT:    s_mov_b32 s4, 0
 ; GFX11-NEXT:    s_delay_alu instid0(VALU_DEP_1)
 ; GFX11-NEXT:    v_max_f32_e32 v3, v1, v1
@@ -2114,14 +2114,10 @@ define float @buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_m
 ; GFX10-LABEL: buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
 ; GFX10:       ; %bb.0:
 ; GFX10-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX10-NEXT:    v_mov_b32_e32 v2, s18
-; GFX10-NEXT:    s_mov_b32 s4, s6
-; GFX10-NEXT:    s_mov_b32 s5, s7
-; GFX10-NEXT:    s_mov_b32 s6, s16
-; GFX10-NEXT:    s_mov_b32 s7, s17
+; GFX10-NEXT:    v_mov_b32_e32 v2, s8
 ; GFX10-NEXT:    v_mov_b32_e32 v1, v0
-; GFX10-NEXT:    buffer_load_dword v0, v2, s[4:7], 0 offen
 ; GFX10-NEXT:    s_mov_b32 s8, 0
+; GFX10-NEXT:    buffer_load_dword v0, v2, s[4:7], 0 offen
 ; GFX10-NEXT:    v_max_f32_e32 v3, v1, v1
 ; GFX10-NEXT:  .LBB12_1: ; %atomicrmw.start
 ; GFX10-NEXT:    ; =>This Inner Loop Header: Depth=1
@@ -2147,11 +2143,7 @@ define float @buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_m
 ; GFX90A-LABEL: buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
 ; GFX90A:       ; %bb.0:
 ; GFX90A-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX90A-NEXT:    s_mov_b32 s4, s6
-; GFX90A-NEXT:    s_mov_b32 s5, s7
-; GFX90A-NEXT:    s_mov_b32 s6, s16
-; GFX90A-NEXT:    s_mov_b32 s7, s17
-; GFX90A-NEXT:    v_mov_b32_e32 v2, s18
+; GFX90A-NEXT:    v_mov_b32_e32 v2, s8
 ; GFX90A-NEXT:    v_mov_b32_e32 v1, v0
 ; GFX90A-NEXT:    buffer_load_dword v0, v2, s[4:7], 0 offen
 ; GFX90A-NEXT:    s_mov_b64 s[8:9], 0
@@ -2177,11 +2169,7 @@ define float @buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_m
 ; GFX908-LABEL: buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
 ; GFX908:       ; %bb.0:
 ; GFX908-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX908-NEXT:    s_mov_b32 s4, s6
-; GFX908-NEXT:    s_mov_b32 s5, s7
-; GFX908-NEXT:    s_mov_b32 s6, s16
-; GFX908-NEXT:    s_mov_b32 s7, s17
-; GFX908-NEXT:    v_mov_b32_e32 v2, s18
+; GFX908-NEXT:    v_mov_b32_e32 v2, s8
 ; GFX908-NEXT:    v_mov_b32_e32 v1, v0
 ; GFX908-NEXT:    buffer_load_dword v0, v2, s[4:7], 0 offen
 ; GFX908-NEXT:    s_mov_b64 s[8:9], 0
@@ -2208,11 +2196,7 @@ define float @buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_m
 ; GFX8-LABEL: buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
 ; GFX8:       ; %bb.0:
 ; GFX8-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX8-NEXT:    s_mov_b32 s4, s6
-; GFX8-NEXT:    s_mov_b32 s5, s7
-; GFX8-NEXT:    s_mov_b32 s6, s16
-; GFX8-NEXT:    s_mov_b32 s7, s17
-; GFX8-NEXT:    v_mov_b32_e32 v2, s18
+; GFX8-NEXT:    v_mov_b32_e32 v2, s8
 ; GFX8-NEXT:    v_mov_b32_e32 v1, v0
 ; GFX8-NEXT:    buffer_load_dword v0, v2, s[4:7], 0 offen
 ; GFX8-NEXT:    s_mov_b64 s[8:9], 0
@@ -2239,11 +2223,7 @@ define float @buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_m
 ; GFX7-LABEL: buffer_fat_ptr_agent_atomic_fmax_ret_f32__amdgpu_no_fine_grained_memory:
 ; GFX7:       ; %bb.0:
 ; GFX7-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX7-NEXT:    s_mov_b32 s4, s6
-; GFX7-NEXT:    s_mov_b32 s5, s7
-; GFX7-NEXT:    s_mov_b32 s6, s16
-; GFX7-NEXT:    s_mov_b32 s7, s17
-; GFX7-NEXT:    v_mov_b32_e32 v2, s18
+; GFX7-NEXT:    v_mov_b32_e32 v2, s8
 ; GFX7-NEXT:    v_mov_b32_e32 v1, v0
 ; GFX7-NEXT:    buffer_load_dword v0, v2, s[4:7], 0 offen
 ; GFX7-NEXT:    s_mov_b64 s[8:9], 0
@@ -2278,7 +2258,7 @@ define void @buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_
 ; GFX12-NEXT:    s_wait_samplecnt 0x0
 ; GFX12-NEXT:    s_wait_bvhcnt 0x0
 ; GFX12-NEXT:    s_wait_kmcnt 0x0
-; GFX12-NEXT:    v_dual_mov_b32 v2, s6 :: v_dual_max_num_f32 v3, v0, v0
+; GFX12-NEXT:    v_dual_mov_b32 v2, s4 :: v_dual_max_num_f32 v3, v0, v0
 ; GFX12-NEXT:    s_mov_b32 s4, 0
 ; GFX12-NEXT:    buffer_load_b32 v1, v2, s[0:3], null offen
 ; GFX12-NEXT:  .LBB13_1: ; %atomicrmw.start
@@ -2305,7 +2285,7 @@ define void @buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_
 ; GFX940-LABEL: buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory:
 ; GFX940:       ; %bb.0:
 ; GFX940-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX940-NEXT:    v_mov_b32_e32 v2, s6
+; GFX940-NEXT:    v_mov_b32_e32 v2, s4
 ; GFX940-NEXT:    buffer_load_dword v1, v2, s[0:3], 0 offen
 ; GFX940-NEXT:    s_mov_b64 s[4:5], 0
 ; GFX940-NEXT:    v_max_f32_e32 v3, v0, v0
@@ -2331,7 +2311,7 @@ define void @buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_
 ; GFX11-LABEL: buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory:
 ; GFX11:       ; %bb.0:
 ; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX11-NEXT:    v_dual_mov_b32 v2, s6 :: v_dual_max_f32 v3, v0, v0
+; GFX11-NEXT:    v_dual_mov_b32 v2, s4 :: v_dual_max_f32 v3, v0, v0
 ; GFX11-NEXT:    s_mov_b32 s4, 0
 ; GFX11-NEXT:    buffer_load_b32 v1, v2, s[0:3], 0 offen
 ; GFX11-NEXT:  .LBB13_1: ; %atomicrmw.start
@@ -2359,14 +2339,10 @@ define void @buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_
 ; GFX10-LABEL: buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory:
 ; GFX10:       ; %bb.0:
 ; GFX10-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX10-NEXT:    v_mov_b32_e32 v2, s18
-; GFX10-NEXT:    s_mov_b32 s4, s6
-; GFX10-NEXT:    s_mov_b32 s5, s7
-; GFX10-NEXT:    s_mov_b32 s6, s16
-; GFX10-NEXT:    s_mov_b32 s7, s17
+; GFX10-NEXT:    v_mov_b32_e32 v2, s8
 ; GFX10-NEXT:    v_max_f32_e32 v3, v0, v0
-; GFX10-NEXT:    buffer_load_dword v1, v2, s[4:7], 0 offen
 ; GFX10-NEXT:    s_mov_b32 s8, 0
+; GFX10-NEXT:    buffer_load_dword v1, v2, s[4:7], 0 offen
 ; GFX10-NEXT:  .LBB13_1: ; %atomicrmw.start
 ; GFX10-NEXT:    ; =>This Inner Loop Header: Depth=1
 ; GFX10-NEXT:    s_waitcnt vmcnt(0)
@@ -2391,11 +2367,7 @@ define void @buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_
 ; GFX90A-LABEL: buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory:
 ; GFX90A:       ; %bb.0:
 ; GFX90A-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX90A-NEXT:    s_mov_b32 s4, s6
-; GFX90A-NEXT:    s_mov_b32 s5, s7
-; GFX90A-NEXT:    s_mov_b32 s6, s16
-; GFX90A-NEXT:    s_mov_b32 s7, s17
-; GFX90A-NEXT:    v_mov_b32_e32 v2, s18
+; GFX90A-NEXT:    v_mov_b32_e32 v2, s8
 ; GFX90A-NEXT:    buffer_load_dword v1, v2, s[4:7], 0 offen
 ; GFX90A-NEXT:    s_mov_b64 s[8:9], 0
 ; GFX90A-NEXT:    v_max_f32_e32 v3, v0, v0
@@ -2420,11 +2392,7 @@ define void @buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_
 ; GFX908-LABEL: buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory:
 ; GFX908:       ; %bb.0:
 ; GFX908-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX908-NEXT:    s_mov_b32 s4, s6
-; GFX908-NEXT:    s_mov_b32 s5, s7
-; GFX908-NEXT:    s_mov_b32 s6, s16
-; GFX908-NEXT:    s_mov_b32 s7, s17
-; GFX908-NEXT:    v_mov_b32_e32 v2, s18
+; GFX908-NEXT:    v_mov_b32_e32 v2, s8
 ; GFX908-NEXT:    buffer_load_dword v1, v2, s[4:7], 0 offen
 ; GFX908-NEXT:    s_mov_b64 s[8:9], 0
 ; GFX908-NEXT:    v_max_f32_e32 v3, v0, v0
@@ -2450,11 +2418,7 @@ define void @buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_
 ; GFX8-LABEL: buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory:
 ; GFX8:       ; %bb.0:
 ; GFX8-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX8-NEXT:    s_mov_b32 s4, s6
-; GFX8-NEXT:    s_mov_b32 s5, s7
-; GFX8-NEXT:    s_mov_b32 s6, s16
-; GFX8-NEXT:    s_mov_b32 s7, s17
-; GFX8-NEXT:    v_mov_b32_e32 v2, s18
+; GFX8-NEXT:    v_mov_b32_e32 v2, s8
 ; GFX8-NEXT:    buffer_load_dword v1, v2, s[4:7], 0 offen
 ; GFX8-NEXT:    s_mov_b64 s[8:9], 0
 ; GFX8-NEXT:    v_mul_f32_e32 v3, 1.0, v0
@@ -2480,11 +2444,7 @@ define void @buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_
 ; GFX7-LABEL: buffer_fat_ptr_agent_atomic_fmax_noret_f32__amdgpu_no_fine_grained_memory:
 ; GFX7:       ; %bb.0:
 ; GFX7-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX7-NEXT:    s_mov_b32 s4, s6
-; GFX7-NEXT:    s_mov_b32 s5, s7
-; GFX7-NEXT:    s_mov_b32 s6, s16
-; GFX7-NEXT:    s_mov_b32 s7, s17
-; GFX7-NEXT:    v_mov_b32_e32 v2, s18
+; GFX7-NEXT:    v_mov_b32_e32 v2, s8
 ; GFX7-NEXT:    buffer_load_dword v1, v2, s[4:7], 0 offen
 ; GFX7-NEXT:    s_mov_b64 s[8:9], 0
 ; GFX7-NEXT:    v_mul_f32_e32 v3, 1.0, v0
@@ -2518,7 +2478,7 @@ define double @buffer_fat_ptr_agent_atomic_fmax_ret_f64__amdgpu_no_fine_grained_
 ; GFX12-NEXT:    s_wait_samplecnt 0x0
 ; GFX12-NEXT:    s_wait_...
[truncated]

arsenm added a commit that referenced this pull request Jul 15, 2024
… and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851)

This reverts commit adaff46.

Drop the -O3 checks from default-attributes.hip. I don't know why they
are different on some bots but reverting this is far too disruptive.
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 15, 2024

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime running on omp-vega20-0 while building clang,llvm at step 7 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/30/builds/1975

Here is the relevant piece of the build log for the reference:

Step 7 (Add check check-offload) failure: test (failure)
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: api/omp_dynamic_shared_memory_mixed_amdgpu.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/api/omp_dynamic_shared_memory_mixed_amdgpu.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/api/Output/omp_dynamic_shared_memory_mixed_amdgpu.c.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -O1 -mllvm -openmp-opt-inline-device -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/api
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/api/omp_dynamic_shared_memory_mixed_amdgpu.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/api/Output/omp_dynamic_shared_memory_mixed_amdgpu.c.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -O1 -mllvm -openmp-opt-inline-device -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/api
# .---command stderr------------
# | clang-linker-wrapper: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp:158: virtual bool llvm::AMDGPUResourceUsageAnalysis::runOnModule(llvm::Module&): Assertion `MF && "function must have been generated already"' failed.
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
# | Stack dump:
# | 0.	Program arguments: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper --opt-level=O1 --host-triple=x86_64-unknown-linux-gnu -mllvm -openmp-opt-inline-device --linker-path=/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/ld.lld -z relro --hash-style=gnu --eh-frame-hdr -m elf_x86_64 -pie -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/api/Output/omp_dynamic_shared_memory_mixed_amdgpu.c.tmp /lib/x86_64-linux-gnu/Scrt1.o /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbeginS.o -L/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/lib/clang/19/lib/x86_64-unknown-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/lib -L/usr/lib -rpath /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -rpath /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -rpath /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib /tmp/lit-tmp-hgm8nh9i/omp_dynamic_shared_memory_mixed_amdgpu-820d3e.o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -lomp -lomptarget -L/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/lib -lgcc --as-needed -lgcc_s --no-as-needed -lpthread -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/9/crtendS.o /lib/x86_64-linux-gnu/crtn.o
# | 1.	Running pass 'Function register usage analysis' on module 'ld-temp.o'.
# |  #0 0x000056324f08b8cf llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x1b268cf)
# |  #1 0x000056324f088e14 SignalHandler(int) Signals.cpp:0:0
# |  #2 0x00007f48870a9420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
# |  #3 0x00007f4886b7600b raise (/lib/x86_64-linux-gnu/libc.so.6+0x4300b)
# |  #4 0x00007f4886b55859 abort (/lib/x86_64-linux-gnu/libc.so.6+0x22859)
# |  #5 0x00007f4886b55729 (/lib/x86_64-linux-gnu/libc.so.6+0x22729)
# |  #6 0x00007f4886b66fd6 (/lib/x86_64-linux-gnu/libc.so.6+0x33fd6)
# |  #7 0x000056324e076106 llvm::AMDGPUResourceUsageAnalysis::runOnModule(llvm::Module&) (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0xb11106)
# |  #8 0x000056324e7c1855 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x125c855)
# |  #9 0x000056324f743c3d codegen(llvm::lto::Config const&, llvm::TargetMachine*, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex const&) LTOBackend.cpp:0:0
# | #10 0x000056324f7455ed llvm::lto::backend(llvm::lto::Config const&, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex&) (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x21e05ed)
# | #11 0x000056324f739cda llvm::lto::LTO::runRegularLTO(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>) (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x21d4cda)
# | #12 0x000056324f73a21c llvm::lto::LTO::run(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, std::function<llvm::Expected<std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>> (unsigned int, llvm::StringRef, llvm::Twine const&)>) (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x21d521c)
# | #13 0x000056324dc0e2c0 llvm::Error (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::SmallVector<llvm::object::OffloadFile, 3u>>&, llvm::opt::InputArgList const&, char**, int)::'lambda'(auto&)::operator()<llvm::SmallVector<llvm::object::OffloadFile, 3u>>(auto&) const ClangLinkerWrapper.cpp:0:0
# | #14 0x000056324dc12fe6 (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::SmallVector<llvm::object::OffloadFile, 3u>>&, llvm::opt::InputArgList const&, char**, int) ClangLinkerWrapper.cpp:0:0
# | #15 0x000056324db3438f main (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x5cf38f)
# | #16 0x00007f4886b57083 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24083)
# | #17 0x000056324dbf2b5e _start (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x68db5e)
# |  #0 0x000056324f08b8cf llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x1b268cf)
# |  #1 0x000056324f088e14 SignalHandler(int) Signals.cpp:0:0
# |  #2 0x00007f48870a9420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
# |  #3 0x00007f4886b7600b raise (/lib/x86_64-linux-gnu/libc.so.6+0x4300b)
# |  #4 0x00007f4886b55859 abort (/lib/x86_64-linux-gnu/libc.so.6+0x22859)
# |  #5 0x00007f4886b55729 (/lib/x86_64-linux-gnu/libc.so.6+0x22729)
# |  #6 0x00007f4886b66fd6 (/lib/x86_64-linux-gnu/libc.so.6+0x33fd6)
# |  #7 0x000056324e076106 llvm::AMDGPUResourceUsageAnalysis::runOnModule(llvm::Module&) (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0xb11106)
# |  #8 0x000056324e7c1855 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x125c855)
# |  #9 0x000056324f743c3d codegen(llvm::lto::Config const&, llvm::TargetMachine*, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex const&) LTOBackend.cpp:0:0
# | #10 0x000056324f7455ed llvm::lto::backend(llvm::lto::Config const&, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex&) (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x21e05ed)
# | #11 0x000056324f739cda llvm::lto::LTO::runRegularLTO(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>) (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x21d4cda)
# | #12 0x000056324f73a21c llvm::lto::LTO::run(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, std::function<llvm::Expected<std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>> (unsigned int, llvm::StringRef, llvm::Twine const&)>) (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x21d521c)
# | #13 0x000056324dc0e2c0 llvm::Error (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::SmallVector<llvm::object::OffloadFile, 3u>>&, llvm::opt::InputArgList const&, char**, int)::'lambda'(auto&)::operator()<llvm::SmallVector<llvm::object::OffloadFile, 3u>>(auto&) const ClangLinkerWrapper.cpp:0:0
# | #14 0x000056324dc12fe6 (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::SmallVector<llvm::object::OffloadFile, 3u>>&, llvm::opt::InputArgList const&, char**, int) ClangLinkerWrapper.cpp:0:0
# | #15 0x000056324db3438f main (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x5cf38f)
# | #16 0x00007f4886b57083 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24083)
# | #17 0x000056324dbf2b5e _start (/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x68db5e)
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 15, 2024

LLVM Buildbot has detected a new failure on builder openmp-offload-libc-amdgpu-runtime running on omp-vega20-1 while building clang,llvm at step 10 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/1842

Here is the relevant piece of the build log for the reference:

Step 10 (Add check check-offload) failure: test (failure)
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: offloading/bug51982.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/bug51982.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/bug51982.c.tmp /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libcgpu-amdgpu.a /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -O1 && /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/bug51982.c.tmp
# executed command: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/offload/test/offloading/bug51982.c -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/bug51982.c.tmp /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libcgpu-amdgpu.a /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -O1
# .---command stderr------------
# | clang-linker-wrapper: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp:158: virtual bool llvm::AMDGPUResourceUsageAnalysis::runOnModule(llvm::Module&): Assertion `MF && "function must have been generated already"' failed.
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
# | Stack dump:
# | 0.	Program arguments: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper --opt-level=O1 --host-triple=x86_64-unknown-linux-gnu --linker-path=/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/ld.lld -z relro --hash-style=gnu --eh-frame-hdr -m elf_x86_64 -pie -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/bug51982.c.tmp /lib/x86_64-linux-gnu/Scrt1.o /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbeginS.o -L/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib -L/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/lib/clang/19/lib/x86_64-unknown-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/lib -L/usr/lib -rpath /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -rpath /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -rpath /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib /tmp/lit-tmp-mxeqx9rl/bug51982-837906.o /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libcgpu-amdgpu.a /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -lomp -lomptarget -L/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/lib -lgcc --as-needed -lgcc_s --no-as-needed -lpthread -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/9/crtendS.o /lib/x86_64-linux-gnu/crtn.o
# | 1.	Running pass 'Function register usage analysis' on module 'ld-temp.o'.
# |  #0 0x000055607b0768cf llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x1b268cf)
# |  #1 0x000055607b073e14 SignalHandler(int) Signals.cpp:0:0
# |  #2 0x00007f37a7712420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
# |  #3 0x00007f37a71df00b raise /build/glibc-LcI20x/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
# |  #4 0x00007f37a71be859 abort /build/glibc-LcI20x/glibc-2.31/stdlib/abort.c:81:7
# |  #5 0x00007f37a71be729 get_sysdep_segment_value /build/glibc-LcI20x/glibc-2.31/intl/loadmsgcat.c:509:8
# |  #6 0x00007f37a71be729 _nl_load_domain /build/glibc-LcI20x/glibc-2.31/intl/loadmsgcat.c:970:34
# |  #7 0x00007f37a71cffd6 (/lib/x86_64-linux-gnu/libc.so.6+0x33fd6)
# |  #8 0x000055607a061106 llvm::AMDGPUResourceUsageAnalysis::runOnModule(llvm::Module&) (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0xb11106)
# |  #9 0x000055607a7ac855 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x125c855)
# | #10 0x000055607b72ed4d codegen(llvm::lto::Config const&, llvm::TargetMachine*, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex const&) LTOBackend.cpp:0:0
# | #11 0x000055607b7306fd llvm::lto::backend(llvm::lto::Config const&, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex&) (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x21e06fd)
# | #12 0x000055607b724dea llvm::lto::LTO::runRegularLTO(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>) (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x21d4dea)
# | #13 0x000055607b72532c llvm::lto::LTO::run(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, std::function<llvm::Expected<std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>> (unsigned int, llvm::StringRef, llvm::Twine const&)>) (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x21d532c)
# | #14 0x0000556079bf92c0 llvm::Error (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::SmallVector<llvm::object::OffloadFile, 3u>>&, llvm::opt::InputArgList const&, char**, int)::'lambda'(auto&)::operator()<llvm::SmallVector<llvm::object::OffloadFile, 3u>>(auto&) const ClangLinkerWrapper.cpp:0:0
# | #15 0x0000556079bfdfe6 (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::SmallVector<llvm::object::OffloadFile, 3u>>&, llvm::opt::InputArgList const&, char**, int) ClangLinkerWrapper.cpp:0:0
# | #16 0x0000556079b1f38f main (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x5cf38f)
# | #17 0x00007f37a71c0083 __libc_start_main /build/glibc-LcI20x/glibc-2.31/csu/../csu/libc-start.c:342:3
# | #18 0x0000556079bddb5e _start (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x68db5e)
# |  #0 0x000055607b0768cf llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x1b268cf)
# |  #1 0x000055607b073e14 SignalHandler(int) Signals.cpp:0:0
# |  #2 0x00007f37a7712420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
# |  #3 0x00007f37a71df00b raise /build/glibc-LcI20x/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
# |  #4 0x00007f37a71be859 abort /build/glibc-LcI20x/glibc-2.31/stdlib/abort.c:81:7
# |  #5 0x00007f37a71be729 get_sysdep_segment_value /build/glibc-LcI20x/glibc-2.31/intl/loadmsgcat.c:509:8
# |  #6 0x00007f37a71be729 _nl_load_domain /build/glibc-LcI20x/glibc-2.31/intl/loadmsgcat.c:970:34
# |  #7 0x00007f37a71cffd6 (/lib/x86_64-linux-gnu/libc.so.6+0x33fd6)
# |  #8 0x000055607a061106 llvm::AMDGPUResourceUsageAnalysis::runOnModule(llvm::Module&) (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0xb11106)
# |  #9 0x000055607a7ac855 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x125c855)
# | #10 0x000055607b72ed4d codegen(llvm::lto::Config const&, llvm::TargetMachine*, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex const&) LTOBackend.cpp:0:0
# | #11 0x000055607b7306fd llvm::lto::backend(llvm::lto::Config const&, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex&) (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x21e06fd)
# | #12 0x000055607b724dea llvm::lto::LTO::runRegularLTO(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>) (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x21d4dea)
# | #13 0x000055607b72532c llvm::lto::LTO::run(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, std::function<llvm::Expected<std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>> (unsigned int, llvm::StringRef, llvm::Twine const&)>) (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x21d532c)
# | #14 0x0000556079bf92c0 llvm::Error (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::SmallVector<llvm::object::OffloadFile, 3u>>&, llvm::opt::InputArgList const&, char**, int)::'lambda'(auto&)::operator()<llvm::SmallVector<llvm::object::OffloadFile, 3u>>(auto&) const ClangLinkerWrapper.cpp:0:0
# | #15 0x0000556079bfdfe6 (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::SmallVector<llvm::object::OffloadFile, 3u>>&, llvm::opt::InputArgList const&, char**, int) ClangLinkerWrapper.cpp:0:0
# | #16 0x0000556079b1f38f main (/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/clang-linker-wrapper+0x5cf38f)
...

@dyung
Copy link
Collaborator Author

dyung commented Jul 15, 2024

@arsenm the previous two bot failure notifications are from your reapplication of your changes, not from my original revert. Not sure why it is mentioning it here.

@dyung dyung deleted the dyung/main/83131-revert branch July 15, 2024 08:32
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Nov 22, 2024
…lvm#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (llvm#98851)"

This reverts commit b1bcb7c.

Change-Id: Ia262230003989ed152f82ea475364b42d2592090
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AMDGPU clang Clang issues not falling into any other category llvm:globalisel llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants