Skip to content

Conversation

@TopRichard
Copy link
Collaborator

@TopRichard TopRichard commented Apr 15, 2025

  •  x86_64_generic
  •  cascadelake
  •  haswell
  •  icelake
  •  sapphirerapids
  •  skylake
  •  zen2
  •  zen3
  •  zen4
  •  aarch64_generic
  •  neoverse_n1
  •  neoverse_v1
  •  nvidia/grace Built on gpu-node

First attempt using the rebuild precedure:

  • For the builds with accelerator:nvidia/cc70 and CUDA-Samples-12.1-GCC-12.3.0, the build process on a gpu and none gpu node of CUDA-Samples-12.1-GCC-12.3.0 fails with error :
make[1]: Entering directory '/tmp/.../easybuild/build/CUDASamples/12.1/GCC-12.3.0-CUDA-12.1.1/cuda-samples-12.1/Samples/3_CUDA_Features/immaTensorCoreGemm'
/../software/CUDA/12.1.1/bin/nvcc -ccbin g++ -I../../../Common -m64 -maxrregcount=255 --threads 0 --std=c++11 -gencode arch=compute_70,code=sm_70 -genco
de arch=compute_70,code=compute_70 -o immaTensorCoreGemm.o -c immaTensorCoreGemm.cu
immaTensorCoreGemm.cu(260): error: incomplete type is not allowed
      wmma::fragment<wmma::accumulator, 16, 16, 16, int> c[2]
                                                         ^

immaTensorCoreGemm.cu(271): error: no instance of overloaded function "nvcuda::wmma::load_matrix_sync" matches the argument list
            argument types are: (<error-type>, const int *, int, nvcuda::wmma::layout_t)
          wmma::load_matrix_sync(c[i][j], tile_ptr, (16 * (4 * 2)), wmma::mem_row_major);
          ^

immaTensorCoreGemm.cu(336): error: incomplete type is not allowed
              a[2];

16 errors detected in the compilation of "immaTensorCoreGemm.cu".

Skipping CUDA-Samples-12.1-GCC-12.3.0, results in a successful build

Second attempt setting the cc70 yml file in accel/nvidia, thus no rebuild required

@eessi-bot
Copy link

eessi-bot bot commented Apr 15, 2025

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/sapphirerapids, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-software, eessi.io-2023.06-compat

@eessi-bot-deucalion
Copy link

Instance eessi-bot-deucalion is configured to build for:

  • architectures: aarch64/a64fx
  • repositories: eessi.io-2023.06-software

@eessi-bot
Copy link

eessi-bot bot commented Apr 15, 2025

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi.io-2023.06-compat, eessi.io-2023.06-software

@eessi-bot-surf
Copy link

Instance eessi-bot-surf is configured to build for:

  • architectures: x86_64/amd/zen4, x86_64/amd/zen2
  • repositories: eessi-hpc.org-2023.06-software, eessi.io-2023.06-software, eessi.io-2023.06-compat, eessi-hpc.org-2023.06-compat

@eessi-bot-trz42
Copy link

Instance trz42-GH200-jr is configured to build for:

  • architectures: aarch64/nvidia/grace
  • repositories: eessi.io-2023.06-software

@eessi-bot-toprichard
Copy link

Instance rt-Grace-jr is configured to build for:

  • architectures: aarch64/nvidia/grace
  • repositories: eessi.io-2023.06-software

@TopRichard TopRichard marked this pull request as draft April 15, 2025 11:32
@TopRichard TopRichard added 2023.06-software.eessi.io 2023.06 version of software.eessi.io accel:nvidia labels Apr 15, 2025
@TopRichard
Copy link
Collaborator Author

bot: build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software accelerator:nvidia/cc70

@eessi-bot
Copy link

eessi-bot bot commented Apr 15, 2025

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software accelerator:nvidia/cc70 from TopRichard

    • expanded format: build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software accelerator:nvidia/cc70
  • handling command build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software accelerator:nvidia/cc70 resulted in:

    • no jobs were submitted

@eessi-bot
Copy link

eessi-bot bot commented Apr 15, 2025

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software accelerator:nvidia/cc70 from TopRichard

    • expanded format: build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software accelerator:nvidia/cc70
  • handling command build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software accelerator:nvidia/cc70 resulted in:

    • no jobs were submitted

@eessi-bot-deucalion
Copy link

eessi-bot-deucalion bot commented Apr 15, 2025

Updates by the bot instance eessi-bot-deucalion (click for details)
  • received bot command build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software accelerator:nvidia/cc70 from TopRichard

    • expanded format: build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software accelerator:nvidia/cc70
  • handling command build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software accelerator:nvidia/cc70 resulted in:

    • no jobs were submitted

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Apr 15, 2025

Updates by the bot instance eessi-bot-surf (click for details)
  • received bot command build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software accelerator:nvidia/cc70 from TopRichard

    • expanded format: build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software accelerator:nvidia/cc70
  • handling command build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software accelerator:nvidia/cc70 resulted in:

    • no jobs were submitted

@eessi-bot-toprichard
Copy link

eessi-bot-toprichard bot commented Apr 15, 2025

Updates by the bot instance rt-Grace-jr (click for details)

@eessi-bot-toprichard
Copy link

eessi-bot-toprichard bot commented Apr 15, 2025

New job on instance rt-Grace-jr for CPU micro-architecture aarch64-nvidia-grace and accelerator nvidia/cc70 for repository eessi.io-2023.06-software in job dir /p/project1/ceasybuilders/bot-rt/jobs/2025.04/pr_1030/13613238

date job status comment
Apr 15 12:29:47 UTC 2025 submitted job id 13613238 awaits release by job manager
Apr 15 12:30:04 UTC 2025 released job awaits launch by Slurm scheduler
Apr 15 12:31:07 UTC 2025 running job 13613238 is running
Apr 15 13:04:05 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-13613238.out
❌ found message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-nvidia-grace-1744721408.tar.gzsize: 2010 MiB (2107824492 bytes)
entries: 3777
modules under 2023.06/software/linux/aarch64/nvidia/grace/accel/nvidia/cc70/modules/all
CUDA/12.1.1.lua
software under 2023.06/software/linux/aarch64/nvidia/grace/accel/nvidia/cc70/software
CUDA/12.1.1
other under 2023.06/software/linux/aarch64/nvidia/grace/accel/nvidia/cc70
no other files in tarball
Apr 15 13:04:05 UTC 2025 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-13613238.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@TopRichard
Copy link
Collaborator Author

bot: build inst:eessi-bot-surf repo:eessi.io-2023.06-software accelerator:nvidia/cc70

@eessi-bot
Copy link

eessi-bot bot commented Apr 15, 2025

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build inst:eessi-bot-surf repo:eessi.io-2023.06-software accelerator:nvidia/cc70 from TopRichard

    • expanded format: build instance:eessi-bot-surf repository:eessi.io-2023.06-software accelerator:nvidia/cc70
  • handling command build instance:eessi-bot-surf repository:eessi.io-2023.06-software accelerator:nvidia/cc70 resulted in:

    • no jobs were submitted

@eessi-bot
Copy link

eessi-bot bot commented Apr 15, 2025

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build inst:eessi-bot-surf repo:eessi.io-2023.06-software accelerator:nvidia/cc70 from TopRichard

    • expanded format: build instance:eessi-bot-surf repository:eessi.io-2023.06-software accelerator:nvidia/cc70
  • handling command build instance:eessi-bot-surf repository:eessi.io-2023.06-software accelerator:nvidia/cc70 resulted in:

    • no jobs were submitted

@eessi-bot-deucalion
Copy link

eessi-bot-deucalion bot commented Apr 15, 2025

Updates by the bot instance eessi-bot-deucalion (click for details)
  • received bot command build inst:eessi-bot-surf repo:eessi.io-2023.06-software accelerator:nvidia/cc70 from TopRichard

    • expanded format: build instance:eessi-bot-surf repository:eessi.io-2023.06-software accelerator:nvidia/cc70
  • handling command build instance:eessi-bot-surf repository:eessi.io-2023.06-software accelerator:nvidia/cc70 resulted in:

    • no jobs were submitted

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Apr 15, 2025

Updates by the bot instance eessi-bot-surf (click for details)

@eessi-bot-toprichard
Copy link

eessi-bot-toprichard bot commented Apr 15, 2025

Updates by the bot instance rt-Grace-jr (click for details)
  • received bot command build inst:eessi-bot-surf repo:eessi.io-2023.06-software accelerator:nvidia/cc70 from TopRichard

    • expanded format: build instance:eessi-bot-surf repository:eessi.io-2023.06-software accelerator:nvidia/cc70
  • handling command build instance:eessi-bot-surf repository:eessi.io-2023.06-software accelerator:nvidia/cc70 resulted in:

    • no jobs were submitted

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Apr 15, 2025

New job on instance eessi-bot-surf for CPU micro-architecture x86_64-amd-zen4 and accelerator nvidia/cc70 for repository eessi.io-2023.06-software in job dir /projects/eessibot/eessi-bot-surf/jobs/2025.04/pr_1030/11191697

date job status comment
Apr 15 13:01:37 UTC 2025 submitted job id 11191697 will be eligible to start in about 20 seconds
Apr 15 13:01:43 UTC 2025 received job awaits launch by Slurm scheduler
Apr 15 13:02:00 UTC 2025 running job 11191697 is running
Apr 15 13:12:33 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-11191697.out
❌ found message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-1744722429.tar.gzsize: 2067 MiB (2167673447 bytes)
entries: 5518
modules under 2023.06/software/linux/x86_64/amd/zen4/accel/nvidia/cc70/modules/all
CUDA/12.1.1.lua
software under 2023.06/software/linux/x86_64/amd/zen4/accel/nvidia/cc70/software
CUDA/12.1.1
other under 2023.06/software/linux/x86_64/amd/zen4/accel/nvidia/cc70
no other files in tarball
Apr 15 13:12:33 UTC 2025 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-11191697.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Apr 15, 2025

New job on instance eessi-bot-surf for CPU micro-architecture x86_64-amd-zen2 and accelerator nvidia/cc70 for repository eessi.io-2023.06-software in job dir /projects/eessibot/eessi-bot-surf/jobs/2025.04/pr_1030/11191721

date job status comment
Apr 15 13:01:40 UTC 2025 submitted job id 11191721 will be eligible to start in about 20 seconds
Apr 15 13:01:46 UTC 2025 received job awaits launch by Slurm scheduler
Apr 15 13:02:15 UTC 2025 running job 11191721 is running
Apr 15 13:15:54 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-11191721.out
❌ found message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen2-1744722600.tar.gzsize: 2067 MiB (2167671608 bytes)
entries: 5518
modules under 2023.06/software/linux/x86_64/amd/zen2/accel/nvidia/cc70/modules/all
CUDA/12.1.1.lua
software under 2023.06/software/linux/x86_64/amd/zen2/accel/nvidia/cc70/software
CUDA/12.1.1
other under 2023.06/software/linux/x86_64/amd/zen2/accel/nvidia/cc70
no other files in tarball
Apr 15 13:15:54 UTC 2025 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-11191721.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@laraPPr
Copy link
Collaborator

laraPPr commented Apr 15, 2025

bot: help instance:eessi-bot-vsc-ugent

@eessi-bot
Copy link

eessi-bot bot commented Apr 15, 2025

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command help instance:eessi-bot-vsc-ugent from laraPPr

    • expanded format: help instance:eessi-bot-vsc-ugent
  • handling command help instance:eessi-bot-vsc-ugent resulted in:
    How to send commands to bot instances

    • Commands must be sent with a new comment (edits of existing comments are ignored).
    • A comment may contain multiple commands, one per line.
    • Every command begins at the start of a line and has the syntax bot: COMMAND [ARGUMENTS]*
    • Currently supported COMMANDs are: help, build, show_config, status

    For more information, see https://www.eessi.io/docs/bot

@eessi-bot
Copy link

eessi-bot bot commented Apr 15, 2025

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command help instance:eessi-bot-vsc-ugent from laraPPr

    • expanded format: help instance:eessi-bot-vsc-ugent
  • handling command help instance:eessi-bot-vsc-ugent resulted in:
    How to send commands to bot instances

    • Commands must be sent with a new comment (edits of existing comments are ignored).
    • A comment may contain multiple commands, one per line.
    • Every command begins at the start of a line and has the syntax bot: COMMAND [ARGUMENTS]*
    • Currently supported COMMANDs are: help, build, show_config, status

    For more information, see https://www.eessi.io/docs/bot

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Apr 15, 2025

Updates by the bot instance eessi-bot-surf (click for details)
  • received bot command help instance:eessi-bot-vsc-ugent from laraPPr

    • expanded format: help instance:eessi-bot-vsc-ugent
  • handling command help instance:eessi-bot-vsc-ugent resulted in:
    How to send commands to bot instances

    • Commands must be sent with a new comment (edits of existing comments are ignored).
    • A comment may contain multiple commands, one per line.
    • Every command begins at the start of a line and has the syntax bot: COMMAND [ARGUMENTS]*
    • Currently supported COMMANDs are: help, build, show_config, status

    For more information, see https://www.eessi.io/docs/bot

@eessi-bot-toprichard
Copy link

Updates by the bot instance rt-Grace-jr (click for details)
  • account laraPPr has NO permission to send commands to the bot

@eessi-bot
Copy link

eessi-bot bot commented May 8, 2025

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/intel/cascadelake accel:nvidia/cc70 from TopRichard

    • expanded format: build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/cascadelake accelerator:nvidia/cc70
  • received bot command build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/intel/icelake accel:nvidia/cc70 from TopRichard

    • expanded format: build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/icelake accelerator:nvidia/cc70
  • handling command build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/cascadelake accelerator:nvidia/cc70 resulted in:

    • no jobs were submitted
  • handling command build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/icelake accelerator:nvidia/cc70 resulted in:

    • no jobs were submitted

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented May 8, 2025

Updates by the bot instance eessi-bot-surf (click for details)
  • received bot command build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/intel/cascadelake accel:nvidia/cc70 from TopRichard

    • expanded format: build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/cascadelake accelerator:nvidia/cc70
  • received bot command build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/intel/icelake accel:nvidia/cc70 from TopRichard

    • expanded format: build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/icelake accelerator:nvidia/cc70
  • handling command build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/cascadelake accelerator:nvidia/cc70 resulted in:

    • no jobs were submitted
  • handling command build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/icelake accelerator:nvidia/cc70 resulted in:

    • no jobs were submitted

@eessi-bot-toprichard
Copy link

eessi-bot-toprichard bot commented May 8, 2025

Updates by the bot instance rt-Grace-jr (click for details)
  • received bot command build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/intel/cascadelake accel:nvidia/cc70 from TopRichard

    • expanded format: build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/cascadelake accelerator:nvidia/cc70
  • received bot command build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/intel/icelake accel:nvidia/cc70 from TopRichard

    • expanded format: build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/icelake accelerator:nvidia/cc70
  • handling command build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/cascadelake accelerator:nvidia/cc70 resulted in:

    • no jobs were submitted
  • handling command build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/icelake accelerator:nvidia/cc70 resulted in:

    • no jobs were submitted

@eessi-bot
Copy link

eessi-bot bot commented May 8, 2025

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-intel-cascadelake and accelerator nvidia/cc70 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2025.05/pr_1030/61709

date job status comment
May 08 18:03:33 UTC 2025 submitted job id 61709 awaits release by job manager
May 08 18:03:58 UTC 2025 released job awaits launch by Slurm scheduler
May 08 18:09:32 UTC 2025 running job 61709 is running
May 08 19:18:53 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-61709.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-cascadelake-17467297290.tar.gzsize: 4495 MiB (4713417378 bytes)
entries: 12167
modules under 2023.06/software/linux/x86_64/intel/cascadelake/accel/nvidia/cc70/modules/all
CUDA/12.1.1.lua
CUDA/12.4.0.lua
GDRCopy/2.4-GCCcore-13.2.0.lua
NCCL/2.18.3-GCCcore-12.3.0-CUDA-12.1.1.lua
NCCL/2.20.5-GCCcore-13.2.0-CUDA-12.4.0.lua
OSU-Micro-Benchmarks/7.2-gompi-2023a-CUDA-12.1.1.lua
OSU-Micro-Benchmarks/7.5-gompi-2023b-CUDA-12.4.0.lua
UCC-CUDA/1.2.0-GCCcore-12.3.0-CUDA-12.1.1.lua
UCC-CUDA/1.2.0-GCCcore-13.2.0-CUDA-12.4.0.lua
UCX-CUDA/1.14.1-GCCcore-12.3.0-CUDA-12.1.1.lua
UCX-CUDA/1.15.0-GCCcore-13.2.0-CUDA-12.4.0.lua
software under 2023.06/software/linux/x86_64/intel/cascadelake/accel/nvidia/cc70/software
CUDA/12.1.1
CUDA/12.4.0
GDRCopy/2.4-GCCcore-13.2.0
NCCL/2.18.3-GCCcore-12.3.0-CUDA-12.1.1
NCCL/2.20.5-GCCcore-13.2.0-CUDA-12.4.0
OSU-Micro-Benchmarks/7.2-gompi-2023a-CUDA-12.1.1
OSU-Micro-Benchmarks/7.5-gompi-2023b-CUDA-12.4.0
UCC-CUDA/1.2.0-GCCcore-12.3.0-CUDA-12.1.1
UCC-CUDA/1.2.0-GCCcore-13.2.0-CUDA-12.4.0
UCX-CUDA/1.14.1-GCCcore-12.3.0-CUDA-12.1.1
UCX-CUDA/1.15.0-GCCcore-13.2.0-CUDA-12.4.0
other under 2023.06/software/linux/x86_64/intel/cascadelake/accel/nvidia/cc70
no other files in tarball
May 08 19:18:53 UTC 2025 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-61709.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@eessi-bot
Copy link

eessi-bot bot commented May 8, 2025

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-intel-icelake and accelerator nvidia/cc70 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2025.05/pr_1030/61710

date job status comment
May 08 18:03:38 UTC 2025 submitted job id 61710 awaits release by job manager
May 08 18:04:03 UTC 2025 released job awaits launch by Slurm scheduler
May 08 18:09:38 UTC 2025 running job 61710 is running
May 08 19:05:17 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-61710.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-icelake-17467293450.tar.gzsize: 4495 MiB (4713433171 bytes)
entries: 12167
modules under 2023.06/software/linux/x86_64/intel/icelake/accel/nvidia/cc70/modules/all
CUDA/12.1.1.lua
CUDA/12.4.0.lua
GDRCopy/2.4-GCCcore-13.2.0.lua
NCCL/2.18.3-GCCcore-12.3.0-CUDA-12.1.1.lua
NCCL/2.20.5-GCCcore-13.2.0-CUDA-12.4.0.lua
OSU-Micro-Benchmarks/7.2-gompi-2023a-CUDA-12.1.1.lua
OSU-Micro-Benchmarks/7.5-gompi-2023b-CUDA-12.4.0.lua
UCC-CUDA/1.2.0-GCCcore-12.3.0-CUDA-12.1.1.lua
UCC-CUDA/1.2.0-GCCcore-13.2.0-CUDA-12.4.0.lua
UCX-CUDA/1.14.1-GCCcore-12.3.0-CUDA-12.1.1.lua
UCX-CUDA/1.15.0-GCCcore-13.2.0-CUDA-12.4.0.lua
software under 2023.06/software/linux/x86_64/intel/icelake/accel/nvidia/cc70/software
CUDA/12.1.1
CUDA/12.4.0
GDRCopy/2.4-GCCcore-13.2.0
NCCL/2.18.3-GCCcore-12.3.0-CUDA-12.1.1
NCCL/2.20.5-GCCcore-13.2.0-CUDA-12.4.0
OSU-Micro-Benchmarks/7.2-gompi-2023a-CUDA-12.1.1
OSU-Micro-Benchmarks/7.5-gompi-2023b-CUDA-12.4.0
UCC-CUDA/1.2.0-GCCcore-12.3.0-CUDA-12.1.1
UCC-CUDA/1.2.0-GCCcore-13.2.0-CUDA-12.4.0
UCX-CUDA/1.14.1-GCCcore-12.3.0-CUDA-12.1.1
UCX-CUDA/1.15.0-GCCcore-13.2.0-CUDA-12.4.0
other under 2023.06/software/linux/x86_64/intel/icelake/accel/nvidia/cc70
no other files in tarball
May 08 19:05:17 UTC 2025 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-61710.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Collaborator

Hm, this version of GDRcopy wasn't in there yet, it is now, so rebuilding again

bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/intel/cascadelake accel:nvidia/cc80
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/intel/icelake accel:nvidia/cc80

@eessi-bot
Copy link

eessi-bot bot commented May 12, 2025

Updates by the bot instance eessi-bot-mc-aws (click for details)

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented May 12, 2025

Updates by the bot instance eessi-bot-surf (click for details)
  • received bot command build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/intel/cascadelake accel:nvidia/cc80 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/cascadelake accelerator:nvidia/cc80
  • received bot command build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/intel/icelake accel:nvidia/cc80 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/icelake accelerator:nvidia/cc80
  • handling command build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/cascadelake accelerator:nvidia/cc80 resulted in:

    • no jobs were submitted
  • handling command build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/icelake accelerator:nvidia/cc80 resulted in:

    • no jobs were submitted

@eessi-bot-toprichard
Copy link

Updates by the bot instance rt-Grace-jr (click for details)
  • account casparvl has NO permission to send commands to the bot

@eessi-bot-deucalion
Copy link

eessi-bot-deucalion bot commented May 12, 2025

Updates by the bot instance eessi-bot-deucalion (click for details)
  • received bot command build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/intel/cascadelake accel:nvidia/cc80 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/cascadelake accelerator:nvidia/cc80
  • received bot command build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/intel/icelake accel:nvidia/cc80 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/icelake accelerator:nvidia/cc80
  • handling command build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/cascadelake accelerator:nvidia/cc80 resulted in:

    • no jobs were submitted
  • handling command build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/icelake accelerator:nvidia/cc80 resulted in:

    • no jobs were submitted

@eessi-bot
Copy link

eessi-bot bot commented May 12, 2025

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-intel-cascadelake and accelerator nvidia/cc80 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2025.05/pr_1030/62582

date job status comment
May 12 08:33:50 UTC 2025 submitted job id 62582 awaits release by job manager
May 12 08:34:44 UTC 2025 released job awaits launch by Slurm scheduler
May 12 08:40:49 UTC 2025 running job 62582 is running
May 12 09:50:11 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-62582.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-cascadelake-17470411870.tar.gzsize: 4497 MiB (4716216371 bytes)
entries: 12144
modules under 2023.06/software/linux/x86_64/intel/cascadelake/accel/nvidia/cc80/modules/all
CUDA/12.1.1.lua
CUDA/12.4.0.lua
NCCL/2.18.3-GCCcore-12.3.0-CUDA-12.1.1.lua
NCCL/2.20.5-GCCcore-13.2.0-CUDA-12.4.0.lua
OSU-Micro-Benchmarks/7.2-gompi-2023a-CUDA-12.1.1.lua
OSU-Micro-Benchmarks/7.5-gompi-2023b-CUDA-12.4.0.lua
UCC-CUDA/1.2.0-GCCcore-12.3.0-CUDA-12.1.1.lua
UCC-CUDA/1.2.0-GCCcore-13.2.0-CUDA-12.4.0.lua
UCX-CUDA/1.14.1-GCCcore-12.3.0-CUDA-12.1.1.lua
UCX-CUDA/1.15.0-GCCcore-13.2.0-CUDA-12.4.0.lua
software under 2023.06/software/linux/x86_64/intel/cascadelake/accel/nvidia/cc80/software
CUDA/12.1.1
CUDA/12.4.0
NCCL/2.18.3-GCCcore-12.3.0-CUDA-12.1.1
NCCL/2.20.5-GCCcore-13.2.0-CUDA-12.4.0
OSU-Micro-Benchmarks/7.2-gompi-2023a-CUDA-12.1.1
OSU-Micro-Benchmarks/7.5-gompi-2023b-CUDA-12.4.0
UCC-CUDA/1.2.0-GCCcore-12.3.0-CUDA-12.1.1
UCC-CUDA/1.2.0-GCCcore-13.2.0-CUDA-12.4.0
UCX-CUDA/1.14.1-GCCcore-12.3.0-CUDA-12.1.1
UCX-CUDA/1.15.0-GCCcore-13.2.0-CUDA-12.4.0
other under 2023.06/software/linux/x86_64/intel/cascadelake/accel/nvidia/cc80
no other files in tarball
May 12 09:50:11 UTC 2025 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-62582.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
May 12 10:34:47 UTC 2025 uploaded transfer of eessi-2023.06-software-linux-x86_64-intel-cascadelake-17470411870.tar.gz to S3 bucket succeeded
May 12 14:12:29 UTC 2025 uploaded transfer of eessi-2023.06-software-linux-x86_64-intel-cascadelake-17470411870.tar.gz to S3 bucket succeeded

@eessi-bot
Copy link

eessi-bot bot commented May 12, 2025

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-intel-icelake and accelerator nvidia/cc80 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2025.05/pr_1030/62583

date job status comment
May 12 08:33:55 UTC 2025 submitted job id 62583 awaits release by job manager
May 12 08:34:48 UTC 2025 released job awaits launch by Slurm scheduler
May 12 08:40:59 UTC 2025 running job 62583 is running
May 12 09:35:55 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-62583.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-icelake-17470407390.tar.gzsize: 4497 MiB (4716169066 bytes)
entries: 12144
modules under 2023.06/software/linux/x86_64/intel/icelake/accel/nvidia/cc80/modules/all
CUDA/12.1.1.lua
CUDA/12.4.0.lua
NCCL/2.18.3-GCCcore-12.3.0-CUDA-12.1.1.lua
NCCL/2.20.5-GCCcore-13.2.0-CUDA-12.4.0.lua
OSU-Micro-Benchmarks/7.2-gompi-2023a-CUDA-12.1.1.lua
OSU-Micro-Benchmarks/7.5-gompi-2023b-CUDA-12.4.0.lua
UCC-CUDA/1.2.0-GCCcore-12.3.0-CUDA-12.1.1.lua
UCC-CUDA/1.2.0-GCCcore-13.2.0-CUDA-12.4.0.lua
UCX-CUDA/1.14.1-GCCcore-12.3.0-CUDA-12.1.1.lua
UCX-CUDA/1.15.0-GCCcore-13.2.0-CUDA-12.4.0.lua
software under 2023.06/software/linux/x86_64/intel/icelake/accel/nvidia/cc80/software
CUDA/12.1.1
CUDA/12.4.0
NCCL/2.18.3-GCCcore-12.3.0-CUDA-12.1.1
NCCL/2.20.5-GCCcore-13.2.0-CUDA-12.4.0
OSU-Micro-Benchmarks/7.2-gompi-2023a-CUDA-12.1.1
OSU-Micro-Benchmarks/7.5-gompi-2023b-CUDA-12.4.0
UCC-CUDA/1.2.0-GCCcore-12.3.0-CUDA-12.1.1
UCC-CUDA/1.2.0-GCCcore-13.2.0-CUDA-12.4.0
UCX-CUDA/1.14.1-GCCcore-12.3.0-CUDA-12.1.1
UCX-CUDA/1.15.0-GCCcore-13.2.0-CUDA-12.4.0
other under 2023.06/software/linux/x86_64/intel/icelake/accel/nvidia/cc80
no other files in tarball
May 12 09:35:55 UTC 2025 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-62583.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
May 12 10:36:09 UTC 2025 uploaded transfer of eessi-2023.06-software-linux-x86_64-intel-icelake-17470407390.tar.gz to S3 bucket succeeded
May 12 14:13:59 UTC 2025 uploaded transfer of eessi-2023.06-software-linux-x86_64-intel-icelake-17470407390.tar.gz to S3 bucket succeeded

@casparvl casparvl added bot:deploy Ask bot to deploy missing software installations to EESSI and removed ready-to-review labels May 12, 2025
@eessi-bot-toprichard
Copy link

Label bot:deploy has been set by user casparvl, but this person does not have permission to trigger deployments

@TopRichard TopRichard added bot:deploy Ask bot to deploy missing software installations to EESSI and removed bot:deploy Ask bot to deploy missing software installations to EESSI labels May 12, 2025
@eessi-bot
Copy link

eessi-bot bot commented May 12, 2025

Label bot:deploy has been set by user TopRichard, which has no permission to trigger the action

@eessi-bot-deucalion
Copy link

Label bot:deploy has been set by user TopRichard, but this person does not have permission to trigger deployments

@eessi-bot
Copy link

eessi-bot bot commented May 12, 2025

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/intel/cascadelake accel:nvidia/cc80 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/cascadelake accelerator:nvidia/cc80
  • received bot command build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/intel/icelake accel:nvidia/cc80 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/icelake accelerator:nvidia/cc80
  • handling command build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/cascadelake accelerator:nvidia/cc80 resulted in:

    • no jobs were submitted
  • handling command build repository:eessi.io-2023.06-software instance:eessi-bot-mc-aws architecture:x86_64/intel/icelake accelerator:nvidia/cc80 resulted in:

    • no jobs were submitted

@eessi-bot
Copy link

eessi-bot bot commented May 12, 2025

Label bot:deploy has been set by user TopRichard, which has no permission to trigger the action

@casparvl casparvl added bot:deploy Ask bot to deploy missing software installations to EESSI and removed bot:deploy Ask bot to deploy missing software installations to EESSI labels May 12, 2025
@eessi-bot-toprichard
Copy link

Label bot:deploy has been set by user casparvl, but this person does not have permission to trigger deployments

@casparvl casparvl merged commit 39bffa2 into EESSI:2023.06-software.eessi.io May 12, 2025
59 checks passed
@eessi-bot
Copy link

eessi-bot bot commented May 12, 2025

PR merged! Moved ['/project/def-users/SHARED/jobs/2025.04/pr_1030/56936', '/project/def-users/SHARED/jobs/2025.04/pr_1030/56937', '/project/def-users/SHARED/jobs/2025.04/pr_1030/56939', '/project/def-users/SHARED/jobs/2025.04/pr_1030/56940', '/project/def-users/SHARED/jobs/2025.04/pr_1030/57149', '/project/def-users/SHARED/jobs/2025.04/pr_1030/57150', '/project/def-users/SHARED/jobs/2025.04/pr_1030/58375', '/project/def-users/SHARED/jobs/2025.04/pr_1030/58376', '/project/def-users/SHARED/jobs/2025.04/pr_1030/58377', '/project/def-users/SHARED/jobs/2025.04/pr_1030/58378', '/project/def-users/SHARED/jobs/2025.04/pr_1030/58379', '/project/def-users/SHARED/jobs/2025.04/pr_1030/58386', '/project/def-users/SHARED/jobs/2025.05/pr_1030/61252', '/project/def-users/SHARED/jobs/2025.05/pr_1030/61253', '/project/def-users/SHARED/jobs/2025.05/pr_1030/61709', '/project/def-users/SHARED/jobs/2025.05/pr_1030/61710', '/project/def-users/SHARED/jobs/2025.05/pr_1030/62582', '/project/def-users/SHARED/jobs/2025.05/pr_1030/62583'] to /project/def-users/SHARED/trash_bin/EESSI/software-layer/2025.05.12

@eessi-bot
Copy link

eessi-bot bot commented May 12, 2025

PR merged! Moved ['/project/def-users/SHARED/jobs/2025.04/pr_1030/2424'] to /project/def-users/SHARED/trash_bin/EESSI/software-layer/2025.05.12

@eessi-bot-surf
Copy link

PR merged! Moved ['/projects/eessibot/eessi-bot-surf/jobs/2025.04/pr_1030/11191721', '/projects/eessibot/eessi-bot-surf/jobs/2025.04/pr_1030/11191697'] to /projects/eessibot/eessi-bot-surf/trash_bin/EESSI/software-layer/2025.05.12

@eessi-bot-toprichard
Copy link

PR merged! Moved ['/p/project1/ceasybuilders/bot-rt/jobs/2025.04/pr_1030/13615161', '/p/project1/ceasybuilders/bot-rt/jobs/2025.04/pr_1030/13613825', '/p/project1/ceasybuilders/bot-rt/jobs/2025.04/pr_1030/13613238'] to /p/project1/ceasybuilders/bot-rt/trash_bin/EESSI/software-layer/2025.05.12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2023.06-software.eessi.io 2023.06 version of software.eessi.io accel:nvidia bot:deploy Ask bot to deploy missing software installations to EESSI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants