Skip to content

Conversation

@TopRichard
Copy link
Collaborator

@TopRichard TopRichard commented Mar 28, 2025

Boost sources could not be fetched, and thus it was failing in #985, from-commit have been added to Boost

Packages added:

ALL/0.9.2-foss-2023a.lua
Boost/1.82.0-GCC-12.3.0.lua
BWA/0.7.17-20220923-GCCcore-12.3.0.lua
CDO/2.2.2-gompi-2023a.lua
ecCodes/2.31.0-gompi-2023a.lua
FFmpeg/6.0-GCCcore-12.3.0.lua
ffnvcodec/12.0.16.0.lua
googletest/1.13.0-GCCcore-12.3.0.lua
LAME/3.100-GCCcore-12.3.0.lua
libaec/1.0.6-GCCcore-12.3.0.lua
netCDF/4.9.2-gompi-2023a.lua
nlohmann_json/3.11.2-GCCcore-12.3.0.lua
PROJ/9.2.0-GCCcore-12.3.0.lua
SDL2/2.28.2-GCCcore-12.3.0.lua
UDUNITS/2.2.28-GCCcore-12.3.0.lua
VTK/9.3.0-foss-2023a.lua
x264/20230226-GCCcore-12.3.0.lua
x265/3.5-GCCcore-12.3.0.lua

@TopRichard TopRichard added 2023.06-software.eessi.io 2023.06 version of software.eessi.io grace NVIDIA Grace CPU labels Mar 28, 2025
@eessi-bot
Copy link

eessi-bot bot commented Mar 28, 2025

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/sapphirerapids, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-software, eessi.io-2023.06-compat

@eessi-bot
Copy link

eessi-bot bot commented Mar 28, 2025

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi.io-2023.06-compat, eessi.io-2023.06-software

@eessi-bot-toprichard
Copy link

Instance rt-Grace-jr is configured to build for:

  • architectures: aarch64/nvidia/grace
  • repositories: eessi.io-2023.06-software

@eessi-bot-trz42
Copy link

Instance trz42-GH200-jr is configured to build for:

  • architectures: aarch64/nvidia/grace
  • repositories: eessi.io-2023.06-software

@TopRichard
Copy link
Collaborator Author

bot: build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software

@eessi-bot
Copy link

eessi-bot bot commented Mar 28, 2025

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software from TopRichard

    • expanded format: build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software
  • handling command build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software resulted in:

    • no jobs were submitted

@eessi-bot
Copy link

eessi-bot bot commented Mar 28, 2025

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software from TopRichard

    • expanded format: build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software
  • handling command build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software resulted in:

    • no jobs were submitted

@eessi-bot-toprichard
Copy link

eessi-bot-toprichard bot commented Mar 28, 2025

Updates by the bot instance rt-Grace-jr (click for details)
  • received bot command build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software from TopRichard

    • expanded format: build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software
  • handling command build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software resulted in:

@eessi-bot-trz42
Copy link

Updates by the bot instance trz42-GH200-jr (click for details)
  • account TopRichard has NO permission to send commands to the bot

@eessi-bot-toprichard
Copy link

eessi-bot-toprichard bot commented Mar 28, 2025

New job on instance rt-Grace-jr for CPU micro-architecture aarch64-nvidia-grace for repository eessi.io-2023.06-software in job dir /p/project1/ceasybuilders/bot-rt/jobs/2025.03/pr_986/13544877
Failure:

Checksum verification for /p/project1/ceasybuilders/bot-rt/shared_fs_path/easybuild/sources/b/Boost/boost_1_82_0.tar.gz using 66a469b6e608a51f8347236f4912e27dc5c60c60d7d53ae9bfe4683316c6f04c failed.

Boost sources downloaded from previous PR were removed, will try to re-run build process

date job status comment
Mar 28 17:24:35 UTC 2025 submitted job id 13544877 awaits release by job manager
Mar 28 17:25:28 UTC 2025 released job awaits launch by Slurm scheduler
Mar 28 17:26:31 UTC 2025 running job 13544877 is running
Mar 28 17:33:43 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-13544877.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Mar 28 17:33:43 UTC 2025 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite produced failures.
ReFrame Summary
[ FAILED ] Ran 8/8 test case(s) from 8 check(s) (8 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-13544877.out
❌ found message matching ERROR:
❌ found message matching [\s*FAILED\s*].*Ran .* test case

@TopRichard
Copy link
Collaborator Author

bot: build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software

@eessi-bot
Copy link

eessi-bot bot commented Mar 28, 2025

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software from TopRichard

    • expanded format: build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software
  • handling command build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software resulted in:

    • no jobs were submitted

@eessi-bot
Copy link

eessi-bot bot commented Mar 28, 2025

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software from TopRichard

    • expanded format: build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software
  • handling command build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software resulted in:

    • no jobs were submitted

@eessi-bot-trz42
Copy link

Updates by the bot instance trz42-GH200-jr (click for details)
  • account TopRichard has NO permission to send commands to the bot

@eessi-bot-toprichard
Copy link

eessi-bot-toprichard bot commented Mar 28, 2025

Updates by the bot instance rt-Grace-jr (click for details)
  • received bot command build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software from TopRichard

    • expanded format: build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software
  • handling command build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software resulted in:

@eessi-bot-toprichard
Copy link

eessi-bot-toprichard bot commented Mar 28, 2025

New job on instance rt-Grace-jr for CPU micro-architecture aarch64-nvidia-grace for repository eessi.io-2023.06-software in job dir /p/project1/ceasybuilders/bot-rt/jobs/2025.03/pr_986/13544880

date job status comment
Mar 28 17:49:22 UTC 2025 submitted job id 13544880 awaits release by job manager
Mar 28 17:49:49 UTC 2025 released job awaits launch by Slurm scheduler
Mar 28 17:50:52 UTC 2025 running job 13544880 is running
Mar 28 18:45:32 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-13544880.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-nvidia-grace-1743187081.tar.gzsize: 240 MiB (251969692 bytes)
entries: 60192
modules under 2023.06/software/linux/aarch64/nvidia/grace/modules/all
ALL/0.9.2-foss-2023a.lua
Boost/1.82.0-GCC-12.3.0.lua
Boost.Python/1.82.0-GCC-12.3.0.lua
BWA/0.7.17-20220923-GCCcore-12.3.0.lua
CDO/2.2.2-gompi-2023a.lua
ecCodes/2.31.0-gompi-2023a.lua
FFmpeg/6.0-GCCcore-12.3.0.lua
ffnvcodec/12.0.16.0.lua
googletest/1.13.0-GCCcore-12.3.0.lua
LAME/3.100-GCCcore-12.3.0.lua
libaec/1.0.6-GCCcore-12.3.0.lua
netCDF/4.9.2-gompi-2023a.lua
nlohmann_json/3.11.2-GCCcore-12.3.0.lua
PROJ/9.2.0-GCCcore-12.3.0.lua
SDL2/2.28.2-GCCcore-12.3.0.lua
UDUNITS/2.2.28-GCCcore-12.3.0.lua
VTK/9.3.0-foss-2023a.lua
x264/20230226-GCCcore-12.3.0.lua
x265/3.5-GCCcore-12.3.0.lua
software under 2023.06/software/linux/aarch64/nvidia/grace/software
ALL/0.9.2-foss-2023a
Boost/1.82.0-GCC-12.3.0
Boost.Python/1.82.0-GCC-12.3.0
BWA/0.7.17-20220923-GCCcore-12.3.0
CDO/2.2.2-gompi-2023a
ecCodes/2.31.0-gompi-2023a
FFmpeg/6.0-GCCcore-12.3.0
ffnvcodec/12.0.16.0
googletest/1.13.0-GCCcore-12.3.0
LAME/3.100-GCCcore-12.3.0
libaec/1.0.6-GCCcore-12.3.0
netCDF/4.9.2-gompi-2023a
nlohmann_json/3.11.2-GCCcore-12.3.0
PROJ/9.2.0-GCCcore-12.3.0
SDL2/2.28.2-GCCcore-12.3.0
UDUNITS/2.2.28-GCCcore-12.3.0
VTK/9.3.0-foss-2023a
x264/20230226-GCCcore-12.3.0
x265/3.5-GCCcore-12.3.0
other under 2023.06/software/linux/aarch64/nvidia/grace
no other files in tarball
Mar 28 18:45:32 UTC 2025 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite produced failures.
ReFrame Summary
[ FAILED ] Ran 8/8 test case(s) from 8 check(s) (8 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-13544880.out
❌ found message matching ERROR:
❌ found message matching [\s*FAILED\s*].*Ran .* test case

@TopRichard
Copy link
Collaborator Author

Following up ReFrame test failures...

Copy link
Collaborator

@trz42 trz42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Boost & Boost.Python should be dependencies of another package than AOFlagger.

@TopRichard
Copy link
Collaborator Author

Boost.Python should be removed from the PR as it is not a dependency for Boost

@TopRichard
Copy link
Collaborator Author

bot: build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software

@eessi-bot
Copy link

eessi-bot bot commented Mar 28, 2025

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software from TopRichard

    • expanded format: build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software
  • handling command build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software resulted in:

    • no jobs were submitted

@eessi-bot
Copy link

eessi-bot bot commented Mar 28, 2025

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software from TopRichard

    • expanded format: build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software
  • handling command build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software resulted in:

    • no jobs were submitted

@eessi-bot-trz42
Copy link

Updates by the bot instance trz42-GH200-jr (click for details)
  • account TopRichard has NO permission to send commands to the bot

@eessi-bot-toprichard
Copy link

eessi-bot-toprichard bot commented Mar 28, 2025

Updates by the bot instance rt-Grace-jr (click for details)
  • received bot command build inst:rt-Grace-jr arch:aarch64/nvidia/grace repo:eessi.io-2023.06-software from TopRichard

    • expanded format: build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software
  • handling command build instance:rt-Grace-jr architecture:aarch64/nvidia/grace repository:eessi.io-2023.06-software resulted in:

@eessi-bot-toprichard
Copy link

eessi-bot-toprichard bot commented Mar 28, 2025

New job on instance rt-Grace-jr for CPU micro-architecture aarch64-nvidia-grace for repository eessi.io-2023.06-software in job dir /p/project1/ceasybuilders/bot-rt/jobs/2025.03/pr_986/13544939

date job status comment
Mar 28 19:15:34 UTC 2025 submitted job id 13544939 awaits release by job manager
Mar 28 19:15:40 UTC 2025 released job awaits launch by Slurm scheduler
Mar 28 19:16:44 UTC 2025 running job 13544939 is running
Mar 28 20:05:07 UTC 2025 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-13544939.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-nvidia-grace-1743191829.tar.gzsize: 223 MiB (234321709 bytes)
entries: 43189
modules under 2023.06/software/linux/aarch64/nvidia/grace/modules/all
ALL/0.9.2-foss-2023a.lua
Boost/1.82.0-GCC-12.3.0.lua
BWA/0.7.17-20220923-GCCcore-12.3.0.lua
CDO/2.2.2-gompi-2023a.lua
ecCodes/2.31.0-gompi-2023a.lua
FFmpeg/6.0-GCCcore-12.3.0.lua
ffnvcodec/12.0.16.0.lua
googletest/1.13.0-GCCcore-12.3.0.lua
LAME/3.100-GCCcore-12.3.0.lua
libaec/1.0.6-GCCcore-12.3.0.lua
netCDF/4.9.2-gompi-2023a.lua
nlohmann_json/3.11.2-GCCcore-12.3.0.lua
PROJ/9.2.0-GCCcore-12.3.0.lua
SDL2/2.28.2-GCCcore-12.3.0.lua
UDUNITS/2.2.28-GCCcore-12.3.0.lua
VTK/9.3.0-foss-2023a.lua
x264/20230226-GCCcore-12.3.0.lua
x265/3.5-GCCcore-12.3.0.lua
software under 2023.06/software/linux/aarch64/nvidia/grace/software
ALL/0.9.2-foss-2023a
Boost/1.82.0-GCC-12.3.0
BWA/0.7.17-20220923-GCCcore-12.3.0
CDO/2.2.2-gompi-2023a
ecCodes/2.31.0-gompi-2023a
FFmpeg/6.0-GCCcore-12.3.0
ffnvcodec/12.0.16.0
googletest/1.13.0-GCCcore-12.3.0
LAME/3.100-GCCcore-12.3.0
libaec/1.0.6-GCCcore-12.3.0
netCDF/4.9.2-gompi-2023a
nlohmann_json/3.11.2-GCCcore-12.3.0
PROJ/9.2.0-GCCcore-12.3.0
SDL2/2.28.2-GCCcore-12.3.0
UDUNITS/2.2.28-GCCcore-12.3.0
VTK/9.3.0-foss-2023a
x264/20230226-GCCcore-12.3.0
x265/3.5-GCCcore-12.3.0
other under 2023.06/software/linux/aarch64/nvidia/grace
no other files in tarball
Mar 28 20:05:07 UTC 2025 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite produced failures.
ReFrame Summary
[ FAILED ] Ran 8/8 test case(s) from 8 check(s) (8 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-13544939.out
❌ found message matching ERROR:
❌ found message matching [\s*FAILED\s*].*Ran .* test case
Mar 28 21:06:08 UTC 2025 uploaded transfer of eessi-2023.06-software-linux-aarch64-nvidia-grace-1743191829.tar.gz to S3 bucket succeeded

Copy link
Collaborator

@trz42 trz42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

One CI workflow is failing. Looks like a newer apptainer version is being installed and that requires some shared library that is not available. Can be tackled in a separate PR.

Nice if you could add a little comment about the failing ReFrame tests.

@trz42 trz42 added ready-to-deploy Mark a PR as ready to deploy and removed ready-to-review labels Mar 28, 2025
@TopRichard
Copy link
Collaborator Author

TopRichard commented Mar 28, 2025

lgtm

One CI workflow is failing. Looks like a newer apptainer version is being installed and that requires some shared library that is not available. Can be tackled in a separate PR.

Nice if you could add a little comment about the failing ReFrame tests.

ReFrame test fails with:

/p/project1/ceasybuilders/bot-rt/jobs/2025.03/pr_986/event_023e9460-0c09-11f0-9cf2-1f5e50809cb5/run_000/linux_aarch64_nvidia_grace/eessi.io-2023.06-software/reframe_runs/stage/BotBuildTests/aarch64_nvidia_grace/default/EESSI_OSU_coll_775175bf/rfm_job.sh: line 7: mpirun: command not found

While a sample of rfm_job.sh:

#!/bin/bash
module load OSU-Micro-Benchmarks/7.2-gompi-2023b
export OMP_NUM_THREADS=1
export I_MPI_PIN_CELL=core
export I_MPI_PIN_DOMAIN=1:compact
export OMPI_MCA_rmaps_base_mapping_policy=slot:PE=1
mpirun -np 72 osu_allreduce -m 8 -x 5 -i 10 -c
echo "EESSI_CVMFS_REPO: $EESSI_CVMFS_REPO"
echo "EESSI_SOFTWARE_SUBDIR: $EESSI_SOFTWARE_SUBDIR"
echo "FULL_MODULEPATH: $(module --location show OSU-Micro-Benchmarks/7.2-gompi-2023b)"

Checking why it it is failing to find mpirun

Looking at rfm_job.out, i can see that FULL_MODULEPATH is empty:

EESSI_CVMFS_REPO: /cvmfs/software.eessi.io
EESSI_SOFTWARE_SUBDIR: aarch64/nvidia/grace
FULL_MODULEPATH: 

@TopRichard TopRichard added bot:deploy Ask bot to deploy missing software installations to EESSI and removed ready-to-deploy Mark a PR as ready to deploy labels Mar 28, 2025
@eessi-bot
Copy link

eessi-bot bot commented Mar 28, 2025

Label bot:deploy has been set by user TopRichard, which has no permission to trigger the action

1 similar comment
@eessi-bot
Copy link

eessi-bot bot commented Mar 28, 2025

Label bot:deploy has been set by user TopRichard, which has no permission to trigger the action

@eessi-bot-trz42
Copy link

Label bot:deploy has been set by user TopRichard, but this person does not have permission to trigger deployments

@trz42
Copy link
Collaborator

trz42 commented Mar 28, 2025

Tarball ingested and software packages available via /cvmfs.

@trz42 trz42 merged commit 0ea60c1 into EESSI:2023.06-software.eessi.io Mar 28, 2025
52 of 59 checks passed
@eessi-bot
Copy link

eessi-bot bot commented Mar 28, 2025

PR merged! Moved [] to /project/def-users/SHARED/trash_bin/EESSI/software-layer/2025.03.28

1 similar comment
@eessi-bot
Copy link

eessi-bot bot commented Mar 28, 2025

PR merged! Moved [] to /project/def-users/SHARED/trash_bin/EESSI/software-layer/2025.03.28

@eessi-bot-toprichard
Copy link

PR merged! Moved ['/p/project1/ceasybuilders/bot-rt/jobs/2025.03/pr_986/13544880', '/p/project1/ceasybuilders/bot-rt/jobs/2025.03/pr_986/13544877', '/p/project1/ceasybuilders/bot-rt/jobs/2025.03/pr_986/13544939'] to /p/project1/ceasybuilders/bot-rt/trash_bin/EESSI/software-layer/2025.03.28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2023.06-software.eessi.io 2023.06 version of software.eessi.io bot:deploy Ask bot to deploy missing software installations to EESSI grace NVIDIA Grace CPU

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants