Skip to content

Conversation

@casparvl
Copy link
Collaborator

@casparvl casparvl commented Nov 4, 2025

This PR separates out the build for cuDNN-9.10.1.4 for CC90, since only for this build do we want to disable the sanity check. See https://gitlab.com/eessi/support/-/issues/210#note_2865904820

NOTE: this PR should only be built for CC90 targets. The other targets will be provided by #1287 .

Caspar van Leeuwen and others added 2 commits November 4, 2025 16:58
@casparvl
Copy link
Collaborator Author

casparvl commented Nov 4, 2025

This should work, with the sanity check disabled...

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-jsc for:arch=aarch64/nvidia/grace,accel=nvidia/cc90

@eessi-bot-jsc
Copy link

eessi-bot-jsc bot commented Nov 4, 2025

New job on instance eessi-bot-jsc for repository eessi.io-2025.06-software
Building on: nvidia-grace and accelerator nvidia/cc90
Building for: aarch64/nvidia/grace and accelerator nvidia/cc90
Job dir: /p/project1/ceasybuilders/eessibot/jobs/2025.11/pr_1286/14181156

date job status comment
Nov 04 16:12:55 UTC 2025 submitted job id 14181156 awaits release by job manager
Nov 04 16:13:12 UTC 2025 released job awaits launch by Slurm scheduler
Nov 04 16:28:22 UTC 2025 running job 14181156 is running
Nov 04 16:31:36 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-14181156.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2025.06-software-linux-aarch64-nvidia-grace-accel-nvidia-cc90-17622737690.tar.gzsize: 0 MiB (45 bytes)
entries: 0
modules under 2025.06/software/linux/aarch64/nvidia/grace/accel/nvidia/cc90/modules/all
no module files in tarball
software under 2025.06/software/linux/aarch64/nvidia/grace/accel/nvidia/cc90/software
no software packages in tarball
reprod directories under 2025.06/software/linux/aarch64/nvidia/grace/accel/nvidia/cc90/reprod
no reprod directories in tarball
other under 2025.06/software/linux/aarch64/nvidia/grace/accel/nvidia/cc90
no other files in tarball
Nov 04 16:31:36 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 0/0 test case(s) from 0 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-14181156.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Collaborator Author

casparvl commented Nov 4, 2025

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-surf for:arch=x86_64/amd/zen4,accel=nvidia/cc90

@eessi-bot-surf
Copy link

eessi-bot-surf bot commented Nov 4, 2025

New job on instance eessi-bot-surf for repository eessi.io-2025.06-software
Building on: amd-zen4 and accelerator nvidia/cc90
Building for: x86_64/amd/zen4 and accelerator nvidia/cc90
Job dir: /projects/eessibot/eessi-bot-surf/jobs/2025.11/pr_1286/15756026

date job status comment
Nov 04 16:31:22 UTC 2025 submitted job id 15756026 will be eligible to start in about 20 seconds
Nov 04 16:31:33 UTC 2025 received job awaits launch by Slurm scheduler
Nov 04 16:32:03 UTC 2025 running job 15756026 is running
Nov 04 16:33:46 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-15756026.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2025.06-software-linux-x86_64-amd-zen4-accel-nvidia-cc90-17622739720.tar.gzsize: 0 MiB (45 bytes)
entries: 0
modules under 2025.06/software/linux/x86_64/amd/zen4/accel/nvidia/cc90/modules/all
no module files in tarball
software under 2025.06/software/linux/x86_64/amd/zen4/accel/nvidia/cc90/software
no software packages in tarball
reprod directories under 2025.06/software/linux/x86_64/amd/zen4/accel/nvidia/cc90/reprod
no reprod directories in tarball
other under 2025.06/software/linux/x86_64/amd/zen4/accel/nvidia/cc90
no other files in tarball
Nov 04 16:33:46 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 0/0 test case(s) from 0 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-15756026.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Collaborator Author

casparvl commented Nov 4, 2025

Ah...

== Summary:

  • [FAILED] CUDA/12.8.0
  • [SKIPPED] cuDNN/9.10.1.4-CUDA-12.8.0

We need #1278 now before we proceed here, since now it gets build as dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant