-
Notifications
You must be signed in to change notification settings - Fork 66
{2025.05}[SYSTEM] Cuda 12.6.0, 12.8.0, cuDNN 9.5.0.50 #1278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-surf for:arch=x86_64/intel/icelake,accel=nvidia/cc80 |
|
New job on instance
|
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-surf for:arch=x86_64/intel/icelake,accel=nvidia/cc80 |
|
New job on instance
|
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-surf for:arch=x86_64/intel/icelake,accel=nvidia/cc80 |
|
New job on instance
|
|
Failure in the cuDNN host injections installations because it doesn't contain ptx code (fixed in EESSI/software-layer-scripts@e25b625 en bf2fc9c) Also, another failure: Not sure what's wrong here. We may be missing a |
|
Added some extra verbosity EESSI/software-layer-scripts@54bd9ad , let's see bot: build repo:eessi.io-2025.06-software instance:eessi-bot-surf for:arch=x86_64/intel/icelake,accel=nvidia/cc80 |
|
New job on instance
|
|
Making it verbose seems to have solved the issue. That is, of course, impossible, but... things are working now: So maybe this was just one more of |
|
Let's get all of those host-injections installed... All bots that run native builds (one architecture per bot is sufficient) bot: build repo:eessi.io-2025.06-software instance:eessi-bot-vsc-ugent for:arch=x86_64/intel/cascadelake,accel=nvidia/cc70 x86_64 and arm archs on AWS bot: bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws on:arch=zen2 for:arch=x86_64/amd/zen2,accel=nvidia/cc70 |
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
Edit: not sure why the previous build failed. The installations in the host_injections failed with a message that the lock file was already present. That's very strange, there should not be a lock file in the host_injections... bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws on:arch=zen2 for:arch=x86_64/amd/zen2,accel=nvidia/cc70 |
|
New job on instance
|
|
Oh crap, I see the issue, the other build was bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws for:arch=aarch64/generic |
|
New job on instance
|
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws for:arch=aarch64/generic |
|
New job on instance
|
|
Wrong version... bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws for:arch=aarch64/generic |
|
New job on instance
|
|
bot: help |
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-vsc-ugent for:arch=x86_64/intel/cascadelake,accel=nvidia/cc70 |
|
bot: help |
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
Updates by the bot instance
|
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-vsc-ugent for:arch=x86_64/intel/cascadelake,accel=nvidia/cc70 |
|
bot: show_config |
|
Instance
|
|
Instance
|
|
Instance
|
|
Instance
|
|
Instance
|
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-vsc-ugent for:arch=x86_64/intel/cascadelake,accel=nvidia/cc70 |
|
New job on instance
|
|
New job on instance
|
|
Hm, both tried to install in the same host_injections. The second one gave another of those strange, random permissions errors. Let me retry... |
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-vsc-ugent for:arch=x86_64/intel/cascadelake,accel=nvidia/cc70 |
|
New job on instance
|
|
New job on instance
|
|
What causes this? |
|
It seems like this environment variable is not found? |
I think we should deploy the script from EESSI/software-layer-scripts#120 through this current PR, then change the
build.shback to it's original form. The issue is that EESSI/software-layer-scripts#120 can't be deployed there, because no software is built, and thus no "no missing installations" message is printed. This causes the bot to consider the build step a 'failure'.