-
Notifications
You must be signed in to change notification settings - Fork 66
Rebuild GROMACS for cuda sanity check #1166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Retrying after having implemented this suggestion to have the bot set bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf for:arch=x86_64/intel/icelake,accel=nvidia/cc80 |
|
New job on instance
|
|
@casparvl how did you add the suggestion? |
|
I added a SitePackage.lua to the |
|
14349790 failed the CUDA sanity check so that it is not getting stuck with the changes made to the .SitePackage.lua file. |
So, it's only PTX code that's missing, device code is there. I'll add the option to ignore this and retry. |
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf for:arch=x86_64/intel/icelake,accel=nvidia/cc80 |
|
New job on instance
|
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-surf for:arch=x86_64/amd/zen4,accel=nvidia/cc90 |
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
I have not updated the sitepackage.lua so I'm gonna cancel the jobs running at ugent. |
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-vsc-ugent for:arch=x86_64/amd/zen3,accel=nvidia/cc80 |
|
New job on instance
|
|
New job on instance
|
|
Doing the remainder of the builds (cross-compilations): bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws on:arch=zen2 for:arch=x86_64/amd/zen2,accel=nvidia/cc70 |
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
Ugent bot turns out not to be configured with this yet, so cross-compiling: bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws on:arch=zen2 for:arch=x86_64/amd/zen2,accel=nvidia/cc80 |
|
New job on instance
|
|
We should look into why these three builds are failing The tests on nvidia grace + cc90 are also failing. Since this is a native build this should also be resolved (#1166 (comment)). |
|
Check if these two pass now NCCL is rebuild and deployed: |
|
New job on instance
|
|
New job on instance
|
|
89454 failed with : |
Found this in the easybuild log: That happened before, e.g. in #709 (comment), and often it works when you try again. bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws on:arch=icelake for:arch=x86_64/intel/icelake,accel=nvidia/cc90 |
|
New job on instance
|
|
New job on instance
|
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws on:arch=cascadelake for:arch=x86_64/intel/cascadelake,accel=nvidia/cc80 |
|
New job on instance
|
Looks like its config file needs to be updated? |
|
Ah, the bot config was recently updated. Let's try again: bot: build repo:eessi.io-2023.06-software instance:eessi-bot-jsc for:arch=aarch64/nvidia/grace,accel=nvidia/cc90 |
|
New job on instance
|
Checklist for supported CPU-GPU combos (by @boegel):
generic)aarch64x86_64aarch64)neoverse_n1neoverse_v1nvidia/gracex86_64/amd)zen2zen3zen4x86_64/intel)haswellskylake_avx512cascadelakeicelakesapphirerapids