-
Notifications
You must be signed in to change notification settings - Fork 15
Make EESSI-extend support accelerator installations
#27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Strict installation path checking is enforced by EESSI for EESSI and site | ||
| installations involving accelerators. In these cases, if you wish to create an | ||
| accelerator installation you must set the environement variable | ||
| EESSI_ACCELERATOR_INSTALL (and load/reload this module). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new environment variable has an impact on the build scripts, it needs to be set in the scenario where we expect to do an accelerator installation
| fi | ||
| if [ ! -z ${EESSI_ACCELERATOR_TARGET} ]; then | ||
| INPUT="export EESSI_ACCELERATOR_TARGET=${EESSI_ACCELERATOR_TARGET}; ${INPUT}" | ||
| if [ ! -z ${EESSI_ACCELERATOR_TARGET_OVERRIDE} ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trz42 This is why I was asking about where these environment variables get set, this should be using the override mechanism
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| module load EESSI-extend/${{matrix.EESSI_VERSION}}-easybuild | ||
| check_env_var "EASYBUILD_INSTALLPATH" "$EESSI_SOFTWARE_PATH" # installation path should be the same unless we ask for an explicit GPU installation | ||
| check_env_var "EASYBUILD_CUDA_COMPUTE_CAPABILITIES" "$STORED_CUDA_CC" | ||
| export EESSI_ACCELERATOR_INSTALL=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is this variable (EESSI_ACCELERATOR_INSTALL) and EESSI_ACCELERATOR_TARGET_OVERRIDE that need to be set by the bot in order to configure EESSI-extend correctly for a GPU installation
| if (eessi_accelerator_target ~= nil) then | ||
| cuda_compute_capability = string.match(eessi_accelerator_target, "^nvidia/cc([0-9][0-9])$") | ||
| if (cuda_compute_capability ~= nil) then | ||
| easybuild_installpath = pathJoin(easybuild_installpath, 'accel', eessi_accelerator_target) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was actually wrong, archdetect returns paths like accel/nvidia/cc80 (see https://github.com/EESSI/software-layer-scripts/blob/main/tests/archdetect/nvidia-smi/1xa100.output), but we were consistent in our error (the build script set EESSI_ACCELERATOR_TARGET incorrectly rather than set EESSI_ACCELERATOR_TARGET_OVERRIDE which would have affected the behaviour of archdetect).
| export EESSI_ACCELERATOR_TARGET=$(cfg_get_value "architecture" "accelerator") | ||
| echo "bot/build.sh: EESSI_ACCELERATOR_TARGET='${EESSI_ACCELERATOR_TARGET}'" | ||
| export EESSI_ACCELERATOR_TARGET_OVERRIDE="accel/$(cfg_get_value architecture accelerator)" | ||
| echo "bot/build.sh: EESSI_ACCELERATOR_TARGET_OVERRIDE='${EESSI_ACCELERATOR_TARGET_OVERRIDE}'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if [[ -n "$EESSI_ACCELERATOR_TARGET_OVERRIDE" && -z "$EESSI_ACCELERATOR_TARGET" ]]; then | ||
| fatal_error "EESSI module should've set EESSI_ACCELERATOR_TARGET when EESSI_ACCELERATOR_TARGET_OVERRIDE exported." >&2 | ||
| elif [[ -n "$EESSI_ACCELERATOR_TARGET_OVERRIDE" ]]; then | ||
| export EESSI_ACCELERATOR_INSTALL=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen2 |
|
New job on instance
|
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen2 |
|
New job on instance
|
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen2 |
|
New job on instance
|
…scripts into eessi-extend-cuda
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen2 |
|
New job on instance
|
Co-authored-by: Bob Dröge <[email protected]>
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen2 |
|
New job on instance
|
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen3 |
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
New job on instance
|
|
Tarballs ingested. I only now noticed that the a64fx build was not done (bot is dead?), but we can fix that later, let's merge this PR. |
…SI_ACCELERATOR_TARGET, as it should be. The changes in this commit where just forgotten, thus making this now inconsistent
No description provided.