Add NVHPC/21.11 and CUDA/11.5.1 #1392

pramodk · 2021-12-10T00:41:32Z

Add NVIDIA HPC-SDK 21.11
Pass -mno-abm to 21.11 NVIDIA compilers to avoid an issue with the __ABM__ macro and Random123 (https://forums.developer.nvidia.com/t/21-11-tp-behaviour/198115)
Add + install CUDA 11.5.1, which matches the version bundled with the HPC-SDK 21.11 (🤞 Alternate CUDA provider spack/spack#19365 progresses soon)
Patch localrc generation for NVIDIA compilers to run in a cleaner environment. Previously this ran after compiler configuration for LLVM, and module load llvm ... module unload llvm left LLVM's autoloaded dependencies, python and cuda, in the environment, and those paths made their way into the localrc file.
Don't set CMAKE_C_FLAGS for CoreNEURON + NMODL builds that contain no C code.
Add (but don't use anymore) a helper script deploy/set-compiler-flags.py -- cc: @matz-e who was interested in tweaking it.

matz-e

Can you file this against merge-upstream, too?

olupton · 2021-12-10T15:14:38Z

Retest this please Jenkins.

olupton · 2021-12-13T08:40:24Z

Retest this please Jenkins.

olupton · 2021-12-13T16:17:50Z

Retest this please Jenkins.

olupton · 2021-12-13T17:11:57Z

I believe the failures are because the new NVC++ with its default target architecture (-tp host) defines __ABM__.

I'm not quite sure what's going on, as

$ nvc++ -V
nvc++ 21.11-0 64-bit target on x86-64 Linux -tp skylake-avx512

suggests it is auto-detecting -tp skylake-avx512, but

$ nvc++ -tp skylake-avx512 test.cpp
"test.cpp", line 2: error: identifier "__ABM__" is undefined
    return __ABM__;
           ^

while

$ nvc++ test.cpp

compiles successfully.

I guess we need to report this to the NVIDIA forums.

I also see in the generated localrc file:

set GPPDIR= /gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/externals/2021-01-06/linux-rhel7-x86_64/gcc-9.3.0/python-3.8.3-suxrst/include/python3.8 /gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/external    s/2021-01-06/linux-rhel7-x86_64/gcc-9.3.0/cuda-11.0.2-kb4wci/include /gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/externals/2021-01-06/linux-rhel7-x86_64/gcc-9.3.0/python-3.8.3-suxrst/include /gpfs/b    bp.cscs.ch/ssd/apps/hpc/jenkins/deploy/compilers/2021-01-06/linux-rhel7-x86_64/gcc-4.8.5/gcc-9.3.0-45gzrp/include/c++/9.3.0 /gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/compilers/2021-01-06/linux-rhe    l7-x86_64/gcc-4.8.5/gcc-9.3.0-45gzrp/include/c++/9.3.0/x86_64-pc-linux-gnu /gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/compilers/2021-01-06/linux-rhel7-x86_64/gcc-4.8.5/gcc-9.3.0-45gzrp/include/c++/    9.3.0/backward /gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/compilers/2021-01-06/linux-rhel7-x86_64/gcc-4.8.5/gcc-9.3.0-45gzrp/lib/gcc/x86_64-pc-linux-gnu/9.3.0/include /usr/local/include /gpfs/bbp.c    scs.ch/ssd/apps/hpc/jenkins/deploy/compilers/2021-01-06/linux-rhel7-x86_64/gcc-4.8.5/gcc-9.3.0-45gzrp/include /gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/compilers/2021-01-06/linux-rhel7-x86_64/gcc-    4.8.5/gcc-9.3.0-45gzrp/lib/gcc/x86_64-pc-linux-gnu/9.3.0/include-fixed /usr/include;

which looks like it has too much irrelevant stuff in it.

olupton · 2021-12-15T15:20:12Z

Note that the release notes say 11.5.1 is included, so I think we should install that one. Hopefully Spack will learn how to express that nvhpc already contains CUDA "soon"...

var/spack/repos/builtin/packages/coreneuron/package.py

olupton · 2021-12-17T09:08:09Z

@pramodk I can't request a review from you because you're the original author 😅

pramodk

Overall LGTM if everything is working in production :)

var/spack/repos/builtin/packages/coreneuron/package.py

* Add + deploy NVIDIA HPC SDK 21.11. * Add + deploy CUDA 11.5.1 to match it. * Do not define CMAKE_C_FLAGS in `coreneuron+nmodl`. * Run `makelocalrc` for `nvhpc` in a cleaner environment. * Add `-mno-abm` to [Core]NEURON recipes for `[email protected]`. Co-authored-by: Olli Lupton <[email protected]>

Presumably this was working before because our nvhpc localrc files accidentally included CUDA include directories before BlueBrain/spack#1392.

* Add + deploy NVIDIA HPC SDK 21.11. * Add + deploy CUDA 11.5.1 to match it. * Do not define CMAKE_C_FLAGS in `coreneuron+nmodl`. * Run `makelocalrc` for `nvhpc` in a cleaner environment. * Add `-mno-abm` to [Core]NEURON recipes for `[email protected]`. Co-authored-by: Olli Lupton <[email protected]>

* Cherry-pick of #1392. * Add + deploy NVIDIA HPC SDK 21.11. * Add + deploy CUDA 11.5.1 to match it. * Do not define CMAKE_C_FLAGS in `coreneuron+nmodl`. * Run `makelocalrc` for `nvhpc` in a cleaner environment. * Use GCC 11.2.0 for NVHPC instead of GCC 9.4.0. * Add `-mno-abm` to [Core]NEURON recipes for `[email protected]`. Co-authored-by: Pramod Kumbhar <[email protected]>

Presumably this was working before because our nvhpc localrc files accidentally included CUDA include directories before BlueBrain/spack#1392.

pramodk marked this pull request as draft December 10, 2021 00:41

matz-e reviewed Dec 10, 2021

View reviewed changes

pramodk and others added 9 commits December 15, 2021 15:09

Add NVHPC 21.11 release (only x86_64 linux)

98b0dcc

Enable [email protected]

a05de0d

Add other platforms

ade11ed

enable cuda 11.5.0

8853d9c

Fix syntax.

329024c

Remove CMAKE_C_FLAGS from coreneuron recipe.

6002e2a

CUDA: add v11.5.1 (spack#27689)

ee8f024

Prefer CUDA 11.5.1 to match HPC-SDK 21.11.

30798b4

Run module purge just before makelocalrc.

b71ea24

olupton force-pushed the pramodk/nvhpc-21.11 branch from 75606e6 to b71ea24 Compare December 15, 2021 14:34

fixup

8a2bddf

olupton added 6 commits December 15, 2021 19:14

Maybe it will work?

218e425

appease flake8

d560eab

Add -mno-abm for [email protected].

953ba1a

Drop the -tp skylake part.

66318a3

-mno-abm for [email protected] for neuron too.

d6145b5

fixup

d824a7c

olupton changed the title ~~Add NVHPC 21.11 release (only x86_64 linux)~~ Add NVHPC/21.11 and CUDA/11.5.1 Dec 16, 2021

olupton requested a review from matz-e December 16, 2021 11:18

matz-e reviewed Dec 16, 2021

View reviewed changes

var/spack/repos/builtin/packages/coreneuron/package.py Show resolved Hide resolved

matz-e approved these changes Dec 16, 2021

View reviewed changes

pramodk commented Dec 21, 2021

View reviewed changes

var/spack/repos/builtin/packages/coreneuron/package.py Show resolved Hide resolved

olupton marked this pull request as ready for review December 21, 2021 10:53

olupton mentioned this pull request Dec 21, 2021

Random123 does not cope with ABM feature macros with PGI/NVIDIA compilers. BlueBrain/CoreNeuron#724

Closed

olupton merged commit 596e000 into develop Dec 21, 2021

olupton deleted the pramodk/nvhpc-21.11 branch December 21, 2021 12:09

olupton mentioned this pull request Dec 21, 2021

Add NVHPC/21.11 and CUDA/11.5.1 #1407

Merged

olupton added a commit to BlueBrain/CoreNeuron that referenced this pull request Dec 21, 2021

Add CUDA toolkit includes.

531c4fe

Presumably this was working before because our nvhpc localrc files accidentally included CUDA include directories before BlueBrain/spack#1392.

ohm314 pushed a commit to BlueBrain/CoreNeuron that referenced this pull request Dec 23, 2021

Add CUDA toolkit includes.

063c57b

Presumably this was working before because our nvhpc localrc files accidentally included CUDA include directories before BlueBrain/spack#1392.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add NVHPC/21.11 and CUDA/11.5.1 #1392

Add NVHPC/21.11 and CUDA/11.5.1 #1392

Uh oh!

pramodk commented Dec 10, 2021 •

edited by olupton

Loading

Uh oh!

matz-e left a comment

Uh oh!

olupton commented Dec 10, 2021

Uh oh!

olupton commented Dec 13, 2021

Uh oh!

olupton commented Dec 13, 2021

Uh oh!

olupton commented Dec 13, 2021

Uh oh!

olupton commented Dec 15, 2021

Uh oh!

Uh oh!

olupton commented Dec 17, 2021

Uh oh!

pramodk left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Add NVHPC/21.11 and CUDA/11.5.1 #1392

Add NVHPC/21.11 and CUDA/11.5.1 #1392

Uh oh!

Conversation

pramodk commented Dec 10, 2021 • edited by olupton Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matz-e left a comment

Choose a reason for hiding this comment

Uh oh!

olupton commented Dec 10, 2021

Uh oh!

olupton commented Dec 13, 2021

Uh oh!

olupton commented Dec 13, 2021

Uh oh!

olupton commented Dec 13, 2021

Uh oh!

olupton commented Dec 15, 2021

Uh oh!

Uh oh!

olupton commented Dec 17, 2021

Uh oh!

pramodk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pramodk commented Dec 10, 2021 •

edited by olupton

Loading