-
Notifications
You must be signed in to change notification settings - Fork 16
Integrate changes from NERSC GPU hackathon. #783
Conversation
* Use `nrn_pragma_acc(...)` in place of `#pragma acc ...`. * Add OpenMP target offload directives with `nrn_pragma_omp(...)`.
|
Logfiles from GitLab pipeline #28911 (:white_check_mark:) have been uploaded here! Status and direct links: |
|
Logfiles from GitLab pipeline #28927 (:white_check_mark:) have been uploaded here! Status and direct links: |
Changes towards XLC/LLVM compiler support for GPU offload. * Do not include cuda.h and openacc.h. * Use cnrn_target_ wrappers instead of acc_ API. * OpenMP offload: simd can not be nested within target parallel loop Co-authored-by: Pramod Kumbhar <[email protected]>
|
Logfiles from GitLab pipeline #29141 (:white_check_mark:) have been uploaded here! Status and direct links: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* Use nrn_pragma_{acc,omp} and CORENEURON_ENABLE_GPU.
* Add EIGEN_DEVICE_FUNC to header to fix a compilation warning.
* Fudge partialPivLu<N> for NVHPC + OpenMP without OpenACC.
|
Logfiles from GitLab pipeline #29749 (:white_check_mark:) have been uploaded here! Status and direct links: |
|
Logfiles from GitLab pipeline #29798 (:white_check_mark:) have been uploaded here! Status and direct links: |
Codecov Report
@@ Coverage Diff @@
## master #783 +/- ##
==========================================
+ Coverage 61.42% 61.44% +0.01%
==========================================
Files 205 205
Lines 29849 29895 +46
==========================================
+ Hits 18335 18368 +33
- Misses 11514 11527 +13
Continue to review full report at Codecov.
|
|
Logfiles from GitLab pipeline #30124 (:white_check_mark:) have been uploaded here! Status and direct links: |
* Update Eigen submodule commit with eigen#2. * Cherry-pick setuptools fix #786. Co-authored-by: Nicolas Cornu <[email protected]>
|
Logfiles from GitLab pipeline #30232 (:white_check_mark:) have been uploaded here! Status and direct links: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👌
| printer->add_line("ThreadDatum* device_thread = cnrn_target_deviceptr(thread);"); | ||
| printer->add_line( | ||
| " acc_memcpy_to_device(&(device_thread[{}]._pvoid), &device_ns, sizeof(void*));"_format( | ||
| "cnrn_target_memcpy_to_device(&(device_thread[{}]._pvoid), &device_ns);"_format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious - was auto deduced type was different here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for this one it was important to have device_ns be void* rather than NewtonSpace*.
This is because now we have
template <typename T>
void cnrn_target_memcpy_to_device(T*, T*) { ... }and if you pass void** and NewtonSpace** it will fail to deduce T. We have to pass two pointers of the same type (up to const qualification).
| printer->start_block("if(nt->compute_gpu)"); | ||
| printer->add_line("double* device_vec = cnrn_target_copyin(vec, vec_size / sizeof(double));"); | ||
| printer->add_line("void* device_ns = cnrn_target_deviceptr(*ns);"); | ||
| printer->add_line("ThreadDatum* device_thread = cnrn_target_deviceptr(thread);"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious - was auto deduced type was different here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think auto or auto* would also be fine.
I believe I just left the type in so it was still spelled out once on the line.
* Tweak NVHPC warning suppressions.
* Emit nrn_pragma_{acc,omp}(...) macros. (BlueBrain/nmodl#780)
* Use cnrn_target_ wrappers instead of acc_ API.
* GPU code generation improvements (BlueBrain/nmodl#782)
* Fix NVHPC + OpenMP ~ OpenACC compilation (BlueBrain/nmodl#784)
* Add EIGEN_DEVICE_FUNC to header to fix a compilation warning.
* Fudge partialPivLu<N> for NVHPC + OpenMP without OpenACC.
* Transfer ml only if cell is not artificial. (BlueBrain/nmodl#785)
* Update Eigen to include OpenMP fixes. (BlueBrain/nmodl#787, BlueBrain/nmodl#789)
Co-authored-by: Nicolas Cornu <[email protected]>
Co-authored-by: Pramod Kumbhar <[email protected]>
NMODL Repo SHA: BlueBrain/nmodl@46f8baf
See BlueBrain/CoreNeuron#713 for context.