- 
                Notifications
    You must be signed in to change notification settings 
- Fork 41
GPU data management using OpenACC as well as OpenMP API #704
Conversation
| Logfiles from GitLab pipeline #28284 (:no_entry:) have been uploaded here! Status and direct links: | 
| Logfiles from GitLab pipeline #28369 (:no_entry:) have been uploaded here! Status and direct links: | 
| Logfiles from GitLab pipeline #28405 (:no_entry:) have been uploaded here! Status and direct links: | 
a6dcbf3    to
    97466ea      
    Compare
  
    | Logfiles from GitLab pipeline #28422 (:no_entry:) have been uploaded here! Status and direct links: | 
…ORENEURON_PREFER_OPENMP_OFFLOAD
| Logfiles from GitLab pipeline #28474 (:no_entry:) have been uploaded here! Status and direct links: | 
| Logfiles from GitLab pipeline #28511 (:no_entry:) have been uploaded here! Status and direct links: | 
| Logfiles from GitLab pipeline #28535 (:no_entry:) have been uploaded here! Status and direct links: | 
| Logfiles from GitLab pipeline #28569 (:no_entry:) have been uploaded here! Status and direct links: | 
| Logfiles from GitLab pipeline #28578 (:no_entry:) have been uploaded here! Status and direct links: | 
| Logfiles from GitLab pipeline #28601 (:white_check_mark:) have been uploaded here! Status and direct links: | 
* IvocVect members t_ and y_ were copied twice * only discon_indices_ is pointer and hence that needs to be copied
| Logfiles from GitLab pipeline #28639 (:white_check_mark:) have been uploaded here! Status and direct links: | 
Summary of changes: - Support OpenMP target offload when NMODL and GPU support are enabled. (#693, #704, #705, #707, #708, #716, #719) - Use sensible defaults for the --nwarp parameter, improving the performance of the Hines solver with --cell-permute=2 on GPU. (#700, #710, #718) - Use a Boost memory pool, if Boost is available, to reduce the number of independent CUDA unified memory allocations used for Random123 stream objects. This speeds up initialisation of models using Random123, and also makes it feasible to use NSight Compute on models using Random123 and for NSight Systems to profile initialisation. (#702, #703) - Use -cuda when compiling with NVHPC and OpenACC or OpenMP, as recommended on the NVIDIA forums. (#721) - Do not compile for compute capability 6.0 by default, as this is not supported by NVHPC with OpenMP target offload. - Add new GitLab CI tests so we test CoreNEURON + NMODL with both OpenACC and OpenMP. (#698, #717) - Add CUDA runtime header search path explicitly, so we don't rely on it being implicit in our NVHPC localrc. - Cleanup unused code. (#711) Co-authored-by: Pramod Kumbhar <[email protected]> Co-authored-by: Ioannis Magkanaris <[email protected]> Co-authored-by: Christos Kotsalos <[email protected]> Co-authored-by: Nicolas Cornu <[email protected]>
Summary of changes: - Support OpenMP target offload when NMODL and GPU support are enabled. (BlueBrain/CoreNeuron#693, BlueBrain/CoreNeuron#704, BlueBrain/CoreNeuron#705, BlueBrain/CoreNeuron#707, BlueBrain/CoreNeuron#708, BlueBrain/CoreNeuron#716, BlueBrain/CoreNeuron#719) - Use sensible defaults for the --nwarp parameter, improving the performance of the Hines solver with --cell-permute=2 on GPU. (BlueBrain/CoreNeuron#700, BlueBrain/CoreNeuron#710, BlueBrain/CoreNeuron#718) - Use a Boost memory pool, if Boost is available, to reduce the number of independent CUDA unified memory allocations used for Random123 stream objects. This speeds up initialisation of models using Random123, and also makes it feasible to use NSight Compute on models using Random123 and for NSight Systems to profile initialisation. (BlueBrain/CoreNeuron#702, BlueBrain/CoreNeuron#703) - Use -cuda when compiling with NVHPC and OpenACC or OpenMP, as recommended on the NVIDIA forums. (BlueBrain/CoreNeuron#721) - Do not compile for compute capability 6.0 by default, as this is not supported by NVHPC with OpenMP target offload. - Add new GitLab CI tests so we test CoreNEURON + NMODL with both OpenACC and OpenMP. (BlueBrain/CoreNeuron#698, BlueBrain/CoreNeuron#717) - Add CUDA runtime header search path explicitly, so we don't rely on it being implicit in our NVHPC localrc. - Cleanup unused code. (BlueBrain/CoreNeuron#711) Co-authored-by: Pramod Kumbhar <[email protected]> Co-authored-by: Ioannis Magkanaris <[email protected]> Co-authored-by: Christos Kotsalos <[email protected]> Co-authored-by: Nicolas Cornu <[email protected]> CoreNEURON Repo SHA: BlueBrain/CoreNeuron@423ae6c
Description
acc_update_deviceby pragmasomp update devicefor eachupdate deviceacc_update_selfby pragmasomp update hostfor eachupdate selfacc_update_selfandacc_update_device** Use certain branches for the GitLab/SimulationStack CI**
CI_BRANCHES:NMODL_BRANCH=hackathon_main,NEURON_BRANCH=master,