solve_interleaved2_launcher (CUDA interface) : fixing size of blocksPerGrid & threadsPerBlock #710

kotsaloscv · 2021-12-10T17:57:33Z

solve_interleaved2_launcher (CUDA interface) : fixing size of blocksPerGrid & threadsPerBlock

bbpbuildbot · 2021-12-10T19:17:32Z

Logfiles from GitLab pipeline #28858 (:no_entry:) have been uploaded here!

Status and direct links:

…erGrid & threadsPerBlock

coreneuron/permute/cellorder.cu

bbpbuildbot · 2021-12-13T11:20:42Z

Logfiles from GitLab pipeline #28934 (:white_check_mark:) have been uploaded here!

Status and direct links:

…erGrid & threadsPerBlock

… blocksPerGrid & threadsPerBlock" This reverts commit beb6841.

…erGrid & threadsPerBlock

bbpbuildbot · 2021-12-13T13:15:34Z

Logfiles from GitLab pipeline #28981 (:white_check_mark:) have been uploaded here!

Status and direct links:

Summary of changes: - Support OpenMP target offload when NMODL and GPU support are enabled. (#693, #704, #705, #707, #708, #716, #719) - Use sensible defaults for the --nwarp parameter, improving the performance of the Hines solver with --cell-permute=2 on GPU. (#700, #710, #718) - Use a Boost memory pool, if Boost is available, to reduce the number of independent CUDA unified memory allocations used for Random123 stream objects. This speeds up initialisation of models using Random123, and also makes it feasible to use NSight Compute on models using Random123 and for NSight Systems to profile initialisation. (#702, #703) - Use -cuda when compiling with NVHPC and OpenACC or OpenMP, as recommended on the NVIDIA forums. (#721) - Do not compile for compute capability 6.0 by default, as this is not supported by NVHPC with OpenMP target offload. - Add new GitLab CI tests so we test CoreNEURON + NMODL with both OpenACC and OpenMP. (#698, #717) - Add CUDA runtime header search path explicitly, so we don't rely on it being implicit in our NVHPC localrc. - Cleanup unused code. (#711) Co-authored-by: Pramod Kumbhar <[email protected]> Co-authored-by: Ioannis Magkanaris <[email protected]> Co-authored-by: Christos Kotsalos <[email protected]> Co-authored-by: Nicolas Cornu <[email protected]>

Summary of changes: - Support OpenMP target offload when NMODL and GPU support are enabled. (BlueBrain/CoreNeuron#693, BlueBrain/CoreNeuron#704, BlueBrain/CoreNeuron#705, BlueBrain/CoreNeuron#707, BlueBrain/CoreNeuron#708, BlueBrain/CoreNeuron#716, BlueBrain/CoreNeuron#719) - Use sensible defaults for the --nwarp parameter, improving the performance of the Hines solver with --cell-permute=2 on GPU. (BlueBrain/CoreNeuron#700, BlueBrain/CoreNeuron#710, BlueBrain/CoreNeuron#718) - Use a Boost memory pool, if Boost is available, to reduce the number of independent CUDA unified memory allocations used for Random123 stream objects. This speeds up initialisation of models using Random123, and also makes it feasible to use NSight Compute on models using Random123 and for NSight Systems to profile initialisation. (BlueBrain/CoreNeuron#702, BlueBrain/CoreNeuron#703) - Use -cuda when compiling with NVHPC and OpenACC or OpenMP, as recommended on the NVIDIA forums. (BlueBrain/CoreNeuron#721) - Do not compile for compute capability 6.0 by default, as this is not supported by NVHPC with OpenMP target offload. - Add new GitLab CI tests so we test CoreNEURON + NMODL with both OpenACC and OpenMP. (BlueBrain/CoreNeuron#698, BlueBrain/CoreNeuron#717) - Add CUDA runtime header search path explicitly, so we don't rely on it being implicit in our NVHPC localrc. - Cleanup unused code. (BlueBrain/CoreNeuron#711) Co-authored-by: Pramod Kumbhar <[email protected]> Co-authored-by: Ioannis Magkanaris <[email protected]> Co-authored-by: Christos Kotsalos <[email protected]> Co-authored-by: Nicolas Cornu <[email protected]> CoreNEURON Repo SHA: BlueBrain/CoreNeuron@423ae6c

kotsaloscv requested a review from iomaganaris December 10, 2021 17:57

solve_interleaved2_launcher (CUDA interface) : fixing size of blocksP…

d0da173

…erGrid & threadsPerBlock

iomaganaris force-pushed the kotsalos/cuda_interleaved_launcher branch from ef04047 to d0da173 Compare December 13, 2021 10:40

olupton reviewed Dec 13, 2021

View reviewed changes

coreneuron/permute/cellorder.cu Show resolved Hide resolved

Christos Kotsalos added 3 commits December 13, 2021 13:21

solve_interleaved2_launcher (CUDA interface) : fixing size of blocksP…

beb6841

…erGrid & threadsPerBlock

Revert "solve_interleaved2_launcher (CUDA interface) : fixing size of…

0bb45cf

… blocksPerGrid & threadsPerBlock" This reverts commit beb6841.

solve_interleaved2_launcher (CUDA interface) : fixing size of blocksP…

9ba9b4d

…erGrid & threadsPerBlock

kotsaloscv requested a review from olupton December 13, 2021 12:26

olupton approved these changes Dec 13, 2021

View reviewed changes

kotsaloscv merged commit 01a39d7 into hackathon_main Dec 13, 2021

kotsaloscv deleted the kotsalos/cuda_interleaved_launcher branch December 13, 2021 12:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

solve_interleaved2_launcher (CUDA interface) : fixing size of blocksPerGrid & threadsPerBlock #710

solve_interleaved2_launcher (CUDA interface) : fixing size of blocksPerGrid & threadsPerBlock #710

Uh oh!

kotsaloscv commented Dec 10, 2021

Uh oh!

bbpbuildbot commented Dec 10, 2021

Uh oh!

Uh oh!

bbpbuildbot commented Dec 13, 2021

Uh oh!

bbpbuildbot commented Dec 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

solve_interleaved2_launcher (CUDA interface) : fixing size of blocksPerGrid & threadsPerBlock #710

solve_interleaved2_launcher (CUDA interface) : fixing size of blocksPerGrid & threadsPerBlock #710

Uh oh!

Conversation

kotsaloscv commented Dec 10, 2021

Uh oh!

bbpbuildbot commented Dec 10, 2021

Uh oh!

Uh oh!

bbpbuildbot commented Dec 13, 2021

Uh oh!

bbpbuildbot commented Dec 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants