This project is a set of libraries designed to work with LDC to enable native execution of D on GPUs (and other more exotic targets of OpenCL such as FPGAs DSPs, hereafter just 'GPUs') on the OpenCL and CUDA runtimes. As DCompute depends on developments in LDC for the code generation, a relatively recent LDC is required, use 1.8.0 or newer.
There are four main parts:
- std: A library containing standard functionality for targetting GPUs and abstractions over the intrinsics of OpenCL and CUDA.
- driver: For handling all the compute API interactions and provide a friendly, easy-to-use, consistent interface. Of course you can always get down to a lower level of interaction if you need to. You can also use this to execute non-D kernels (e.g. OpenCL or CUDA).
- kernels: A set of standard kernels and primitives to cover a large number of use cases and serve as documentation on how (and how not) to use this library.
- tests: A framework for testing kernels. The suite is runnable with
dub test
(seedub.json
for the configuration used).
Kernel:
@kernel void saxpy(GlobalPointer!(float) res,
float alpha,
GlobalPointer!(float) x,
GlobalPointer!(float) y,
size_t N)
{
auto i = GlobalIndex.x;
if (i >= N) return;
res[i] = alpha*x[i] + y[i];
}
Invoke with (CUDA):
q.enqueue!(saxpy)
([N,1,1],[1,1,1]) // Grid & block & optional shared memory
(b_res,alpha,b_x,b_y, N); // kernel arguments
equivalent to the CUDA code
saxpy<<<1,N,0,q>>>(b_res,alpha,b_x,b_y, N);
For more examples and the full code see source/dcompute/tests
.
To build DCompute you will need:
- ldc as the D dcompiler.
- a SPIRV capable LLVM (available here to build ldc to to support SPIRV (required for OpenCL)).
- or LDC built with any LLVM 3.9.1 or greater that has the NVPTX backend enabled, to support CUDA.
- dub then just run
$dub build.
Alternatively, you can include dcompute as a dependency, as shown below:- add
to your
"dcompute": { "version": "~>0.1.1", "dflags": [ "-mdcompute-targets=cuda-800", "-mdcompute-targets=ocl-300", "-oq" ] }
dub.json
underdependencies
. The dflags will be passed to LDC to generate code for the specified targets. You can runldc2 --help
to look for that flag. Useocl-xy0
for OpenCL x.y andcuda-xy0
for CUDA Compute Capability x.y. So the above flags are for OpenCL 3.0 and CUDA CC 8.0. The two flags must be included separately as shown in thedub.json
.- If you get an error saying
Need to use a DCompute enabled compiler
, you likely forgot the-mdcompute-targets
flags. - Check NVIDIA's website for your CUDA Compute Capability.
- If you get an error saying
- Alternatively add the equivalent to dub.sdl,
dependency "dcompute" version="~>0.1.1"
to yourdub.sdl
and include the dflags.
- add
If you get an error like Error: unrecognized switch '-mdcompute-targets=cuda-210
, make sure you are using LDC and not DMD: passing --compiler=/path/to/ldc2
to dub will force it to use /path/to/ldc2
as the D compiler.
A dmd compatible d compiler,dmd, ldmd or gdmd (available as part of ldc and gdc respectively), and cmake for building ldc is also required if you need to build ldc yourself.
Please see the documentation.
Generate OpenCL builtins from here