|
4 | 4 | Overview |
5 | 5 | ======== |
6 | 6 |
|
7 | | -Data Parallel Extension for Numba* (`numba-dpex`_) is an extension to |
8 | | -the `Numba*`_ Python JIT compiler adding an architecture-agnostic kernel |
9 | | -programming API, and a new front-end to compile the Data Parallel Extension |
10 | | -for Numpy* (`dpnp`_) library. The ``dpnp`` Python library is a data-parallel |
11 | | -implementation of `NumPy*`_'s API using the `SYCL*`_ language. |
12 | | - |
13 | | -.. ``numba-dpex``'s support for ``dpnp`` compilation is a new way for Numba* users |
14 | | -.. to write code in a NumPy-like API that is already supported by Numba*, while at |
15 | | -.. the same time automatically running such code parallelly on various types of |
16 | | -.. architecture. |
| 7 | +Data Parallel Extension for Numba* (`numba-dpex`_) is a free and open-source |
| 8 | +LLVM-based code generator for portable accelerator programming in Python. |
| 9 | +numba_dpex defines a new kernel programming domain-specific language (DSL) |
| 10 | +in pure Python called `KAPI` that is modeled after the C++ embedded DSL |
| 11 | +`SYCL*`_. |
| 12 | + |
| 13 | +The following example illustrates a relatively simple pairwise distance matrix |
| 14 | +computation example written in KAPI. |
| 15 | + |
| 16 | +.. code-block:: python |
| 17 | +
|
| 18 | + from numba_dpex import kernel_api as kapi |
| 19 | + import math |
| 20 | + import numpy as np |
| 21 | +
|
| 22 | +
|
| 23 | + def pairwise_distance_kernel(item: kapi.Item, data, distance): |
| 24 | + i = item.get_id(0) |
| 25 | + j = item.get_id(1) |
| 26 | +
|
| 27 | + data_dims = data.shape[1] |
| 28 | +
|
| 29 | + d = data.dtype.type(0.0) |
| 30 | + for k in range(data_dims): |
| 31 | + tmp = data[i, k] - data[j, k] |
| 32 | + d += tmp * tmp |
| 33 | +
|
| 34 | + distance[j, i] = math.sqrt(d) |
| 35 | +
|
| 36 | +
|
| 37 | + data = np.random.ranf((10000, 3)).astype(np.float32) |
| 38 | + distance = np.empty(shape=(data.shape[0], data.shape[0]), dtype=np.float32) |
| 39 | + exec_range = kapi.Range(data.shape[0], data.shape[0]) |
| 40 | + kapi.call_kernel(pairwise_distance_kernel, exec_range, data, distance) |
| 41 | +
|
| 42 | +Skipping over much of the language details, at a high-level the |
| 43 | +``pairwise_distance_kernel`` can be viewed as a "data-parallel" function that |
| 44 | +gets executed individually by a set of "work items". That is, each work item |
| 45 | +runs the same function for a subset of the elements of the input ``data`` and |
| 46 | +``distance`` arrays. For programmers familiar with the CUDA or OpenCL languages, |
| 47 | +it is the same programming model referred to as Single Program Multiple Data |
| 48 | +(SPMD). As Python has no concept of a work item the KAPI function runs |
| 49 | +sequentially resulting in a very slow execution time. Experienced Python |
| 50 | +programmers will most probably write a much faster version of the function using |
| 51 | +NumPy*. |
| 52 | + |
| 53 | +However, using a JIT compiler numba-dpex can compile a function written in the |
| 54 | +KAPI language to a CPython native extension function that executes according to |
| 55 | +the SPMD programming model, speeding up the execution time by orders of |
| 56 | +magnitude. Currently, compilation of KAPI is possible for x86 CPU devices, |
| 57 | +Intel Gen9 integrated GPUs, Intel UHD integrated GPUs, and Intel discrete GPUs. |
| 58 | + |
17 | 59 |
|
18 | 60 | ``numba-dpex`` is an open-source project and can be installed as part of `Intel |
19 | 61 | AI Analytics Toolkit`_ or the `Intel Distribution for Python*`_. The package is |
|
0 commit comments