Skip to content

Commit d6a4c94

Browse files
Merge pull request #1097 from IntelPython/docs/release-0.20-rev01
More revisions and additions to the documentation b3d6cdc
1 parent ab55c18 commit d6a4c94

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

68 files changed

+1322
-974
lines changed

dev/_sources/api_reference/index.rst.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,5 @@
44

55
API Reference
66
=============
7+
8+
Coming soon

dev/_sources/getting_started.rst.txt

Lines changed: 42 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -9,26 +9,40 @@ Getting Started
99
===============
1010

1111

12-
Installing pre-built packages
13-
-----------------------------
12+
Installing pre-built conda packages
13+
-----------------------------------
1414

1515
``numba-dpex`` along with its dependencies can be installed using ``conda``.
1616
It is recommended to use conda packages from the ``anaconda.org/intel`` channel
17-
to get the latest production releases. Nighly builds of ``numba-dpex`` are
18-
available on the ``dppy/label/dev`` conda channel.
17+
to get the latest production releases.
1918

2019
.. code-block:: bash
2120
22-
conda create -n numba-dpex-env numba-dpex dpnp dpctl dpcpp-llvm-spirv spirv-tools -c intel -c conda-forge
21+
conda create -n numba-dpex-env \
22+
numba-dpex dpnp dpctl dpcpp-llvm-spirv spirv-tools \
23+
-c intel -c conda-forge
24+
25+
To try out the bleeding edge, the latest packages built from tip of the main
26+
source trunk can be installed from the ``dppy/label/dev`` conda channel.
27+
28+
.. code-block:: bash
29+
30+
conda create -n numba-dpex-env \
31+
numba-dpex dpnp dpctl dpcpp-llvm-spirv spirv-tools \
32+
-c dppy/label/dev -c intel -c conda-forge
33+
34+
2335
2436
Building from source
2537
--------------------
2638

27-
``numba-dpex`` can be built from source using either ``conda-build`` or ``setuptools``.
39+
``numba-dpex`` can be built from source using either ``conda-build`` or
40+
``setuptools``.
2841

2942
Steps to build using ``conda-build``:
3043

31-
1. Create a conda environment
44+
1. Ensure ``conda-build`` is installed in the ``base`` conda environment or
45+
create a new conda environment with ``conda-build`` installed.
3246

3347
.. code-block:: bash
3448
@@ -45,22 +59,34 @@ Steps to build using ``conda-build``:
4559

4660
.. code-block:: bash
4761
48-
conda install numba-dpex
62+
conda install -c local numba-dpex
4963
5064
Steps to build using ``setup.py``:
5165

66+
As before, a conda environment with all necessary dependencies is the suggested
67+
first step.
68+
5269
.. code-block:: bash
5370
54-
conda create -n numba-dpex-env dpctl dpnp numba spirv-tools dpcpp-llvm-spirv llvmdev pytest -c intel -c conda-forge
71+
# Create a conda environment that hass needed dependencies installed
72+
conda create -n numba-dpex-env \
73+
dpctl dpnp numba spirv-tools dpcpp-llvm-spirv llvmdev pytest \
74+
-c intel -c conda-forge
75+
# Activate the environment
5576
conda activate numba-dpex-env
77+
# Clone the numba-dpex repository
78+
git clone https://github.com/IntelPython/numba-dpex.git
79+
cd numba-dpex
80+
python setup.py develop
5681
5782
Building inside Docker
5883
----------------------
5984

60-
A Dockerfile is provided on the GitHub repository to easily build ``numba-dpex``
85+
A Dockerfile is provided on the GitHub repository to build ``numba-dpex``
6186
as well as its direct dependencies: ``dpctl`` and ``dpnp``. Users can either use
6287
one of the pre-built images on the ``numba-dpex`` GitHub page or use the
63-
bundled Dockerfile to build ``numba-dpex`` from source.
88+
bundled Dockerfile to build ``numba-dpex`` from source. Using the Dockerfile
89+
also ensures that all device drivers and runtime libraries are pre-installed.
6490

6591
Building
6692
~~~~~~~~
@@ -69,10 +95,10 @@ Numba dpex ships with multistage Dockerfile, which means there are
6995
different `targets <https://docs.docker.com/build/building/multi-stage/#stop-at-a-specific-build-stage>`_
7096
available for build. The most useful ones:
7197

72-
- runtime
73-
- runtime-gpu
74-
- numba-dpex-builder-runtime
75-
- numba-dpex-builder-runtime-gpu
98+
- ``runtime``
99+
- ``runtime-gpu``
100+
- ``numba-dpex-builder-runtime``
101+
- ``numba-dpex-builder-runtime-gpu``
76102

77103
To build docker image
78104

@@ -96,7 +122,7 @@ To run docker image
96122
``GITHUB_USER`` and ``GITHUB_PASSWORD``
97123
`build args <https://docs.docker.com/engine/reference/commandline/build/#build-arg>`_
98124
to increase the call limit. A GitHub
99-
`access token <https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token>`
125+
`access token <https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token>`_
100126
can also be used instead of the password.
101127

102128
.. note::

dev/_sources/overview.rst.txt

Lines changed: 27 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -15,23 +15,23 @@ implementation of `NumPy*`_'s API using the `SYCL*`_ language.
1515
.. the same time automatically running such code parallelly on various types of
1616
.. architecture.
1717
18-
``numba-dpex`` is developed as part of `Intel AI Analytics Toolkit`_ and
19-
is distributed with the `Intel Distribution for Python*`_. The extension is
20-
available on Anaconda cloud and as a Docker image on GitHub. Please refer the
21-
:doc:`getting_started` page to learn more.
18+
``numba-dpex`` is an open-source project and can be installed as part of `Intel
19+
AI Analytics Toolkit`_ or the `Intel Distribution for Python*`_. The package is
20+
also available on Anaconda cloud and as a Docker image on GitHub. Please refer
21+
the :doc:`getting_started` page to learn more.
2222

2323
Main Features
2424
-------------
2525

2626
Portable Kernel Programming
2727
~~~~~~~~~~~~~~~~~~~~~~~~~~~
2828

29-
The ``numba-dpex`` kernel API has a design and API similar to Numba's
29+
The ``numba-dpex`` kernel programming API has a design similar to Numba's
3030
``cuda.jit`` sub-module. The API is modeled after the `SYCL*`_ language and uses
3131
the `DPC++`_ SYCL runtime. Currently, compilation of kernels is supported for
3232
SPIR-V-based OpenCL and `oneAPI Level Zero`_ devices CPU and GPU devices. In the
33-
future, the API can be extended to other architectures that are supported by
34-
DPC++.
33+
future, compilation support for other types of hardware that are supported by
34+
DPC++ will be added.
3535

3636
The following example illustrates a vector addition kernel written with
3737
``numba-dpex`` kernel API.
@@ -56,31 +56,33 @@ The following example illustrates a vector addition kernel written with
5656
print(c)
5757
5858
In the above example, three arrays are allocated on a default ``gpu`` device
59-
using the ``dpnp`` library. These arrays are then passed as input arguments to
60-
the kernel function. The compilation target and the subsequent execution of the
61-
kernel is determined completely by the input arguments and follow the
59+
using the ``dpnp`` library. The arrays are then passed as input arguments to the
60+
kernel function. The compilation target and the subsequent execution of the
61+
kernel is determined by the input arguments and follow the
6262
"compute-follows-data" programming model as specified in the `Python* Array API
6363
Standard`_. To change the execution target to a CPU, the device keyword needs to
6464
be changed to ``cpu`` when allocating the ``dpnp`` arrays. It is also possible
6565
to leave the ``device`` keyword undefined and let the ``dpnp`` library select a
6666
default device based on environment flag settings. Refer the
6767
:doc:`user_guide/kernel_programming/index` for further details.
6868

69-
``dpnp`` compilation support
70-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
71-
72-
``numba-dpex`` extends Numba's type system and compilation pipeline to compile
73-
``dpnp`` functions and expressions in the same way as NumPy. Unlike Numba's
74-
NumPy compilation that is serial by default, ``numba-dpex`` always compiles
75-
``dpnp`` expressions into data-parallel kernels and executes them in parallel.
76-
The ``dpnp`` compilation feature is provided using a decorator ``dpjit`` that
77-
behaves identically to ``numba.njit(parallel=True)`` with the addition of
78-
``dpnp`` compilation and kernel offloading. Offloading by ``numba-dpex`` is not
79-
just restricted to CPUs and supports all devices that are presently supported by
80-
the kernel API. ``dpjit`` allows using NumPy and ``dpnp`` expressions in the
81-
same function. All NumPy compilation and parallelization is done via the default
82-
Numba code-generation pipeline, whereas ``dpnp`` expressions are compiled using
83-
the ``numba-dpex`` pipeline.
69+
``dpjit`` decorator
70+
~~~~~~~~~~~~~~~~~~~
71+
72+
The ``numba-dpex`` package provides a new decorator ``dpjit`` that extends
73+
Numba's ``njit`` decorator. The new decorator is equivalent to
74+
``numba.njit(parallel=True)``, but additionally supports compiling ``dpnp``
75+
functions, ``prange`` loops, and array expressions that use ``dpnp.ndarray``
76+
objects.
77+
78+
Unlike Numba's NumPy parallelization that only supports CPUs, ``dpnp``
79+
expressions are first converted to data-parallel kernels and can then be
80+
`offloaded` to different types of devices. As ``dpnp`` implements the same API
81+
as NumPy*, an existing ``numba.njit`` decorated function that uses
82+
``numpy.ndarray`` may be refactored to use ``dpnp.ndarray`` and decorated with
83+
``dpjit``. Such a refactoring can allow the parallel regions to be offloaded
84+
to a supported GPU device, providing users an additional option to execute their
85+
code parallelly.
8486

8587
The vector addition example depicted using the kernel API can also be
8688
expressed in several different ways using ``dpjit``.
Lines changed: 34 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,48 +1,49 @@
1-
.. _caching:
21
.. include:: ./../ext_links.txt
32

4-
5-
Caching Mechanism in Numba-dpex
6-
================================
3+
Caching Mechanism in ``numba-dpex``
4+
===================================
75

86
Caching is done by saving the compiled kernel code, the ELF object of the
97
executable code. By using the kernel code, cached kernels have minimal overhead
108
because no compilation is needed.
119

12-
Unlike Numba, we do not perform file-based caching, instead we use a
13-
Least Recently Used (LRU) caching mechanism. However when a kernel needs to be
14-
evicted, we utilize numba's file-based caching mechanism described
15-
`here <https://numba.pydata.org/numba-doc/latest/developer/caching.html>`_.
10+
Unlike Numba*, ``numba-dpex`` does not perform an exclusive file-based caching,
11+
instead a Least Recently Used (LRU) caching mechanism is used. However, when a
12+
kernel needs to be evicted, Numba's file-based caching mechanism is invoked as
13+
described `here
14+
<https://numba.pydata.org/numba-doc/latest/developer/caching.html>`_.
1615

1716
Algorithm
18-
---------
19-
20-
The caching mechanism for Numba-dpex works as follows: The cache is an LRU cache
21-
backed by an ordered dictionary mapped onto a doubly linked list. The tail of
22-
the list contains the most recently used (MRU) kernel and the head of the list
23-
contains the least recently used (LRU) kernel. The list has a fixed size. If a
24-
new kernel arrives to be cached and if the size is already on the maximum limit,
25-
the algorithm evicts the LRU kernel to make room for the MRU kernel. The evicted
26-
item will be serialized and pickled into a file using Numba's caching mechanism.
27-
28-
Everytime whenever a kernel needs to be retrieved from the cache, the mechanism
29-
will look for the kernel in the cache and will be loaded if it's already present.
30-
However, if the program is seeking for a kernel that has been evicted, the
31-
algorithm will load it from the file and enqueue in the cache.
17+
----------
18+
19+
The caching mechanism for ``numba-dpex`` works as follows: The cache is an LRU
20+
cache backed by an ordered dictionary mapped onto a doubly linked list. The tail
21+
of the list contains the most recently used (MRU) kernel and the head of the
22+
list contains the least recently used (LRU) kernel. The list has a fixed size.
23+
If a new kernel arrives to be cached and if the size is already on the maximum
24+
limit, the algorithm evicts the LRU kernel to make room for the MRU kernel. The
25+
evicted item will be serialized and pickled into a file using Numba's caching
26+
mechanism.
27+
28+
Everytime when a kernel needs to be retrieved from the cache, the mechanism
29+
will look for the kernel in the cache and will be loaded if it's already
30+
present. However, if the program is seeking for a kernel that has been evicted,
31+
the algorithm will load it from the file and enqueue in the cache. As a result,
32+
the amount of file operations are significantly lower than that of Numba.
3233

3334
Settings
34-
--------
35+
---------
3536

36-
Therefore, we employ similar environment variables as used in Numba,
37-
i.e. ``NUMBA_CACHE_DIR`` etc. However we add three more environment variables to
38-
control the caching mechanism.
37+
Therefore, ``numba-dpex`` employs similar environment variables as used in
38+
Numba, i.e. ``NUMBA_CACHE_DIR`` etc. However there are three more environment
39+
variables to control the caching mechanism.
3940

40-
- In order to specify cache capacity, one can use ``NUMBA_DPEX_CACHE_SIZE``.
41-
By default, it's set to 10.
41+
- In order to specify cache capacity, ``NUMBA_DPEX_CACHE_SIZE`` can be used. By
42+
default, it's set to 10.
4243

43-
- ``NUMBA_DPEX_ENABLE_CACHE`` can be used to enable/disable the caching mechanism.
44-
By default it's enabled, i.e. set to 1.
44+
- ``NUMBA_DPEX_ENABLE_CACHE`` can be used to enable/disable the caching
45+
mechanism. By default it's enabled, i.e. set to 1.
4546

46-
- In order to enable the debugging messages related to caching, one can set
47-
``NUMBA_DPEX_DEBUG_CACHE`` to 1. All environment variables are defined in
48-
:file:`numba_dpex/config.py`.
47+
- In order to enable the debugging messages related to caching, the variable
48+
``NUMBA_DPEX_DEBUG_CACHE`` can be set to 1. All environment variables are
49+
defined in :file:`numba_dpex/config.py`.
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
.. include:: ./../ext_links.txt
2+
3+
Configuration Options for ``numba-dpex``
4+
========================================
5+
6+
``numba-dpex`` provides a set of environment variables and flags for configuring different aspects of the compilation, debugging and execution of programs. The configuration flags of ``numba-dpex`` are mostly inherited from those of Numba*. They are defined in :file:`numba_dpex/core/config.py`.
7+
8+
.. note::
9+
In order to enable/disable each of the configuration flags, a ``NUMBA_DPEX``
10+
prefix needs to be appended before each variable. For example, in order to
11+
turn ``SAVE_IR_FILES`` flag on, it needs to be passed as ``NUMBA_DPEX_SAVE_IR_FILES=1``
12+
13+
For example:
14+
15+
.. code-block:: bash
16+
17+
user@host:~/NUMBA_DPEX_SAVE_IR_FILES=1 python numba_dpex_program.py
18+
19+
20+
The list of available configuration flags are listed as follows:
21+
22+
``SAVE_IR_FILES``:
23+
A flag to save the Numba* intermediate representation (IR) files generated by the compiler. Set to ``0`` by default.
24+
25+
``SPIRV_VAL``:
26+
A flag to turn Numba*'s ``SPIRV-VALIDATION`` switch. Set to ``0`` by default.
27+
28+
``OFFLOAD_DIAGNOSTICS``:
29+
A flag to dump the offload diagnostics. Set to ``0`` by default.
30+
31+
``NATIVE_FP_ATOMICS``:
32+
A flag to activate the native floating point (FP) atomcis support for supported devices. Requires ``llvm-spirv`` supporting the FP atomics extension. Set to ``0`` by default.
33+
34+
``DEBUG``:
35+
A flag to emit the debug info, inherited from |numba.core.config.DEBUG|_.
36+
37+
``DEBUGINFO_DEFAULT``:
38+
The default value for the `debug` flag. Inherited from |numba.core.config.DEBUGINFO_DEFAULT|_.
39+
40+
``DUMP_KERNEL_LLVM``:
41+
A flag to emit LLVM assembly language format (``.ll``). Inherited from |numba.core.config.DUMP_OPTIMIZED|_.
42+
43+
``ENABLE_CACHE``:
44+
A flag to enable caching, set ``NUMBA_DPEX_ENABLE_CACHE=0`` to turn off. Set to ``1`` by default.
45+
46+
``CACHE_SIZE``:
47+
A flag to specify the default cache size. Set to ``20`` by default.
48+
49+
``DEBUG_CACHE``:
50+
A flag to enable debugging of cahcing mechanism, set ``1`` to turn it on.
51+
52+
``STATIC_LOCAL_MEM_PASS``:
53+
A flag to turn on the ``ConstantSizeStaticLocalMemoryPass`` in the kernel pipeline. The pass is turned off by default.
54+
55+
56+
.. |numba.core.config.DEBUG| replace:: ``numba.core.config.DEBUG``
57+
.. |numba.core.config.DEBUGINFO_DEFAULT| replace:: ``numba.core.config.DEBUGINFO_DEFAULT``
58+
.. |numba.core.config.DUMP_OPTIMIZED| replace:: ``numba.core.config.DUMP_OPTIMIZED``
59+
60+
.. _`numba.core.config.DEBUG`: https://github.com/numba/numba/blob/main/numba/core/config.py#L202
61+
.. _`numba.core.config.DEBUGINFO_DEFAULT`: https://github.com/numba/numba/blob/main/numba/core/config.py#L488
62+
.. _`numba.core.config.DUMP_OPTIMIZED`: https://github.com/numba/numba/blob/main/numba/core/config.py#L301

dev/_sources/user_guide/debugging/altering.rst.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.. _altering:
1+
.. include:: ./../../ext_links.txt
22

33
Altering Execution
44
==================

dev/_sources/user_guide/debugging/backtrace.rst.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.. _backtrace:
1+
.. include:: ./../../ext_links.txt
22

33
Backtrace
44
==========

dev/_sources/user_guide/debugging/breakpoints.rst.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
.. include:: ./../../ext_links.txt
2+
13
Breakpoints
24
===========
35

dev/_sources/user_guide/debugging/common_issues.rst.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
.. include:: ./../../ext_links.txt
2+
13
Common issues and tips
24
======================
35

dev/_sources/user_guide/debugging/data.rst.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
.. include:: ./../../ext_links.txt
2+
13
Examining Data
24
==============
35

0 commit comments

Comments
 (0)