Skip to content

[SYCL][NATIVECPU][DOCS] add Native CPU info to GettingStartedGuide #19768

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: sycl
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions sycl/doc/GetStartedGuide.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ and a wide range of compute accelerators such as GPU and FPGA.
* [Build DPC++ toolchain with support for NVIDIA CUDA](#build-dpc-toolchain-with-support-for-nvidia-cuda)
* [Build DPC++ toolchain with support for HIP AMD](#build-dpc-toolchain-with-support-for-hip-amd)
* [Build DPC++ toolchain with support for HIP NVIDIA](#build-dpc-toolchain-with-support-for-hip-nvidia)
* [Build DPC++ toolchain with support for Native CPU](#build-dpc-toolchain-with-support-for-native-cpu)
* [Build DPC++ toolchain with support for ARM processors](#build-dpc-toolchain-with-support-for-arm-processors)
* [Build DPC++ toolchain with additional features enabled that require runtime/JIT compilation](#build-dpc-toolchain-with-additional-features-enabled-that-require-runtimejit-compilation)
* [Build DPC++ toolchain with a custom Unified Runtime](#build-dpc-toolchain-with-a-custom-unified-runtime)
Expand Down Expand Up @@ -124,6 +125,7 @@ flags can be found by launching the script with `--help`):
* `--hip-platform` -> select the platform used by the hip backend, `AMD` or
`NVIDIA` (see [HIP AMD](#build-dpc-toolchain-with-support-for-hip-amd) or see
[HIP NVIDIA](#build-dpc-toolchain-with-support-for-hip-nvidia))
* `--native_cpu` -> use the Native CPU backend (see [Native CPU](#build-dpc-toolchain-with-support-for-native-cpu))
* `--enable-all-llvm-targets` -> build compiler (but not a runtime) with all
supported targets
* `--shared-libs` -> Build shared libraries
Expand Down Expand Up @@ -298,6 +300,13 @@ as well as the CUDA Runtime API to be installed, see [NVIDIA CUDA Installation
Guide for
Linux](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html).

### Build DPC++ toolchain with support for Native CPU

Native CPU is a cpu device which by default has no other dependency than DPC++. This device works with all cpu targets supported by the DPC++ runtime.
Supported targets include x86, Aarch64 and riscv_64.

To enable Native CPU in a DPC++ build just add `--native_cpu` to the set of flags passed to `configure.py`.

### Build DPC++ toolchain with support for ARM processors

There is no continuous integration for this, and there are no guarantees for supported platforms or configurations.
Expand Down Expand Up @@ -727,6 +736,13 @@ clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda \
simple-sycl-app.cpp -o simple-sycl-app-cuda.exe
```

When building for Native CPU use the SYCL target native_cpu:

```bash
clang++ -fsycl -fsycl-targets=native_cpu simple-sycl-app.cpp -o simple-sycl-app.exe
```
More Native CPU build options can be found in [SYCLNativeCPU.md](design/SYCLNativeCPU.md).

**Linux & Windows (64-bit)**:

```bash
Expand Down
2 changes: 2 additions & 0 deletions sycl/doc/design/SYCLNativeCPU.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,8 @@ Whole Function Vectorization is enabled by default, and can be controlled throug
* `-mllvm -sycl-native-cpu-no-vecz`: disable Whole Function Vectorization.
* `-mllvm -sycl-native-cpu-vecz-width`: sets the vector width to the specified value, defaults to 8.

The `-march=` option can be used to select specific target cpus which may improve performance of the vectorized code.

For more details on how the Whole Function Vectorizer is integrated for SYCL Native CPU, refer to the [Technical details](#technical-details) section.

# Code coverage
Expand Down