Skip to content

Commit c1f63b2

Browse files
author
iclsrc
committed
Merge from 'sycl' to 'sycl-web' (4 commits)
2 parents 290c27c + fcc330e commit c1f63b2

19 files changed

+1847
-1598
lines changed
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# Considerations for working on KHR extensions
2+
3+
SYCL specification evolves through embedding extensions developed by various
4+
vendors, including Khronos Group itself (`khr` extensions).
5+
6+
In order for a KHR extension to be accepted, there must be CTS tests for it and
7+
at least one implementation which passes them.
8+
9+
Considering that KHR extensions are being developed in public, we can start
10+
prototyping them as soon as corresponding PR for an extension is published at
11+
KhronosGroup/SYCL-Docs.
12+
13+
However, we shouldn't be exposing those extensions to end users until the
14+
extension if finalised, ratified and published by Khronos - due to risk of an
15+
extension changing during that process and lack of the officially published
16+
version of it.
17+
18+
So, we can have a PR but can't merge it. Keeping PRs opened for a long time is a
19+
bad practice, because they tend to get stale: there are merge conflicts,
20+
potential functional issues due to the codebase changes, etc.
21+
22+
In order for us to avoid stale PRs, all functionality which is a public
23+
interface of an "in-progress" KHR extension, must be hidden under
24+
`__DPCPP_ENABLE_UNFINISHED_KHR_EXTENSIONS` macro. That way we can merge a PR to
25+
avoid constantly maintaining it in a good shape, start automatically testing it
26+
but at the same time avoid exposing incomplete and/or undocumented feature to
27+
end users just yet.
28+
29+
"in-progress" KHR extension term used above is defined as:
30+
- PR proposing a KHR extension has not been merged/cherry-picked to `sycl-2020`
31+
branch of KhronosGroup/SYCL-Docs.
32+
33+
That only happens after all formal processes on Khronos Group side are
34+
completed so an extension can be considered good and stable to be released by
35+
us.
36+
37+
Note: merge of an extension proposal PR into `main` branch of
38+
KhronosGroup/SYCL-Docs repo is **not** enough.
39+
- Published (i.e. the above bullet complete) KHR extension, which hasn't been
40+
fully implemented by us
41+
42+
The macro is **not** intended to be used by end users and its purpose is to
43+
simplify our development process by allowing us to merge implementation (full
44+
or partial) of the aforementioned extensions earlier to simplify maintenance and
45+
enable automated testing.
46+
47+
Due to this reason, we are not providing a separate macro for each "in-progress"
48+
KHR extension we may (partially) support, but just a single guard.

sycl/doc/extensions/experimental/sycl_ext_oneapi_group_load_store.asciidoc

Lines changed: 47 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -101,11 +101,13 @@ in the group.
101101
and default constructible.
102102
* `Properties` is an instance of `sycl::ext::oneapi::experimental::properties`
103103

104+
_Mandates_: If `Properties` contains the `alignment` property, `InputIteratorT` must be a pointer.
105+
104106
_Effects_: Loads single element from `in_iter` to `out` by using the `g` group
105107
object to identify memory location as `in_iter` + `g.get_local_linear_id()`.
106108

107-
Properties may provide xref:optimization_properties[assertions] which can
108-
enable better optimizations.
109+
Properties may provide xref:optimization_properties[assertions] or the `alignment` property
110+
which can enable better optimizations.
109111

110112
==== `sycl::vec` Overload
111113

@@ -132,6 +134,8 @@ in the group.
132134
and default constructible.
133135
* `Properties` is an instance of `sycl::ext::oneapi::experimental::properties`
134136

137+
_Mandates_: If `Properties` contains the `alignment` property, `InputIteratorT` must be a pointer.
138+
135139
_Effects_: Loads `N` elements from `in_iter` to `out`
136140
using the `g` group object.
137141
Properties may specify xref:data_placement[data placement].
@@ -140,8 +144,9 @@ Default data placement is a blocked one:
140144
in striped case:
141145
`out[i]` = `in_iter[g.get_local_linear_id() + g.get_local_linear_range() * i];`
142146
for `i` between `0` and `N`.
143-
Properties may also provide xref:optimization_properties[assertions] which can
144-
enable better optimizations.
147+
Properties may also provide xref:optimization_properties[assertions] or the `alignment` property
148+
which can enable better optimizations.
149+
145150

146151
==== Fixed-size Array Overload
147152

@@ -169,6 +174,8 @@ work-group or sub-group.
169174
and default constructible.
170175
* `Properties` is an instance of `sycl::ext::oneapi::experimental::properties`
171176

177+
_Mandates_: If `Properties` contains the `alignment` property, `InputIteratorT` must be a pointer.
178+
172179
_Effects_: Loads `ElementsPerWorkItem` elements from `in_iter` to `out`
173180
using the `g` group object.
174181
Properties may specify xref:data_placement[data placement].
@@ -177,8 +184,9 @@ Default placement is a blocked one:
177184
in striped case:
178185
`out[i]` = `in_iter[g.get_local_linear_id() + g.get_local_linear_range() * i];`
179186
for `i` between `0` and `ElementsPerWorkItem`.
180-
Properties may also provide xref:optimization_properties[assertions] which can
181-
enable better optimizations.
187+
Properties may also provide xref:optimization_properties[assertions] or the `alignment` property
188+
which can enable better optimizations.
189+
182190

183191

184192
=== Store API
@@ -209,11 +217,13 @@ in the group.
209217
and default constructible.
210218
* `Properties` is an instance of `sycl::ext::oneapi::experimental::properties`
211219

220+
_Mandates_: If `Properties` contains the `alignment` property, `OutputIteratorT` must be a pointer.
221+
212222
_Effects_: Stores single element `in` to `out_iter` by using the `g` group
213223
object to identify memory location as `out_iter` + `g.get_local_linear_id()`
214224

215-
Properties may provide xref:optimization_properties[assertions] which can
216-
enable better optimizations.
225+
Properties may provide xref:optimization_properties[assertions] or the `alignment` property
226+
which can enable better optimizations.
217227

218228

219229
==== `sycl::vec` Overload
@@ -241,6 +251,8 @@ in the group.
241251
and default constructible.
242252
* `Properties` is an instance of `sycl::ext::oneapi::experimental::properties`
243253

254+
_Mandates_: If `Properties` contains the `alignment` property, `OutputIteratorT` must be a pointer.
255+
244256
_Effects_: Stores `N` elements from `in` vec to `out_iter`
245257
using the `g` group object.
246258
Properties may specify xref:data_placement[data placement].
@@ -249,8 +261,8 @@ Default placement is a blocked one:
249261
in striped case:
250262
`out_iter[g.get_local_linear_id() + g.get_local_linear_range() * i]` = `in[i];`
251263
for `i` between `0` and `N`.
252-
Properties may also provide xref:optimization_properties[assertions] which can
253-
enable better optimizations.
264+
Properties may also provide xref:optimization_properties[assertions] or the `alignment` property
265+
which can enable better optimizations.
254266

255267

256268
==== Fixed-size Array Overload
@@ -280,6 +292,8 @@ work-group or sub-group.
280292
and default constructible.
281293
* `Properties` is an instance of `sycl::ext::oneapi::experimental::properties`
282294

295+
_Mandates_: If `Properties` contains the `alignment` property, `OutputIteratorT` must be a pointer.
296+
283297
_Effects_: Stores `ElementsPerWorkItem` elements from `in` span to `out_iter`
284298
using the `g` group object.
285299

@@ -289,8 +303,9 @@ Default placement is a blocked one:
289303
in striped case:
290304
`out_iter[g.get_local_linear_id() + g.get_local_linear_range() * i]` = `in[i];`
291305
for `i` between `0` and `ItemsPerWorkItem`.
292-
Properties may also provide xref:optimization_properties[assertions] which can
293-
enable better optimizations.
306+
Properties may also provide xref:optimization_properties[assertions] or the `alignment` property
307+
which can enable better optimizations.
308+
294309

295310
=== Data Placement
296311

@@ -442,6 +457,23 @@ so the implementation can rely on `get_max_local_range()` range size:
442457

443458
If partition is uneven the behavior is undefined.
444459

460+
== Alignment
461+
462+
If `InputIteratorT`/`OutputIteratorT` is a pointer then the following property can be used
463+
to provide an alignment of the pointer. It can allow to avoid dynamic alignment check.
464+
465+
```c++
466+
namespace sycl::ext::oneapi::experimental {
467+
struct alignment_key {
468+
template <int K>
469+
using value_t = property_value<alignment_key, std::integral_constant<int, K>>;
470+
};
471+
472+
template<int K>
473+
inline constexpr alignment_key::value_t<K> alignment;
474+
} // namespace sycl::ext::oneapi::experimental
475+
```
476+
445477
== Usage Example
446478

447479
Example shows the simplest case without local memory usage of blocked load
@@ -458,8 +490,8 @@ constexpr std::size_t block_count = 2;
458490
constexpr std::size_t size = block_count * block_size * items_per_thread;
459491
460492
sycl::queue q;
461-
T* input = sycl::malloc_device<T>(size, q);
462-
T* output = sycl::malloc_device<T>(size, q);
493+
T* input = sycl::aligned_alloc_device<T>(16, size, q);
494+
T* output = sycl::aligned_alloc_device<T>(16, size, q);
463495
464496
q.submit([&](sycl::handler& cgh) {
465497
cgh.parallel_for(
@@ -472,7 +504,7 @@ q.submit([&](sycl::handler& cgh) {
472504
auto offset = g.get_group_id(0) * g.get_local_range(0) *
473505
items_per_thread;
474506
475-
auto props = sycl_exp::properties{sycl_exp::contiguous_memory};
507+
auto props = sycl_exp::properties{sycl_exp::contiguous_memory, sycl_exp::alignment<16>};
476508
477509
sycl_exp::group_load(g, input + offset, sycl::span{ data }, props);
478510

sycl/doc/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,3 +66,4 @@ Developer Documentation
6666
developer/DockerBKMs
6767
developer/ABIPolicyGuide
6868
developer/ContributeToDPCPP
69+
developer/KHRExtensions

sycl/include/sycl/builtins_utils_scalar.hpp

Lines changed: 0 additions & 73 deletions
This file was deleted.

sycl/include/sycl/builtins_utils_vec.hpp

Lines changed: 0 additions & 79 deletions
This file was deleted.

0 commit comments

Comments
 (0)