-
Notifications
You must be signed in to change notification settings - Fork 33
Closed
Description
Two dimensional indexing over dpctl.tensor.usm_ndarray data types inside a kernel does not work. Incorrect result is generated.
Reproducer:
import dpctl
import dpctl.tensor as dpt
import numba_dpex as dpex
@dpex.kernel
def data_parallel_sum(a, b, c):
"""
A two-dimensional vector addition example using the ``kernel`` decorator.
"""
i = dpex.get_global_id(0)
j = dpex.get_global_id(1)
c[i, j] = a[i, j] + b[i, j]
# Array dimensions
X = 8
Y = 8
global_size = X, Y
device = dpctl.select_default_device()
a_dpt = dpt.arange(X * Y, dtype=dpt.float32, device=device)
a_dpt = dpt.reshape(a_dpt, (X, Y))
b_dpt = dpt.arange(X * Y, dtype=dpt.float32, device=device)
b_dpt = dpt.reshape(b_dpt, (X, Y))
c_dpt = dpt.empty_like(a_dpt)
c_dpt = dpt.reshape(c_dpt, (X, Y))
data_parallel_sum[global_size](a, b, c)
Metadata
Metadata
Assignees
Labels
No labels