-
Notifications
You must be signed in to change notification settings - Fork 33
Refactor/kernel interfaces #804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
736bff3 to
6b179af
Compare
6b179af to
d426eb2
Compare
f489b60 to
921624e
Compare
2d98ded to
b2ed65d
Compare
f0e4ed0 to
211d5d8
Compare
|
@mingjie-intel @chudur-budur All existing tests except caching work with the new API. I have updated the description to capture pending TODOs. Any early feedback will be very helpful. |
| extra_compile_flags=extra_compile_flags, | ||
| ) | ||
|
|
||
| self._target_context = cres.target_context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@diptorupd Why we are not doing the same as it has been done in _compile()?
f99df29 to
b43201c
Compare
27490c8 to
17676d2
Compare
| np.copyto(obj._orig_val, obj._packed_val) | ||
|
|
||
| def __init__( | ||
| self, kernel_name, arg_list, argty_list, access_specifiers_list, queue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to use consistent variable naming, like pyfunc_name instead of kernel_name
| compile_flags=None, | ||
| array_access_specifiers=None, | ||
| ): | ||
| self.typingctx = dpex_target.typing_context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dpex_target is a global object coming from an external module, why do we always keep it in some local variable? Is there any specific reason for this?
| ) | ||
| func = cres.library.get_function(cres.fndesc.llvm_func_name) | ||
| cres.target_context.mark_ocl_device(func) | ||
| devfn = DpexFunction(cres) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to cache compiled func?
26c4334 to
4656193
Compare
|
Documentation preview: show. |
- The compute follows data checking is now based on queue
equality.
- USMNdArray no longer requires usm_type and device
during construction. It allows us to specialize an usm_ndarray
only on ndims, layout and dtype.
- No check for compute follows data for eager compilation.
- Change caching to not require backend and device-type.
- Fixes to test cases.
- The DEFAULT_LOCAL_SIZE is deprecated and users warned to
provided a valid local range for nd_range kernels.
- Removed the global_range and local_range kw args from
JitKernel.__call__().
- Undeprecate the JitKernel.__getitem__ call.
- Fix and improve how arguments to JitKernel.__call__() are
parsed to extract the global_range and local_range.
637a04d to
c74ea24
Compare
|
Documentation preview: show. |
|
Merging as TeamCity CI is all green. |
|
Documentation preview removed. |
Refactor/kernel interfaces 187782d
I don't have a minimal reproducer yet but I can say that this fixes the issues I've had after IntelPython#804
Have you provided a meaningful PR description?
numba_dpex\compiler.pymixes both things, making hard the separation of compute-follows-data based kernel launch from legacydpctl.device_contextbased behavior.dpctl.device_contextfor kernels.__getitem__to provide global and local ranges for a kernel launch. (to be reevaluated Deprecate __getitem__ support in numba_dpex.kernel #790)numba_dpex.core.kernel_interface.DpexFuncto new APInumba_dpex.compiler.pyHave you added a test, reproducer or referred to an issue with a reproducer?
Have you tested your changes locally for CPU and GPU devices?
Have you made sure that new changes do not introduce compiler warnings?
If this PR is a work in progress, are you filing the PR as a draft?
Fixes #814, #816, #780, #810