Skip to content

Conversation

@NeoZhangJianyu
Copy link
Collaborator

  1. Use the cache in ggml_sycl_device_info() to replace function get_work_group_size() which has low performance.
  2. Rm function get_work_group_size().

@NeoZhangJianyu NeoZhangJianyu requested a review from joeatodd July 4, 2024 01:51
@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Jul 4, 2024
@NeoZhangJianyu NeoZhangJianyu requested a review from airMeng July 4, 2024 01:56
@airMeng
Copy link
Contributor

airMeng commented Jul 4, 2024

Have you ran the UTs of norm and softmax on MTL/DG2?

@NeoZhangJianyu
Copy link
Collaborator Author

Have you ran the UTs of norm and softmax on MTL/DG2?

norm on Acr770 are passed.
softmax is not tested due to UT is broken by MUL_MAT with B16.

not test on MTL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants