[CANN]MUL_MAT optimization #12382

noemotiovon · 2025-03-14T08:03:34Z

Optimize the MUL_MAT operator for the CANN backend. When the underlying aclnnWeightQuantBatchMatmulV2 operator is called and k <= QK8_0, use the per_channel algorithm instead of the per_group algorithm.

Test Cases

  MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=32,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK
  MUL_MAT(type_a=q8_0,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[1,1],per=[0,1,2,3]): OK

Signed-off-by: noemotiovon <[email protected]>

[CANN]MUL_MAT optimization

531b3a8

Signed-off-by: noemotiovon <[email protected]>

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Mar 14, 2025

hipudding self-requested a review March 14, 2025 08:04

hipudding assigned noemotiovon Mar 14, 2025

hipudding added the Ascend NPU issues specific to Ascend NPUs label Mar 14, 2025

hipudding approved these changes Mar 14, 2025

View reviewed changes

code adjustment

3d5b3a7

Signed-off-by: noemotiovon <[email protected]>

hipudding merged commit 92a3913 into ggml-org:master Mar 15, 2025
47 checks passed

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Mar 19, 2025

[CANN]MUL_MAT optimization (ggml-org#12382)

cb1de04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CANN]MUL_MAT optimization #12382

[CANN]MUL_MAT optimization #12382

Uh oh!

noemotiovon commented Mar 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[CANN]MUL_MAT optimization #12382

[CANN]MUL_MAT optimization #12382

Uh oh!

Conversation

noemotiovon commented Mar 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants