Revert "CUDA: add expert reduce kernel (#16857)" #17100

am17an · 2025-11-08T10:32:18Z

This reverts commit 4146d6a.

This kernel causes some weird perplexity behavior which I can't explain. I will need to fix it before adding it back again

This reverts commit 4146d6a.

CISC · 2025-11-08T11:32:28Z

ggml/src/ggml-cuda/ggml-cuda.cu

-                        }
-
-                        while (current_node < cgraph->n_nodes && cgraph->nodes[current_node]->op == GGML_OP_ADD &&
-                                num_adds < num_views - 1) {


This logic doesn't add up, what was the intention here?

What do you mean? n_expert_used views followed by n_expert_used - 1 adds

Well, it's a little awkward when n_expert_used is 1 (ie, no add, but cont instead), and I'm not sure you're catering for that?

I think this also means it will kick in on just mul+view.

Good point, I'll fix that once I figure out the other problems in this kernel (namely related to the lifetime to the weights buffer)

Revert "CUDA: add expert reduce kernel (ggml-org#16857)"

4d1daaf

This reverts commit 4146d6a.

am17an requested a review from slaren as a code owner November 8, 2025 10:32

am17an requested review from JohannesGaessler and slaren and removed request for slaren November 8, 2025 10:32

DajanaV mentioned this pull request Nov 8, 2025

UPSTREAM PR #17100: Revert "CUDA: add expert reduce kernel (#16857)" auroralabs-loci/llama.cpp#131

Open

github-actions bot added testing Everything test related Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Nov 8, 2025

JohannesGaessler approved these changes Nov 8, 2025

View reviewed changes

CISC reviewed Nov 8, 2025

View reviewed changes

am17an merged commit 64fe17f into ggml-org:master Nov 8, 2025
117 of 126 checks passed

am17an deleted the cuda-revert-moe-expert branch November 9, 2025 03:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Revert "CUDA: add expert reduce kernel (#16857)" #17100

Revert "CUDA: add expert reduce kernel (#16857)" #17100

Uh oh!

am17an commented Nov 8, 2025

Uh oh!

CISC Nov 8, 2025

Uh oh!

am17an Nov 8, 2025

Uh oh!

CISC Nov 8, 2025

Uh oh!

CISC Nov 8, 2025

Uh oh!

am17an Nov 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Revert "CUDA: add expert reduce kernel (#16857)" #17100

Revert "CUDA: add expert reduce kernel (#16857)" #17100

Uh oh!

Conversation

am17an commented Nov 8, 2025

Uh oh!

CISC Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

am17an Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

CISC Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

CISC Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

am17an Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

am17an Nov 8, 2025 •

edited

Loading