Skip to content

Conversation

@am17an
Copy link
Collaborator

@am17an am17an commented Nov 8, 2025

This reverts commit 4146d6a.

This kernel causes some weird perplexity behavior which I can't explain. I will need to fix it before adding it back again

@am17an am17an requested a review from slaren as a code owner November 8, 2025 10:32
@am17an am17an requested review from JohannesGaessler and slaren and removed request for slaren November 8, 2025 10:32
@github-actions github-actions bot added testing Everything test related Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Nov 8, 2025
}

while (current_node < cgraph->n_nodes && cgraph->nodes[current_node]->op == GGML_OP_ADD &&
num_adds < num_views - 1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic doesn't add up, what was the intention here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? n_expert_used views followed by n_expert_used - 1 adds

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's a little awkward when n_expert_used is 1 (ie, no add, but cont instead), and I'm not sure you're catering for that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this also means it will kick in on just mul+view.

Copy link
Collaborator Author

@am17an am17an Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll fix that once I figure out the other problems in this kernel (namely related to the lifetime to the weights buffer)

@am17an am17an merged commit 64fe17f into ggml-org:master Nov 8, 2025
117 of 126 checks passed
@am17an am17an deleted the cuda-revert-moe-expert branch November 9, 2025 03:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants