Skip to content

Conversation

ggerganov
Copy link
Member

Unify the RMS_NORM and NORM implementations and extend support for more shapes.

@ggerganov ggerganov requested a review from slaren as a code owner September 24, 2025 10:36
@github-actions github-actions bot added testing Everything test related ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Sep 24, 2025
@ggerganov ggerganov merged commit dfcd53f into master Sep 25, 2025
1 check passed
@ggerganov ggerganov deleted the gg/metal-norm-generic branch September 25, 2025 08:30
pwilkin pushed a commit to pwilkin/llama.cpp that referenced this pull request Sep 25, 2025
…6220)

* metal : fuse NORM + MUL + ADD

* metal : support norms of non-multiple of 4

* cont : fix comment [no ci]
@joseph777111
Copy link

joseph777111 commented Sep 26, 2025

Superior inference quality exhibited on METAL after updating to the current (at the time) version of llama.cpp (835b2b9). The difference is literally night and day - this is incredibly noticeable when running quantized versions of gpt-oss-20B. And, the improvements seem to enhance all quantized models run with METAL. Thank you, @ggerganov! This is the smartest gpt-oss-20B has ever been on my M1 MacBook Pro. I appreciate all that you and the llama.cpp team do. You guys are the best! 😋

struct pushed a commit to struct/llama.cpp that referenced this pull request Sep 26, 2025
…6220)

* metal : fuse NORM + MUL + ADD

* metal : support norms of non-multiple of 4

* cont : fix comment [no ci]
yael-works pushed a commit to yael-works/llama.cpp that referenced this pull request Oct 15, 2025
…6220)

* metal : fuse NORM + MUL + ADD

* metal : support norms of non-multiple of 4

* cont : fix comment [no ci]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants