Skip to content

Conversation

@ggerganov
Copy link
Member

fix #12860

Embedding models require to fit the entire prompt in a single micro batch, so we simply force:

n_ubatch = n_batch = n_ctx

@ggerganov ggerganov merged commit 226251e into master Apr 24, 2025
56 checks passed
@ggerganov ggerganov deleted the gg/embeddings-fix-batch branch April 24, 2025 19:29
pockers21 pushed a commit to pockers21/llama.cpp that referenced this pull request Apr 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: llama-embedding asserts: GGML_ASSERT(params.n_batch >= params.n_ctx);

2 participants