Skip to content

update to llama.cpp b5688 #115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

arnej27959
Copy link
Contributor

  • extract updated glue code to server.hpp and utils.cpp
  • adapt native code in jllama.cpp to track API changes
  • update tags and adapt CMakeLists.txt

note: All the code in utils.hpp and server.hpp is just lifted directly from llama.cpp, so I don't really have a good understanding of all the changes that happened there. But I've looked closely at the changes that needed to be done in jllama.cpp to track the API changes, and done a bit of testing of the resulting binary.

longer-term it would be better to get the server code in llama.cpp split so it these pieces can be used as a library directly; we are considering writing a different server wrapper with protobuf/rpc replacing JSON/http, which could use the same library.

- extract updated glue code to server.hpp and utils.cpp
- adapt native code in jllama.cpp to track API changes
- update tags and adapt CMakeLists.txt
@kherud
Copy link
Owner

kherud commented Jun 20, 2025

Awesome work 👍 though it'll take some time for me to review.

longer-term it would be better to get the server code in llama.cpp split so it these pieces can be used as a library directly;

Yeah, if you can get the llama.cpp team to do this, it would make things much easier!

we are considering writing a different server wrapper with protobuf/rpc replacing JSON/http, which could use the same library.

I'm a big fan of protobuf/rpc, so I'll look forward to that!

comment says:
// if the assistant message appears at the end of list, we do not add end-of-turn token

so it seems the changed output should be is expected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants