Skip to content

tool-call: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme #11539

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jan 31, 2025

Conversation

ochafik
Copy link
Collaborator

@ochafik ochafik commented Jan 31, 2025

Makes more models to play nice w/ pydantic agent (follow up from #9639)

  • Always return tool call id (default = empty) even if the model / template doesn't support it
  • Allow missing tool call name for backfill of templates that don't support tool calls (sync Don't require tool name in backfill behaviour google/minja#36)
  • Fix parsing of non-tool-calling outputs for Functionary v3.2
  • Make Llama 3.x a bit more compliant by adding more grammar triggers (tested 1B & 8B, still not perfect)
  • Updated examples/server/README
  • Log Chat format: even w/o --verbose
# Was already working (tool call id was generated):
llama-server --jinja -fa -hf bartowski/Mistral-Nemo-Instruct-2407-GGUF:Q6_K_L

# Newly working:

## Native format
llama-server --jinja -fa -hf bartowski/Qwen2.5-7B-Instruct-GGUF:Q4_K_M
llama-server --jinja -fa -hf bartowski/Mistral-Nemo-Instruct-2407-GGUF:Q4_K_M
llama-server --jinja -fa -hf bartowski/Llama-3.2-3B-Instruct-GGUF:Q6_K
llama-server --jinja -fa -hf bartowski/functionary-small-v3.2-GGUF:Q4_K_M
llama-server --jinja -fa -hf bartowski/Hermes-3-Llama-3.1-8B-GGUF:Q4_K_M \
  --chat-template-file <( python scripts/get_chat_template.py NousResearch/Hermes-3-Llama-3.1-8B tool_use )
llama-server --jinja -fa -hf bartowski/firefunction-v2-GGUF -hff firefunction-v2-IQ1_M.gguf \
  --chat-template-file <( python scripts/get_chat_template.py fireworks-ai/firellama-3-firefunction-v2 )
## Generic format:
llama-server --jinja -fa -hf bartowski/Phi-3.5-mini-instruct-GGUF:Q4_K_M

Example test code from @brucepro (ref):

import random
from pydantic_ai import Agent, RunContext
from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
    'llama3.3-70B',
    base_url='http://127.0.0.1:8082',
    api_key='123',
)

agent = Agent(
    model,
    deps_type=str,
    system_prompt=(
        "You're a dice game, you should roll the die and see if the number "
        "you get back matches the user's guess. If so, tell them they're a winner. "
        "Use the player's name in the response."
    ),
)

@agent.tool_plain
def roll_die() -> str:
    """Roll a six-sided die and return the result."""
    return str(random.randint(1, 6))

@agent.tool
def get_player_name(ctx: RunContext[str]) -> str:
    """Get the player's name."""
    return ctx.deps

dice_result = agent.run_sync('My guess is 4', deps='Anne')
print(dice_result.data)

@ochafik ochafik marked this pull request as ready for review January 31, 2025 12:00
@ochafik ochafik requested a review from ngxson as a code owner January 31, 2025 12:00
@ochafik ochafik changed the title tool-call: small fixes to play nice w/ pydantic_ai package tool-call: small fixes to play nice w/ pydantic_ai package (+ update readme) Jan 31, 2025
@ochafik ochafik changed the title tool-call: small fixes to play nice w/ pydantic_ai package (+ update readme) tool-call: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme Jan 31, 2025
@ochafik ochafik added bugfix fixes an issue or bug documentation Improvements or additions to documentation labels Jan 31, 2025
@ochafik ochafik requested a review from ggerganov January 31, 2025 13:53
@ochafik ochafik merged commit a83f528 into ggml-org:master Jan 31, 2025
43 of 44 checks passed
@ochafik ochafik deleted the tool-call-fix branch January 31, 2025 14:15
tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025
…_ai package, update readme (ggml-org#11539)

* An empty tool_call_id is better than none!

* sync: minja (tool call name optional google/minja#36)

* Force-disable parallel_tool_calls if template doesn't support it

* More debug logs

* Llama 3.x tools: accept / trigger on more varied spaced outputs

* Fix empty content for functionary v3.2 tool call

* Add proper tool call docs to server README

* readme: function calling *is* supported now

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: Georgi Gerganov <[email protected]>
@brucepro
Copy link
Contributor

I have added a pr to start the discussion of how MCP should be added to the client. Please share your thoughts if you don't mind. #11853

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025
…_ai package, update readme (ggml-org#11539)

* An empty tool_call_id is better than none!

* sync: minja (tool call name optional google/minja#36)

* Force-disable parallel_tool_calls if template doesn't support it

* More debug logs

* Llama 3.x tools: accept / trigger on more varied spaced outputs

* Fix empty content for functionary v3.2 tool call

* Add proper tool call docs to server README

* readme: function calling *is* supported now

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: Georgi Gerganov <[email protected]>
mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025
…_ai package, update readme (ggml-org#11539)

* An empty tool_call_id is better than none!

* sync: minja (tool call name optional google/minja#36)

* Force-disable parallel_tool_calls if template doesn't support it

* More debug logs

* Llama 3.x tools: accept / trigger on more varied spaced outputs

* Fix empty content for functionary v3.2 tool call

* Add proper tool call docs to server README

* readme: function calling *is* supported now

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: Georgi Gerganov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix fixes an issue or bug documentation Improvements or additions to documentation examples server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants