`tool-call`: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme #11539

ochafik · 2025-01-31T11:16:10Z

Makes more models to play nice w/ pydantic agent (follow up from #9639)

Always return tool call id (default = empty) even if the model / template doesn't support it
Allow missing tool call name for backfill of templates that don't support tool calls (sync Don't require tool name in backfill behaviour google/minja#36)
Fix parsing of non-tool-calling outputs for Functionary v3.2
Make Llama 3.x a bit more compliant by adding more grammar triggers (tested 1B & 8B, still not perfect)
Updated examples/server/README
Log Chat format: even w/o --verbose

# Was already working (tool call id was generated):
llama-server --jinja -fa -hf bartowski/Mistral-Nemo-Instruct-2407-GGUF:Q6_K_L

# Newly working:

## Native format
llama-server --jinja -fa -hf bartowski/Qwen2.5-7B-Instruct-GGUF:Q4_K_M
llama-server --jinja -fa -hf bartowski/Mistral-Nemo-Instruct-2407-GGUF:Q4_K_M
llama-server --jinja -fa -hf bartowski/Llama-3.2-3B-Instruct-GGUF:Q6_K
llama-server --jinja -fa -hf bartowski/functionary-small-v3.2-GGUF:Q4_K_M
llama-server --jinja -fa -hf bartowski/Hermes-3-Llama-3.1-8B-GGUF:Q4_K_M \
  --chat-template-file <( python scripts/get_chat_template.py NousResearch/Hermes-3-Llama-3.1-8B tool_use )
llama-server --jinja -fa -hf bartowski/firefunction-v2-GGUF -hff firefunction-v2-IQ1_M.gguf \
  --chat-template-file <( python scripts/get_chat_template.py fireworks-ai/firellama-3-firefunction-v2 )
## Generic format:
llama-server --jinja -fa -hf bartowski/Phi-3.5-mini-instruct-GGUF:Q4_K_M

Example test code from @brucepro (ref):

import random
from pydantic_ai import Agent, RunContext
from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
    'llama3.3-70B',
    base_url='http://127.0.0.1:8082',
    api_key='123',
)

agent = Agent(
    model,
    deps_type=str,
    system_prompt=(
        "You're a dice game, you should roll the die and see if the number "
        "you get back matches the user's guess. If so, tell them they're a winner. "
        "Use the player's name in the response."
    ),
)

@agent.tool_plain
def roll_die() -> str:
    """Roll a six-sided die and return the result."""
    return str(random.randint(1, 6))

@agent.tool
def get_player_name(ctx: RunContext[str]) -> str:
    """Get the player's name."""
    return ctx.deps

dice_result = agent.run_sync('My guess is 4', deps='Anne')
print(dice_result.data)

common/chat.cpp

examples/server/utils.hpp

Co-authored-by: Georgi Gerganov <[email protected]>

…_ai package, update readme (ggml-org#11539) * An empty tool_call_id is better than none! * sync: minja (tool call name optional google/minja#36) * Force-disable parallel_tool_calls if template doesn't support it * More debug logs * Llama 3.x tools: accept / trigger on more varied spaced outputs * Fix empty content for functionary v3.2 tool call * Add proper tool call docs to server README * readme: function calling *is* supported now * Apply suggestions from code review Co-authored-by: Georgi Gerganov <[email protected]> --------- Co-authored-by: Georgi Gerganov <[email protected]>

brucepro · 2025-02-14T00:31:23Z

I have added a pr to start the discussion of how MCP should be added to the client. Please share your thoughts if you don't mind. #11853

…_ai package, update readme (ggml-org#11539) * An empty tool_call_id is better than none! * sync: minja (tool call name optional google/minja#36) * Force-disable parallel_tool_calls if template doesn't support it * More debug logs * Llama 3.x tools: accept / trigger on more varied spaced outputs * Fix empty content for functionary v3.2 tool call * Add proper tool call docs to server README * readme: function calling *is* supported now * Apply suggestions from code review Co-authored-by: Georgi Gerganov <[email protected]> --------- Co-authored-by: Georgi Gerganov <[email protected]>

Olivier Chafik added 5 commits January 31, 2025 09:28

An empty tool_call_id is better than none!

5408cb8

sync: minja (tool call name optional google/minja#36)

caed1ef

Force-disable parallel_tool_calls if template doesn't support it

4799185

More debug logs

b31259b

Llama 3.x tools: accept / trigger on more varied spaced outputs

422df5d

github-actions bot added examples server labels Jan 31, 2025

Fix empty content for functionary v3.2 tool call

8a44b2f

ochafik force-pushed the tool-call-fix branch from f597c5a to e58ff54 Compare January 31, 2025 12:00

ochafik marked this pull request as ready for review January 31, 2025 12:00

ochafik requested a review from ngxson as a code owner January 31, 2025 12:00

ochafik changed the title ~~tool-call: small fixes to play nice w/ pydantic_ai package~~ tool-call: small fixes to play nice w/ pydantic_ai package (+ update readme) Jan 31, 2025

ochafik changed the title ~~tool-call: small fixes to play nice w/ pydantic_ai package (+ update readme)~~ tool-call: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme Jan 31, 2025

Add proper tool call docs to server README

fa20249

ochafik force-pushed the tool-call-fix branch from e58ff54 to fa20249 Compare January 31, 2025 12:04

ochafik mentioned this pull request Jan 31, 2025

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars #9639

Merged

41 tasks

readme: function calling *is* supported now

ed82223

ochafik added bugfix fixes an issue or bug documentation Improvements or additions to documentation labels Jan 31, 2025

ochafik requested a review from ggerganov January 31, 2025 13:53

ggerganov approved these changes Jan 31, 2025

View reviewed changes

common/chat.cpp Outdated Show resolved Hide resolved

examples/server/utils.hpp Outdated Show resolved Hide resolved

Apply suggestions from code review

45a1c20

Co-authored-by: Georgi Gerganov <[email protected]>

ochafik merged commit a83f528 into ggml-org:master Jan 31, 2025
43 of 44 checks passed

ochafik deleted the tool-call-fix branch January 31, 2025 14:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`tool-call`: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme #11539

`tool-call`: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme #11539

Uh oh!

ochafik commented Jan 31, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brucepro commented Feb 14, 2025

Uh oh!

Uh oh!

tool-call: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme #11539

tool-call: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme #11539

Uh oh!

Conversation

ochafik commented Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brucepro commented Feb 14, 2025

Uh oh!

Uh oh!

`tool-call`: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme #11539

`tool-call`: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme #11539

ochafik commented Jan 31, 2025 •

edited

Loading