-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Model: Minimax M2 - chat support #16946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
+483
−6
Closed
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
e21f87e
Minimax M2 chat template support
pwilkin 4e58382
No newline after <think>
pwilkin de67255
On the other hand, this is probably safer
pwilkin 1a351a0
Use Unsloth template, add extra test parameters for ignoring addition…
pwilkin 9481289
Whitespace.
pwilkin 23d4bb7
Add proper handling of optional parameters with test
pwilkin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,171 @@ | ||
| {# Unsloth template fixes #} | ||
| {# ----------‑‑‑ special token variables ‑‑‑---------- #} | ||
| {%- set toolcall_begin_token = '<minimax:tool_call>' -%} | ||
| {%- set toolcall_end_token = '</minimax:tool_call>' -%} | ||
| {#- Tool Rendering Functions ============================================== -#} | ||
| {%- macro render_tool_namespace(namespace_name, tool_list) -%} | ||
| {%- for tool in tool_list -%} | ||
| <tool>{{ tool.function | tojson | string }}</tool> | ||
| {% endfor -%} | ||
| {%- endmacro -%} | ||
| {%- macro visible_text(content) -%} | ||
| {%- if content is string -%} | ||
| {{ content }} | ||
| {%- elif content is iterable and content is not mapping -%} | ||
| {%- for item in content -%} | ||
| {%- if item is mapping and item.type == 'text' -%} | ||
| {{- item.text }} | ||
| {%- elif item is string -%} | ||
| {{- item }} | ||
| {%- endif -%} | ||
| {%- endfor -%} | ||
| {%- else -%} | ||
| {{- content }} | ||
| {%- endif -%} | ||
| {%- endmacro -%} | ||
| {#- System Message Construction ============================================ -#} | ||
| {%- macro build_system_message(system_message) -%} | ||
| {%- if system_message and system_message.content -%} | ||
| {{- visible_text(system_message.content) }} | ||
| {%- else -%} | ||
| {%- if model_identity is not defined -%} | ||
| {%- set model_identity = "You are a helpful assistant." -%} | ||
| {%- endif -%} | ||
| {{- model_identity }} | ||
| {%- endif -%} | ||
|
|
||
| {#- Handle current_date -#} | ||
| {%- if system_message and system_message.current_date -%} | ||
| {{- '\n' ~ 'Current date: ' + system_message.current_date }} | ||
| {%- endif -%} | ||
| {#- Handle current_location -#} | ||
| {%- if system_message and system_message.current_location -%} | ||
| {{- '\n' ~ 'Current location: ' + system_message.current_location }} | ||
| {%- endif -%} | ||
| {%- endmacro -%} | ||
| {#- Main Template Logic ================================================= -#} | ||
| {#- Extract system message (only first message if it's system) -#} | ||
| {%- set system_message = none -%} | ||
| {%- set conversation_messages = messages -%} | ||
| {%- if messages and messages[0].role == "system" -%} | ||
| {%- set system_message = messages[0] -%} | ||
| {%- set conversation_messages = messages[1:] -%} | ||
| {%- endif -%} | ||
| {#- Get the last user message turn, for interleved thinking -#} | ||
| {%- set ns = namespace(last_user_index=-1) %} | ||
| {% for m in conversation_messages %} | ||
| {%- if m.role == 'user' %} | ||
| {% set ns.last_user_index = loop.index0 -%} | ||
| {%- endif %} | ||
| {%- endfor %} | ||
| {#- Render system message -#} | ||
| {{- ']~!b[' ~ ']~b]system' ~ '\n' }} | ||
| {{- build_system_message(system_message) }} | ||
| {#- Render tools if available -#} | ||
| {%- if tools -%} | ||
| {{- '\n\n' ~ '# Tools' ~ '\n' ~ 'You may call one or more tools to assist with the user query.\nHere are the tools available in JSONSchema format:' ~ '\n' }} | ||
| {{- '\n' ~ '<tools>' ~ '\n' }} | ||
| {{- render_tool_namespace("functions", tools) }} | ||
| {{- '</tools>' ~ '\n\n' }} | ||
| {{- 'When making tool calls, use XML format to invoke tools and pass parameters:' ~ '\n' }} | ||
| {{- '\n' ~ toolcall_begin_token }} | ||
| <invoke name="tool-name-1"> | ||
| <parameter name="param-key-1">param-value-1</parameter> | ||
| <parameter name="param-key-2">param-value-2</parameter> | ||
| ... | ||
| </invoke> | ||
| {{- '\n' ~ toolcall_end_token }} | ||
| {%- endif -%} | ||
| {{- '[e~[\n' }} | ||
|
|
||
| {#- Render messages -#} | ||
| {%- set last_tool_call = namespace(name=none) -%} | ||
| {%- for message in conversation_messages -%} | ||
| {%- if message.role == 'assistant' -%} | ||
| {#- Only render reasoning_content if no user message follows -#} | ||
| {{- ']~b]ai' ~ '\n' }} | ||
|
|
||
| {%- set reasoning_content = '' %} | ||
| {%- set content = visible_text(message.content) %} | ||
| {%- if message.reasoning_content is string %} | ||
| {%- set reasoning_content = message.reasoning_content %} | ||
| {%- else %} | ||
| {%- if '</think>' in content %} | ||
| {# Unsloth template fixes - must change to for loop since llama.cpp will error out if not #} | ||
| {%- set parts = content.split('</think>') %} | ||
| {%- for part in parts %} | ||
| {%- if loop.index0 == 0 -%} | ||
| {%- set reasoning_content = part.strip('\n') %} | ||
| {%- set reasoning_content = (reasoning_content.split('<think>')|last) %} | ||
| {%- set reasoning_content = reasoning_content.strip('\n') -%} | ||
| {%- else -%} | ||
| {%- set content = part.strip('\n') %} | ||
| {%- endif %} | ||
| {%- endfor %} | ||
| {%- endif %} | ||
| {%- endif %} | ||
| {%- if reasoning_content and loop.index0 > ns.last_user_index -%} | ||
| {{- '<think>' ~ '\n' ~ reasoning_content ~ '\n' ~ '</think>' ~ '\n\n' }} | ||
| {%- endif -%} | ||
| {%- if content -%} | ||
| {{- content }} | ||
| {%- endif -%} | ||
| {%- if message.tool_calls -%} | ||
| {{- '\n' ~ toolcall_begin_token ~ '\n' }} | ||
|
|
||
| {%- for tool_call in message.tool_calls -%} | ||
| {%- if tool_call.function %} | ||
| {%- set tool_call = tool_call.function %} | ||
| {%- endif %} | ||
| {{- '<invoke name="' + tool_call.name + '">\n' }} | ||
| {%- if tool_call.arguments is defined and tool_call.arguments is mapping -%} | ||
| {% set _args = tool_call.arguments %} | ||
| {%- for k, v in _args|items %} | ||
| {{- '<parameter name="' + k + '">' }} | ||
| {{- v | tojson | string if v is not string else v }} | ||
| {{- '</parameter>' }} | ||
| {% endfor %}{%- endif -%} | ||
| {{- '</invoke>' ~ '\n' }} | ||
| {%- endfor -%} | ||
|
|
||
| {{- toolcall_end_token}} | ||
| {%- set last_tool_call.name = message.tool_calls[-1].name -%} | ||
| {%- else -%} | ||
| {%- set last_tool_call.name = none -%} | ||
| {%- endif -%} | ||
| {{- '[e~[' ~ '\n' }} | ||
|
|
||
| {%- elif message.role == 'tool' -%} | ||
| {%- if last_tool_call.name is none -%} | ||
| {{- raise_exception("Message has tool role, but there was no previous assistant message with a tool call!") }} | ||
| {%- endif -%} | ||
| {%- if loop.first or (conversation_messages[loop.index0 - 1].role != 'tool') -%} | ||
| {{- ']~b]tool' }} | ||
| {%- endif -%} | ||
| {%- if message.content is string -%} | ||
| {{- '\n<response>' }} | ||
| {{- message.content }} | ||
| {{- '</response>' }} | ||
| {%- else -%} | ||
| {%- for tr in message.content -%} | ||
| {{- '\n<response>' }} | ||
| {{- tr.output if tr.output is defined else (tr.text if tr.type == 'text' and tr.text is defined else tr) }} | ||
| {{- '\n</response>' }} | ||
| {%- endfor -%} | ||
| {%- endif -%} | ||
| {%- if loop.last or (conversation_messages[loop.index0 + 1].role != 'tool') -%} | ||
| {{- '[e~[\n' -}} | ||
| {%- endif -%} | ||
|
|
||
| {%- elif message.role == 'user' -%} | ||
| {{- ']~b]user' ~ '\n' }} | ||
| {{- visible_text(message.content) }} | ||
| {{- '[e~[' ~ '\n' }} | ||
| {%- endif -%} | ||
| {%- endfor -%} | ||
|
|
||
| {#- Generation prompt -#} | ||
| {%- if add_generation_prompt -%} | ||
| {{- ']~b]ai' ~ '\n' ~ '<think>' ~ '\n' }} | ||
| {%- endif -%} | ||
| {# Copyright 2025-present Unsloth. Apache 2.0 License. #} |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure we're using the correct definition of interleaved thinking here? I don't think it means the CoT is interleaved with the content during generation, but rather it is interleaved in the entire prompt during multi-turn tool calling sessions. It seems to behave very similarly to gpt-oss. None of my testing, granted at Q2_XL, seems to indicate that the CoT is interleaved during generation. It's also only applied if the last message is a
toolresponse.Using the proposed fix for tool response support by @ochafik, it works as is if I pass
reasoning_contentwith the assistant messages. Without this fix, thetoolmessages are transformed touserby the polyfill.Template Example
It does place the burden of returning
reasoning_contenton the clients.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aldehir That's actually a good clarification - I was somehow convinced that interlaving reasoning actually meant content blocks with multiple reasoning / content chunks intertwined (I think that the Anthropic protocol allows something like that). We shouldn't have a problem with it if it's just tool calls intertwined with reasoning blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hksdpc255 please take a look at this discussion, since I feel you're repeating the same error (with using reasoning-format none + literally outputting the opening
<think>tag).Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pwilkin Thanks for pointing that out. I actually had the same misunderstanding about
interleaved thinkingat first.Because of that, I initially implemented full support for reasoning and normal content being interleaved during generation. Later I realized that this wasn’t really required in our current setup. But since I already had a custom test harness for it, I verified that my implementation can indeed handle such interleaved reasoning/content streams. It might still be useful in the future if models start emitting that pattern more often.
As for
--reasoning-format none, my understanding was that it means not to treat reasoning specially, but to include it directly in the normal assistant message. This interpretation seemed consistent with how some chat templates (like GLM 4.5 / 4.6 and MiniMax M2) automatically detect<think>blocks in the main content, extract them intoreasoning_content, and remove them from the visible answer. That behavior is quite helpful for clients that don’t support returningreasoning_contentback to the server — which I believe is the case for most code agents.I’m currently using
--reasoning-format noneto serve the Zed editor, and in that setup, MiniMax M2 performs impressively well on fairly complex tasks.However, I might have misunderstood the actual purpose of
--reasoning-format none. If so, I’d really appreciate clarification. And if it’s not meant for this kind of use case, I think introducing a new--reasoning-formatmode to explicitly support it would make a lot of sense.