Skip to content

Commit 27317bf

Browse files
Merge pull request modelcontextprotocol#31 from modelcontextprotocol/justin/model-reqs-during-sampling
Optional model preferences during sampling
2 parents 89be0b7 + c86fff4 commit 27317bf

File tree

3 files changed

+159
-15
lines changed

3 files changed

+159
-15
lines changed

docs/spec/sampling.md

Lines changed: 42 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -35,13 +35,17 @@ A Sampling Request in the Model Context Protocol (MCP) represents a request from
3535

3636
Message content can be either text or images, allowing for multimodal interactions where supported by the model. Text content is provided directly as strings, while image content must be base64 encoded with an appropriate MIME type.
3737

38+
### Model Preferences
39+
40+
Servers can express preferences for model selection using the `ModelPreferences` object. This allows servers to indicate priorities for factors like cost, speed, and intelligence, as well as provide hints for specific models.
41+
3842
## Use Cases
3943

4044
Common use cases for sampling include generating responses in chat interfaces, code completion, and content generation. Here are some example sampling scenarios:
4145

4246
### Chat Response
4347

44-
A server requesting a chat response:
48+
A server requesting a chat response with model preferences:
4549

4650
```json
4751
{
@@ -55,7 +59,16 @@ A server requesting a chat response:
5559
}
5660
],
5761
"maxTokens": 100,
58-
"temperature": 0.7
62+
"temperature": 0.7,
63+
"modelPreferences": {
64+
"hints": [
65+
{
66+
"name": "claude-3-sonnet"
67+
}
68+
],
69+
"intelligencePriority": 0.8,
70+
"speedPriority": 0.5
71+
}
5972
}
6073
```
6174

@@ -82,7 +95,18 @@ A server requesting analysis of an image:
8295
}
8396
}
8497
],
85-
"maxTokens": 200
98+
"maxTokens": 200,
99+
"modelPreferences": {
100+
"hints": [
101+
{
102+
"name": "claude-3-opus"
103+
},
104+
{
105+
"name": "claude-3-sonnet"
106+
}
107+
],
108+
"intelligencePriority": 1.0
109+
}
86110
}
87111
```
88112

@@ -126,6 +150,7 @@ To request sampling from an LLM via the client, the server MUST send a `sampling
126150
Method: `sampling/createMessage`
127151
Params:
128152
- `messages`: Array of `SamplingMessage` objects representing the conversation history
153+
- `modelPreferences`: Optional `ModelPreferences` object to guide model selection
129154
- `systemPrompt`: Optional system prompt to use
130155
- `includeContext`: Optional request to include context from MCP servers
131156
- `temperature`: Optional sampling temperature
@@ -152,7 +177,19 @@ Example:
152177
"systemPrompt": "You are a helpful assistant.",
153178
"maxTokens": 100,
154179
"temperature": 0.7,
155-
"includeContext": "none"
180+
"includeContext": "none",
181+
"modelPreferences": {
182+
"hints": [
183+
{
184+
"name": "claude-3-sonnet"
185+
},
186+
{
187+
"name": "claude-3-opus"
188+
}
189+
],
190+
"intelligencePriority": 0.9,
191+
"speedPriority": 0.6
192+
}
156193
}
157194
}
158195
```
@@ -177,7 +214,7 @@ Example:
177214
"type": "text",
178215
"text": "The capital of France is Paris."
179216
},
180-
"model": "gpt-4",
217+
"model": "claude-3-sonnet-20240307",
181218
"stopReason": "endTurn"
182219
}
183220
}

schema/schema.json

Lines changed: 40 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -290,6 +290,10 @@
290290
"properties": {},
291291
"type": "object"
292292
},
293+
"modelPreferences": {
294+
"$ref": "#/definitions/ModelPreferences",
295+
"description": "The server's preferences for which model to select. The client MAY ignore these preferences."
296+
},
293297
"stopSequences": {
294298
"items": {
295299
"type": "string"
@@ -347,20 +351,14 @@
347351
"type": "string"
348352
},
349353
"stopReason": {
350-
"description": "The reason why sampling stopped.",
351-
"enum": [
352-
"endTurn",
353-
"maxTokens",
354-
"stopSequence"
355-
],
354+
"description": "The reason why sampling stopped, if known.",
356355
"type": "string"
357356
}
358357
},
359358
"required": [
360359
"content",
361360
"model",
362-
"role",
363-
"stopReason"
361+
"role"
364362
],
365363
"type": "object"
366364
},
@@ -971,6 +969,37 @@
971969
],
972970
"type": "object"
973971
},
972+
"ModelPreferences": {
973+
"description": "The server's preferences for model selection, requested of the client during sampling.\n\nBecause LLMs can vary along multiple dimensions, choosing the \"best\" model is\nrarely straightforward. Different models excel in different areas—some are\nfaster but less capable, others are more capable but more expensive, and so\non. This interface allows servers to express their priorities across multiple\ndimensions to help clients make an appropriate selection for their use case.\n\nThese preferences are always advisory. The client MAY ignore them. It is also\nup to the client to decide how to interpret these preferences and how to\nbalance them against other considerations.",
974+
"properties": {
975+
"costPriority": {
976+
"description": "How much to prioritize cost when selecting a model. A value of 0 means cost\nis not important, while a value of 1 means cost is the most important\nfactor.",
977+
"maximum": 1,
978+
"minimum": 0,
979+
"type": "number"
980+
},
981+
"hints": {
982+
"description": "Optional string hints to use for model selection. How these hints are\ninterpreted depends on the key(s) in each record:\n\n- If the record contains a `name` key:\n - The client SHOULD treat this as a substring of a model name; for example:\n - `claude-3-5-sonnet` should match `claude-3-5-sonnet-20241022`\n - `sonnet` should match `claude-3-5-sonnet-20241022`, `claude-3-sonnet-20240229`, etc.\n - `claude` should match any Claude model\n - The client MAY also map the string to a different provider's model name or a different model family, as long as it fills a similar niche; for example:\n - `gemini-1.5-flash` could match `claude-3-haiku-20240307`\n\nAll other keys are currently left unspecified by the spec and are up to the\nclient to interpret.\n\nIf multiple hints are specified, the client MUST evaluate them in order\n(such that the first match is taken).\n\nThe client SHOULD prioritize these hints over the numeric priorities, but\nMAY still use the priorities to select from ambiguous matches.",
983+
"items": {
984+
"$ref": "#/definitions/Record<string,string>"
985+
},
986+
"type": "array"
987+
},
988+
"intelligencePriority": {
989+
"description": "How much to prioritize intelligence and capabilities when selecting a\nmodel. A value of 0 means intelligence is not important, while a value of 1\nmeans intelligence is the most important factor.",
990+
"maximum": 1,
991+
"minimum": 0,
992+
"type": "number"
993+
},
994+
"speedPriority": {
995+
"description": "How much to prioritize sampling speed (latency) when selecting a model. A\nvalue of 0 means speed is not important, while a value of 1 means speed is\nthe most important factor.",
996+
"maximum": 1,
997+
"minimum": 0,
998+
"type": "number"
999+
}
1000+
},
1001+
"type": "object"
1002+
},
9741003
"Notification": {
9751004
"properties": {
9761005
"method": {
@@ -1238,6 +1267,9 @@
12381267
],
12391268
"type": "object"
12401269
},
1270+
"Record<string,string>": {
1271+
"type": "object"
1272+
},
12411273
"Request": {
12421274
"properties": {
12431275
"method": {

schema/schema.ts

Lines changed: 77 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -682,6 +682,10 @@ export interface CreateMessageRequest extends Request {
682682
method: "sampling/createMessage";
683683
params: {
684684
messages: SamplingMessage[];
685+
/**
686+
* The server's preferences for which model to select. The client MAY ignore these preferences.
687+
*/
688+
modelPreferences?: ModelPreferences;
685689
/**
686690
* An optional system prompt the server wants to use for sampling. The client MAY modify or omit this prompt.
687691
*/
@@ -715,9 +719,9 @@ export interface CreateMessageResult extends Result, SamplingMessage {
715719
*/
716720
model: string;
717721
/**
718-
* The reason why sampling stopped.
722+
* The reason why sampling stopped, if known.
719723
*/
720-
stopReason: "endTurn" | "stopSequence" | "maxTokens";
724+
stopReason?: "endTurn" | "stopSequence" | "maxTokens" | string;
721725
}
722726

723727
/**
@@ -756,6 +760,77 @@ export interface ImageContent {
756760
mimeType: string;
757761
}
758762

763+
/**
764+
* The server's preferences for model selection, requested of the client during sampling.
765+
*
766+
* Because LLMs can vary along multiple dimensions, choosing the "best" model is
767+
* rarely straightforward. Different models excel in different areas—some are
768+
* faster but less capable, others are more capable but more expensive, and so
769+
* on. This interface allows servers to express their priorities across multiple
770+
* dimensions to help clients make an appropriate selection for their use case.
771+
*
772+
* These preferences are always advisory. The client MAY ignore them. It is also
773+
* up to the client to decide how to interpret these preferences and how to
774+
* balance them against other considerations.
775+
*/
776+
export interface ModelPreferences {
777+
/**
778+
* Optional string hints to use for model selection. How these hints are
779+
* interpreted depends on the key(s) in each record:
780+
*
781+
* - If the record contains a `name` key:
782+
* - The client SHOULD treat this as a substring of a model name; for example:
783+
* - `claude-3-5-sonnet` should match `claude-3-5-sonnet-20241022`
784+
* - `sonnet` should match `claude-3-5-sonnet-20241022`, `claude-3-sonnet-20240229`, etc.
785+
* - `claude` should match any Claude model
786+
* - The client MAY also map the string to a different provider's model name or a different model family, as long as it fills a similar niche; for example:
787+
* - `gemini-1.5-flash` could match `claude-3-haiku-20240307`
788+
*
789+
* All other keys are currently left unspecified by the spec and are up to the
790+
* client to interpret.
791+
*
792+
* If multiple hints are specified, the client MUST evaluate them in order
793+
* (such that the first match is taken).
794+
*
795+
* The client SHOULD prioritize these hints over the numeric priorities, but
796+
* MAY still use the priorities to select from ambiguous matches.
797+
*/
798+
hints?: Record<"name" | string, string>[];
799+
800+
/**
801+
* How much to prioritize cost when selecting a model. A value of 0 means cost
802+
* is not important, while a value of 1 means cost is the most important
803+
* factor.
804+
*
805+
* @TJS-type number
806+
* @minimum 0
807+
* @maximum 1
808+
*/
809+
costPriority?: number;
810+
811+
/**
812+
* How much to prioritize sampling speed (latency) when selecting a model. A
813+
* value of 0 means speed is not important, while a value of 1 means speed is
814+
* the most important factor.
815+
*
816+
* @TJS-type number
817+
* @minimum 0
818+
* @maximum 1
819+
*/
820+
speedPriority?: number;
821+
822+
/**
823+
* How much to prioritize intelligence and capabilities when selecting a
824+
* model. A value of 0 means intelligence is not important, while a value of 1
825+
* means intelligence is the most important factor.
826+
*
827+
* @TJS-type number
828+
* @minimum 0
829+
* @maximum 1
830+
*/
831+
intelligencePriority?: number;
832+
}
833+
759834
/* Autocomplete */
760835
/**
761836
* A request from the client to the server, to ask for completion options.

0 commit comments

Comments
 (0)