You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/spec/sampling.md
+42-5Lines changed: 42 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,13 +35,17 @@ A Sampling Request in the Model Context Protocol (MCP) represents a request from
35
35
36
36
Message content can be either text or images, allowing for multimodal interactions where supported by the model. Text content is provided directly as strings, while image content must be base64 encoded with an appropriate MIME type.
37
37
38
+
### Model Preferences
39
+
40
+
Servers can express preferences for model selection using the `ModelPreferences` object. This allows servers to indicate priorities for factors like cost, speed, and intelligence, as well as provide hints for specific models.
41
+
38
42
## Use Cases
39
43
40
44
Common use cases for sampling include generating responses in chat interfaces, code completion, and content generation. Here are some example sampling scenarios:
41
45
42
46
### Chat Response
43
47
44
-
A server requesting a chat response:
48
+
A server requesting a chat response with model preferences:
45
49
46
50
```json
47
51
{
@@ -55,7 +59,16 @@ A server requesting a chat response:
55
59
}
56
60
],
57
61
"maxTokens": 100,
58
-
"temperature": 0.7
62
+
"temperature": 0.7,
63
+
"modelPreferences": {
64
+
"hints": [
65
+
{
66
+
"name": "claude-3-sonnet"
67
+
}
68
+
],
69
+
"intelligencePriority": 0.8,
70
+
"speedPriority": 0.5
71
+
}
59
72
}
60
73
```
61
74
@@ -82,7 +95,18 @@ A server requesting analysis of an image:
82
95
}
83
96
}
84
97
],
85
-
"maxTokens": 200
98
+
"maxTokens": 200,
99
+
"modelPreferences": {
100
+
"hints": [
101
+
{
102
+
"name": "claude-3-opus"
103
+
},
104
+
{
105
+
"name": "claude-3-sonnet"
106
+
}
107
+
],
108
+
"intelligencePriority": 1.0
109
+
}
86
110
}
87
111
```
88
112
@@ -126,6 +150,7 @@ To request sampling from an LLM via the client, the server MUST send a `sampling
126
150
Method: `sampling/createMessage`
127
151
Params:
128
152
-`messages`: Array of `SamplingMessage` objects representing the conversation history
153
+
-`modelPreferences`: Optional `ModelPreferences` object to guide model selection
129
154
-`systemPrompt`: Optional system prompt to use
130
155
-`includeContext`: Optional request to include context from MCP servers
Copy file name to clipboardExpand all lines: schema/schema.json
+40-8Lines changed: 40 additions & 8 deletions
Original file line number
Diff line number
Diff line change
@@ -290,6 +290,10 @@
290
290
"properties": {},
291
291
"type": "object"
292
292
},
293
+
"modelPreferences": {
294
+
"$ref": "#/definitions/ModelPreferences",
295
+
"description": "The server's preferences for which model to select. The client MAY ignore these preferences."
296
+
},
293
297
"stopSequences": {
294
298
"items": {
295
299
"type": "string"
@@ -347,20 +351,14 @@
347
351
"type": "string"
348
352
},
349
353
"stopReason": {
350
-
"description": "The reason why sampling stopped.",
351
-
"enum": [
352
-
"endTurn",
353
-
"maxTokens",
354
-
"stopSequence"
355
-
],
354
+
"description": "The reason why sampling stopped, if known.",
356
355
"type": "string"
357
356
}
358
357
},
359
358
"required": [
360
359
"content",
361
360
"model",
362
-
"role",
363
-
"stopReason"
361
+
"role"
364
362
],
365
363
"type": "object"
366
364
},
@@ -971,6 +969,37 @@
971
969
],
972
970
"type": "object"
973
971
},
972
+
"ModelPreferences": {
973
+
"description": "The server's preferences for model selection, requested of the client during sampling.\n\nBecause LLMs can vary along multiple dimensions, choosing the \"best\" model is\nrarely straightforward. Different models excel in different areas—some are\nfaster but less capable, others are more capable but more expensive, and so\non. This interface allows servers to express their priorities across multiple\ndimensions to help clients make an appropriate selection for their use case.\n\nThese preferences are always advisory. The client MAY ignore them. It is also\nup to the client to decide how to interpret these preferences and how to\nbalance them against other considerations.",
974
+
"properties": {
975
+
"costPriority": {
976
+
"description": "How much to prioritize cost when selecting a model. A value of 0 means cost\nis not important, while a value of 1 means cost is the most important\nfactor.",
977
+
"maximum": 1,
978
+
"minimum": 0,
979
+
"type": "number"
980
+
},
981
+
"hints": {
982
+
"description": "Optional string hints to use for model selection. How these hints are\ninterpreted depends on the key(s) in each record:\n\n- If the record contains a `name` key:\n - The client SHOULD treat this as a substring of a model name; for example:\n - `claude-3-5-sonnet` should match `claude-3-5-sonnet-20241022`\n - `sonnet` should match `claude-3-5-sonnet-20241022`, `claude-3-sonnet-20240229`, etc.\n - `claude` should match any Claude model\n - The client MAY also map the string to a different provider's model name or a different model family, as long as it fills a similar niche; for example:\n - `gemini-1.5-flash` could match `claude-3-haiku-20240307`\n\nAll other keys are currently left unspecified by the spec and are up to the\nclient to interpret.\n\nIf multiple hints are specified, the client MUST evaluate them in order\n(such that the first match is taken).\n\nThe client SHOULD prioritize these hints over the numeric priorities, but\nMAY still use the priorities to select from ambiguous matches.",
983
+
"items": {
984
+
"$ref": "#/definitions/Record<string,string>"
985
+
},
986
+
"type": "array"
987
+
},
988
+
"intelligencePriority": {
989
+
"description": "How much to prioritize intelligence and capabilities when selecting a\nmodel. A value of 0 means intelligence is not important, while a value of 1\nmeans intelligence is the most important factor.",
990
+
"maximum": 1,
991
+
"minimum": 0,
992
+
"type": "number"
993
+
},
994
+
"speedPriority": {
995
+
"description": "How much to prioritize sampling speed (latency) when selecting a model. A\nvalue of 0 means speed is not important, while a value of 1 means speed is\nthe most important factor.",
* The server's preferences for model selection, requested of the client during sampling.
765
+
*
766
+
* Because LLMs can vary along multiple dimensions, choosing the "best" model is
767
+
* rarely straightforward. Different models excel in different areas—some are
768
+
* faster but less capable, others are more capable but more expensive, and so
769
+
* on. This interface allows servers to express their priorities across multiple
770
+
* dimensions to help clients make an appropriate selection for their use case.
771
+
*
772
+
* These preferences are always advisory. The client MAY ignore them. It is also
773
+
* up to the client to decide how to interpret these preferences and how to
774
+
* balance them against other considerations.
775
+
*/
776
+
exportinterfaceModelPreferences{
777
+
/**
778
+
* Optional string hints to use for model selection. How these hints are
779
+
* interpreted depends on the key(s) in each record:
780
+
*
781
+
* - If the record contains a `name` key:
782
+
* - The client SHOULD treat this as a substring of a model name; for example:
783
+
* - `claude-3-5-sonnet` should match `claude-3-5-sonnet-20241022`
784
+
* - `sonnet` should match `claude-3-5-sonnet-20241022`, `claude-3-sonnet-20240229`, etc.
785
+
* - `claude` should match any Claude model
786
+
* - The client MAY also map the string to a different provider's model name or a different model family, as long as it fills a similar niche; for example:
787
+
* - `gemini-1.5-flash` could match `claude-3-haiku-20240307`
788
+
*
789
+
* All other keys are currently left unspecified by the spec and are up to the
790
+
* client to interpret.
791
+
*
792
+
* If multiple hints are specified, the client MUST evaluate them in order
793
+
* (such that the first match is taken).
794
+
*
795
+
* The client SHOULD prioritize these hints over the numeric priorities, but
796
+
* MAY still use the priorities to select from ambiguous matches.
797
+
*/
798
+
hints?: Record<"name"|string,string>[];
799
+
800
+
/**
801
+
* How much to prioritize cost when selecting a model. A value of 0 means cost
802
+
* is not important, while a value of 1 means cost is the most important
803
+
* factor.
804
+
*
805
+
* @TJS-type number
806
+
* @minimum 0
807
+
* @maximum 1
808
+
*/
809
+
costPriority?: number;
810
+
811
+
/**
812
+
* How much to prioritize sampling speed (latency) when selecting a model. A
813
+
* value of 0 means speed is not important, while a value of 1 means speed is
814
+
* the most important factor.
815
+
*
816
+
* @TJS-type number
817
+
* @minimum 0
818
+
* @maximum 1
819
+
*/
820
+
speedPriority?: number;
821
+
822
+
/**
823
+
* How much to prioritize intelligence and capabilities when selecting a
824
+
* model. A value of 0 means intelligence is not important, while a value of 1
825
+
* means intelligence is the most important factor.
826
+
*
827
+
* @TJS-type number
828
+
* @minimum 0
829
+
* @maximum 1
830
+
*/
831
+
intelligencePriority?: number;
832
+
}
833
+
759
834
/* Autocomplete */
760
835
/**
761
836
* A request from the client to the server, to ask for completion options.
0 commit comments