File tree Expand file tree Collapse file tree 3 files changed +10
-2
lines changed Expand file tree Collapse file tree 3 files changed +10
-2
lines changed Original file line number Diff line number Diff line change @@ -179,6 +179,7 @@ PUT _inference/text_embedding/my-e5-model
179179 "min_number_of_allocations": 3,
180180 "max_number_of_allocations": 10
181181 },
182+ "num_threads": 1,
182183 "model_id": ".multilingual-e5-small"
183184 }
184185}
Original file line number Diff line number Diff line change @@ -147,7 +147,8 @@ PUT _inference/sparse_embedding/my-elser-model
147147 "enabled": true,
148148 "min_number_of_allocations": 3,
149149 "max_number_of_allocations": 10
150- }
150+ },
151+ "num_threads": 1
151152 }
152153}
153154------------------------------------------------------------
Original file line number Diff line number Diff line change @@ -36,7 +36,11 @@ PUT _inference/sparse_embedding/my-elser-endpoint <1>
3636{
3737 "service": "elser", <2>
3838 "service_settings": {
39- "num_allocations": 1,
39+ "adaptive_allocations": { <3>
40+ "enabled": true,
41+ "min_number_of_allocations": 3,
42+ "max_number_of_allocations": 10
43+ },
4044 "num_threads": 1
4145 }
4246}
@@ -46,6 +50,8 @@ PUT _inference/sparse_embedding/my-elser-endpoint <1>
4650be used and ELSER creates sparse vectors. The `inference_id` is
4751`my-elser-endpoint`.
4852<2> The `elser` service is used in this example.
53+ <3> This setting enables and configures {ml-docs}/ml-nlp-elser.html#elser-adaptive-allocations[adaptive allocations].
54+ Adaptive allocations make it possible for ELSER to automatically scale up or down resources based on the current load on the process.
4955
5056[NOTE]
5157====
You can’t perform that action at this time.
0 commit comments