From 2e4f4a6d26286292f742c8a84931a501f06e5d73 Mon Sep 17 00:00:00 2001 From: virtualramblas Date: Thu, 19 Sep 2024 11:59:12 +0100 Subject: [PATCH] Added note to the Readme file about the correct way to set the model attribute in a client request. --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 85813e0c..215877b3 100644 --- a/README.md +++ b/README.md @@ -79,7 +79,7 @@ print(response) ``` The code above applies to both OpenAI and Azure OpenAI, just remember to populate the `OPENAI_API_KEY` env variable with the proper key. You can control the technique you use for optimization by prepending the slug to the model name `{slug}-model-name`. E.g. in the above code we are using `moa` or -mixture of agents as the optimization approach. In the proxy logs you will see the following showing the `moa` is been used with the base model as `gpt-4o-mini`. +mixture of agents as the optimization approach. In the proxy logs you will see the following showing the `moa` is been used with the base model as `gpt-4o-mini`. ```bash 2024-09-06 08:35:32,597 - INFO - Using approach moa, with gpt-4o-mini @@ -89,6 +89,8 @@ mixture of agents as the optimization approach. In the proxy logs you will see t 2024-09-06 08:35:44,797 - INFO - 127.0.0.1 - - [06/Sep/2024 08:35:44] "POST /v1/chat/completions HTTP/1.1" 200 - ``` +Please note that the naming convention described above for the `model` attribute works only when the optillm server has been started with inference approach set to `auto`. Otherwise, the `model` attribute in the client request must be set with the model name only. + optillm is a transparent proxy and will work with any LLM API or provider that has an OpenAI API compatible chat completions endpoint, and in turn, optillm also exposes the same OpenAI API comptaible chat completions endpoint. This should allow you to integrate it into any existing tools or frameworks easily. If the LLM you want to use doesn't have an OpenAI API comptaible endpoint (like Google or Anthropic) you can use [LiteLLM proxy server](https://docs.litellm.ai/docs/proxy/quick_start) that supports most LLMs.