Skip to content

Commit e5c7cd0

Browse files
authored
Merge pull request #6 from sammcj/docker
feat(docker,auth): Add Docker, Compose, Auth, Parameter envs
2 parents 04ee742 + a8bc862 commit e5c7cd0

File tree

4 files changed

+285
-69
lines changed

4 files changed

+285
-69
lines changed

Dockerfile

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# Build stage
2+
FROM python:3.12-slim AS builder
3+
4+
# Set working directory
5+
WORKDIR /app
6+
7+
# Install system dependencies
8+
RUN apt-get update && apt-get install -y --no-install-recommends \
9+
gcc libc6-dev \
10+
&& rm -rf /var/lib/apt/lists/*
11+
12+
# Copy only the requirements file first to leverage Docker cache
13+
COPY requirements.txt .
14+
15+
# Install Python dependencies
16+
RUN pip install --no-cache-dir -r requirements.txt
17+
18+
# Final stage
19+
FROM python:3.12-slim
20+
21+
# Install curl for the healthcheck
22+
RUN apt-get update && apt-get install -y --no-install-recommends \
23+
curl && \
24+
apt-get clean && rm -rf /var/lib/apt/lists/*
25+
26+
# Set working directory
27+
WORKDIR /app
28+
29+
# Copy installed dependencies from builder stage
30+
COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages
31+
COPY --from=builder /usr/local/bin /usr/local/bin
32+
33+
# Copy application code
34+
COPY . .
35+
36+
# Create a non-root user and switch to it
37+
RUN useradd -m appuser
38+
USER appuser
39+
40+
# Set environment variables
41+
ENV PYTHONUNBUFFERED=1
42+
43+
# Expose the port the app runs on
44+
EXPOSE 8000
45+
46+
# Run the application
47+
ENTRYPOINT ["python", "optillm.py"]

README.md

Lines changed: 93 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -6,15 +6,16 @@ optillm is an OpenAI API compatible optimizing inference proxy which implements
66

77
### plansearch-gpt-4o-mini on LiveCodeBench (Sep 2024)
88

9-
| Model | pass@1 | pass@5 | pass@10 |
10-
|-------|--------|--------|---------|
11-
| plansearch-gpt-4o-mini | 44.03 | 59.31 | 63.5 |
12-
| gpt-4o-mini | 43.9 | 50.61 | 53.25 |
13-
| claude-3.5-sonnet | 51.3 | | |
14-
| gpt-4o-2024-05-13 | 45.2 | | |
15-
| gpt-4-turbo-2024-04-09 | 44.2 | | |
9+
| Model | pass@1 | pass@5 | pass@10 |
10+
| ---------------------- | ------ | ------ | ------- |
11+
| plansearch-gpt-4o-mini | 44.03 | 59.31 | 63.5 |
12+
| gpt-4o-mini | 43.9 | 50.61 | 53.25 |
13+
| claude-3.5-sonnet | 51.3 | | |
14+
| gpt-4o-2024-05-13 | 45.2 | | |
15+
| gpt-4-turbo-2024-04-09 | 44.2 | | |
1616

1717
### moa-gpt-4o-mini on Arena-Hard-Auto (Aug 2024)
18+
1819
![Results showing Mixture of Agents approach using gpt-4o-mini on Arena Hard Auto Benchmark](./moa-results.png)
1920

2021
## Installation
@@ -32,7 +33,7 @@ pip install -r requirements.txt
3233
You can then run the optillm proxy as follows.
3334

3435
```bash
35-
python optillm.py
36+
python optillm.py
3637
2024-09-06 07:57:14,191 - INFO - Starting server with approach: auto
3738
2024-09-06 07:57:14,191 - INFO - Server configuration: {'approach': 'auto', 'mcts_simulations': 2, 'mcts_exploration': 0.2, 'mcts_depth': 1, 'best_of_n': 3, 'model': 'gpt-4o-mini', 'rstar_max_depth': 3, 'rstar_num_rollouts': 5, 'rstar_c': 1.4, 'base_url': ''}
3839
* Serving Flask app 'optillm'
@@ -44,11 +45,11 @@ python optillm.py
4445
2024-09-06 07:57:14,212 - INFO - Press CTRL+C to quit
4546
```
4647

47-
### Usage
48+
## Usage
4849

49-
Once the proxy is running, you can just use it as a drop in replacement for an OpenAI client by setting the `base_url` as `http://localhost:8000/v1`.
50+
Once the proxy is running, you can use it as a drop in replacement for an OpenAI client by setting the `base_url` as `http://localhost:8000/v1`.
5051

51-
```bash
52+
```python
5253
import os
5354
from openai import OpenAI
5455

@@ -70,7 +71,7 @@ response = client.chat.completions.create(
7071
print(response)
7172
```
7273

73-
You can control the technique you use for optimization by prepending the slug to the model name `{slug}-model-name`. E.g. in the above code we are using `moa` or
74+
You can control the technique you use for optimization by prepending the slug to the model name `{slug}-model-name`. E.g. in the above code we are using `moa` or
7475
mixture of agents as the optimization approach. In the proxy logs you will see the following showing the `moa` is been used with the base model as `gpt-4o-mini`.
7576

7677
```bash
@@ -83,20 +84,86 @@ mixture of agents as the optimization approach. In the proxy logs you will see t
8384

8485
## Implemented techniques
8586

86-
| Technique | Slug | Description |
87-
|-----------|----------------|-------------|
88-
| Agent | `agent ` | Determines which of the below approaches to take and then combines the results |
89-
| Monte Carlo Tree Search | `mcts` | Uses MCTS for decision-making in chat responses |
90-
| Best of N Sampling | `bon` | Generates multiple responses and selects the best one |
91-
| Mixture of Agents | `moa` | Combines responses from multiple critiques |
92-
| Round Trip Optimization | `rto` | Optimizes responses through a round-trip process |
93-
| Z3 Solver | `z3` | Utilizes the Z3 theorem prover for logical reasoning |
94-
| Self-Consistency | `self_consistency` | Implements an advanced self-consistency method |
95-
| PV Game | `pvg` | Applies a prover-verifier game approach at inference time |
96-
| R* Algorithm | `rstar` | Implements the R* algorithm for problem-solving |
97-
| CoT with Reflection | `cot_reflection` | Implements chain-of-thought reasoning with \<thinking\>, \<reflection> and \<output\> sections |
98-
| PlanSearch | `plansearch` | Implements a search algorithm over candidate plans for solving a problem in natural language |
99-
| LEAP | `leap` | Learns task-specific principles from few shot examples |
87+
| Technique | Slug | Description |
88+
| ----------------------- | ------------------ | ---------------------------------------------------------------------------------------------- |
89+
| Agent | `agent` | Determines which of the below approaches to take and then combines the results |
90+
| Monte Carlo Tree Search | `mcts` | Uses MCTS for decision-making in chat responses |
91+
| Best of N Sampling | `bon` | Generates multiple responses and selects the best one |
92+
| Mixture of Agents | `moa` | Combines responses from multiple critiques |
93+
| Round Trip Optimization | `rto` | Optimizes responses through a round-trip process |
94+
| Z3 Solver | `z3` | Utilizes the Z3 theorem prover for logical reasoning |
95+
| Self-Consistency | `self_consistency` | Implements an advanced self-consistency method |
96+
| PV Game | `pvg` | Applies a prover-verifier game approach at inference time |
97+
| R* Algorithm | `rstar` | Implements the R* algorithm for problem-solving |
98+
| CoT with Reflection | `cot_reflection` | Implements chain-of-thought reasoning with \<thinking\>, \<reflection> and \<output\> sections |
99+
| PlanSearch | `plansearch` | Implements a search algorithm over candidate plans for solving a problem in natural language |
100+
| LEAP | `leap` | Learns task-specific principles from few shot examples |
101+
102+
## Available Parameters
103+
104+
optillm supports various command-line arguments and environment variables for configuration.
105+
106+
| Parameter | Description | Default Value |
107+
|--------------------------|-----------------------------------------------------------------|-----------------|
108+
| `--approach` | Inference approach to use | `"auto"` |
109+
| `--simulations` | Number of MCTS simulations | 2 |
110+
| `--exploration` | Exploration weight for MCTS | 0.2 |
111+
| `--depth` | Simulation depth for MCTS | 1 |
112+
| `--best-of-n` | Number of samples for best_of_n approach | 3 |
113+
| `--model` | OpenAI model to use | `"gpt-4o-mini"` |
114+
| `--base-url` | Base URL for OpenAI compatible endpoint | `""` |
115+
| `--rstar-max-depth` | Maximum depth for rStar algorithm | 3 |
116+
| `--rstar-num-rollouts` | Number of rollouts for rStar algorithm | 5 |
117+
| `--rstar-c` | Exploration constant for rStar algorithm | 1.4 |
118+
| `--n` | Number of final responses to be returned | 1 |
119+
| `--return-full-response` | Return the full response including the CoT with <thinking> tags | `False` |
120+
| `--port` | Specify the port to run the proxy | 8000 |
121+
| `--api-key` | Optional API key for client authentication to optillm | `""` |
122+
123+
When using Docker, these can be set as environment variables prefixed with `OPTILLM_`.
124+
125+
## Running with Docker
126+
127+
optillm can optionally be built and run using Docker and the provided [Dockerfile](./Dockerfile).
128+
129+
### Using Docker Compose
130+
131+
1. Make sure you have Docker and Docker Compose installed on your system.
132+
133+
2. Either update the environment variables in the docker-compose.yaml file or create a `.env` file in the project root directory and add any environment variables you want to set. For example, to set the OpenAI API key, add the following line to the `.env` file:
134+
135+
```bash
136+
OPENAI_API_KEY=your_openai_api_key_here
137+
```
138+
139+
3. Run the following command to start optillm:
140+
141+
```bash
142+
docker compose up -d
143+
```
144+
145+
This will build the Docker image if it doesn't exist and start the optillm service.
146+
147+
4. optillm will be available at `http://localhost:8000`.
148+
149+
When using Docker, you can set these parameters as environment variables. For example, to set the approach and model, you would use:
150+
151+
```bash
152+
OPTILLM_APPROACH=mcts
153+
OPTILLM_MODEL=gpt-4
154+
```
155+
156+
To secure the optillm proxy with an API key, set the `OPTILLM_API_KEY` environment variable:
157+
158+
```bash
159+
OPTILLM_API_KEY=your_secret_api_key
160+
```
161+
162+
When the API key is set, clients must include it in their requests using the `Authorization` header:
163+
164+
```plain
165+
Authorization: Bearer your_secret_api_key
166+
```
100167

101168
## References
102169

docker-compose.yaml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
services:
2+
&name optillm:
3+
build:
4+
context: https://github.com/codelion/optillm.git#main
5+
# context: .
6+
dockerfile: Dockerfile
7+
tags:
8+
- optillm:latest
9+
image: optillm:latest
10+
container_name: *name
11+
hostname: *name
12+
ports:
13+
- "8000:8000"
14+
environment:
15+
OPENAI_API_KEY: ${OPENAI_API_KEY:-""}
16+
OPTILLM_BASE_URL: ${OPENAI_BASE_URL:-"https://api.openai.com/v1"}
17+
# OPTILLM_API_KEY: ${OPTILLM_API_KEY:-} # optionally sets an API key for Optillm clients
18+
# Uncomment and set values for other arguments (prefixed with OPTILLM_) as needed, e.g.:
19+
# OPTILLM_APPROACH: auto
20+
# OPTILLM_MODEL: gpt-4o-mini
21+
# OPTILLM_SIMULATIONS: 2
22+
# OPTILLM_EXPLORATION: 0.2
23+
# OPTILLM_DEPTH: 1
24+
# OPTILLM_BEST_OF_N: 3
25+
# OPTILLM_RSTAR_MAX_DEPTH: 3
26+
# OPTILLM_RSTAR_NUM_ROLLOUTS: 5
27+
# OPTILLM_RSTAR_C: 1.4
28+
# OPTILLM_N: 1
29+
# OPTILLM_RETURN_FULL_RESPONSE: false
30+
# OPTILLM_PORT: 8000
31+
restart: on-failure
32+
stop_grace_period: 2s
33+
healthcheck:
34+
test: ["CMD", "curl", "-f", "http://127.0.0.1:8000/health"]
35+
interval: 30s
36+
timeout: 5s
37+
retries: 3
38+
start_period: 3s

0 commit comments

Comments
 (0)