Skip to content

Commit 967337c

Browse files
authored
Run models in parallel during benchmark (#53)
* Run models in parallel during benchmark * Updating eval docs * Formatting * Unused context * Handling race conditions
1 parent 58bfd43 commit 967337c

File tree

3 files changed

+286
-51
lines changed

3 files changed

+286
-51
lines changed

docs/evals.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,8 @@ This installs:
6767
| `--azure-api-version` || Azure OpenAI API version (default: 2025-01-01-preview) |
6868
| `--models` || Models for benchmark mode (benchmark only) |
6969
| `--latency-iterations` || Latency test samples (default: 25) (benchmark only) |
70+
| `--max-parallel-models` || Maximum number of models to benchmark concurrently (default: max(1, min(model_count, cpu_count))) (benchmark only) |
71+
| `--benchmark-chunk-size` || Optional number of samples per chunk when benchmarking to limit long-running runs (benchmark only) |
7072

7173
## Configuration
7274

@@ -205,6 +207,8 @@ guardrails-evals \
205207
- **Automatic stage detection**: Evaluates all stages found in configuration
206208
- **Batch processing**: Configurable parallel processing
207209
- **Benchmark mode**: Model performance comparison with ROC AUC, precision at recall thresholds
210+
- **Parallel benchmarking**: Run multiple models concurrently (defaults to CPU count)
211+
- **Benchmark chunking**: Process large datasets in chunks for better progress tracking
208212
- **Latency testing**: End-to-end guardrail performance measurement
209213
- **Visualization**: Automatic chart and graph generation
210214
- **Multi-provider support**: OpenAI, Azure OpenAI, Ollama, vLLM, and other OpenAI-compatible APIs

0 commit comments

Comments
 (0)