Removed llama-cpp-python as it still doesn't support sampling multiple responses

LuMarans30 · web-flow · commit f7ad74508913 · 2024-09-21T20:32:51.000+02:00
diff --git a/README.md b/README.md
@@ -69,7 +69,7 @@ python optillm.py
 > [!WARNING]
 > Note that llama-server currently does not support sampling multiple responses from a model, which limits the available approaches to the following:
 > `cot_reflection`, `leap`, `plansearch`, `rstar`, `rto`, `self_consistency`, and `z3`.
-> In order to use other approaches, consider using an alternative compatible server such as [ollama](https://github.com/ollama/ollama) or [llama-cpp-python](https://github.com/abetlen/llama-cpp-python).
+> In order to use other approaches, consider using an alternative compatible server such as [ollama](https://github.com/ollama/ollama).
 
 > [!NOTE]
 > You'll later need to specify a model name in the OpenAI client configuration. Since llama-server was started with a single model, you can choose any name you want.