-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Labels
Description
Hi,
I'm playing around with the temperature property when calling a model from the /chat/completions API, but I can't figure out how how to get some variance in the responses. I have the temperature set to 0.8.
Here's how I start the server:
./llamafile-server-0.4 -m models/starling-lm-7b-alpha.Q4_K_M.gguf --nobrowserAnd this is the way I call the API:
curl -s http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "gpt-3.5-turbo", "temperature": 0.8,
"messages": [
{
"role": "system",
"content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
},
{
"role": "user",
"content": "Compose a poem that explains the concept of recursion in programming. A maximum of 5 lines"
}
]
}' 2>/dev/null | jq -r '.choices[].message.content'And below is a video showing the same output 3 times.
2023-12-17_10-55-15.mp4
garyblankenship