Skip to content

No variance in response from /chat/completions #117

@mneedham

Description

@mneedham

Hi,

I'm playing around with the temperature property when calling a model from the /chat/completions API, but I can't figure out how how to get some variance in the responses. I have the temperature set to 0.8.

Here's how I start the server:

./llamafile-server-0.4 -m models/starling-lm-7b-alpha.Q4_K_M.gguf --nobrowser

And this is the way I call the API:

 curl -s http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
  "model": "gpt-3.5-turbo", "temperature": 0.8,
  "messages": [
    {
      "role": "system",
      "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
    },
    {
      "role": "user",
      "content": "Compose a poem that explains the concept of recursion in programming. A maximum of 5 lines"
    }
  ]
}' 2>/dev/null | jq -r '.choices[].message.content'

And below is a video showing the same output 3 times.

2023-12-17_10-55-15.mp4

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions