Unable to test openai/gpt-oss-120b via vllm

**Describe the bug**
`KeyError: 'content'`

**Expected behavior**
Can it skip chunks where content is not specified?

**Environment**
Include all relevant environment information:
1. OS [e.g. Ubuntu 20.04]: -
2. Python version [e.g. 3.12.2]: 3.10

**To Reproduce**
Exact steps to reproduce the behavior:
1. Run vllm on H100
```bash
vastai create instance <OFFER_ID> --image vllm/vllm-openai:gptoss --env '-p 8000:8000 --ipc=host --gpus all' --disk 8 --args --model "openai/gpt-oss-120b" --tensor-parallel-size 1
```
2. Run guidellm
```bash
export GUIDELLM__PREFERRED_ROUTE="chat_completions" && export GUIDELLM__OPENAI__MAX_OUTPUT_TOKENS=512 && export GUIDELLM__MAX_CONCURRENCY=233 && export GUIDELLM__REQUEST_TIMEOUT=300 && export GUIDELLM__PREFERRED_PROMPT_TOKENS_SOURCE=local && export GUIDELLM__PREFERRED_OUTPUT_TOKENS_SOURCE=local && guidellm benchmark --target http://89.25.97.3:11509 --rate-type sweep --rate 5 --model openai/gpt-oss-120b --processor openai/gpt-oss-120b --random-seed 2025 --max-requests=100 --data "prompt_tokens=4096,output_tokens=512" --backend-args '{"extra_body":{"chat_template_kwargs":{"enable_thinking":false}}}' --output-path "data/benchmarks.json"
```
**Errors**
```bash
Creating backend...
2025-08-15T09:46:20.216941+0000 | chat_completions | ERROR - OpenAIHTTPBackend request with headers: {'Content-Type': 'application/json'} and params: {} and payload: {'chat_template_kwargs': {'enable_thinking': False}, 'messages': [{'role': 'user', 'content': 'Test connection'}], 'model': 'openai/gpt-oss-120b', 'stream': True, 'stream_options': {'include_usage': True}, 'max_tokens': 1, 'max_completion_tokens': 1, 'stop': None, 'ignore_eos': True} failed: 'content'
Traceback (most recent call last):
  File "/home/user/.local/bin/guidellm", line 7, in <module>
    sys.exit(cli())
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1082, in main
    rv = self.invoke(ctx)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/__main__.py", line 314, in run
    asyncio.run(
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/benchmark/entrypoints.py", line 29, in benchmark_with_scenario
    return await benchmark_generative_text(**vars(scenario), **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/benchmark/entrypoints.py", line 71, in benchmark_generative_text
    await backend.validate()
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/backend/backend.py", line 138, in validate
    async for _ in self.chat_completions(  # type: ignore[attr-defined]
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/backend/openai.py", line 393, in chat_completions
    raise ex
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/backend/openai.py", line 374, in chat_completions
    async for resp in self._iterative_completions_request(
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/backend/openai.py", line 623, in _iterative_completions_request
    if delta := self._extract_completions_delta_content(type_, data):
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/backend/openai.py", line 691, in _extract_completions_delta_content
    return data["choices"][0]["delta"]["content"]
KeyError: 'content'
```

**Additional context**
Add any other context about the problem here. Also include any relevant files.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to test openai/gpt-oss-120b via vllm #281

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to test openai/gpt-oss-120b via vllm #281

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions