Skip to content

Unable to test openai/gpt-oss-120b via vllm #281

@psydok

Description

@psydok

Describe the bug
KeyError: 'content'

Expected behavior
Can it skip chunks where content is not specified?

Environment
Include all relevant environment information:

  1. OS [e.g. Ubuntu 20.04]: -
  2. Python version [e.g. 3.12.2]: 3.10

To Reproduce
Exact steps to reproduce the behavior:

  1. Run vllm on H100
vastai create instance <OFFER_ID> --image vllm/vllm-openai:gptoss --env '-p 8000:8000 --ipc=host --gpus all' --disk 8 --args --model "openai/gpt-oss-120b" --tensor-parallel-size 1
  1. Run guidellm
export GUIDELLM__PREFERRED_ROUTE="chat_completions" && export GUIDELLM__OPENAI__MAX_OUTPUT_TOKENS=512 && export GUIDELLM__MAX_CONCURRENCY=233 && export GUIDELLM__REQUEST_TIMEOUT=300 && export GUIDELLM__PREFERRED_PROMPT_TOKENS_SOURCE=local && export GUIDELLM__PREFERRED_OUTPUT_TOKENS_SOURCE=local && guidellm benchmark --target http://89.25.97.3:11509 --rate-type sweep --rate 5 --model openai/gpt-oss-120b --processor openai/gpt-oss-120b --random-seed 2025 --max-requests=100 --data "prompt_tokens=4096,output_tokens=512" --backend-args '{"extra_body":{"chat_template_kwargs":{"enable_thinking":false}}}' --output-path "data/benchmarks.json"

Errors

Creating backend...
2025-08-15T09:46:20.216941+0000 | chat_completions | ERROR - OpenAIHTTPBackend request with headers: {'Content-Type': 'application/json'} and params: {} and payload: {'chat_template_kwargs': {'enable_thinking': False}, 'messages': [{'role': 'user', 'content': 'Test connection'}], 'model': 'openai/gpt-oss-120b', 'stream': True, 'stream_options': {'include_usage': True}, 'max_tokens': 1, 'max_completion_tokens': 1, 'stop': None, 'ignore_eos': True} failed: 'content'
Traceback (most recent call last):
  File "/home/user/.local/bin/guidellm", line 7, in <module>
    sys.exit(cli())
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1082, in main
    rv = self.invoke(ctx)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/user/.local/lib/python3.10/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/__main__.py", line 314, in run
    asyncio.run(
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/benchmark/entrypoints.py", line 29, in benchmark_with_scenario
    return await benchmark_generative_text(**vars(scenario), **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/benchmark/entrypoints.py", line 71, in benchmark_generative_text
    await backend.validate()
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/backend/backend.py", line 138, in validate
    async for _ in self.chat_completions(  # type: ignore[attr-defined]
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/backend/openai.py", line 393, in chat_completions
    raise ex
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/backend/openai.py", line 374, in chat_completions
    async for resp in self._iterative_completions_request(
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/backend/openai.py", line 623, in _iterative_completions_request
    if delta := self._extract_completions_delta_content(type_, data):
  File "/home/user/.local/lib/python3.10/site-packages/guidellm/backend/openai.py", line 691, in _extract_completions_delta_content
    return data["choices"][0]["delta"]["content"]
KeyError: 'content'

Additional context
Add any other context about the problem here. Also include any relevant files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions