Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
04257d5
Update README
romainhuet Aug 5, 2025
0f03367
Try fix pypi ci (#13)
zhuohan123 Aug 5, 2025
f615ce3
fix: Correct broken links in awesome-gpt-oss.md (#12)
mkusaka Aug 5, 2025
08e50b3
Python Agents SDK Example (#14)
sumitaryal Aug 5, 2025
9e5b841
readme: fix python tool ref (#10)
hewliyang Aug 5, 2025
3e3c828
Merge pull request #16 from openai/zhuohan/fix-pypi-ci
zhuohan123 Aug 5, 2025
1e47b70
docs: Fix another extra "= messages" (#7)
mmangkad Aug 5, 2025
51bfa9e
Fix typos and grammar in README (#6)
rmscode Aug 5, 2025
9074326
Update LICENSE
dkundel-openai Aug 5, 2025
89fe402
Add comprehensive test suite for Responses API (#20)
micic-mihajlo Aug 5, 2025
8fe4ee2
Fix import for metal example (#24)
jackos Aug 5, 2025
246e377
Add some additional links to awesome-gpt-oss.md (#22)
nsburbank Aug 5, 2025
0a8f5f2
Correct small grammar issues for better comprehension (#21)
cwhitelam Aug 5, 2025
3a68b4f
fix: Correct multiple documentation URLs (#17)
mkusaka Aug 5, 2025
ba7d80a
Fix chat demo (#26)
zhuohan123 Aug 5, 2025
a6d9d90
set plataform for CI porpuses (#18)
draczer01 Aug 5, 2025
d8db548
Fix TOML parsing errors in pyproject.toml for scikit-build configurat…
enochkan Aug 5, 2025
f1774c5
fix ci/pypi (#30)
scott-oai Aug 6, 2025
4931694
fix build
scott-oai Aug 6, 2025
754a56b
evals: add chat completions API sampler (#59)
volsgd Aug 6, 2025
d0a300a
evals: log reasoning and extend max_tokens for chat completions (#62)
volsgd Aug 6, 2025
4f5ca7f
chat / api_server: do not include developer messages to reduce mismat…
volsgd Aug 7, 2025
4d514dd
Fix typos 'lenght' -> 'length' (#78)
bodoque007 Aug 7, 2025
9568e6e
fix f string errors in streamlit chat (#73)
NinoRisteski Aug 7, 2025
e490130
Fixing typos and grammatical improvements. (#72)
IgnacioCorrecher Aug 7, 2025
98f62cc
fix: max_tokens handling in generate.py (#70)
Mirza-Samad-Ahmed-Baig Aug 7, 2025
7e64492
Support concurrent sampling from multiple Contexts (#83)
Maratyszcza Aug 7, 2025
ec7914d
Update README.md
dkundel-openai Aug 8, 2025
4589fbb
fix packaging (#90)
LucasWilkinson Aug 8, 2025
1a9e106
Update README.md (#29)
shoumikhin Aug 10, 2025
7ba69ff
Update README.md (#58)
hemenduroy Aug 10, 2025
220a058
Fix typos and improve grammar in README (#61)
hasanerdemak Aug 10, 2025
954f47f
Update README.md (#71)
palenciavik Aug 10, 2025
f409636
Update README.md (#87)
genmnz Aug 10, 2025
82a3bad
Update README.md (#41)
Buddhsen-tripathi Aug 10, 2025
d7f9708
fix: typos across the codebase (#69)
bigint Aug 10, 2025
0d45dfd
[MINOR] fix: correct spelling error from "wnat" to "want" (#99)
jjestrada2 Aug 10, 2025
c77966f
a few typo fixes. (#102)
fujitatomoya Aug 10, 2025
79eaf7f
Add API compatibility test (#114)
dkundel-openai Aug 11, 2025
a4f98ea
Update awesome-gpt-oss.md
dkundel-openai Aug 11, 2025
e73da24
fix: Add channel parameter to PythonTool response handling (#33)
JustinTong0323 Aug 12, 2025
4a8a22e
Fix: Corrected typos across 3 files in gpt-oss directory (#115)
CivaaBTW Aug 12, 2025
3e8be30
fix editable build (#113)
heheda12345 Aug 12, 2025
0c83ebe
docs: add table of contents to README.md (#106)
OkeyAmy Aug 12, 2025
9dd466c
fix: Markdown linting and cleanup (#107)
OkeyAmy Aug 12, 2025
750cfe9
docs: add docstrings to utility and helper functions (#97)
adarsh-crafts Aug 12, 2025
1dcd7d0
feat: Add Gradio chat interface example (#89)
harshalmore31 Aug 12, 2025
4195fb3
Feat: add command-line arguments for backend parameters (#86)
SyedaAnshrahGillani Aug 12, 2025
421dbe9
added GPTOSS_BUILD_METAL=1 for metal. (#84)
xiejw Aug 12, 2025
83e1b36
chore: remove unused WeatherParams class and import (#82)
adarsh-crafts Aug 12, 2025
1246ff8
refactor: rename search_tool for clarity (#81)
adarsh-crafts Aug 12, 2025
359b3ff
fix invalid import in build-system-prompt.py (#32)
Om-Alve Aug 12, 2025
1000112
Update simple_browser_tool.py (#40)
Shubhankar-Dixit Aug 12, 2025
fa67988
triton implementation need install triton_kernels (#45)
sBobHuang Aug 12, 2025
906a0ef
bump version
dkundel-openai Aug 12, 2025
a02c2ce
Update README.md
dkundel-openai Aug 13, 2025
f8d21ad
fix streamlit & ollama demo. Add python tool (#131)
dkundel-openai Aug 13, 2025
65b3d6b
Add some links to awesome-gpt-oss.md (#28)
hiyouga Aug 14, 2025
53efd59
fix: fix f-string unmatched '(' bug in streamlit_chat.py (#31)
liuzhiqi71 Aug 14, 2025
f018fab
Fix start_q use in upper bound calculation (#136)
peterbell10 Aug 15, 2025
cf427a6
Process tokens in Context lazily (#138)
Maratyszcza Aug 15, 2025
11c01b2
Create CODEOWNERS
dkundel-openai Aug 18, 2025
56930eb
Replace '/' with '__' in model names (#142)
simonw Aug 18, 2025
64f8a4b
Rename `with_browser` to `with_browser_tool` in README (#140)
xiaohk Aug 18, 2025
69a0b1c
Update attention kernel to use TensorDescriptor (#137)
peterbell10 Aug 18, 2025
995e148
feat(metail): Parallelize SDPA across multiple simdgroups (#144)
Maratyszcza Aug 18, 2025
352cd3c
chore: release 0.0.4 (#145)
dkundel-openai Aug 18, 2025
18fd187
Update awesome-gpt-oss.md with llama.cpp (#148)
dkundel-openai Aug 19, 2025
dbb76fa
Update README.md (#154)
dkundel-openai Aug 26, 2025
5ec1d16
Added Tensorfuse (AWS) guide (#118)
samagra14 Aug 28, 2025
a19d0bc
Add Lemonade to `awesome-gpt-oss` (#117)
danielholanda Aug 28, 2025
0c39f1d
Add uv python backend (#156)
heheda12345 Aug 28, 2025
7be9334
Update pyproject.toml
dkundel-openai Aug 28, 2025
8ee92ec
Metal: add end-to-end benchmarks (#161)
Maratyszcza Sep 2, 2025
57e45b1
Metal: simplify and optimize Reponses API adapter (#162)
Maratyszcza Sep 2, 2025
38df14a
Metal: fix KV-cache invalidation after reset+append (#163)
Maratyszcza Sep 2, 2025
24804a6
Increase max output tokens in Reponses API to 131K (#165)
Maratyszcza Sep 2, 2025
942ef44
Remove requirement on maximum Python version (#167)
Maratyszcza Sep 2, 2025
a8ce88f
Move Lemonade to AMD section of `awesome-gpt-oss` (#164)
danielholanda Sep 2, 2025
864020a
Added VLLM Offline Serve working code. (#150)
hrithiksagar-tih Sep 2, 2025
95d7716
Metal: indicate threadgroup is a multiple of simdgroup (#168)
Maratyszcza Sep 3, 2025
7f3c896
Metal: mlock model weights in memory (#170)
Maratyszcza Sep 3, 2025
a0a8427
Add You.com as tool for browser (#171)
bojanbabic Sep 3, 2025
b558ecc
Evals: correctly pass temperature/max_tokens when using Responses API…
Maratyszcza Sep 8, 2025
be0d32e
Metal: move sampling to GPU (#175)
Maratyszcza Sep 8, 2025
f2a1458
Metal: benchmark generation of 100 tokens instead of 1 (#178)
Maratyszcza Sep 8, 2025
152fc0c
Metal: support generating multiple tokens at once (#179)
Maratyszcza Sep 9, 2025
1b5b45a
Adding prefill benchmarking for metal backend (#181)
ibahmed-oai Sep 10, 2025
0b1fb06
Metal: tune threadgroup sizes (#180)
Maratyszcza Sep 10, 2025
bbc5c48
Metal: Adding optimized dense matmul kernel to optimize prefill perf …
ibahmed-oai Sep 11, 2025
35eb3cc
Metal: fused QKV projection (matmul+RoPE+KV cache init) kernel (#184)
Maratyszcza Sep 11, 2025
758e904
Create devcontainer.json
shaeenhaque Sep 13, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"image": "mcr.microsoft.com/devcontainers/universal:2",
"features": {}
}
5 changes: 5 additions & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
@openai/developer-experience
dkundel-openai
Maratyszcza
scott-oai
volsgd
367 changes: 184 additions & 183 deletions LICENSE

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
recursive-include _build *
206 changes: 161 additions & 45 deletions README.md

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions _build/gpt_oss_build_backend/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""In-tree PEP 517 backend package for gpt-oss."""
140 changes: 140 additions & 0 deletions _build/gpt_oss_build_backend/backend.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
"""
Build backend for gpt-oss that supports two modes:

1) Default (pure wheel for PyPI)
- Delegates to setuptools.build_meta.
- Produces a py3-none-any wheel so PyPI accepts it (no linux_x86_64 tag).

2) Optional Metal/C extension build (local only)
- If the environment variable GPTOSS_BUILD_METAL is set to a truthy value
(1/true/on/yes), delegates to scikit_build_core.build.
- Dynamically injects build requirements (scikit-build-core, cmake, ninja,
pybind11) only for this mode.

Why this is needed
- PyPI rejects Linux wheels tagged linux_x86_64; manylinux/musllinux is required
for binary wheels. We ship a pure wheel by default, but still allow developers
to build/install the native Metal backend locally when needed.

Typical usage
- Publish pure wheel: `python -m build` (do not set GPTOSS_BUILD_METAL).
- Local Metal dev: `GPTOSS_BUILD_METAL=1 pip install -e ".[metal]"`.
- CI: keep GPTOSS_BUILD_METAL unset for releases; set it in internal jobs that
exercise the extension.

Notes
- The base package remains importable without the extension. The Metal backend
is only used when `gpt_oss.metal` is explicitly imported.
- This file is discovered via `backend-path = ["_build"]` and
`build-backend = "gpt_oss_build_backend.backend"` in pyproject.toml.
"""
import os
from importlib import import_module
from typing import Any, Mapping, Sequence


TRUE_VALUES = {"1", "true", "TRUE", "on", "ON", "yes", "YES"}


def _use_metal_backend() -> bool:
return str(os.environ.get("GPTOSS_BUILD_METAL", "")).strip() in TRUE_VALUES


def _setuptools_backend():
from setuptools import build_meta as _bm # type: ignore

return _bm


def _scikit_build_backend():
return import_module("scikit_build_core.build")


def _backend():
return _scikit_build_backend() if _use_metal_backend() else _setuptools_backend()


# Required PEP 517 hooks

def build_wheel(
wheel_directory: str,
config_settings: Mapping[str, Any] | None = None,
metadata_directory: str | None = None,
) -> str:
return _backend().build_wheel(wheel_directory, config_settings, metadata_directory)


def build_sdist(
sdist_directory: str, config_settings: Mapping[str, Any] | None = None
) -> str:
return _backend().build_sdist(sdist_directory, config_settings)


def prepare_metadata_for_build_wheel(
metadata_directory: str, config_settings: Mapping[str, Any] | None = None
) -> str:
# Fallback if backend doesn't implement it
be = _backend()
fn = getattr(be, "prepare_metadata_for_build_wheel", None)
if fn is None:
# setuptools exposes it; scikit-build-core may not. Defer to building a wheel for metadata.
return _setuptools_backend().prepare_metadata_for_build_wheel(
metadata_directory, config_settings
)
return fn(metadata_directory, config_settings)


# Optional hooks

def build_editable(
editable_directory: str, config_settings: Mapping[str, Any] | None = None, metadata_directory: str | None = None
) -> str:
be = _backend()
fn = getattr(be, "build_editable", None)
if fn is None:
# setuptools implements build_editable; if not available, raise the standard error
raise RuntimeError("Editable installs not supported by the selected backend")
return fn(editable_directory, config_settings)


def get_requires_for_build_wheel(
config_settings: Mapping[str, Any] | None = None,
) -> Sequence[str]:
if _use_metal_backend():
# Add dynamic build requirements only when building the Metal backend
return [
"scikit-build-core>=0.10",
"pybind11>=2.12",
"cmake>=3.26",
"ninja",
]
# setuptools usually returns []
return list(_setuptools_backend().get_requires_for_build_wheel(config_settings))


def get_requires_for_build_sdist(
config_settings: Mapping[str, Any] | None = None,
) -> Sequence[str]:
# No special requirements for SDist
be = _backend()
fn = getattr(be, "get_requires_for_build_sdist", None)
if fn is None:
return []
return list(fn(config_settings))


def get_requires_for_build_editable(
config_settings: Mapping[str, Any] | None = None,
) -> Sequence[str]:
if _use_metal_backend():
return [
"scikit-build-core>=0.10",
"pybind11>=2.12",
"cmake>=3.26",
"ninja",
]
be = _setuptools_backend()
fn = getattr(be, "get_requires_for_build_editable", None)
if fn is None:
return []
return list(fn(config_settings))
33 changes: 26 additions & 7 deletions awesome-gpt-oss.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ This is a list of guides and resources to help you get started with the gpt-oss
- [Cloud](#cloud)
- [Examples / Tutorials](#examples--tutorials)
- [Tools](#tools)
- [Training](#training)

## Inference

Expand All @@ -25,36 +26,48 @@ This is a list of guides and resources to help you get started with the gpt-oss
- [Use gpt-oss-120b with LM Studio](https://lmstudio.ai/models/openai/gpt-oss-120b)
- Hugging Face & Transformers
- [How to run gpt-oss with Transformers](https://cookbook.openai.com/articles/gpt-oss/run-transformers)
- [Hugging Face & gpt-oss launch blog](http://huggingface.co/blog/welcome-openai-gpt-oss)
- [Hugging Face & gpt-oss launch blog](https://huggingface.co/blog/welcome-openai-gpt-oss)
- [Collection of Hugging Face examples](https://github.com/huggingface/gpt-oss-recipes)
- NVIDIA
- [gpt-oss on RTX](https://blogs.nvidia.com/blog/rtx-ai-garage-openai-oss)
- AMD
- [Running gpt-oss models on AMD Ryzen AI Processors and Radeon Graphics Cards](https://www.amd.com/en/blogs/2025/how-to-run-openai-gpt-oss-20b-120b-models-on-amd-ryzen-ai-radeon.html)
- [Running gpt-oss on STX Halo and Radeon dGPUs using Lemonade](https://lemonade-server.ai/news/gpt-oss.html)
- llama.cpp
- [Running gpt-oss with llama.cpp](https://github.com/ggml-org/llama.cpp/discussions/15396)

### Server

- vLLM
- [How to run gpt-oss with vLLM](https://cookbook.openai.com/articles/gpt-oss/run-vllm)
- [vLLM & gpt-oss recipies](https://docs.vllm.ai/projects/recipes/en/latest/OpenAI/GPT-OSS.html)
- NVIDIA
- [Optimizing gpt-oss with NVIDIA TensorRT-LLM](https://cookbook.openai.com/articles/gpt-oss/run-nvidia)
- [Deploying gpt-oss on TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/blogs/tech_blog/blog_9_Deploying_GPT_OSS_on_TRTLLM.md)
- [Optimizing gpt-oss with NVIDIA TensorRT-LLM](https://cookbook.openai.com/articles/run-nvidia)
- [Deploying gpt-oss on TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md)
- AMD
- [Running the Latest Open Models from OpenAI on AMD AI Hardware](https://rocm.blogs.amd.com/ecosystems-and-partners/openai-day-0/README.html)

### Cloud

- Groq
- [Groq & gpt-oss launch blog](http://groq.com/day-zero-support-for-openai-open-model)
- [Groq & gpt-oss launch blog](https://groq.com/blog/day-zero-support-for-openai-open-models)
- [gpt-oss-120b model on the GroqCloud Playground](https://console.groq.com/playground?model=openai/gpt-oss-120b)
- [gpt-oss-20b model on the GroqCloud Playground](https://console.groq.com/playground?model=openai/gpt-oss-20b)
- [gpt-oss with built-in web search on GroqCloud](https://console.groq.com/docs/browser-search)
- [gpt-oss with built-in code execution on GroqCloud](https://console.groq.com/docs/code-execution)
- [Responses API on Groq](https://console.groq.com/docs/responses)
- [gpt-oss with built-in code execution on GroqCloud](https://console.groq.com/docs/code-execution)
- [Responses API on Groq](https://console.groq.com/docs/responses-api)
- NVIDIA
- [NVIDIA launch blog post](https://blogs.nvidia.com/blog/openai-gpt-oss/)
- [NVIDIA & gpt-oss developer launch blog post](https://developer.nvidia.com/blog/delivering-1-5-m-tps-inference-on-nvidia-gb200-nvl72-nvidia-accelerates-openai-gpt-oss-models-from-cloud-to-edge/)
- Use [gpt-oss-120b](https://build.nvidia.com/openai/gpt-oss-120b) and [gpt-oss-20b](https://build.nvidia.com/openai/gpt-oss-20b) on NVIDIA's Cloud
- Cloudflare
- [Cloudflare & gpt-oss launch blog post](http://blog.cloudflare.com/openai-gpt-oss-on-workers-ai)
- [Cloudflare & gpt-oss launch blog post](https://blog.cloudflare.com/openai-gpt-oss-on-workers-ai)
- [gpt-oss-120b on Cloudflare Workers AI](https://developers.cloudflare.com/workers-ai/models/gpt-oss-120b)
- [gpt-oss-20b on Cloudflare Workers AI](https://developers.cloudflare.com/workers-ai/models/gpt-oss-20b)
- AMD
- [gpt-oss-120B on AMD MI300X](https://huggingface.co/spaces/amd/gpt-oss-120b-chatbot)
- AWS (Deploy via Tensorfuse)
- [Deploy gpt-oss for both 20b and 120b models on AWS EKS](https://tensorfuse.io/docs/guides/modality/text/openai_oss)

## Examples & Tutorials

Expand All @@ -65,6 +78,12 @@ This is a list of guides and resources to help you get started with the gpt-oss
- [Example `python` tool for gpt-oss](./gpt_oss/tools/python_docker/)
- [Example `browser` tool for gpt-oss](./gpt_oss/tools/simple_browser/)

## Training

- [Hugging Face TRL examples](https://github.com/huggingface/gpt-oss-recipes)
- [LlamaFactory examples](https://llamafactory.readthedocs.io/en/latest/advanced/best_practice/gpt-oss.html)
- [Unsloth examples](https://docs.unsloth.ai/basics/gpt-oss-how-to-run-and-fine-tune)

## Contributing

Feel free to open a PR to add your own guides and resources on how to run gpt-oss. We will try to review it and add it here.
142 changes: 142 additions & 0 deletions compatibility-test/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
lerna-debug.log*

# Diagnostic reports (https://nodejs.org/api/report.html)
report.[0-9]*.[0-9]*.[0-9]*.[0-9]*.json

# Runtime data
pids
*.pid
*.seed
*.pid.lock

# Directory for instrumented libs generated by jscoverage/JSCover
lib-cov

# Coverage directory used by tools like istanbul
coverage
*.lcov

# nyc test coverage
.nyc_output

# Grunt intermediate storage (https://gruntjs.com/creating-plugins#storing-task-files)
.grunt

# Bower dependency directory (https://bower.io/)
bower_components

# node-waf configuration
.lock-wscript

# Compiled binary addons (https://nodejs.org/api/addons.html)
build/Release

# Dependency directories
node_modules/
jspm_packages/

# Snowpack dependency directory (https://snowpack.dev/)
web_modules/

# TypeScript cache
*.tsbuildinfo

# Optional npm cache directory
.npm

# Optional eslint cache
.eslintcache

# Optional stylelint cache
.stylelintcache

# Optional REPL history
.node_repl_history

# Output of 'npm pack'
*.tgz

# Yarn Integrity file
.yarn-integrity

# dotenv environment variable files
.env
.env.*
!.env.example

# parcel-bundler cache (https://parceljs.org/)
.cache
.parcel-cache

# Next.js build output
.next
out

# Nuxt.js build / generate output
.nuxt
dist

# Gatsby files
.cache/
# Comment in the public line in if your project uses Gatsby and not Next.js
# https://nextjs.org/blog/next-9-1#public-directory-support
# public

# vuepress build output
.vuepress/dist

# vuepress v2.x temp and cache directory
.temp
.cache

# Sveltekit cache directory
.svelte-kit/

# vitepress build output
**/.vitepress/dist

# vitepress cache directory
**/.vitepress/cache

# Docusaurus cache and generated files
.docusaurus

# Serverless directories
.serverless/

# FuseBox cache
.fusebox/

# DynamoDB Local files
.dynamodb/

# Firebase cache directory
.firebase/

# TernJS port file
.tern-port

# Stores VSCode versions used for testing VSCode extensions
.vscode-test

# yarn v3
.pnp.*
.yarn/*
!.yarn/patches
!.yarn/plugins
!.yarn/releases
!.yarn/sdks
!.yarn/versions

# Vite logs files
vite.config.js.timestamp-*
vite.config.ts.timestamp-*

rollout_*.jsonl
analysis_*.json
Loading