Skip to content

Conversation

@enyst
Copy link
Collaborator

@enyst enyst commented Nov 6, 2025

This PR updates the OpenHands LLMs documentation to match the source of truth in the Agent SDK.

Source of truth:

  • agent-sdk path: openhands-sdk/openhands/sdk/llm/utils/verified_models.py
  • list: VERIFIED_OPENHANDS_MODELS

Changes:

  • Added models: claude-haiku-4-5-20251001, gpt-5-codex, claude-opus-4-1-20250805, kimi-k2-0711-preview
  • Removed model: devstral-small-2505
  • Kept all other models and aligned order with the verified list
  • Clarified pricing table with N/A where provider pricing/limits are not documented

Why:

  • Ensure docs reflect the exact set of models that are verified to work with OpenHands via the Agent SDK
  • Avoid drift between docs and implementation

Files changed:

  • openhands/usage/llms/openhands-llms.mdx

Co-authored-by: openhands [email protected]

@enyst can click here to continue refining the PR

…LS\n\nSource of truth: openhands-sdk/openhands/sdk/llm/utils/verified_models.py\n- Add: claude-haiku-4-5-20251001, gpt-5-codex, claude-opus-4-1-20250805, kimi-k2-0711-preview\n- Remove: devstral-small-2505\n- Align order with VERIFIED_OPENHANDS_MODELS\n\nCo-authored-by: openhands <[email protected]>
@mamoodi
Copy link
Collaborator

mamoodi commented Nov 6, 2025

I've removed myself and asked Xingyao for a look. I don't know how correct the changes are.

… N/A; add source note\n\nSource: litellm model_prices_and_context_window_backup.json; Verified list remains source-of-truth for models.\n\nCo-authored-by: openhands <[email protected]>
@enyst
Copy link
Collaborator Author

enyst commented Nov 6, 2025

Yup! I verified some all over the place. In general, the list didn't change much:

  • I added more models from verified openhands list in agent-sdk
  • found prices for them in litellm's JSON
  • double checked Claudes and GPTs
  • and some extra "looks the same except for x" checks.

Copy link
Contributor

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me, irrc there was a test case checking this table agaisnt litellm's model_price JSON, can we port that over to here as well? 🤔

…e JSON)\n\n- Skips models not present or intentionally N/A\n- Compares input/cached/output costs per 1M and token limits when available\n\nCo-authored-by: openhands <[email protected]>
add_fail(f"input_cost mismatch: mdx={mdx_input_cost} vs litellm={exp_input_cost}")

# Cached input cost
if exp_cached_cost is not None or mdx_cached_cost is not None:
Copy link
Collaborator Author

@enyst enyst Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GPT-5 explanation for why it cares about None on cached input, but not on input or output:

Cached input cost: We treat it as both a price and a capability signal (prompt caching support). So the validator enforces presence parity and numeric accuracy:

  • Both None → OK
  • LiteLLM None, MDX number → fail (docs claim caching where provider doesn’t report it)
  • LiteLLM number, MDX None → fail (docs missing a provider-reported caching price)
  • Both numbers → compare within tolerance

Input/output costs: These are fundamental but occasionally missing in LiteLLM for preview/edge cases. To avoid false failures due to incomplete upstream data, we only compare when both sides provide numbers; if either is None, we skip strict enforcement.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to me... WDYT?

Copy link
Contributor

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xingyaoww xingyaoww merged commit dce8b12 into main Nov 7, 2025
3 checks passed
@xingyaoww xingyaoww deleted the sync-verified-openhands-models branch November 7, 2025 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants