Skip to content

Conversation

ayam04
Copy link

@ayam04 ayam04 commented Oct 1, 2025

fix: #3065

Bug

  • LiteLlm.generate_content_async always used self.model, ignoring the model specified in llm_request.model
  • This prevented agent callbacks from dynamically changing the model at runtime

Fix

  • Changed line 821 in lite_llm.py from "model": self.model to "model": llm_request.model or self.model
  • Now respects llm_request.model when set, with fallback to self.model when None
  • Aligns with pattern used in other LLM implementations (anthropic_llm.py, google_llm.py)

Changes

  • Modified: lite_llm.py (1 line change)
  • Added: 2 new test cases in test_litellm.py
    • test_generate_content_async_with_model_override - verifies override works
    • test_generate_content_async_without_model_override - verifies fallback works

Testing Plan

  1. test_generate_content_async_with_model_override

    • Validates that when llm_request.model is set to a different value, it overrides self.model
    • Creates an LLM instance with model="test_model"
    • Passes a request with model="overridden_model"
    • Asserts the actual model used is "overridden_model"
  2. test_generate_content_async_without_model_override

    • Validates that when llm_request.model is None, it falls back to self.model
    • Creates an LLM instance with model="test_model"
    • Passes a request with model=None
    • Asserts the actual model used is "test_model" (the fallback)

Test Results

# New tests - Model override functionality
$ pytest tests/unittests/models/test_litellm.py::test_generate_content_async_with_model_override tests/unittests/models/test_litellm.py::test_generate_content_async_without_model_override -v

collected 4 items

test_litellm.py::test_generate_content_async_with_model_override[GOOGLE_AI] PASSED [ 25%]
test_litellm.py::test_generate_content_async_with_model_override[VERTEX] PASSED [ 50%]
test_litellm.py::test_generate_content_async_without_model_override[GOOGLE_AI] PASSED [ 75%]
test_litellm.py::test_generate_content_async_without_model_override[VERTEX] PASSED [100%]

========================= 4 passed in 67.47s =========================
# Existing test - Backward compatibility check
$ pytest tests/unittests/models/test_litellm.py::test_generate_content_async -v

collected 2 items

test_litellm.py::test_generate_content_async[GOOGLE_AI] PASSED [ 50%]
test_litellm.py::test_generate_content_async[VERTEX] PASSED [100%]

============================== 2 passed in 30.95s ==============================

Coverage

  • Override scenario: Model specified in llm_request is used
  • Fallback scenario: self.model is used when llm_request.model is None
  • Backward compatibility: Existing tests pass without modification
  • Parameterized testing: Both GOOGLE_AI and VERTEX provider configurations tested

Manual Testing

The fix enables the following use case (tested via agent callbacks):

agent = LlmAgent(
    model=LiteLlm(model="openai/gpt-4"),
    before_model_callback=lambda llm_request, ctx: setattr(llm_request, 'model', 'openai/gpt-3.5-turbo')
) # Model is now correctly switched to gpt-3.5-turbo at runtime

Use Case Enabled

# Before: this wouldn't work
def before_model_callback(llm_request: LlmRequest, ...):
    llm_request.model = "openai/gpt-4"  # ignored

# After: this now works correctly
def before_model_callback(llm_request: LlmRequest, ...):
    llm_request.model = "openai/gpt-4"  # respected

Copy link

Summary of Changes

Hello @ayam04, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the LiteLLM integration that previously prevented the dynamic overriding of models during runtime. By adjusting the model selection logic, the system now correctly respects model specifications within llm_request.model, enabling more flexible agent behaviors. The change is supported by new, comprehensive test cases that validate both the override and fallback scenarios, ensuring robustness and backward compatibility.

Highlights

  • Bug Fix: Resolved an issue where LiteLlm.generate_content_async incorrectly ignored the model specified in llm_request.model, preventing dynamic model changes at runtime for agent callbacks.
  • Model Override Logic: Modified lite_llm.py to prioritize llm_request.model if provided, falling back to self.model otherwise, aligning with patterns in other LLM implementations.
  • New Test Cases: Added two new asynchronous test cases in test_litellm.py to explicitly verify both the model override functionality and the fallback mechanism, ensuring correct behavior.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot adk-bot added bot triaged [Bot] This issue is triaged by ADK bot models [Component] Issues related to model support labels Oct 1, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes an issue where the model override in LlmRequest was being ignored by LiteLlm. The change to prioritize llm_request.model over self.model is a clean and effective solution. The addition of new tests to cover both the override and fallback scenarios is great. I've added one suggestion to improve the new test code by simplifying it, which will make it more focused and maintainable.

@ayam04
Copy link
Author

ayam04 commented Oct 2, 2025

hi @seanzhou1023, would u be reviewing the PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot triaged [Bot] This issue is triaged by ADK bot models [Component] Issues related to model support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LiteLLM ignores LLMRequest.model in generate_content_async
2 participants