Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 29, 2025

📄 5% (0.05x) speedup for llm_health_check in backend/python/app/api/routes/health.py

⏱️ Runtime : 3.24 milliseconds 3.08 milliseconds (best of 125 runs)

📝 Explanation and details

The optimization focuses on a single but impactful change in the get_epoch_timestamp_in_ms() function that reduces timestamp generation overhead by 67%.

Key optimization:

  • Replaced datetime-based timestamp generation with direct time.time_ns() // 1_000_000
  • Eliminated object creation overhead from datetime.now(timezone.utc) which creates multiple objects (datetime, timezone)
  • Reduced function call depth by using a single system call instead of chained method calls

Performance impact:
The line profiler shows get_epoch_timestamp_in_ms() time dropped from 431,236ns to 139,675ns (67% reduction). Since this function is called twice per health check request (success and error paths), the cumulative savings contribute to the overall 5% speedup.

Why this works:

  • time.time_ns() is a direct system call that returns nanoseconds since epoch
  • Integer division // 1_000_000 converts to milliseconds without floating-point arithmetic
  • Avoids the overhead of datetime object instantiation and timezone-aware timestamp conversion

Test case suitability:
This optimization benefits all test scenarios equally since every health check response includes a timestamp. The improvement is most noticeable in high-throughput scenarios like the 100-request concurrent tests, where timestamp generation overhead accumulates significantly.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 376 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 71.4%
🌀 Generated Regression Tests and Runtime
import asyncio  # used to run async functions
import types

import pytest  # used for our unit tests
from app.api.routes.health import llm_health_check
from fastapi import Request
from fastapi.responses import JSONResponse

# --- Mocks and Fakes for Dependencies ---

class FakeLLM:
    """Fake LLM class with async ainvoke method."""
    def __init__(self, should_fail=False):
        self.should_fail = should_fail
        self.invoked = False

    async def ainvoke(self, message):
        self.invoked = True
        if self.should_fail:
            raise RuntimeError("LLM invocation failed")
        return "ok"

class FakeConfigService:
    """Fake config service for get_llm."""
    def __init__(self, llm_configs=None):
        self.llm_configs = llm_configs

    async def get_config(self, key, use_cache=False):
        # Mimic returning config for AI_MODELS
        return {"llm": self.llm_configs}

class FakeAppContainer:
    """Fake container to return config service."""
    def __init__(self, config_service):
        self._config_service = config_service

    def config_service(self):
        return self._config_service

class FakeApp:
    """Fake FastAPI app with container."""
    def __init__(self, config_service):
        self.container = FakeAppContainer(config_service)

class FakeRequest:
    """Fake FastAPI request with app."""
    def __init__(self, app):
        self.app = app

# --- Patch get_llm and get_epoch_timestamp_in_ms for isolation ---


def fake_get_llm(config_service, llm_configs=None):
    # Return a FakeLLM and the config used
    # If config has 'should_fail', return a failing FakeLLM
    should_fail = False
    if llm_configs and llm_configs and llm_configs[0].get("should_fail"):
        should_fail = True
    return FakeLLM(should_fail=should_fail), llm_configs[0] if llm_configs else {}

async def async_fake_get_llm(config_service, llm_configs=None):
    # Async wrapper for fake_get_llm
    return fake_get_llm(config_service, llm_configs)
from app.api.routes.health import llm_health_check

# --- Basic Test Cases ---

@pytest.mark.asyncio
async def test_llm_health_check_basic_success():
    """Test basic successful health check with valid config."""
    llm_configs = [{
        "provider": "OPENAI",
        "configuration": {"model": "gpt-3.5", "apiKey": "fake-key"},
        "isDefault": True
    }]
    config_service = FakeConfigService(llm_configs=llm_configs)
    app = FakeApp(config_service)
    request = FakeRequest(app)
    resp = await llm_health_check(request, llm_configs)

@pytest.mark.asyncio
async def test_llm_health_check_basic_failure():
    """Test health check when LLM invocation fails."""
    llm_configs = [{
        "provider": "OPENAI",
        "configuration": {"model": "gpt-3.5", "apiKey": "fake-key"},
        "isDefault": True,
        "should_fail": True  # used by fake_get_llm to simulate failure
    }]
    config_service = FakeConfigService(llm_configs=llm_configs)
    app = FakeApp(config_service)
    request = FakeRequest(app)
    resp = await llm_health_check(request, llm_configs)

@pytest.mark.asyncio
async def test_llm_health_check_empty_configs():
    """Test health check with empty config list (should fail)."""
    llm_configs = []
    config_service = FakeConfigService(llm_configs=llm_configs)
    app = FakeApp(config_service)
    request = FakeRequest(app)
    resp = await llm_health_check(request, llm_configs)

# --- Edge Test Cases ---

@pytest.mark.asyncio
async def test_llm_health_check_invalid_provider():
    """Test health check with unknown provider (should fail)."""
    llm_configs = [{
        "provider": "UNKNOWN",
        "configuration": {"model": "gpt-3.5", "apiKey": "fake-key"},
        "isDefault": True
    }]
    config_service = FakeConfigService(llm_configs=llm_configs)
    app = FakeApp(config_service)
    request = FakeRequest(app)
    # monkeypatch get_llm to raise ValueError for unknown provider
    async def raise_value_error(config_service, llm_configs=None):
        raise ValueError("Unsupported provider type: UNKNOWN")
    llm_health_check.__globals__["get_llm"] = raise_value_error
    resp = await llm_health_check(request, llm_configs)
    # restore patch for other tests
    llm_health_check.__globals__["get_llm"] = async_fake_get_llm

@pytest.mark.asyncio
async def test_llm_health_check_concurrent_success():
    """Test concurrent health checks with valid configs."""
    llm_configs = [{
        "provider": "OPENAI",
        "configuration": {"model": "gpt-3.5", "apiKey": "fake-key"},
        "isDefault": True
    }]
    config_service = FakeConfigService(llm_configs=llm_configs)
    app = FakeApp(config_service)
    request = FakeRequest(app)
    # Run 10 concurrent health checks
    results = await asyncio.gather(
        *[llm_health_check(request, llm_configs) for _ in range(10)]
    )
    for resp in results:
        pass

@pytest.mark.asyncio
async def test_llm_health_check_concurrent_mixed():
    """Test concurrent health checks with mixed configs (success and failure)."""
    configs_success = [{
        "provider": "OPENAI",
        "configuration": {"model": "gpt-3.5", "apiKey": "fake-key"},
        "isDefault": True
    }]
    configs_fail = [{
        "provider": "OPENAI",
        "configuration": {"model": "gpt-3.5", "apiKey": "fake-key"},
        "isDefault": True,
        "should_fail": True
    }]
    config_service_s = FakeConfigService(llm_configs=configs_success)
    config_service_f = FakeConfigService(llm_configs=configs_fail)
    app_s = FakeApp(config_service_s)
    app_f = FakeApp(config_service_f)
    req_s = FakeRequest(app_s)
    req_f = FakeRequest(app_f)
    results = await asyncio.gather(
        llm_health_check(req_s, configs_success),
        llm_health_check(req_f, configs_fail),
    )

# --- Large Scale Test Cases ---

@pytest.mark.asyncio

async def test_llm_health_check_large_scale_mixed():
    """Test health check with a mix of success and failure configs."""
    configs_success = [{
        "provider": "OPENAI",
        "configuration": {"model": "gpt-3.5", "apiKey": "fake-key"},
        "isDefault": True
    }]
    configs_fail = [{
        "provider": "OPENAI",
        "configuration": {"model": "gpt-3.5", "apiKey": "fake-key"},
        "isDefault": True,
        "should_fail": True
    }]
    config_service_s = FakeConfigService(llm_configs=configs_success)
    config_service_f = FakeConfigService(llm_configs=configs_fail)
    app_s = FakeApp(config_service_s)
    app_f = FakeApp(config_service_f)
    req_s = FakeRequest(app_s)
    req_f = FakeRequest(app_f)
    # 25 success, 25 fail
    tasks = [llm_health_check(req_s, configs_success) for _ in range(25)] + \
            [llm_health_check(req_f, configs_fail) for _ in range(25)]
    results = await asyncio.gather(*tasks)
    success_count = sum(1 for r in results if r.status_code == 200)
    fail_count = sum(1 for r in results if r.status_code == 500)

# --- Throughput Test Cases ---

@pytest.mark.asyncio


async def test_llm_health_check_throughput_high_volume():
    """Throughput test: high volume (100 requests)."""
    llm_configs = [{
        "provider": "OPENAI",
        "configuration": {"model": "gpt-3.5", "apiKey": "fake-key"},
        "isDefault": True
    }]
    config_service = FakeConfigService(llm_configs=llm_configs)
    app = FakeApp(config_service)
    request = FakeRequest(app)
    results = await asyncio.gather(
        *[llm_health_check(request, llm_configs) for _ in range(100)]
    )

@pytest.mark.asyncio
async def test_llm_health_check_throughput_mixed_load():
    """Throughput test: mixed load (50 success, 50 fail)."""
    configs_success = [{
        "provider": "OPENAI",
        "configuration": {"model": "gpt-3.5", "apiKey": "fake-key"},
        "isDefault": True
    }]
    configs_fail = [{
        "provider": "OPENAI",
        "configuration": {"model": "gpt-3.5", "apiKey": "fake-key"},
        "isDefault": True,
        "should_fail": True
    }]
    config_service_s = FakeConfigService(llm_configs=configs_success)
    config_service_f = FakeConfigService(llm_configs=configs_fail)
    app_s = FakeApp(config_service_s)
    app_f = FakeApp(config_service_f)
    req_s = FakeRequest(app_s)
    req_f = FakeRequest(app_f)
    tasks = [llm_health_check(req_s, configs_success) for _ in range(50)] + \
            [llm_health_check(req_f, configs_fail) for _ in range(50)]
    results = await asyncio.gather(*tasks)
    success_count = sum(1 for r in results if r.status_code == 200)
    fail_count = sum(1 for r in results if r.status_code == 500)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import asyncio  # used to run async functions
from typing import Any

import pytest  # used for our unit tests
from app.api.routes.health import llm_health_check
from fastapi import Request
from fastapi.responses import JSONResponse


class DummyConfigService:
    """A dummy config service that returns configs or raises errors."""
    def __init__(self, llm_return=None, should_fail=False):
        self.llm_return = llm_return
        self.should_fail = should_fail

    async def get_config(self, key, use_cache=True):
        if self.should_fail:
            raise RuntimeError("Config service failure")
        return {"llm": self.llm_return}

class DummyContainer:
    """A dummy container to provide config_service."""
    def __init__(self, config_service):
        self._config_service = config_service

    def config_service(self):
        return self._config_service

class DummyApp:
    """A dummy app object to simulate FastAPI app."""
    def __init__(self, container):
        self.container = container

class DummyRequest(Request):
    """A dummy request object with .app."""
    def __init__(self, app):
        self.app = app

# ---- End: Minimal Mocks ----

# ---- Begin: Basic Test Cases ----

@pytest.mark.asyncio

To edit these changes git checkout codeflash/optimize-llm_health_check-mhbq31yu and push.

Codeflash

The optimization focuses on a single but impactful change in the `get_epoch_timestamp_in_ms()` function that reduces timestamp generation overhead by **67%**.

**Key optimization:**
- **Replaced datetime-based timestamp generation** with direct `time.time_ns() // 1_000_000`
- **Eliminated object creation overhead** from `datetime.now(timezone.utc)` which creates multiple objects (datetime, timezone)
- **Reduced function call depth** by using a single system call instead of chained method calls

**Performance impact:**
The line profiler shows `get_epoch_timestamp_in_ms()` time dropped from **431,236ns** to **139,675ns** (67% reduction). Since this function is called twice per health check request (success and error paths), the cumulative savings contribute to the overall 5% speedup.

**Why this works:**
- `time.time_ns()` is a direct system call that returns nanoseconds since epoch
- Integer division `// 1_000_000` converts to milliseconds without floating-point arithmetic
- Avoids the overhead of datetime object instantiation and timezone-aware timestamp conversion

**Test case suitability:**
This optimization benefits all test scenarios equally since every health check response includes a timestamp. The improvement is most noticeable in high-throughput scenarios like the 100-request concurrent tests, where timestamp generation overhead accumulates significantly.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 29, 2025 08:17
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant