⚡️ Speed up method `GsContentApi.get_contents` by 8% #473

codeflash-ai · 2025-10-28T20:28:26Z

📄 8% (0.08x) speedup for `GsContentApi.get_contents` in `gs_quant/api/gs/content.py`

⏱️ Runtime : 94.4 microseconds → 87.7 microseconds (best of 9 runs)

📝 Explanation and details

The optimized code achieves a 7% speedup through three key optimizations that reduce Python overhead and unnecessary operations:

1. Eliminated redundant list creations in get_contents()
The original code created lists inline during method calls ([offset] if offset else None). The optimized version pre-creates these lists once and reuses them, reducing repeated conditional evaluations and list construction overhead.

2. Optimized sorting in _build_parameters_dict()
The original code used setdefault().extend(sorted(value)) for every parameter, which calls sorted() even on single-item collections. The optimized version checks collection length first - if there's only one item, it skips sorting entirely and just converts to a list, saving significant time for single-value parameters.

3. Replaced string concatenation with join() in _build_query_string()
The original code built query strings through repeated concatenation (query_string += ...), which creates new string objects each time. The optimized version collects all parts in a list first, then uses '&'.join() at the end - a well-known Python performance pattern that's much faster for multiple concatenations.

Test case performance patterns:

Edge cases with validation errors (invalid limits/offsets): Show 0-7% improvements, demonstrating the optimizations don't add overhead to error paths
Large-scale scenarios: Benefit most from the join optimization when building longer query strings with many parameters
Single vs. multi-parameter cases: The conditional sorting optimization particularly helps when most parameters have single values

These optimizations are especially effective for typical API usage patterns where query strings contain multiple single-valued parameters.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	✅ 5 Passed
🌀 Generated Regression Tests	✅ 9 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

⚙️ Existing Unit Tests and Runtime

Test File::Test Function	Original ⏱️	Optimized ⏱️	Speedup
`api/test_content.py::test_get_contents`	80.4μs	73.7μs	9.08%✅

🌀 Generated Regression Tests and Runtime

from collections import OrderedDict
from typing import List

# imports
import pytest
from gs_quant.api.gs.content import GsContentApi


# Mocks for dependencies (minimal, no external libraries)
class OrderBy:
    DESC = 'DESC'
    ASC = 'ASC'

class ContentResponse:
    def __init__(self, id, channel, asset_id, author_id, tag, createdTime):
        self.id = id
        self.channel = channel
        self.asset_id = asset_id
        self.author_id = author_id
        self.tag = tag
        self.createdTime = createdTime

class GetManyContentsResponse:
    def __init__(self, data):
        self.data = data

class DummySession:
    def __init__(self, contents):
        self.contents = contents
    def _get(self, url, cls=None):
        # This dummy implementation just returns all contents, ignoring query string
        return GetManyContentsResponse(self.contents)

class GsSession:
    current = None
    @classmethod
    def use(cls, contents=None):
        cls.current = DummySession(contents or [])

# Test data for all scenarios
test_contents = [
    ContentResponse(id='1', channel='G10', asset_id='A1', author_id='U1', tag='T1', createdTime='2024-01-01T00:00:00'),
    ContentResponse(id='2', channel='EM', asset_id='A2', author_id='U2', tag='T2', createdTime='2024-01-02T00:00:00'),
    ContentResponse(id='3', channel='G10', asset_id='A3', author_id='U3', tag='T1', createdTime='2024-01-03T00:00:00'),
    ContentResponse(id='4', channel='EM', asset_id='A4', author_id='U1', tag='T3', createdTime='2024-01-04T00:00:00'),
]

# ----------- BASIC TEST CASES -----------













def test_edge_invalid_limit():
    # Test with limit > 1000
    with pytest.raises(ValueError, match='Limit is too large'):
        GsContentApi.get_contents(limit=1001) # 1.62μs -> 1.58μs (2.46% faster)

def test_edge_negative_offset():
    # Test with negative offset
    with pytest.raises(ValueError, match='Invalid offset'):
        GsContentApi.get_contents(offset=-1) # 1.35μs -> 1.39μs (2.88% slower)

def test_edge_offset_equal_limit():
    # Test with offset == limit (invalid)
    with pytest.raises(ValueError, match='Invalid offset'):
        GsContentApi.get_contents(limit=10, offset=10) # 1.62μs -> 1.51μs (7.37% faster)

def test_edge_offset_greater_than_limit():
    # Test with offset > limit (invalid)
    with pytest.raises(ValueError, match='Invalid offset'):
        GsContentApi.get_contents(limit=10, offset=15) # 1.37μs -> 1.40μs (2.28% slower)









def test_large_scale_invalid_limit():
    # Test limit just above maximum
    large_contents = [
        ContentResponse(id=str(i), channel='C1', asset_id='A1', author_id='U1', tag='T1',
                        createdTime=f'2024-01-{(i % 28) + 1:02d}T00:00:00')
        for i in range(1000)
    ]
    GsSession.use(contents=large_contents)
    with pytest.raises(ValueError, match='Limit is too large'):
        GsContentApi.get_contents(limit=1001) # 1.82μs -> 1.75μs (4.11% faster)

def test_large_scale_invalid_offset():
    # Test offset just above limit
    large_contents = [
        ContentResponse(id=str(i), channel='C1', asset_id='A1', author_id='U1', tag='T1',
                        createdTime=f'2024-01-{(i % 28) + 1:02d}T00:00:00')
        for i in range(1000)
    ]
    GsSession.use(contents=large_contents)
    with pytest.raises(ValueError, match='Invalid offset'):
        GsContentApi.get_contents(limit=1000, offset=1000) # 1.70μs -> 1.69μs (0.533% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from collections import OrderedDict
from typing import Dict, List, Set
from urllib.parse import quote

# imports
import pytest
from gs_quant.api.gs.content import GsContentApi

# --- Minimal stubs and mocks for dependencies ---

# Simulate OrderBy enum
class OrderBy:
    DESC = 'desc'
    ASC = 'asc'

# Simulate ContentResponse and GetManyContentsResponse
class ContentResponse:
    def __init__(self, content_id, data):
        self.content_id = content_id
        self.data = data

    def __eq__(self, other):
        return isinstance(other, ContentResponse) and self.content_id == other.content_id and self.data == other.data

class GetManyContentsResponse:
    def __init__(self, data):
        self.data = data

# Simulate GsSession
class DummySession:
    def __init__(self):
        self.last_url = None
        self.last_cls = None
        self.responses = {}

    def _get(self, url, cls=None):
        self.last_url = url
        self.last_cls = cls
        # Return a canned response based on the URL for testability
        return self.responses.get(url, GetManyContentsResponse([]))

class GsSession:
    current = None

    @classmethod
    def use(cls, session=None):
        cls.current = session or DummySession()

# --- Unit Tests ---

@pytest.fixture(autouse=True)
def setup_session():
    # Setup a dummy session before each test
    session = DummySession()
    GsSession.use(session)
    yield session

# --------------------
# 1. Basic Test Cases
# --------------------







def test_get_contents_zero_limit_raises(setup_session):
    # Arrange/Act/Assert
    with pytest.raises(ValueError, match='Limit is too large'):
        GsContentApi.get_contents(limit=1001) # 1.72μs -> 1.75μs (1.66% slower)

def test_get_contents_negative_offset_raises(setup_session):
    # Arrange/Act/Assert
    with pytest.raises(ValueError, match='Invalid offset'):
        GsContentApi.get_contents(offset=-1) # 1.36μs -> 1.53μs (10.7% slower)

def test_get_contents_offset_equal_to_limit_raises(setup_session):
    # Arrange/Act/Assert
    with pytest.raises(ValueError, match='Invalid offset'):
        GsContentApi.get_contents(offset=10, limit=10) # 1.46μs -> 1.37μs (6.51% faster)

To edit these changes git checkout codeflash/optimize-GsContentApi.get_contents-mhb0r2mt and push.

The optimized code achieves a **7% speedup** through three key optimizations that reduce Python overhead and unnecessary operations: **1. Eliminated redundant list creations in get_contents()** The original code created lists inline during method calls (`[offset] if offset else None`). The optimized version pre-creates these lists once and reuses them, reducing repeated conditional evaluations and list construction overhead. **2. Optimized sorting in _build_parameters_dict()** The original code used `setdefault().extend(sorted(value))` for every parameter, which calls `sorted()` even on single-item collections. The optimized version checks collection length first - if there's only one item, it skips sorting entirely and just converts to a list, saving significant time for single-value parameters. **3. Replaced string concatenation with join() in _build_query_string()** The original code built query strings through repeated concatenation (`query_string += ...`), which creates new string objects each time. The optimized version collects all parts in a list first, then uses `'&'.join()` at the end - a well-known Python performance pattern that's much faster for multiple concatenations. **Test case performance patterns:** - **Edge cases with validation errors** (invalid limits/offsets): Show 0-7% improvements, demonstrating the optimizations don't add overhead to error paths - **Large-scale scenarios**: Benefit most from the join optimization when building longer query strings with many parameters - **Single vs. multi-parameter cases**: The conditional sorting optimization particularly helps when most parameters have single values These optimizations are especially effective for typical API usage patterns where query strings contain multiple single-valued parameters.

codeflash-ai bot requested a review from mashraf-222 October 28, 2025 20:28

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Oct 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

⚡️ Speed up method `GsContentApi.get_contents` by 8% #473

⚡️ Speed up method `GsContentApi.get_contents` by 8% #473

Uh oh!

codeflash-ai bot commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

⚡️ Speed up method GsContentApi.get_contents by 8% #473

Are you sure you want to change the base?

⚡️ Speed up method GsContentApi.get_contents by 8% #473

Uh oh!

Conversation

codeflash-ai bot commented Oct 28, 2025

📄 8% (0.08x) speedup for GsContentApi.get_contents in gs_quant/api/gs/content.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method `GsContentApi.get_contents` by 8% #473

⚡️ Speed up method `GsContentApi.get_contents` by 8% #473

📄 8% (0.08x) speedup for `GsContentApi.get_contents` in `gs_quant/api/gs/content.py`