⚡️ Speed up function `_try_blobstore_fetch` by 6% #281

codeflash-ai · 2025-10-30T22:45:36Z

📄 6% (0.06x) speedup for `_try_blobstore_fetch` in `backend/python/app/utils/fetch_full_record.py`

⏱️ Runtime : 1.56 milliseconds → 1.47 milliseconds (best of 179 runs)

📝 Explanation and details

The optimized code achieves a 5% runtime improvement (from 1.56ms to 1.47ms) and 1.1% throughput improvement (118,590 to 119,930 ops/sec) through several key micro-optimizations:

Key optimizations:

Reduced dictionary lookups: Changed secret_keys.get("scopedJwtSecret") to secret_keys.get("scopedJwtSecret") if secret_keys else None, eliminating redundant .get() calls when secret_keys is None.
Optimized endpoint lookup: Replaced nested .get() calls with a single conditional expression that checks for endpoint existence before accessing the nested dictionary, reducing dictionary traversal overhead.
Streamlined signed URL handling: Removed the unnecessary if(data.get("signedUrl")) check and directly assigned signed_url = data.get("signedUrl"), then used a simple if signed_url: condition. This eliminates duplicate dictionary lookups on the same key.
Cleaner variable naming: Used resp2 instead of reusing resp for the signed URL response, improving code clarity and potentially avoiding variable reassignment overhead.

Performance impact: These optimizations primarily reduce the overhead of dictionary operations and conditional checks. The line profiler shows the most time is spent in the get_config() calls (15-17% of total time), and the dictionary operations account for 8-10% of execution time. By minimizing these lookups, the optimizations provide consistent small gains across all test scenarios.

Best for: The optimizations show consistent benefits across all test cases - basic success scenarios, edge cases with missing data, and high-throughput concurrent requests (up to 200 concurrent operations). The improvements are most noticeable in high-volume scenarios where the reduced per-operation overhead compounds.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 664 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

import asyncio  # used to run async functions
# Patch aiohttp.ClientSession in BlobStorage for tests
import sys
import types
from typing import Any, Dict, Optional

import pytest  # used for our unit tests
from app.utils.fetch_full_record import _try_blobstore_fetch

# --- Mocks and Test Doubles ---

class DummyLogger:
    def __init__(self):
        self.messages = []
    def info(self, msg, *args): self.messages.append(("info", msg, args))
    def error(self, msg, *args): self.messages.append(("error", msg, args))
    def exception(self, msg): self.messages.append(("exception", msg))
    def debug(self, msg, *args): self.messages.append(("debug", msg, args))
    def warning(self, msg, *args): self.messages.append(("warning", msg, args))

class DummyConfigService:
    def __init__(self, config_map=None):
        # config_map: dict of key -> value
        self.config_map = config_map or {}
    async def get_config(self, key: str, default=None, use_cache=True):
        # Simulate async config fetch
        return self.config_map.get(key, default)

class DummyArangoService:
    def __init__(self, mapping=None):
        # mapping: dict of virtual_record_id -> document_id
        self.mapping = mapping or {}
        class DummyDB:
            class DummyAQL:
                def execute(self_inner, query):
                    # Parse query to extract virtualRecordId
                    import re
                    m = re.search(r'FILTER doc\.virtualRecordId == "([^"]+)"', query)
                    if m:
                        vid = m.group(1)
                        doc_id = self.mapping.get(vid)
                        if doc_id:
                            return [doc_id]
                        else:
                            return []
                    return []
            aql = DummyAQL()
        self.db = DummyDB()

class DummyResponse:
    def __init__(self, status, json_data):
        self.status = status
        self._json_data = json_data
    async def json(self):
        return self._json_data
    async def __aenter__(self): return self
    async def __aexit__(self, exc_type, exc, tb): pass

class DummySession:
    def __init__(self, responses):
        # responses: dict of url -> DummyResponse
        self.responses = responses
        self.calls = []
    async def __aenter__(self): return self
    async def __aexit__(self, exc_type, exc, tb): pass
    def get(self, url, headers=None):
        # Return DummyResponse for url
        self.calls.append((url, headers))
        resp = self.responses.get(url)
        if not resp:
            # Simulate not found
            return DummyResponse(404, {})
        return resp
from app.utils.fetch_full_record import _try_blobstore_fetch

# --- Minimal BlobStorage Implementation for Testing ---

class TestBlobStorage:
    """
    Minimal testable version of BlobStorage for unit tests.
    """
    def __init__(self, logger, config_service, arango_service=None, session=None):
        self.logger = logger
        self.config_service = config_service
        self.arango_service = arango_service
        self._session = session  # inject dummy session for tests

    async def get_document_id_by_virtual_record_id(self, virtual_record_id: str) -> str:
        if not self.arango_service:
            self.logger.error("ArangoService not initialized, cannot get document ID by virtual record ID.")
            raise Exception("ArangoService not initialized, cannot get document ID by virtual record ID.")
        try:
            # Simulate ArangoDB query
            doc_id = self.arango_service.mapping.get(virtual_record_id)
            if doc_id:
                return doc_id
            else:
                self.logger.info("No document ID found for virtual record ID: %s", virtual_record_id)
                return None
        except Exception as e:
            self.logger.error("Error getting document ID by virtual record ID: %s", str(e))
            raise e

    async def get_record_from_storage(self, virtual_record_id: str, org_id: str) -> str:
        self.logger.info("Retrieving record from storage for virtual_record_id: %s", virtual_record_id)
        try:
            # Generate JWT token for authorization (simulate)
            payload = {
                "orgId": org_id,
                "scopes": ["storage:token"],
            }
            secret_keys = await self.config_service.get_config("SECRET_KEYS")
            scoped_jwt_secret = secret_keys.get("scopedJwtSecret")
            if not scoped_jwt_secret:
                raise ValueError("Missing scoped JWT secret")
            # Simulate JWT encoding
            jwt_token = "dummy.jwt.token"
            headers = {
                "Authorization": f"Bearer {jwt_token}"
            }
            endpoints = await self.config_service.get_config("ENDPOINTS")
            nodejs_endpoint = endpoints.get("cm", {}).get("endpoint", "http://dummy-nodejs-endpoint")
            if not nodejs_endpoint:
                raise ValueError("Missing CM endpoint configuration")
            document_id = await self.get_document_id_by_virtual_record_id(virtual_record_id)
            if not document_id:
                self.logger.info("No document ID found for virtual record ID: %s", virtual_record_id)
                return None
            # Build download URL
            download_url = f"{nodejs_endpoint}/storage/download/{document_id}"
            # Use injected dummy session for testing
            session = self._session
            async with session:
                async with session.get(download_url, headers=headers) as resp:
                    if resp.status == 200:
                        data = await resp.json()
                        if(data.get("signedUrl")):
                            signed_url = data.get("signedUrl")
                            async with session.get(signed_url, headers=headers) as resp2:
                                if resp2.status == 200:
                                    data2 = await resp2.json()
                                    self.logger.info("Successfully retrieved record for virtual_record_id from blob storage: %s", virtual_record_id)
                                    return data2.get("record")
                        # If no signedUrl, return None
                        return None
                    else:
                        self.logger.error("Failed to retrieve record: status %s, virtual_record_id: %s", resp.status, virtual_record_id)
                        raise Exception("Failed to retrieve record from storage")
        except Exception as e:
            self.logger.error("Error retrieving record from storage: %s", str(e))
            self.logger.exception("Detailed error trace:")
            raise e

# --- Test Cases ---

# --- 1. Basic Test Cases ---

@pytest.mark.asyncio
async def test__try_blobstore_fetch_basic_success():
    """
    Basic: Function returns expected record when everything succeeds.
    """
    logger = DummyLogger()
    config_service = DummyConfigService({
        "SECRET_KEYS": {"scopedJwtSecret": "dummysecret"},
        "ENDPOINTS": {"cm": {"endpoint": "http://dummy-nodejs-endpoint"}}
    })
    arango_service = DummyArangoService({"rec123": "doc456"})
    # Setup session responses
    record_data = {"record": {"foo": "bar"}}
    signed_url = "http://signed-url"
    session = DummySession({
        "http://dummy-nodejs-endpoint/storage/download/doc456": DummyResponse(200, {"signedUrl": signed_url}),
        signed_url: DummyResponse(200, record_data)
    })
    blob_store = TestBlobStorage(logger, config_service, arango_service, session)
    result = await _try_blobstore_fetch(blob_store, "org1", "rec123")

@pytest.mark.asyncio
async def test__try_blobstore_fetch_basic_none_when_no_document_id():
    """
    Basic: Returns None if document_id is not found for the virtual_record_id.
    """
    logger = DummyLogger()
    config_service = DummyConfigService({
        "SECRET_KEYS": {"scopedJwtSecret": "dummysecret"},
        "ENDPOINTS": {"cm": {"endpoint": "http://dummy-nodejs-endpoint"}}
    })
    arango_service = DummyArangoService({})  # No mapping
    session = DummySession({})
    blob_store = TestBlobStorage(logger, config_service, arango_service, session)
    result = await _try_blobstore_fetch(blob_store, "org1", "rec999")

@pytest.mark.asyncio
async def test__try_blobstore_fetch_basic_none_when_no_signed_url():
    """
    Basic: Returns None if the download response does not contain a signedUrl.
    """
    logger = DummyLogger()
    config_service = DummyConfigService({
        "SECRET_KEYS": {"scopedJwtSecret": "dummysecret"},
        "ENDPOINTS": {"cm": {"endpoint": "http://dummy-nodejs-endpoint"}}
    })
    arango_service = DummyArangoService({"rec123": "doc456"})
    session = DummySession({
        "http://dummy-nodejs-endpoint/storage/download/doc456": DummyResponse(200, {})  # No signedUrl
    })
    blob_store = TestBlobStorage(logger, config_service, arango_service, session)
    result = await _try_blobstore_fetch(blob_store, "org1", "rec123")

# --- 2. Edge Test Cases ---

@pytest.mark.asyncio
async def test__try_blobstore_fetch_edge_missing_secret_key():
    """
    Edge: Raises ValueError if scopedJwtSecret is missing in config.
    """
    logger = DummyLogger()
    config_service = DummyConfigService({
        "SECRET_KEYS": {},  # missing scopedJwtSecret
        "ENDPOINTS": {"cm": {"endpoint": "http://dummy-nodejs-endpoint"}}
    })
    arango_service = DummyArangoService({"rec123": "doc456"})
    session = DummySession({})
    blob_store = TestBlobStorage(logger, config_service, arango_service, session)
    result = await _try_blobstore_fetch(blob_store, "org1", "rec123")

@pytest.mark.asyncio

async def test__try_blobstore_fetch_edge_download_http_error():
    """
    Edge: Raises Exception if download response status is not 200.
    """
    logger = DummyLogger()
    config_service = DummyConfigService({
        "SECRET_KEYS": {"scopedJwtSecret": "dummysecret"},
        "ENDPOINTS": {"cm": {"endpoint": "http://dummy-nodejs-endpoint"}}
    })
    arango_service = DummyArangoService({"rec123": "doc456"})
    session = DummySession({
        "http://dummy-nodejs-endpoint/storage/download/doc456": DummyResponse(500, {})  # HTTP error
    })
    blob_store = TestBlobStorage(logger, config_service, arango_service, session)
    result = await _try_blobstore_fetch(blob_store, "org1", "rec123")

@pytest.mark.asyncio
async def test__try_blobstore_fetch_edge_signed_url_http_error():
    """
    Edge: Raises Exception if signed URL response status is not 200.
    """
    logger = DummyLogger()
    config_service = DummyConfigService({
        "SECRET_KEYS": {"scopedJwtSecret": "dummysecret"},
        "ENDPOINTS": {"cm": {"endpoint": "http://dummy-nodejs-endpoint"}}
    })
    arango_service = DummyArangoService({"rec123": "doc456"})
    signed_url = "http://signed-url"
    session = DummySession({
        "http://dummy-nodejs-endpoint/storage/download/doc456": DummyResponse(200, {"signedUrl": signed_url}),
        signed_url: DummyResponse(404, {})  # HTTP error on signed URL
    })
    blob_store = TestBlobStorage(logger, config_service, arango_service, session)
    result = await _try_blobstore_fetch(blob_store, "org1", "rec123")

@pytest.mark.asyncio
async def test__try_blobstore_fetch_edge_concurrent_execution():
    """
    Edge: Test multiple concurrent fetches for different records.
    """
    logger = DummyLogger()
    config_service = DummyConfigService({
        "SECRET_KEYS": {"scopedJwtSecret": "dummysecret"},
        "ENDPOINTS": {"cm": {"endpoint": "http://dummy-nodejs-endpoint"}}
    })
    arango_service = DummyArangoService({"recA": "docA", "recB": "docB"})
    session = DummySession({
        "http://dummy-nodejs-endpoint/storage/download/docA": DummyResponse(200, {"signedUrl": "http://signed-url-A"}),
        "http://signed-url-A": DummyResponse(200, {"record": {"val": "A"}}),
        "http://dummy-nodejs-endpoint/storage/download/docB": DummyResponse(200, {"signedUrl": "http://signed-url-B"}),
        "http://signed-url-B": DummyResponse(200, {"record": {"val": "B"}}),
    })
    blob_store = TestBlobStorage(logger, config_service, arango_service, session)
    results = await asyncio.gather(
        _try_blobstore_fetch(blob_store, "org1", "recA"),
        _try_blobstore_fetch(blob_store, "org1", "recB"),
    )

# --- 3. Large Scale Test Cases ---

@pytest.mark.asyncio
async def test__try_blobstore_fetch_large_scale_many_concurrent():
    """
    Large Scale: Test many concurrent fetches (50).
    """
    logger = DummyLogger()
    config_service = DummyConfigService({
        "SECRET_KEYS": {"scopedJwtSecret": "dummysecret"},
        "ENDPOINTS": {"cm": {"endpoint": "http://dummy-nodejs-endpoint"}}
    })
    # Generate mappings and responses
    N = 50
    mapping = {f"rec{i}": f"doc{i}" for i in range(N)}
    responses = {}
    for i in range(N):
        signed_url = f"http://signed-url-{i}"
        responses[f"http://dummy-nodejs-endpoint/storage/download/doc{i}"] = DummyResponse(200, {"signedUrl": signed_url})
        responses[signed_url] = DummyResponse(200, {"record": {"val": i}})
    arango_service = DummyArangoService(mapping)
    session = DummySession(responses)
    blob_store = TestBlobStorage(logger, config_service, arango_service, session)
    coros = [_try_blobstore_fetch(blob_store, "org1", f"rec{i}") for i in range(N)]
    results = await asyncio.gather(*coros)

# --- 4. Throughput Test Cases ---

@pytest.mark.asyncio
async def test__try_blobstore_fetch_throughput_small_load():
    """
    Throughput: Test function performance with small load (5 concurrent).
    """
    logger = DummyLogger()
    config_service = DummyConfigService({
        "SECRET_KEYS": {"scopedJwtSecret": "dummysecret"},
        "ENDPOINTS": {"cm": {"endpoint": "http://dummy-nodejs-endpoint"}}
    })
    N = 5
    mapping = {f"rec{i}": f"doc{i}" for i in range(N)}
    responses = {}
    for i in range(N):
        signed_url = f"http://signed-url-{i}"
        responses[f"http://dummy-nodejs-endpoint/storage/download/doc{i}"] = DummyResponse(200, {"signedUrl": signed_url})
        responses[signed_url] = DummyResponse(200, {"record": {"val": i}})
    arango_service = DummyArangoService(mapping)
    session = DummySession(responses)
    blob_store = TestBlobStorage(logger, config_service, arango_service, session)
    coros = [_try_blobstore_fetch(blob_store, "org1", f"rec{i}") for i in range(N)]
    results = await asyncio.gather(*coros)

@pytest.mark.asyncio
async def test__try_blobstore_fetch_throughput_medium_load():
    """
    Throughput: Test function performance with medium load (20 concurrent).
    """
    logger = DummyLogger()
    config_service = DummyConfigService({
        "SECRET_KEYS": {"scopedJwtSecret": "dummysecret"},
        "ENDPOINTS": {"cm": {"endpoint": "http://dummy-nodejs-endpoint"}}
    })
    N = 20
    mapping = {f"rec{i}": f"doc{i}" for i in range(N)}
    responses = {}
    for i in range(N):
        signed_url = f"http://signed-url-{i}"
        responses[f"http://dummy-nodejs-endpoint/storage/download/doc{i}"] = DummyResponse(200, {"signedUrl": signed_url})
        responses[signed_url] = DummyResponse(200, {"record": {"val": i}})
    arango_service = DummyArangoService(mapping)
    session = DummySession(responses)
    blob_store = TestBlobStorage(logger, config_service, arango_service, session)
    coros = [_try_blobstore_fetch(blob_store, "org1", f"rec{i}") for i in range(N)]
    results = await asyncio.gather(*coros)

@pytest.mark.asyncio
async def test__try_blobstore_fetch_throughput_high_volume():
    """
    Throughput: Test function performance with high volume (100 concurrent).
    """
    logger = DummyLogger()
    config_service = DummyConfigService({
        "SECRET_KEYS": {"scopedJwtSecret": "dummysecret"},
        "ENDPOINTS": {"cm": {"endpoint": "http://dummy-nodejs-endpoint"}}
    })
    N = 100
    mapping = {f"rec{i}": f"doc{i}" for i in range(N)}
    responses = {}
    for i in range(N):
        signed_url = f"http://signed-url-{i}"
        responses[f"http://dummy-nodejs-endpoint/storage/download/doc{i}"] = DummyResponse(200, {"signedUrl": signed_url})
        responses[signed_url] = DummyResponse(200, {"record": {"val": i}})
    arango_service = DummyArangoService(mapping)
    session = DummySession(responses)
    blob_store = TestBlobStorage(logger, config_service, arango_service, session)
    coros = [_try_blobstore_fetch(blob_store, "org1", f"rec{i}") for i in range(N)]
    results = await asyncio.gather(*coros)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import asyncio
# Patch jwt.encode
# Patch aiohttp.ClientSession globally for all tests
import sys
import types
from typing import Any, Dict, Optional

import pytest
from app.modules.transformers.blob_storage import BlobStorage
from app.utils.fetch_full_record import _try_blobstore_fetch


# Mock DefaultEndpoints
class DefaultEndpoints:
    NODEJS_ENDPOINT = type("Enum", (), {"value": "http://mock-nodejs-endpoint/"})

# Mock config_node_constants
class config_node_constants:
    SECRET_KEYS = type("Enum", (), {"value": "secret_keys"})
    ENDPOINTS = type("Enum", (), {"value": "endpoints"})

# ---- Mock Services ----

class MockLogger:
    def __init__(self):
        self.infos = []
        self.errors = []
        self.exceptions = []
    def info(self, msg, *args):
        self.infos.append((msg, args))
    def error(self, msg, *args):
        self.errors.append((msg, args))
    def exception(self, msg):
        self.exceptions.append(msg)
    def debug(self, msg, *args):
        pass
    def warning(self, msg, *args):
        pass

class MockArangoService:
    def __init__(self, mapping=None):
        self.mapping = mapping or {}
        self.db = self
    def aql(self):
        return self
    def execute(self, query):
        # parse virtualRecordId from query
        import re
        m = re.search(r'FILTER doc.virtualRecordId == "([^"]+)"', query)
        if m:
            vid = m.group(1)
            doc_id = self.mapping.get(vid)
            if doc_id:
                return iter([doc_id])
        return iter([])

class MockConfigService:
    def __init__(self, secret_keys=None, endpoints=None):
        self.secret_keys = secret_keys or {"scopedJwtSecret": "mock_secret"}
        self.endpoints = endpoints or {"cm": {"endpoint": DefaultEndpoints.NODEJS_ENDPOINT.value}}
    async def get_config(self, key):
        if key == config_node_constants.SECRET_KEYS.value:
            return self.secret_keys
        elif key == config_node_constants.ENDPOINTS.value:
            return self.endpoints
        return None
from app.utils.fetch_full_record import _try_blobstore_fetch

# ---- Test Fixtures ----

@pytest.fixture
def logger():
    return MockLogger()

@pytest.fixture
def arango_service():
    # Maps virtual_record_id to document_id
    return MockArangoService(mapping={"vrid1": "docid1", "vrid2": "docid2"})

@pytest.fixture
def config_service():
    return MockConfigService()

@pytest.fixture
def blob_store(logger, config_service, arango_service):
    return BlobStorage(logger, config_service, arango_service)

# ---- Basic Test Cases ----

@pytest.mark.asyncio
async def test__try_blobstore_fetch_basic_success(blob_store):
    # Basic: Should return record dict when everything is correct
    result = await _try_blobstore_fetch(blob_store, org_id="org1", record_id="vrid1")

@pytest.mark.asyncio
async def test__try_blobstore_fetch_basic_none_on_missing_document(blob_store):
    # Should return None if virtual_record_id not mapped
    result = await _try_blobstore_fetch(blob_store, org_id="org1", record_id="not_found_vrid")

@pytest.mark.asyncio





async def test__try_blobstore_fetch_edge_concurrent_requests(blob_store):
    # Test concurrent fetches for different record_ids
    tasks = [
        _try_blobstore_fetch(blob_store, org_id="org1", record_id="vrid1"),
        _try_blobstore_fetch(blob_store, org_id="org1", record_id="vrid2"),
        _try_blobstore_fetch(blob_store, org_id="org1", record_id="not_found_vrid"),
    ]
    results = await asyncio.gather(*tasks)

# ---- Large Scale Test Cases ----

@pytest.mark.asyncio
async def test__try_blobstore_fetch_large_scale_many_concurrent(blob_store, logger, config_service):
    # Simulate multiple concurrent fetches
    arango_service = MockArangoService(mapping={f"vrid{i}": f"docid{i}" for i in range(10)})
    bs = BlobStorage(logger, config_service, arango_service)
    tasks = [
        _try_blobstore_fetch(bs, org_id="org1", record_id=f"vrid{i}")
        for i in range(10)
    ]
    results = await asyncio.gather(*tasks)
    for result in results:
        pass

@pytest.mark.asyncio
async def test__try_blobstore_fetch_large_scale_missing_and_found(blob_store, logger, config_service):
    # Mix of found and missing record_ids
    mapping = {f"vrid{i}": f"docid{i}" for i in range(5)}
    arango_service = MockArangoService(mapping=mapping)
    bs = BlobStorage(logger, config_service, arango_service)
    tasks = [
        _try_blobstore_fetch(bs, org_id="org1", record_id=f"vrid{i}") for i in range(10)
    ]
    results = await asyncio.gather(*tasks)
    for i, result in enumerate(results):
        if i < 5:
            pass
        else:
            pass

# ---- Throughput Test Cases ----

@pytest.mark.asyncio
async def test__try_blobstore_fetch_throughput_small_load(blob_store, logger, config_service):
    # Throughput: Small load (5 requests)
    arango_service = MockArangoService(mapping={f"vrid{i}": f"docid{i}" for i in range(5)})
    bs = BlobStorage(logger, config_service, arango_service)
    tasks = [
        _try_blobstore_fetch(bs, org_id="org1", record_id=f"vrid{i}") for i in range(5)
    ]
    results = await asyncio.gather(*tasks)

@pytest.mark.asyncio
async def test__try_blobstore_fetch_throughput_medium_load(blob_store, logger, config_service):
    # Throughput: Medium load (50 requests)
    arango_service = MockArangoService(mapping={f"vrid{i}": f"docid{i}" for i in range(50)})
    bs = BlobStorage(logger, config_service, arango_service)
    tasks = [
        _try_blobstore_fetch(bs, org_id="org1", record_id=f"vrid{i}") for i in range(50)
    ]
    results = await asyncio.gather(*tasks)

@pytest.mark.asyncio
async def test__try_blobstore_fetch_throughput_large_load(blob_store, logger, config_service):
    # Throughput: Large load (200 requests, under 1000 as per instructions)
    arango_service = MockArangoService(mapping={f"vrid{i}": f"docid{i}" for i in range(200)})
    bs = BlobStorage(logger, config_service, arango_service)
    tasks = [
        _try_blobstore_fetch(bs, org_id="org1", record_id=f"vrid{i}") for i in range(200)
    ]
    results = await asyncio.gather(*tasks)

@pytest.mark.asyncio
async def test__try_blobstore_fetch_throughput_mixed_load(blob_store, logger, config_service):
    # Throughput: Mixed load (half found, half missing)
    mapping = {f"vrid{i}": f"docid{i}" for i in range(100)}
    arango_service = MockArangoService(mapping=mapping)
    bs = BlobStorage(logger, config_service, arango_service)
    tasks = [
        _try_blobstore_fetch(bs, org_id="org1", record_id=f"vrid{i}") for i in range(200)
    ]
    results = await asyncio.gather(*tasks)
    for i, result in enumerate(results):
        if i < 100:
            pass
        else:
            pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-_try_blobstore_fetch-mhe0j5cy and push.

The optimized code achieves a **5% runtime improvement** (from 1.56ms to 1.47ms) and **1.1% throughput improvement** (118,590 to 119,930 ops/sec) through several key micro-optimizations: **Key optimizations:** 1. **Reduced dictionary lookups**: Changed `secret_keys.get("scopedJwtSecret")` to `secret_keys.get("scopedJwtSecret") if secret_keys else None`, eliminating redundant `.get()` calls when `secret_keys` is None. 2. **Optimized endpoint lookup**: Replaced nested `.get()` calls with a single conditional expression that checks for endpoint existence before accessing the nested dictionary, reducing dictionary traversal overhead. 3. **Streamlined signed URL handling**: Removed the unnecessary `if(data.get("signedUrl"))` check and directly assigned `signed_url = data.get("signedUrl")`, then used a simple `if signed_url:` condition. This eliminates duplicate dictionary lookups on the same key. 4. **Cleaner variable naming**: Used `resp2` instead of reusing `resp` for the signed URL response, improving code clarity and potentially avoiding variable reassignment overhead. **Performance impact**: These optimizations primarily reduce the overhead of dictionary operations and conditional checks. The line profiler shows the most time is spent in the `get_config()` calls (15-17% of total time), and the dictionary operations account for 8-10% of execution time. By minimizing these lookups, the optimizations provide consistent small gains across all test scenarios. **Best for**: The optimizations show consistent benefits across all test cases - basic success scenarios, edge cases with missing data, and high-throughput concurrent requests (up to 200 concurrent operations). The improvements are most noticeable in high-volume scenarios where the reduced per-operation overhead compounds.

codeflash-ai bot requested a review from mashraf-222 October 30, 2025 22:45

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Oct 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `_try_blobstore_fetch` by 6% #281

⚡️ Speed up function `_try_blobstore_fetch` by 6% #281

Uh oh!

codeflash-ai bot commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function _try_blobstore_fetch by 6% #281

Are you sure you want to change the base?

⚡️ Speed up function _try_blobstore_fetch by 6% #281

Uh oh!

Conversation

codeflash-ai bot commented Oct 30, 2025

📄 6% (0.06x) speedup for _try_blobstore_fetch in backend/python/app/utils/fetch_full_record.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function `_try_blobstore_fetch` by 6% #281

⚡️ Speed up function `_try_blobstore_fetch` by 6% #281

📄 6% (0.06x) speedup for `_try_blobstore_fetch` in `backend/python/app/utils/fetch_full_record.py`