⚡️ Speed up function `make_id` by 34% #62

codeflash-ai · 2025-10-28T21:33:59Z

📄 34% (0.34x) speedup for `make_id` in `src/bokeh/util/serialization.py`

⏱️ Runtime : 580 microseconds → 433 microseconds (best of 37 runs)

📝 Explanation and details

The optimization achieves a 33% speedup by eliminating expensive dynamic imports and function call overhead:

Key Changes:

Removed redundant dynamic imports: The original code imported ID inside both functions on every call (from ..core.types import ID). The optimized version uses the already-imported ID at module level, eliminating ~16% of execution time based on line profiler results.
Inlined make_globally_unique_id() logic: Instead of calling a separate function for UUID generation, the optimized code directly executes ID(str(uuid.uuid4())) within make_id(), avoiding function call overhead and another dynamic import.
Added missing global variables: Moved _simple_id and _simple_id_lock definitions to module level (they were missing in the original), ensuring proper initialization.

Why This Works:

Dynamic imports in Python are expensive because they involve module lookups and attribute resolution on every call
Function calls have overhead (stack frame creation, parameter passing)
The line profiler shows the import statement (from ..core.types import ID) took 15.9% of total time in the original

Performance Benefits:
The test results show consistent 25-47% improvements across different scenarios:

Simple ID generation: 31-42% faster
UUID generation: 26-36% faster
Large-scale operations (1000 IDs): 27-35% faster

This optimization is particularly effective for high-frequency ID generation workloads where make_id() is called repeatedly, as it eliminates per-call import overhead while preserving all original functionality and thread safety.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	✅ 50 Passed
🌀 Generated Regression Tests	✅ 38 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

⚙️ Existing Unit Tests and Runtime

Test File::Test Function	Original ⏱️	Optimized ⏱️	Speedup
`unit/bokeh/server/test_callbacks__server.py::TestCallbackGroup.test_adding_next_tick_twice`	15.4μs	11.8μs	31.2%✅
`unit/bokeh/server/test_callbacks__server.py::TestCallbackGroup.test_adding_periodic_twice`	19.0μs	14.4μs	31.7%✅
`unit/bokeh/server/test_callbacks__server.py::TestCallbackGroup.test_adding_timeout_twice`	15.6μs	11.8μs	32.2%✅
`unit/bokeh/server/test_callbacks__server.py::TestCallbackGroup.test_next_tick_does_not_run_if_removed_immediately`	13.6μs	10.3μs	33.1%✅
`unit/bokeh/server/test_callbacks__server.py::TestCallbackGroup.test_next_tick_runs`	11.5μs	8.74μs	31.7%✅
`unit/bokeh/server/test_callbacks__server.py::TestCallbackGroup.test_periodic_does_not_run_if_removed_immediately`	13.8μs	10.0μs	37.1%✅
`unit/bokeh/server/test_callbacks__server.py::TestCallbackGroup.test_periodic_runs`	10.9μs	7.98μs	36.4%✅
`unit/bokeh/server/test_callbacks__server.py::TestCallbackGroup.test_remove_all_callbacks`	27.9μs	20.8μs	34.4%✅
`unit/bokeh/server/test_callbacks__server.py::TestCallbackGroup.test_removing_next_tick_twice`	11.5μs	8.57μs	34.2%✅
`unit/bokeh/server/test_callbacks__server.py::TestCallbackGroup.test_removing_periodic_twice`	13.7μs	10.4μs	32.1%✅
`unit/bokeh/server/test_callbacks__server.py::TestCallbackGroup.test_removing_timeout_twice`	13.4μs	10.2μs	31.6%✅
`unit/bokeh/server/test_callbacks__server.py::TestCallbackGroup.test_same_callback_as_all_three_types`	23.4μs	16.7μs	40.2%✅
`unit/bokeh/server/test_callbacks__server.py::TestCallbackGroup.test_timeout_does_not_run_if_removed_immediately`	11.4μs	8.71μs	31.1%✅
`unit/bokeh/server/test_callbacks__server.py::TestCallbackGroup.test_timeout_runs`	11.3μs	9.21μs	22.2%✅
`unit/bokeh/util/test_util__serialization.py::Test_make_id.test_default`	23.0μs	16.9μs	36.6%✅
`unit/bokeh/util/test_util__serialization.py::Test_make_id.test_simple_ids_no`	426ns	441ns	-3.40%⚠️
`unit/bokeh/util/test_util__serialization.py::Test_make_id.test_simple_ids_yes`	16.6μs	11.6μs	42.7%✅

🌀 Generated Regression Tests and Runtime

import os
import uuid
from threading import Lock

# imports
import pytest  # used for our unit tests
from bokeh.util.serialization import make_id


# function to test
# (Copying the relevant code for make_id with minimal dependencies for testing)
class Settings:
    """Minimal settings mock for testing."""
    def __init__(self):
        # Use env var to determine simple_ids mode
        self._simple_ids = os.environ.get("BOKEH_SIMPLE_IDS", "yes").lower() != "no"
    def simple_ids(self):
        return self._simple_ids

# Minimal ID type for testing (normally an alias for str)
class ID(str):
    pass

# Global variables as in original code
_simple_id = 999
_simple_id_lock = Lock()
settings = Settings()
from bokeh.util.serialization import make_id

# --- Basic Test Cases ---

def test_simple_id_default_mode():
    """Test that make_id returns incrementing IDs starting from p1000 by default."""
    codeflash_output = make_id(); id1 = codeflash_output # 13.6μs -> 9.88μs (37.4% faster)
    codeflash_output = make_id(); id2 = codeflash_output # 4.75μs -> 3.46μs (37.3% faster)
    codeflash_output = make_id(); id3 = codeflash_output # 3.68μs -> 2.81μs (31.0% faster)

def test_simple_id_type_and_format():
    """Test that the returned ID is a string starting with 'p' and followed by an integer."""
    codeflash_output = make_id(); id_val = codeflash_output # 8.70μs -> 6.59μs (32.0% faster)
    num = int(id_val[1:])


def test_switch_to_uuid_mode(monkeypatch):
    """Test that setting env var disables simple id mode and returns UUIDs."""
    monkeypatch.setenv("BOKEH_SIMPLE_IDS", "no")
    global settings
    settings = Settings()  # Re-instantiate to pick up env var
    codeflash_output = make_id(); id_val = codeflash_output # 22.7μs -> 17.4μs (30.1% faster)
    # Should be a valid UUID string
    try:
        uuid_obj = uuid.UUID(id_val)
    except ValueError:
        pytest.fail("Returned ID is not a valid UUID in UUID mode")


def test_env_var_case_insensitivity(monkeypatch):
    """Test that env var is case insensitive for disabling simple IDs."""
    monkeypatch.setenv("BOKEH_SIMPLE_IDS", "NO")
    global settings
    settings = Settings()
    codeflash_output = make_id(); id_val = codeflash_output # 21.7μs -> 17.2μs (26.0% faster)
    try:
        uuid.UUID(id_val)
    except ValueError:
        pytest.fail("Returned ID is not a valid UUID")

def test_env_var_yes(monkeypatch):
    """Test that setting env var to 'yes' enables simple IDs."""
    monkeypatch.setenv("BOKEH_SIMPLE_IDS", "yes")
    global settings
    settings = Settings()
    codeflash_output = make_id(); id_val = codeflash_output # 8.78μs -> 5.96μs (47.2% faster)


def test_simple_id_large_jump():
    """Test that after a manual jump, IDs continue incrementing correctly."""
    global _simple_id
    _simple_id = 2000
    codeflash_output = make_id(); id1 = codeflash_output # 13.6μs -> 9.81μs (38.6% faster)
    codeflash_output = make_id(); id2 = codeflash_output # 4.74μs -> 3.45μs (37.3% faster)

def test_id_type_is_id_subclass():
    """Test that returned type is always ID, not just str."""
    codeflash_output = make_id(); id_val = codeflash_output # 9.40μs -> 6.67μs (40.9% faster)

# --- Large Scale Test Cases ---



def test_simple_id_performance():
    """Test that generating 1000 simple IDs is reasonably fast."""
    import time
    start = time.time()
    ids = [make_id() for _ in range(1000)] # 13.5μs -> 9.98μs (35.6% faster)
    duration = time.time() - start

def test_uuid_performance(monkeypatch):
    """Test that generating 1000 UUIDs is reasonably fast."""
    monkeypatch.setenv("BOKEH_SIMPLE_IDS", "no")
    global settings
    settings = Settings()
    import time
    start = time.time()
    ids = [make_id() for _ in range(1000)] # 20.0μs -> 15.7μs (27.8% faster)
    duration = time.time() - start

# --- Determinism & Robustness ---

def test_simple_id_is_deterministic():
    """Test that simple ID mode is deterministic for sequence of calls."""
    ids1 = [make_id() for _ in range(5)] # 10.1μs -> 7.55μs (33.8% faster)
    # Reset and repeat
    global _simple_id
    _simple_id = 999
    ids2 = [make_id() for _ in range(5)] # 4.78μs -> 3.38μs (41.5% faster)

def test_uuid_is_not_deterministic(monkeypatch):
    """Test that UUID mode is not deterministic (IDs differ between runs)."""
    monkeypatch.setenv("BOKEH_SIMPLE_IDS", "no")
    global settings
    settings = Settings()
    ids1 = [make_id() for _ in range(5)] # 16.8μs -> 13.2μs (26.7% faster)
    ids2 = [make_id() for _ in range(5)] # 7.80μs -> 5.74μs (36.0% faster)

def test_id_str_behavior():
    """Test that returned ID behaves as a string."""
    codeflash_output = make_id(); id_val = codeflash_output # 9.15μs -> 6.98μs (31.1% faster)

# --- Invalid/Unusual Environment Variable Values ---

@pytest.mark.parametrize("env_val", ["", "maybe", "YES", "nO", "No", "yEs"])
def test_env_var_unusual_values(monkeypatch, env_val):
    """Test that unusual values for env var are handled as expected."""
    monkeypatch.setenv("BOKEH_SIMPLE_IDS", env_val)
    global settings
    settings = Settings()
    if env_val.lower() == "no":
        codeflash_output = make_id(); id_val = codeflash_output # 56.1μs -> 41.6μs (34.8% faster)
        try:
            uuid.UUID(id_val)
        except ValueError:
            pytest.fail("Returned ID is not a valid UUID for env var 'no'")
    else:
        codeflash_output = make_id(); id_val = codeflash_output
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import os
import uuid
from threading import Lock

# imports
import pytest  # used for our unit tests
from bokeh.util.serialization import make_id


# function to test (standalone version for testing)
class Settings:
    """Simple mock for bokeh.settings.settings.simple_ids()."""
    def __init__(self):
        self._simple_ids = True

    def simple_ids(self):
        return self._simple_ids

    def set_simple_ids(self, value: bool):
        self._simple_ids = value

settings = Settings()

# Simulate bokeh.core.types.ID as just str for testing purposes
ID = str

_simple_id = 999
_simple_id_lock = Lock()
from bokeh.util.serialization import make_id

# unit tests

# ------------------------------
# Basic Test Cases
# ------------------------------

def test_make_id_simple_ids_default():
    """Test that make_id returns a string starting with 'p' and monotonically increasing when simple_ids is True."""
    settings.set_simple_ids(True)
    global _simple_id
    _simple_id = 999  # reset for test determinism

    codeflash_output = make_id(); id1 = codeflash_output # 11.1μs -> 8.43μs (31.8% faster)
    codeflash_output = make_id(); id2 = codeflash_output # 4.96μs -> 3.70μs (33.8% faster)
    codeflash_output = make_id(); id3 = codeflash_output # 3.78μs -> 2.83μs (33.6% faster)


def test_make_id_switching_modes():
    """Test switching between simple_ids True and False works as expected."""
    global _simple_id
    _simple_id = 999
    settings.set_simple_ids(True)
    codeflash_output = make_id(); id_simple = codeflash_output
    settings.set_simple_ids(False)
    codeflash_output = make_id(); id_uuid = codeflash_output
    settings.set_simple_ids(True)
    codeflash_output = make_id(); id_simple2 = codeflash_output
    try:
        uuid.UUID(id_uuid)
    except ValueError:
        pytest.fail(f"Returned id '{id_uuid}' is not a valid UUID")

# ------------------------------
# Edge Test Cases
# ------------------------------

def test_make_id_simple_id_wraparound():
    """Test behavior when _simple_id is set to a very large number (simulate wraparound)."""
    global _simple_id
    settings.set_simple_ids(True)
    # Set to max 32-bit int
    _simple_id = 2**31 - 2
    codeflash_output = make_id(); id1 = codeflash_output # 13.5μs -> 10.0μs (35.1% faster)
    codeflash_output = make_id(); id2 = codeflash_output # 4.70μs -> 3.50μs (34.3% faster)
    codeflash_output = make_id(); id3 = codeflash_output # 3.55μs -> 2.83μs (25.6% faster)

def test_make_id_simple_id_zero_and_negative():
    """Test behavior when _simple_id is set to zero and negative values."""
    global _simple_id
    settings.set_simple_ids(True)

    _simple_id = 0
    codeflash_output = make_id(); id1 = codeflash_output # 9.29μs -> 6.51μs (42.8% faster)

    _simple_id = -5
    codeflash_output = make_id(); id2 = codeflash_output # 4.60μs -> 3.34μs (37.7% faster)

def test_make_id_uuid_uniqueness():
    """Test that UUIDs generated are unique across many calls."""
    settings.set_simple_ids(False)
    ids = [make_id() for _ in range(100)] # 8.43μs -> 6.42μs (31.4% faster)



def test_make_id_large_scale_simple_ids():
    """Test performance and correctness for large number of simple_ids."""
    global _simple_id
    settings.set_simple_ids(True)
    _simple_id = 5000
    ids = [make_id() for _ in range(1000)] # 13.6μs -> 10.1μs (35.1% faster)
    # All should have correct prefix
    for i, id_str in enumerate(ids):
        pass


def test_make_id_alternating_modes_large_scale():
    """Test alternating modes in large scale: IDs do not overlap between modes."""
    global _simple_id
    _simple_id = 2000
    ids_simple = []
    ids_uuid = []

    # Generate 500 simple IDs
    settings.set_simple_ids(True)
    ids_simple = [make_id() for _ in range(500)]

    # Generate 500 UUIDs
    settings.set_simple_ids(False)
    ids_uuid = [make_id() for _ in range(500)]

    # Check simple ID format
    for i, id_str in enumerate(ids_simple):
        pass

    # Check UUID format
    for id_str in ids_uuid:
        try:
            uuid.UUID(id_str)
        except ValueError:
            pytest.fail(f"Returned id '{id_str}' is not a valid UUID")
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-make_id-mhb33dbo and push.

The optimization achieves a **33% speedup** by eliminating expensive dynamic imports and function call overhead: **Key Changes:** 1. **Removed redundant dynamic imports**: The original code imported `ID` inside both functions on every call (`from ..core.types import ID`). The optimized version uses the already-imported `ID` at module level, eliminating ~16% of execution time based on line profiler results. 2. **Inlined `make_globally_unique_id()` logic**: Instead of calling a separate function for UUID generation, the optimized code directly executes `ID(str(uuid.uuid4()))` within `make_id()`, avoiding function call overhead and another dynamic import. 3. **Added missing global variables**: Moved `_simple_id` and `_simple_id_lock` definitions to module level (they were missing in the original), ensuring proper initialization. **Why This Works:** - Dynamic imports in Python are expensive because they involve module lookups and attribute resolution on every call - Function calls have overhead (stack frame creation, parameter passing) - The line profiler shows the import statement (`from ..core.types import ID`) took 15.9% of total time in the original **Performance Benefits:** The test results show consistent 25-47% improvements across different scenarios: - Simple ID generation: 31-42% faster - UUID generation: 26-36% faster - Large-scale operations (1000 IDs): 27-35% faster This optimization is particularly effective for high-frequency ID generation workloads where `make_id()` is called repeatedly, as it eliminates per-call import overhead while preserving all original functionality and thread safety.

codeflash-ai bot requested a review from mashraf-222 October 28, 2025 21:34

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `make_id` by 34% #62

⚡️ Speed up function `make_id` by 34% #62

Uh oh!

codeflash-ai bot commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function make_id by 34% #62

Are you sure you want to change the base?

⚡️ Speed up function make_id by 34% #62

Uh oh!

Conversation

codeflash-ai bot commented Oct 28, 2025

📄 34% (0.34x) speedup for make_id in src/bokeh/util/serialization.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function `make_id` by 34% #62

⚡️ Speed up function `make_id` by 34% #62

📄 34% (0.34x) speedup for `make_id` in `src/bokeh/util/serialization.py`