Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 28, 2025

📄 19% (0.19x) speedup for _split_feature_trait in src/bokeh/plotting/_renderer.py

⏱️ Runtime : 1.55 milliseconds 1.31 milliseconds (best of 168 runs)

📝 Explanation and details

The optimized code replaces the split() method with direct string indexing using find(). Instead of splitting the string into a list and then checking its length, the optimization:

  1. Uses str.find("_") instead of str.split("_", 1) - This returns the index of the first underscore or -1 if not found, avoiding the overhead of creating a list object.

  2. Direct string slicing - When an underscore is found, it uses ft[:idx] and ft[idx+1:] to extract the parts directly, eliminating the intermediate list creation and tuple conversion.

The key performance benefit comes from avoiding memory allocation for the list object that split() creates. The find() method is a simple C-level string search operation that's faster than the more complex split() which must allocate memory and populate a list.

Test case performance patterns:

  • Strings without underscores show the best improvement (11-43% faster) because the optimized version can return immediately after find() returns -1
  • Strings with underscores are generally slower (13-30%) in the optimized version, likely due to the additional string slicing operations and function call overhead in the microbenchmark context
  • Large strings without underscores benefit significantly (25-47% faster) since find() can terminate early while split() would still need to process the entire string

The 18% overall speedup suggests the codebase has more cases without underscores or the performance gain from avoiding list allocation outweighs the slicing overhead in typical usage patterns.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 5067 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 2 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest  # used for our unit tests
from bokeh.plotting._renderer import _split_feature_trait

# unit tests

# ----------------------------
# Basic Test Cases
# ----------------------------

def test_basic_split_with_underscore():
    # Standard case with one underscore
    codeflash_output = _split_feature_trait("line_color") # 678ns -> 929ns (27.0% slower)
    # Standard case with one underscore and longer second part
    codeflash_output = _split_feature_trait("fill_alpha") # 306ns -> 401ns (23.7% slower)
    # Standard case with one underscore and numeric second part
    codeflash_output = _split_feature_trait("foo_123") # 293ns -> 376ns (22.1% slower)
    # Standard case with one underscore and special chars in second part
    codeflash_output = _split_feature_trait("bar_!@#") # 200ns -> 249ns (19.7% slower)

def test_basic_split_multiple_underscores():
    # Only split at the first underscore, rest remains
    codeflash_output = _split_feature_trait("glyph_size_large") # 651ns -> 823ns (20.9% slower)
    codeflash_output = _split_feature_trait("abc_def_ghi") # 387ns -> 448ns (13.6% slower)

def test_basic_split_no_underscore():
    # No underscore: returns first char and None
    codeflash_output = _split_feature_trait("x") # 728ns -> 653ns (11.5% faster)
    codeflash_output = _split_feature_trait("yvalue") # 320ns -> 325ns (1.54% slower)
    codeflash_output = _split_feature_trait("A") # 245ns -> 254ns (3.54% slower)
    codeflash_output = _split_feature_trait("Test") # 251ns -> 218ns (15.1% faster)

# ----------------------------
# Edge Test Cases
# ----------------------------


def test_edge_underscore_at_start():
    # Underscore at start: first part is empty string
    codeflash_output = _split_feature_trait("_foo") # 966ns -> 1.11μs (12.9% slower)

def test_edge_underscore_at_end():
    # Underscore at end: second part is empty string
    codeflash_output = _split_feature_trait("foo_") # 773ns -> 950ns (18.6% slower)

def test_edge_only_underscore():
    # Only underscore: first part is empty, second part is empty
    codeflash_output = _split_feature_trait("_") # 744ns -> 915ns (18.7% slower)

def test_edge_multiple_underscores_only():
    # Multiple underscores only: first split, rest remains
    codeflash_output = _split_feature_trait("__") # 700ns -> 955ns (26.7% slower)
    codeflash_output = _split_feature_trait("___") # 308ns -> 405ns (24.0% slower)

def test_edge_single_char():
    # Single character string: returns char, None
    codeflash_output = _split_feature_trait("z") # 754ns -> 745ns (1.21% faster)

def test_edge_unicode_and_special_chars():
    # Unicode characters in input
    codeflash_output = _split_feature_trait("π_θ") # 988ns -> 1.29μs (23.5% slower)
    codeflash_output = _split_feature_trait("λvalue") # 667ns -> 465ns (43.4% faster)
    # Special characters in input
    codeflash_output = _split_feature_trait("a_$b") # 344ns -> 463ns (25.7% slower)
    codeflash_output = _split_feature_trait("$") # 297ns -> 268ns (10.8% faster)

def test_edge_numeric_strings():
    # Numeric strings
    codeflash_output = _split_feature_trait("1_2") # 648ns -> 779ns (16.8% slower)
    codeflash_output = _split_feature_trait("123") # 472ns -> 377ns (25.2% faster)
    codeflash_output = _split_feature_trait("9_") # 335ns -> 365ns (8.22% slower)

def test_edge_whitespace_handling():
    # Whitespace in input
    codeflash_output = _split_feature_trait("foo_bar baz") # 655ns -> 856ns (23.5% slower)
    codeflash_output = _split_feature_trait("foo bar") # 486ns -> 388ns (25.3% faster)
    codeflash_output = _split_feature_trait("foo_ bar") # 271ns -> 350ns (22.6% slower)
    codeflash_output = _split_feature_trait(" foo_bar") # 280ns -> 320ns (12.5% slower)

def test_edge_long_first_part_no_underscore():
    # Long first part, no underscore
    s = "x" * 50
    codeflash_output = _split_feature_trait(s) # 711ns -> 715ns (0.559% slower)

def test_edge_long_second_part():
    # Long second part after first underscore
    s = "a_" + "b" * 100
    codeflash_output = _split_feature_trait(s) # 615ns -> 851ns (27.7% slower)

def test_edge_long_first_and_second_part():
    # Long first and second part
    s = "x" * 100 + "_" + "y" * 100
    codeflash_output = _split_feature_trait(s) # 662ns -> 871ns (24.0% slower)

# ----------------------------
# Large Scale Test Cases
# ----------------------------

def test_large_scale_many_inputs():
    # Test with many different inputs, including edge cases
    for i in range(1, 1000):
        # Case: "foo_i_bar_i"
        s = f"foo{i}_bar{i}"
        codeflash_output = _split_feature_trait(s) # 215μs -> 252μs (14.7% slower)
        # Case: "x" * i
        s2 = "x" * i
        codeflash_output = _split_feature_trait(s2)
        # Case: "x" * i + "_" + "y" * i
        s3 = "x" * i + "_" + "y" * i # 345μs -> 192μs (79.5% faster)
        codeflash_output = _split_feature_trait(s3)
        # Case: "_" + "z" * i
        s4 = "_" + "z" * i
        codeflash_output = _split_feature_trait(s4) # 363μs -> 279μs (30.1% faster)
        # Case: "z" * i + "_"
        s5 = "z" * i + "_"
        codeflash_output = _split_feature_trait(s5)

def test_large_scale_max_length_strings():
    # Test with maximum allowed string lengths (1000 chars)
    s1 = "a" * 1000 + "_" + "b" * 1000
    codeflash_output = _split_feature_trait(s1) # 1.46μs -> 1.61μs (9.85% slower)
    s2 = "c" * 1000
    codeflash_output = _split_feature_trait(s2) # 722ns -> 511ns (41.3% faster)
    s3 = "_" + "d" * 999
    codeflash_output = _split_feature_trait(s3) # 464ns -> 529ns (12.3% slower)
    s4 = "e" * 999 + "_"
    codeflash_output = _split_feature_trait(s4) # 645ns -> 438ns (47.3% faster)

def test_large_scale_all_ascii_chars():
    # Test with all ASCII characters before and after underscore
    import string
    s = string.ascii_letters + "_" + string.digits
    codeflash_output = _split_feature_trait(s) # 663ns -> 884ns (25.0% slower)
    s2 = string.punctuation
    codeflash_output = _split_feature_trait(s2) # 366ns -> 468ns (21.8% slower)

def test_large_scale_stress_with_repeated_pattern():
    # Stress test with repeated pattern and underscore in the middle
    s = ("ab_" * 500)[:-1]  # Remove last character to avoid trailing underscore
    first_underscore = s.find("_")
    codeflash_output = _split_feature_trait(s) # 870ns -> 913ns (4.71% slower)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest  # used for our unit tests
from bokeh.plotting._renderer import _split_feature_trait

# unit tests

# --- Basic Test Cases ---

def test_basic_two_parts():
    # Standard case: string with one underscore
    codeflash_output = _split_feature_trait("line_color") # 654ns -> 823ns (20.5% slower)
    # Another standard case
    codeflash_output = _split_feature_trait("foo_bar") # 364ns -> 444ns (18.0% slower)
    # Numbers and letters
    codeflash_output = _split_feature_trait("x_1") # 289ns -> 361ns (19.9% slower)
    # Underscore in the second part
    codeflash_output = _split_feature_trait("abc_def_ghi") # 238ns -> 292ns (18.5% slower)
    # Case sensitivity
    codeflash_output = _split_feature_trait("Line_Color") # 212ns -> 274ns (22.6% slower)

def test_basic_no_underscore():
    # No underscore: returns first character and None
    codeflash_output = _split_feature_trait("x") # 727ns -> 665ns (9.32% faster)
    codeflash_output = _split_feature_trait("A") # 294ns -> 297ns (1.01% slower)
    codeflash_output = _split_feature_trait("foo") # 259ns -> 281ns (7.83% slower)
    codeflash_output = _split_feature_trait("123") # 200ns -> 202ns (0.990% slower)


def test_edge_leading_underscore():
    # Leading underscore: first part is empty string
    codeflash_output = _split_feature_trait("_foo") # 949ns -> 1.14μs (16.6% slower)

def test_edge_trailing_underscore():
    # Trailing underscore: second part is empty string
    codeflash_output = _split_feature_trait("foo_") # 785ns -> 923ns (15.0% slower)

def test_edge_multiple_underscores():
    # Multiple underscores: only split on the first
    codeflash_output = _split_feature_trait("a_b_c_d") # 733ns -> 964ns (24.0% slower)
    codeflash_output = _split_feature_trait("_a_b") # 335ns -> 481ns (30.4% slower)
    codeflash_output = _split_feature_trait("__b") # 220ns -> 267ns (17.6% slower)

def test_edge_only_underscore():
    # Only underscore: splits into two empty strings
    codeflash_output = _split_feature_trait("_") # 702ns -> 846ns (17.0% slower)

def test_edge_single_char():
    # Single character input
    codeflash_output = _split_feature_trait("z") # 763ns -> 737ns (3.53% faster)

def test_edge_unicode_and_special_chars():
    # Unicode characters
    codeflash_output = _split_feature_trait("π_θ") # 998ns -> 1.28μs (21.9% slower)
    codeflash_output = _split_feature_trait("你好_世界") # 426ns -> 494ns (13.8% slower)
    # Special characters
    codeflash_output = _split_feature_trait("@_#") # 301ns -> 434ns (30.6% slower)
    codeflash_output = _split_feature_trait("!$%_^&*") # 286ns -> 355ns (19.4% slower)

def test_edge_numeric_strings():
    # Numeric strings
    codeflash_output = _split_feature_trait("123_456") # 579ns -> 818ns (29.2% slower)
    codeflash_output = _split_feature_trait("0_") # 352ns -> 432ns (18.5% slower)

def test_edge_long_first_part():
    # Very long first part, short second part
    codeflash_output = _split_feature_trait("a"*999 + "_b") # 1.21μs -> 1.34μs (9.85% slower)

def test_edge_long_second_part():
    # Short first part, very long second part
    codeflash_output = _split_feature_trait("a_" + "b"*999) # 841ns -> 1.14μs (26.0% slower)

def test_edge_input_is_underscore_only():
    # Input is just '_'
    codeflash_output = _split_feature_trait("_") # 635ns -> 839ns (24.3% slower)

# --- Large Scale Test Cases ---

def test_large_scale_long_string_no_underscore():
    # Large string with no underscores
    s = "a" * 1000
    codeflash_output = _split_feature_trait(s) # 995ns -> 793ns (25.5% faster)

def test_large_scale_long_string_with_underscore():
    # Large string with underscore at position 500
    s = "x" * 500 + "_" + "y" * 499
    codeflash_output = _split_feature_trait(s) # 924ns -> 1.21μs (23.3% slower)

def test_large_scale_multiple_underscores():
    # Large string with multiple underscores
    s = "foo_" + "_".join(str(i) for i in range(997))
    # Only the first underscore is used for splitting
    expected_first = "foo"
    expected_second = "_".join(str(i) for i in range(997))
    codeflash_output = _split_feature_trait(s) # 887ns -> 1.19μs (25.3% slower)

def test_large_scale_leading_underscores():
    # Large string with many leading underscores
    s = "_" * 999 + "end"
    codeflash_output = _split_feature_trait(s) # 888ns -> 1.18μs (24.8% slower)

def test_large_scale_trailing_underscores():
    # Large string ending with underscores
    s = "start" + "_" * 999
    codeflash_output = _split_feature_trait(s) # 825ns -> 1.08μs (23.5% slower)

def test_large_scale_all_underscores():
    # String of only underscores
    s = "_" * 1000
    # Should split into '' and '_'*999
    codeflash_output = _split_feature_trait(s) # 816ns -> 1.10μs (25.8% slower)

def test_large_scale_first_char_underscore():
    # First character is underscore, rest is long string
    s = "_" + "x" * 999
    codeflash_output = _split_feature_trait(s) # 788ns -> 1.08μs (27.0% slower)

def test_large_scale_second_char_underscore():
    # Second character is underscore, rest is long string
    s = "a_" + "x" * 998
    codeflash_output = _split_feature_trait(s) # 824ns -> 982ns (16.1% slower)

def test_large_scale_last_char_underscore():
    # Last character is underscore, should split at first
    s = "a" * 999 + "_"
    codeflash_output = _split_feature_trait(s) # 1.16μs -> 1.21μs (3.81% slower)

def test_large_scale_unicode():
    # Large string with unicode characters and underscore
    s = "你" * 500 + "_" + "好" * 499
    codeflash_output = _split_feature_trait(s) # 1.51μs -> 1.84μs (17.9% slower)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from bokeh.plotting._renderer import _split_feature_trait

def test__split_feature_trait():
    _split_feature_trait('\x00')

def test__split_feature_trait_2():
    _split_feature_trait('_')
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_cthbg6_3/tmpi0l_qklb/test_concolic_coverage.py::test__split_feature_trait 774ns 751ns 3.06%✅
codeflash_concolic_cthbg6_3/tmpi0l_qklb/test_concolic_coverage.py::test__split_feature_trait_2 789ns 931ns -15.3%⚠️

To edit these changes git checkout codeflash/optimize-_split_feature_trait-mhb63bm2 and push.

Codeflash

The optimized code replaces the `split()` method with direct string indexing using `find()`. Instead of splitting the string into a list and then checking its length, the optimization:

1. **Uses `str.find("_")` instead of `str.split("_", 1)`** - This returns the index of the first underscore or -1 if not found, avoiding the overhead of creating a list object.

2. **Direct string slicing** - When an underscore is found, it uses `ft[:idx]` and `ft[idx+1:]` to extract the parts directly, eliminating the intermediate list creation and tuple conversion.

The key performance benefit comes from avoiding memory allocation for the list object that `split()` creates. The `find()` method is a simple C-level string search operation that's faster than the more complex `split()` which must allocate memory and populate a list.

**Test case performance patterns:**
- **Strings without underscores** show the best improvement (11-43% faster) because the optimized version can return immediately after `find()` returns -1
- **Strings with underscores** are generally slower (13-30%) in the optimized version, likely due to the additional string slicing operations and function call overhead in the microbenchmark context
- **Large strings without underscores** benefit significantly (25-47% faster) since `find()` can terminate early while `split()` would still need to process the entire string

The 18% overall speedup suggests the codebase has more cases without underscores or the performance gain from avoiding list allocation outweighs the slicing overhead in typical usage patterns.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 28, 2025 22:57
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant