Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 30, 2025

📄 218% (2.18x) speedup for ScopedVisitor.visit_Global in marimo/_ast/visitor.py

⏱️ Runtime : 16.3 milliseconds 5.13 milliseconds (best of 275 runs)

📝 Explanation and details

The optimized code achieves a 218% speedup through three key optimizations:

1. Replaced any() with in operator in Block.is_defined()

  • Original: return any(name == defn for defn in self.defs)
  • Optimized: return name in self.defs
  • This eliminates the generator expression overhead and leverages the highly optimized set membership test, reducing is_defined() runtime from 91.8ms to 1.1ms (~83x faster)

2. Used dict.setdefault() instead of conditional dictionary initialization

  • Original: Manual check with if name not in self._refs: self._refs[name] = []
  • Optimized: refs_name = self._refs.setdefault(name, [])
  • This reduces dictionary lookups and provides a direct reference to the list, eliminating repeated key lookups

3. Added early break in reference search loop

  • Added break after finding matching reference in _add_ref()
  • Prevents unnecessary iteration through remaining references once a match is found

4. Cached repeated attribute lookups in visit_Global()

  • Stored self.block_stack[-1] and self.block_stack[0] in local variables
  • Reduces repeated attribute access overhead in the critical loop

The optimizations are particularly effective for:

  • Large-scale scenarios: Tests with 1000+ names show 25-31% improvements
  • Cases with many already-defined globals: One test showed 1744% speedup when half the names were pre-defined, as the faster is_defined() check dramatically reduces overhead
  • Regular usage patterns: Even basic cases with 3-5 names show consistent 15-25% improvements

The is_defined() optimization provides the largest impact since it's called for every global name to check if it's already defined in the global scope.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 87 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 3 Passed
📊 Tests Coverage 66.7%
🌀 Generated Regression Tests and Runtime
import ast
from collections import defaultdict
from dataclasses import dataclass, field
from typing import Callable, Literal, Optional, Union
from uuid import uuid4

# imports
import pytest
from marimo._ast.visitor import ScopedVisitor

# --- End: function to test and dependencies ---

# --- Begin: Unit Tests ---

# Helper to create an ast.Global node
def make_global(names):
    return ast.Global(names=names)

# Basic Test Cases

def test_basic_single_global_name():
    # Test with a single global name
    visitor = ScopedVisitor()
    node = make_global(["foo"])
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 6.86μs -> 6.02μs (14.0% faster)

def test_basic_multiple_global_names():
    # Multiple names
    visitor = ScopedVisitor()
    node = make_global(["a", "b", "c"])
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 9.04μs -> 7.67μs (17.8% faster)
    for name in ["a", "b", "c"]:
        pass

def test_basic_global_already_defined_in_global_scope():
    # If name is already defined in global scope, no ref added
    visitor = ScopedVisitor()
    visitor.block_stack[0].defs.add("foo")
    node = make_global(["foo"])
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 3.30μs -> 2.54μs (29.6% faster)

def test_basic_global_with_local_names():
    # Local names should be mangled
    visitor = ScopedVisitor(mangle_prefix="prefix_")
    node = make_global(["_bar", "__", "baz"])
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 9.73μs -> 8.37μs (16.2% faster)
    # _bar and __ should be mangled, baz should not
    mangled_bar = "_prefix__bar"
    mangled_dunder = "_prefix___"

def test_basic_global_ignore_local_flag():
    # ignore_local disables mangling
    visitor = ScopedVisitor(mangle_prefix="prefix_", ignore_local=True)
    node = make_global(["_bar", "__", "baz"])
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 7.91μs -> 6.85μs (15.5% faster)
    for name in ["_bar", "__", "baz"]:
        pass

# Edge Test Cases

def test_edge_empty_names_list():
    # Empty names should not fail
    visitor = ScopedVisitor()
    node = make_global([])
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 1.04μs -> 1.27μs (17.8% slower)

def test_edge_duplicate_names():
    # Duplicates in names should be handled gracefully
    visitor = ScopedVisitor()
    node = make_global(["foo", "foo", "bar"])
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 10.00μs -> 8.71μs (14.9% faster)

def test_edge_global_in_non_global_block():
    # Simulate visiting global in a nested block
    visitor = ScopedVisitor()
    visitor.block_stack.append(Block())  # simulate function scope
    node = make_global(["foo"])
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output

def test_edge_global_with_only_local_names():
    # Only local names, all should be mangled
    visitor = ScopedVisitor(mangle_prefix="prefix_")
    node = make_global(["_a", "__"])
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 9.94μs -> 8.58μs (15.8% faster)

def test_edge_global_with_nonstring_names():
    # Should only accept string names; ast.Global expects list of str
    visitor = ScopedVisitor()
    node = make_global([123, None])
    with pytest.raises(Exception):
        visitor.visit_Global(node) # 3.09μs -> 2.94μs (4.99% faster)

def test_edge_global_with_reserved_python_names():
    # Reserved names like 'True', 'False', 'None'
    visitor = ScopedVisitor()
    node = make_global(["True", "False", "None"])
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 9.45μs -> 8.10μs (16.7% faster)
    for name in ["True", "False", "None"]:
        pass

def test_edge_global_with_long_names():
    # Very long variable names
    long_name = "a" * 512
    visitor = ScopedVisitor()
    node = make_global([long_name])
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 5.55μs -> 5.04μs (10.3% faster)

# Large Scale Test Cases

def test_large_many_names():
    # Stress test with many names
    N = 500
    names = [f"var_{i}" for i in range(N)]
    visitor = ScopedVisitor()
    node = make_global(names)
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 577μs -> 446μs (29.5% faster)
    for name in names:
        pass

def test_large_many_local_names():
    # Many local names, all should be mangled
    N = 500
    names = [f"_local_{i}" for i in range(N)]
    visitor = ScopedVisitor(mangle_prefix="prefix_")
    node = make_global(names)
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 627μs -> 495μs (26.6% faster)
    mangled_names = {f"_prefix__local_{i}" for i in range(N)}
    for name in mangled_names:
        pass

def test_large_mixed_names():
    # Mix of local and global names
    N = 250
    names = [f"var_{i}" for i in range(N)] + [f"_local_{i}" for i in range(N)]
    visitor = ScopedVisitor(mangle_prefix="prefix_")
    node = make_global(names)
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 600μs -> 470μs (27.6% faster)
    expected = set([f"var_{i}" for i in range(N)] + [f"_prefix__local_{i}" for i in range(N)])
    for name in expected:
        pass

def test_large_duplicate_names():
    # Many duplicate names
    N = 100
    names = ["foo"] * N + ["bar"] * N
    visitor = ScopedVisitor()
    node = make_global(names)
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 184μs -> 143μs (28.7% faster)

def test_large_names_already_defined():
    # All names already defined in global scope, should not add refs
    N = 100
    names = [f"var_{i}" for i in range(N)]
    visitor = ScopedVisitor()
    visitor.block_stack[0].defs.update(names)
    node = make_global(names)
    codeflash_output = visitor.visit_Global(node); returned = codeflash_output # 181μs -> 25.9μs (603% faster)
    for name in names:
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import ast

# imports
import pytest
from marimo._ast.visitor import ScopedVisitor

# --- Unit Tests ---

# Helper to create ast.Global nodes
def make_global(names):
    return ast.Global(names=names)

# ----------- Basic Test Cases -----------

def test_basic_single_global():
    # Test single global variable
    visitor = ScopedVisitor()
    node = make_global(["x"])
    codeflash_output = visitor.visit_Global(node); result = codeflash_output # 5.62μs -> 4.78μs (17.5% faster)

def test_basic_multiple_globals():
    # Test multiple global variables
    visitor = ScopedVisitor()
    node = make_global(["x", "y", "z"])
    codeflash_output = visitor.visit_Global(node); result = codeflash_output # 8.39μs -> 7.10μs (18.1% faster)
    for name in ["x", "y", "z"]:
        pass

def test_basic_global_already_defined():
    # If name is already defined at global scope, no ref should be added
    visitor = ScopedVisitor()
    visitor.block_stack[0].defs.add("x")
    node = make_global(["x", "y"])
    codeflash_output = visitor.visit_Global(node); result = codeflash_output # 6.29μs -> 5.10μs (23.4% faster)

def test_basic_ignore_local_flag():
    # If ignore_local is True, no mangling should occur
    visitor = ScopedVisitor(ignore_local=True)
    node = make_global(["_foo", "__", "bar"])
    codeflash_output = visitor.visit_Global(node); result = codeflash_output # 8.06μs -> 6.86μs (17.6% faster)

# ----------- Edge Test Cases -----------

def test_edge_empty_names():
    # Global with no names should not add anything
    visitor = ScopedVisitor()
    node = make_global([])
    codeflash_output = visitor.visit_Global(node); result = codeflash_output # 1.05μs -> 1.31μs (19.7% slower)

def test_edge_all_local_mangling():
    # All names are local, should be mangled
    visitor = ScopedVisitor(mangle_prefix="MANGLED")
    node = make_global(["_foo", "__", "_bar"])
    codeflash_output = visitor.visit_Global(node); result = codeflash_output # 9.85μs -> 8.47μs (16.3% faster)
    # Mangled names should be in global_names and result.names
    for name in ["_foo", "__", "_bar"]:
        mangled = f"_MANGLED{name}"

def test_edge_some_local_some_global():
    # Mix of local and global names
    visitor = ScopedVisitor(mangle_prefix="EDGE")
    node = make_global(["x", "_y", "__", "z"])
    codeflash_output = visitor.visit_Global(node); result = codeflash_output # 10.9μs -> 8.71μs (24.7% faster)

def test_edge_duplicate_names():
    # Duplicate names in global statement
    visitor = ScopedVisitor()
    node = make_global(["x", "x", "y"])
    codeflash_output = visitor.visit_Global(node); result = codeflash_output # 9.19μs -> 8.16μs (12.6% faster)

def test_edge_defined_in_inner_block():
    # If name is defined in inner block, but not global, should still add ref
    visitor = ScopedVisitor()
    visitor.block_stack.append(Block())  # simulate inner block
    node = make_global(["inner"])
    codeflash_output = visitor.visit_Global(node); result = codeflash_output

def test_edge_mangle_prefix_with_hyphens():
    # mangle_prefix with hyphens should be used as is
    visitor = ScopedVisitor(mangle_prefix="hy-phen")
    node = make_global(["_foo"])
    codeflash_output = visitor.visit_Global(node); result = codeflash_output # 7.63μs -> 6.85μs (11.5% faster)

# ----------- Large Scale Test Cases -----------

def test_large_many_globals():
    # Test with 1000 global names
    names = [f"var{i}" for i in range(1000)]
    visitor = ScopedVisitor()
    node = make_global(names)
    codeflash_output = visitor.visit_Global(node); result = codeflash_output # 1.17ms -> 893μs (31.0% faster)
    for name in names:
        pass

def test_large_many_local_globals():
    # Test with 1000 local names, should all be mangled
    names = [f"_local{i}" for i in range(1000)]
    visitor = ScopedVisitor(mangle_prefix="LARGE")
    node = make_global(names)
    codeflash_output = visitor.visit_Global(node); result = codeflash_output # 1.29ms -> 1.02ms (25.7% faster)
    for name in names:
        mangled = f"_LARGE{name}"

def test_large_some_defined_some_not():
    # Half the names are already defined at global scope
    names = [f"v{i}" for i in range(1000)]
    visitor = ScopedVisitor()
    for name in names[:500]:
        visitor.block_stack[0].defs.add(name)
    node = make_global(names)
    codeflash_output = visitor.visit_Global(node); result = codeflash_output # 10.3ms -> 559μs (1744% faster)
    for name in names:
        pass
    # Only names not defined at global scope should be in _refs
    for name in names[:500]:
        pass
    for name in names[500:]:
        pass

def test_large_mixed_local_and_global():
    # Mix local and non-local names
    names = [f"_l{i}" if i % 2 == 0 else f"g{i}" for i in range(1000)]
    visitor = ScopedVisitor(mangle_prefix="MIXED")
    node = make_global(names)
    codeflash_output = visitor.visit_Global(node); result = codeflash_output # 1.20ms -> 942μs (27.7% faster)
    for i, name in enumerate(names):
        if i % 2 == 0:
            mangled = f"_MIXED{name}"
        else:
            pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from ast import Global
from marimo._ast.visitor import ScopedVisitor
import pytest

def test_ScopedVisitor_visit_Global():
    with pytest.raises(AttributeError, match="'Global'\\ object\\ has\\ no\\ attribute\\ 'names'"):
        ScopedVisitor.visit_Global(ScopedVisitor(), Global())
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_o_lbxivc/tmpvun_gdqb/test_concolic_coverage.py::test_ScopedVisitor_visit_Global 1.47μs 1.61μs -9.11%⚠️

To edit these changes git checkout codeflash/optimize-ScopedVisitor.visit_Global-mhcya60w and push.

Codeflash Static Badge

The optimized code achieves a **218% speedup** through three key optimizations:

**1. Replaced `any()` with `in` operator in `Block.is_defined()`**
- Original: `return any(name == defn for defn in self.defs)` 
- Optimized: `return name in self.defs`
- This eliminates the generator expression overhead and leverages the highly optimized set membership test, reducing `is_defined()` runtime from 91.8ms to 1.1ms (~83x faster)

**2. Used `dict.setdefault()` instead of conditional dictionary initialization**  
- Original: Manual check with `if name not in self._refs: self._refs[name] = []`
- Optimized: `refs_name = self._refs.setdefault(name, [])`
- This reduces dictionary lookups and provides a direct reference to the list, eliminating repeated key lookups

**3. Added early `break` in reference search loop**
- Added `break` after finding matching reference in `_add_ref()` 
- Prevents unnecessary iteration through remaining references once a match is found

**4. Cached repeated attribute lookups in `visit_Global()`**
- Stored `self.block_stack[-1]` and `self.block_stack[0]` in local variables
- Reduces repeated attribute access overhead in the critical loop

The optimizations are particularly effective for:
- **Large-scale scenarios**: Tests with 1000+ names show 25-31% improvements
- **Cases with many already-defined globals**: One test showed 1744% speedup when half the names were pre-defined, as the faster `is_defined()` check dramatically reduces overhead
- **Regular usage patterns**: Even basic cases with 3-5 names show consistent 15-25% improvements

The `is_defined()` optimization provides the largest impact since it's called for every global name to check if it's already defined in the global scope.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 30, 2025 04:54
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant