Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 28, 2025

📄 17% (0.17x) speedup for _has_script_tag_without_src in marimo/_output/formatters/iframe.py

⏱️ Runtime : 46.9 milliseconds 40.2 milliseconds (best of 82 runs)

📝 Explanation and details

The optimization replaces the string containment check "<script" not in html_content with html_content.find("<script") and then passes only the substring starting from the first <script> tag to the HTML parser instead of the entire HTML content.

Key changes:

  1. More efficient early exit: find() returns the index (-1 if not found) which is slightly more efficient than the in operator for this use case
  2. Substring parsing: Instead of parsing the entire HTML document, only parse from the first <script> tag onward using parser.feed(html_content[idx:])

Why this is faster:

  • Reduced parser workload: The HTML parser (ScriptTagParser) no longer needs to process potentially large amounts of HTML content before the first <script> tag. This is especially beneficial for large documents where <script> tags appear later in the HTML.
  • Early termination advantage: Since the parser raises StopIteration when it finds a script tag without src, parsing only the relevant portion means less overall work.

Test case performance patterns:

  • Dramatic speedups for large HTML documents with script tags in the middle or end (up to 13000% faster in some cases)
  • Slight slowdowns (1-5%) for very small HTML snippets due to the additional find() call overhead
  • Consistent improvements for medium to large documents, especially when script tags appear after other HTML content

The optimization is particularly effective when the HTML content has substantial non-script content before the first <script> tag, making it a worthwhile trade-off despite minor overhead on tiny inputs.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 8 Passed
🌀 Generated Regression Tests 115 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 3 Passed
📊 Tests Coverage 80.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
_output/formatters/test_iframe.py::test_has_script_tag_without_src_inline 11.7μs 11.3μs 3.90%✅
_output/formatters/test_iframe.py::test_has_script_tag_without_src_multiple 34.5μs 34.0μs 1.25%✅
_output/formatters/test_iframe.py::test_has_script_tag_without_src_no_script 422ns 672ns -37.2%⚠️
_output/formatters/test_iframe.py::test_has_script_tag_without_src_with_attributes 17.1μs 18.0μs -5.29%⚠️
_output/formatters/test_iframe.py::test_has_script_tag_without_src_with_src 23.8μs 24.2μs -1.69%⚠️
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

from html.parser import HTMLParser

# imports
import pytest  # used for our unit tests
from marimo._output.formatters.iframe import _has_script_tag_without_src

# function to test
# Copyright 2025 Marimo. All rights reserved.


class ScriptTagParser(HTMLParser):
    """
    HTML parser to detect <script> tags without a src attribute.
    """
    def __init__(self):
        super().__init__()
        self.has_script_without_src = False

    def handle_starttag(self, tag, attrs):
        if tag.lower() == "script":
            # If 'src' is not in the attributes, mark as found
            if not any(attr[0].lower() == "src" for attr in attrs):
                self.has_script_without_src = True
                # Stop parsing further for efficiency
                raise StopIteration
from marimo._output.formatters.iframe import _has_script_tag_without_src

# unit tests

# ------------------------
# Basic Test Cases
# ------------------------

def test_no_script_tag_returns_false():
    # No <script> tag at all
    codeflash_output = _has_script_tag_without_src("<div>Hello World</div>") # 420ns -> 635ns (33.9% slower)

def test_script_tag_with_src_returns_false():
    # <script> tag with src attribute
    html = '<script src="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 28.5μs -> 29.2μs (2.33% slower)

def test_script_tag_without_src_returns_true():
    # <script> tag without src attribute
    html = '<script>console.log("hi")</script>'
    codeflash_output = _has_script_tag_without_src(html) # 12.3μs -> 12.4μs (0.413% slower)

def test_multiple_script_tags_all_with_src_returns_false():
    # Multiple <script> tags, all with src
    html = '<script src="a.js"></script><script src="b.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 37.2μs -> 37.2μs (0.121% faster)

def test_multiple_script_tags_one_without_src_returns_true():
    # Multiple <script> tags, one without src
    html = '<script src="a.js"></script><script>var x=1;</script>'
    codeflash_output = _has_script_tag_without_src(html) # 27.6μs -> 27.1μs (1.96% faster)

# ------------------------
# Edge Test Cases
# ------------------------

def test_script_tag_with_src_and_other_attributes_returns_false():
    # <script> tag with src and other attributes
    html = '<script src="main.js" type="text/javascript"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 25.5μs -> 25.4μs (0.287% faster)

def test_script_tag_with_other_attributes_but_no_src_returns_true():
    # <script> tag with other attributes but no src
    html = '<script type="text/javascript"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 15.5μs -> 15.0μs (3.25% faster)

def test_script_tag_with_empty_src_returns_false():
    # <script> tag with empty src attribute
    html = '<script src=""></script>'
    codeflash_output = _has_script_tag_without_src(html) # 23.1μs -> 23.3μs (0.730% slower)

def test_script_tag_with_src_in_different_case_returns_false():
    # <script> tag with SRC in uppercase
    html = '<script SRC="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 24.1μs -> 23.6μs (1.89% faster)

def test_script_tag_with_mixed_case_tag_returns_true():
    # <ScRiPt> tag without src attribute
    html = '<ScRiPt>console.log(1)</ScRiPt>'
    codeflash_output = _has_script_tag_without_src(html) # 451ns -> 675ns (33.2% slower)

def test_script_tag_with_src_and_spaces_returns_false():
    # <script> tag with src and extra spaces
    html = '<script   src = "main.js" ></script>'
    codeflash_output = _has_script_tag_without_src(html) # 26.5μs -> 26.0μs (1.87% faster)

def test_script_tag_with_src_and_single_quotes_returns_false():
    # <script> tag with src in single quotes
    html = "<script src='main.js'></script>"
    codeflash_output = _has_script_tag_without_src(html) # 24.7μs -> 23.9μs (3.30% faster)

def test_script_tag_with_no_attributes_returns_true():
    # <script> tag with no attributes and no content
    html = '<script></script>'
    codeflash_output = _has_script_tag_without_src(html) # 12.1μs -> 12.0μs (0.701% faster)

def test_script_tag_with_comment_inside_returns_true():
    # <script> tag with comment inside
    html = '<script><!-- comment --></script>'
    codeflash_output = _has_script_tag_without_src(html) # 11.8μs -> 11.6μs (2.05% faster)

def test_script_tag_with_src_and_comment_inside_returns_false():
    # <script> tag with src and comment inside
    html = '<script src="main.js"><!-- comment --></script>'
    codeflash_output = _has_script_tag_without_src(html) # 26.1μs -> 26.2μs (0.282% slower)

def test_script_tag_with_src_as_substring_of_other_attribute_returns_true():
    # <script> tag with attribute containing 'src' as substring, but not 'src'
    html = '<script srcx="foo"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 16.1μs -> 15.9μs (1.15% faster)

def test_script_tag_with_src_and_other_attribute_order_returns_false():
    # <script> tag with src attribute not first
    html = '<script type="text/javascript" src="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 27.3μs -> 27.5μs (0.894% slower)

def test_script_tag_with_src_and_boolean_attribute_returns_false():
    # <script> tag with src and boolean attribute
    html = '<script src="main.js" async></script>'
    codeflash_output = _has_script_tag_without_src(html) # 25.4μs -> 25.6μs (1.11% slower)

def test_script_tag_with_src_and_uppercase_returns_false():
    # <script> tag with uppercase SRC attribute
    html = '<script SRC="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 23.6μs -> 23.8μs (0.753% slower)

def test_script_tag_with_src_and_mixed_case_returns_false():
    # <script> tag with mixed case src attribute
    html = '<script sRc="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 23.5μs -> 22.9μs (2.60% faster)

def test_script_tag_with_src_and_no_closing_tag_returns_false():
    # <script> tag with src and no closing tag
    html = '<script src="main.js">'
    codeflash_output = _has_script_tag_without_src(html) # 19.0μs -> 18.4μs (3.28% faster)

def test_script_tag_without_src_and_no_closing_tag_returns_true():
    # <script> tag without src and no closing tag
    html = '<script>'
    codeflash_output = _has_script_tag_without_src(html) # 11.7μs -> 12.1μs (3.18% slower)

def test_script_tag_with_src_and_malformed_html_returns_false():
    # Malformed HTML with src attribute
    html = '<script src="main.js"'
    codeflash_output = _has_script_tag_without_src(html) # 9.17μs -> 9.46μs (3.00% slower)

def test_script_tag_without_src_and_malformed_html_returns_true():
    # Malformed HTML without src attribute
    html = '<script'
    codeflash_output = _has_script_tag_without_src(html) # 7.87μs -> 8.13μs (3.26% slower)

def test_script_tag_with_src_and_self_closing_returns_false():
    # Self-closing <script> tag with src
    html = '<script src="main.js"/>'
    codeflash_output = _has_script_tag_without_src(html) # 18.4μs -> 19.0μs (3.25% slower)

def test_script_tag_without_src_and_self_closing_returns_true():
    # Self-closing <script> tag without src
    html = '<script/>'
    codeflash_output = _has_script_tag_without_src(html) # 12.5μs -> 12.7μs (1.06% slower)

def test_script_tag_with_src_and_spaces_in_tag_returns_false():
    # <script> tag with spaces and src
    html = '<script   src="main.js"    ></script>'
    codeflash_output = _has_script_tag_without_src(html) # 26.5μs -> 26.4μs (0.375% faster)

def test_script_tag_with_src_and_newline_returns_false():
    # <script> tag with src and newline
    html = '<script\nsrc="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 25.3μs -> 25.1μs (0.673% faster)

def test_script_tag_without_src_and_newline_returns_true():
    # <script> tag without src and newline
    html = '<script\n></script>'
    codeflash_output = _has_script_tag_without_src(html) # 12.0μs -> 11.9μs (0.834% faster)

def test_script_tag_with_src_and_tab_returns_false():
    # <script> tag with src and tab
    html = '<script\tsrc="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 24.9μs -> 24.9μs (0.173% slower)

def test_script_tag_without_src_and_tab_returns_true():
    # <script> tag without src and tab
    html = '<script\t></script>'
    codeflash_output = _has_script_tag_without_src(html) # 11.6μs -> 11.8μs (1.26% slower)

def test_script_tag_with_src_and_extra_attributes_returns_false():
    # <script> tag with src and extra attributes
    html = '<script src="main.js" type="module" async></script>'
    codeflash_output = _has_script_tag_without_src(html) # 28.9μs -> 28.5μs (1.57% faster)

def test_script_tag_with_src_and_false_value_returns_false():
    # <script> tag with src set to 'false'
    html = '<script src="false"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 23.8μs -> 23.7μs (0.435% faster)

def test_script_tag_with_src_and_boolean_false_returns_false():
    # <script> tag with src set to boolean false
    html = '<script src=false></script>'
    codeflash_output = _has_script_tag_without_src(html) # 24.3μs -> 23.3μs (4.02% faster)

def test_script_tag_with_src_and_no_value_returns_false():
    # <script> tag with src attribute but no value
    html = '<script src></script>'
    codeflash_output = _has_script_tag_without_src(html) # 21.8μs -> 21.8μs (0.105% faster)

def test_script_tag_with_src_and_spaces_in_attribute_returns_false():
    # <script> tag with src and spaces in attribute
    html = '<script src = "main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 23.5μs -> 23.7μs (0.865% slower)

def test_script_tag_with_src_and_double_quotes_returns_false():
    # <script> tag with src in double quotes
    html = '<script src="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 23.3μs -> 23.6μs (1.11% slower)

def test_script_tag_with_src_and_single_quotes_returns_false():
    # <script> tag with src in single quotes
    html = "<script src='main.js'></script>"
    codeflash_output = _has_script_tag_without_src(html) # 24.7μs -> 23.9μs (3.30% faster)

def test_script_tag_with_src_and_mixed_quotes_returns_false():
    # <script> tag with src in mixed quotes
    html = '<script src="main.js\'></script>'
    codeflash_output = _has_script_tag_without_src(html) # 9.19μs -> 9.23μs (0.422% slower)

def test_script_tag_with_src_and_missing_closing_angle_returns_false():
    # <script> tag with src and missing closing angle
    html = '<script src="main.js"'
    codeflash_output = _has_script_tag_without_src(html) # 8.96μs -> 9.23μs (2.92% slower)

def test_script_tag_with_src_and_extra_angle_returns_false():
    # <script> tag with src and extra angle bracket
    html = '<script src="main.js">></script>'
    codeflash_output = _has_script_tag_without_src(html) # 26.1μs -> 26.3μs (0.579% slower)

def test_script_tag_with_src_and_comment_in_tag_returns_false():
    # <script> tag with src and comment in tag
    html = '<script src="main.js" <!-- comment --> ></script>'
    codeflash_output = _has_script_tag_without_src(html) # 28.3μs -> 29.1μs (2.74% slower)

def test_script_tag_without_src_and_comment_in_tag_returns_true():
    # <script> tag without src and comment in tag
    html = '<script <!-- comment --> ></script>'
    codeflash_output = _has_script_tag_without_src(html) # 17.2μs -> 17.2μs (0.273% faster)

def test_script_tag_with_src_and_script_in_attribute_returns_false():
    # <script> tag with src containing the word 'script'
    html = '<script src="script.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 24.9μs -> 24.4μs (1.72% faster)

def test_script_tag_with_src_and_script_in_other_attribute_returns_false():
    # <script> tag with other attribute containing the word 'script'
    html = '<script src="main.js" data-script="true"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 26.7μs -> 26.5μs (0.702% faster)

def test_script_tag_with_src_and_script_in_content_returns_false():
    # <script> tag with src and script content
    html = '<script src="main.js">var script = true;</script>'
    codeflash_output = _has_script_tag_without_src(html) # 24.0μs -> 24.1μs (0.170% slower)

def test_script_tag_with_nested_script_tags_returns_true():
    # Nested <script> tags, one without src
    html = '<script><script>console.log(1)</script></script>'
    codeflash_output = _has_script_tag_without_src(html) # 11.9μs -> 11.6μs (2.32% faster)

def test_script_tag_with_script_in_text_returns_false():
    # <script> tag with src and 'script' in text
    html = '<script src="main.js">script</script>'
    codeflash_output = _has_script_tag_without_src(html) # 25.1μs -> 25.7μs (2.27% slower)

def test_script_tag_with_src_and_script_in_comment_returns_false():
    # <script> tag with src and 'script' in comment
    html = '<script src="main.js"><!-- script --></script>'
    codeflash_output = _has_script_tag_without_src(html) # 24.6μs -> 23.9μs (3.05% faster)

# ------------------------
# Large Scale Test Cases
# ------------------------

def test_large_html_with_many_script_tags_all_with_src_returns_false():
    # Large HTML with many <script> tags, all with src
    html = "".join(f'<script src="file{i}.js"></script>' for i in range(500))
    codeflash_output = _has_script_tag_without_src(html) # 2.78ms -> 2.73ms (1.78% faster)

def test_large_html_with_many_script_tags_one_without_src_returns_true():
    # Large HTML with many <script> tags, one without src in the middle
    html = (
        "".join(f'<script src="file{i}.js"></script>' for i in range(250))
        + '<script>console.log("middle")</script>'
        + "".join(f'<script src="file{i}.js"></script>' for i in range(250, 500))
    )
    codeflash_output = _has_script_tag_without_src(html) # 1.40ms -> 1.40ms (0.072% faster)

def test_large_html_with_script_tag_without_src_at_start_returns_true():
    # Large HTML with <script> tag without src at the start
    html = (
        '<script>console.log("start")</script>'
        + "".join(f'<script src="file{i}.js"></script>' for i in range(999))
    )
    codeflash_output = _has_script_tag_without_src(html) # 12.8μs -> 13.0μs (1.71% slower)

def test_large_html_with_script_tag_without_src_at_end_returns_true():
    # Large HTML with <script> tag without src at the end
    html = (
        "".join(f'<script src="file{i}.js"></script>' for i in range(999))
        + '<script>console.log("end")</script>'
    )
    codeflash_output = _has_script_tag_without_src(html) # 5.57ms -> 5.62ms (0.940% slower)

def test_large_html_with_no_script_tags_returns_false():
    # Large HTML with no <script> tags
    html = "".join(f'<div id="div{i}"></div>' for i in range(1000))
    codeflash_output = _has_script_tag_without_src(html) # 3.28μs -> 3.44μs (4.51% slower)

def test_large_html_with_script_tags_and_noise_returns_true():
    # Large HTML with script tags and lots of noise
    html = (
        "".join(f'<div id="div{i}"></div>' for i in range(500))
        + '<script>var x=1;</script>'
        + "".join(f'<span id="span{i}"></span>' for i in range(500))
    )
    codeflash_output = _has_script_tag_without_src(html) # 2.09ms -> 15.9μs (13049% faster)

def test_large_html_with_script_tags_and_noise_returns_false():
    # Large HTML with script tags with src and lots of noise
    html = (
        "".join(f'<div id="div{i}"></div>' for i in range(500))
        + '<script src="main.js"></script>'
        + "".join(f'<span id="span{i}"></span>' for i in range(500))
    )
    codeflash_output = _has_script_tag_without_src(html) # 4.35ms -> 2.18ms (99.4% faster)

def test_large_html_with_script_tags_all_without_src_returns_true():
    # Large HTML with all <script> tags without src
    html = "".join(f'<script>console.log({i})</script>' for i in range(1000))
    codeflash_output = _has_script_tag_without_src(html) # 12.7μs -> 13.0μs (1.97% slower)

def test_large_html_with_script_tags_some_malformed_returns_true():
    # Large HTML with some malformed <script> tags without src
    html = (
        "".join(f'<script>console.log({i})</script>' for i in range(500))
        + "".join(f'<script src="file{i}.js"></script>' for i in range(499))
        + '<script'
    )
    codeflash_output = _has_script_tag_without_src(html) # 12.7μs -> 12.8μs (0.671% slower)

def test_large_html_with_script_tags_all_malformed_with_src_returns_false():
    # Large HTML with all malformed <script> tags with src
    html = "".join(f'<script src="file{i}.js"' for i in range(1000))
    codeflash_output = _has_script_tag_without_src(html) # 395μs -> 438μs (9.92% slower)

def test_large_html_with_script_tags_and_attributes_returns_true():
    # Large HTML with <script> tags with various attributes, one without src
    html = (
        "".join(f'<script src="file{i}.js" type="module"></script>' for i in range(500))
        + '<script type="text/javascript"></script>'
        + "".join(f'<script src="file{i}.js" async></script>' for i in range(499))
    )
    codeflash_output = _has_script_tag_without_src(html) # 3.33ms -> 3.29ms (1.16% faster)

def test_large_html_with_script_tags_and_attributes_returns_false():
    # Large HTML with <script> tags with various attributes, all with src
    html = "".join(f'<script src="file{i}.js" type="module"></script>' for i in range(1000))
    codeflash_output = _has_script_tag_without_src(html) # 6.56ms -> 6.61ms (0.678% slower)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from __future__ import annotations

from html.parser import HTMLParser

# imports
import pytest  # used for our unit tests
from marimo._output.formatters.iframe import _has_script_tag_without_src

# function to test
# Copyright 2025 Marimo. All rights reserved.


class ScriptTagParser(HTMLParser):
    def __init__(self):
        super().__init__()
        self.has_script_without_src = False

    def handle_starttag(self, tag, attrs):
        if tag.lower() == "script":
            # Check if 'src' attribute is present
            if not any(attr[0].lower() == "src" for attr in attrs):
                self.has_script_without_src = True
                # Stop parsing further for efficiency
                raise StopIteration
from marimo._output.formatters.iframe import _has_script_tag_without_src

# unit tests

# ---------- Basic Test Cases ----------

def test_no_script_tag_returns_false():
    # No <script> tag present
    html = "<html><body><h1>Hello World</h1></body></html>"
    codeflash_output = _has_script_tag_without_src(html) # 542ns -> 835ns (35.1% slower)

def test_script_tag_with_src_returns_false():
    # <script> tag with src attribute
    html = '<script src="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 27.3μs -> 28.5μs (4.08% slower)

def test_script_tag_without_src_returns_true():
    # <script> tag without src attribute
    html = '<script>alert("hi");</script>'
    codeflash_output = _has_script_tag_without_src(html) # 12.3μs -> 12.4μs (0.785% slower)

def test_multiple_script_tags_one_without_src_returns_true():
    # Multiple <script> tags, one without src
    html = '''
    <script src="a.js"></script>
    <script>alert("hi");</script>
    <script src="b.js"></script>
    '''
    codeflash_output = _has_script_tag_without_src(html) # 32.6μs -> 30.6μs (6.48% faster)

def test_multiple_script_tags_all_with_src_returns_false():
    # Multiple <script> tags, all with src
    html = '''
    <script src="a.js"></script>
    <script src="b.js"></script>
    '''
    codeflash_output = _has_script_tag_without_src(html) # 38.4μs -> 37.6μs (2.18% faster)

# ---------- Edge Test Cases ----------

def test_script_tag_with_src_and_other_attributes_returns_false():
    # <script> tag with src and other attributes
    html = '<script src="main.js" type="text/javascript"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 25.4μs -> 25.8μs (1.39% slower)

def test_script_tag_with_uppercase_src_returns_false():
    # <script> tag with uppercase SRC attribute
    html = '<script SRC="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 23.3μs -> 23.1μs (0.819% faster)

def test_script_tag_with_mixed_case_src_returns_false():
    # <script> tag with mixed case sRc attribute
    html = '<script sRc="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 23.2μs -> 22.4μs (3.80% faster)

def test_script_tag_with_empty_src_returns_false():
    # <script> tag with empty src attribute
    html = '<script src=""></script>'
    codeflash_output = _has_script_tag_without_src(html) # 23.0μs -> 23.4μs (1.79% slower)

def test_script_tag_with_other_attributes_returns_true():
    # <script> tag with type attribute, but no src
    html = '<script type="application/javascript"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 15.7μs -> 15.8μs (0.564% slower)

def test_script_tag_with_src_and_other_script_without_src_returns_true():
    # One <script> with src, one without
    html = '''
    <script src="a.js"></script>
    <script type="text/javascript"></script>
    '''
    codeflash_output = _has_script_tag_without_src(html) # 34.0μs -> 33.2μs (2.42% faster)

def test_script_tag_with_src_in_middle_of_attributes_returns_false():
    # <script> tag with src not first attribute
    html = '<script type="text/javascript" src="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 26.2μs -> 25.4μs (3.29% faster)

def test_script_tag_with_no_attributes_and_no_content_returns_true():
    # <script> tag with no attributes and no content
    html = '<script></script>'
    codeflash_output = _has_script_tag_without_src(html) # 11.7μs -> 11.5μs (1.95% faster)

def test_script_tag_with_whitespace_only_returns_true():
    # <script> tag with whitespace only, no src
    html = '<script>   </script>'
    codeflash_output = _has_script_tag_without_src(html) # 11.5μs -> 11.9μs (3.37% slower)

def test_script_tag_with_comment_inside_returns_true():
    # <script> tag with comment inside, no src
    html = '<script><!-- comment --></script>'
    codeflash_output = _has_script_tag_without_src(html) # 11.5μs -> 11.8μs (3.22% slower)

def test_script_tag_with_src_and_comment_inside_returns_false():
    # <script> tag with src and comment inside
    html = '<script src="main.js"><!-- comment --></script>'
    codeflash_output = _has_script_tag_without_src(html) # 26.4μs -> 26.7μs (0.963% slower)

def test_script_tag_with_self_closing_returns_false():
    # Self-closing <script src="main.js"/>
    html = '<script src="main.js"/>'
    codeflash_output = _has_script_tag_without_src(html) # 18.0μs -> 17.7μs (2.21% faster)

def test_script_tag_with_self_closing_without_src_returns_true():
    # Self-closing <script/>
    html = '<script/>'
    codeflash_output = _has_script_tag_without_src(html) # 12.2μs -> 12.3μs (0.416% slower)

def test_script_tag_with_extra_spaces_returns_true():
    # <script   > with extra spaces
    html = '<script   >alert("hi");</script>'
    codeflash_output = _has_script_tag_without_src(html) # 12.0μs -> 12.4μs (2.91% slower)

def test_script_tag_with_newline_in_tag_returns_true():
    # <script\n> with newline in tag
    html = '<script\n>alert("hi");</script>'
    codeflash_output = _has_script_tag_without_src(html) # 11.9μs -> 12.1μs (1.54% slower)

def test_script_tag_with_attributes_and_newline_returns_true():
    # <script\n type="text/javascript"> with newline and attribute
    html = '<script\n type="text/javascript"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 16.6μs -> 16.5μs (0.540% faster)

def test_script_tag_with_src_and_newline_returns_false():
    # <script\n src="main.js"> with newline and src
    html = '<script\n src="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 26.9μs -> 26.5μs (1.46% faster)

def test_script_tag_with_src_and_other_attribute_with_newline_returns_false():
    # <script\n type="text/javascript" src="main.js"> with newline and src
    html = '<script\n type="text/javascript" src="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 28.0μs -> 27.5μs (1.90% faster)

def test_script_tag_with_src_and_other_attribute_with_newline_in_src_returns_false():
    # <script type="text/javascript" src="main.js"\n> with newline after src
    html = '<script type="text/javascript" src="main.js"\n></script>'
    codeflash_output = _has_script_tag_without_src(html) # 27.3μs -> 26.1μs (4.81% faster)

def test_script_tag_with_malformed_html_returns_false():
    # Malformed <script> tag, should not crash
    html = '<script src="main.js"<div></div>'
    codeflash_output = _has_script_tag_without_src(html) # 20.9μs -> 20.7μs (1.17% faster)

def test_script_tag_with_malformed_html_but_valid_script_returns_true():
    # Malformed HTML, but valid <script> without src
    html = '<script>alert("hi");</script><div><div>'
    codeflash_output = _has_script_tag_without_src(html) # 11.8μs -> 12.1μs (2.18% slower)

def test_script_tag_with_attributes_with_no_value_returns_true():
    # <script defer></script> (defer is not src)
    html = '<script defer></script>'
    codeflash_output = _has_script_tag_without_src(html) # 14.1μs -> 14.2μs (0.127% slower)

def test_script_tag_with_multiple_attributes_one_is_src_returns_false():
    # <script defer src="main.js"></script>
    html = '<script defer src="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 26.7μs -> 27.4μs (2.42% slower)

def test_script_tag_with_src_in_single_quotes_returns_false():
    # <script src='main.js'></script>
    html = "<script src='main.js'></script>"
    codeflash_output = _has_script_tag_without_src(html) # 24.4μs -> 24.1μs (1.13% faster)

def test_script_tag_with_src_and_other_attributes_with_single_quotes_returns_false():
    # <script type='text/javascript' src='main.js'></script>
    html = "<script type='text/javascript' src='main.js'></script>"
    codeflash_output = _has_script_tag_without_src(html) # 26.7μs -> 26.7μs (0.319% faster)

def test_script_tag_with_src_and_other_attributes_with_double_quotes_returns_false():
    # <script type="text/javascript" src="main.js"></script>
    html = '<script type="text/javascript" src="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 26.5μs -> 26.1μs (1.90% faster)

def test_script_tag_with_src_and_other_attributes_with_mixed_quotes_returns_false():
    # <script type="text/javascript" src='main.js'></script>
    html = '<script type="text/javascript" src=\'main.js\'></script>'
    codeflash_output = _has_script_tag_without_src(html) # 26.5μs -> 25.8μs (2.65% faster)

def test_script_tag_with_src_and_other_attributes_with_spaces_returns_false():
    # <script src = "main.js"></script>
    html = '<script src = "main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 23.9μs -> 23.5μs (1.46% faster)

def test_script_tag_with_src_and_other_attributes_with_tabs_returns_false():
    # <script	src="main.js"></script>
    html = '<script\tsrc="main.js"></script>'
    codeflash_output = _has_script_tag_without_src(html) # 23.4μs -> 23.7μs (1.13% slower)

def test_script_tag_with_multiple_script_tags_some_malformed_returns_true():
    # Multiple <script> tags, some malformed, one valid without src
    html = '''
    <script src="a.js"></script>
    <script>alert("hi");</script
    <script src="b.js"></script>
    '''
    codeflash_output = _has_script_tag_without_src(html) # 31.1μs -> 30.5μs (1.71% faster)

def test_script_tag_with_nested_script_tags_returns_true():
    # <script> tag nested inside another tag
    html = '<div><script>alert("hi");</script></div>'
    codeflash_output = _has_script_tag_without_src(html) # 16.1μs -> 11.5μs (40.1% faster)

def test_script_tag_with_nested_script_tags_with_src_returns_false():
    # <script> tag with src nested inside another tag
    html = '<div><script src="main.js"></script></div>'
    codeflash_output = _has_script_tag_without_src(html) # 30.4μs -> 26.8μs (13.6% faster)

def test_script_tag_with_spaces_before_tag_returns_true():
    # Spaces before <script>
    html = '    <script>alert("hi");</script>'
    codeflash_output = _has_script_tag_without_src(html) # 12.8μs -> 11.7μs (9.70% faster)

def test_script_tag_with_spaces_inside_tag_returns_true():
    # Spaces inside <script  >
    html = '<script  >alert("hi");</script>'
    codeflash_output = _has_script_tag_without_src(html) # 12.0μs -> 12.1μs (1.39% slower)

def test_script_tag_with_comment_before_tag_returns_true():
    # Comment before <script>
    html = '<!-- comment --><script>alert("hi");</script>'
    codeflash_output = _has_script_tag_without_src(html) # 16.5μs -> 12.0μs (37.1% faster)

def test_script_tag_with_comment_after_tag_returns_true():
    # Comment after <script>
    html = '<script>alert("hi");</script><!-- comment -->'
    codeflash_output = _has_script_tag_without_src(html) # 11.7μs -> 12.1μs (2.82% slower)

# ---------- Large Scale Test Cases ----------

def test_large_html_with_many_non_script_tags_returns_false():
    # Large HTML with many non-script tags
    html = "<div>" + ("<p>Hello</p>" * 500) + "</div>"
    codeflash_output = _has_script_tag_without_src(html) # 907ns -> 1.14μs (20.5% slower)

def test_large_html_with_many_script_tags_all_with_src_returns_false():
    # Large HTML with many <script src="..."></script> tags
    html = "".join(f'<script src="file{i}.js"></script>' for i in range(500))
    codeflash_output = _has_script_tag_without_src(html) # 2.76ms -> 2.73ms (0.888% faster)

def test_large_html_with_many_script_tags_one_without_src_returns_true():
    # Large HTML with many <script> tags, one without src somewhere
    html = (
        "".join(f'<script src="file{i}.js"></script>' for i in range(250)) +
        '<script>alert("hi");</script>' +
        "".join(f'<script src="file{i}.js"></script>' for i in range(250, 500))
    )
    codeflash_output = _has_script_tag_without_src(html) # 1.42ms -> 1.40ms (1.79% faster)

def test_large_html_with_script_tag_at_end_returns_true():
    # Large HTML with <script> tag without src at the very end
    html = "<div>" + ("<p>Hello</p>" * 500) + "</div><script>console.log(1);</script>"
    codeflash_output = _has_script_tag_without_src(html) # 1.64ms -> 12.6μs (12850% faster)

def test_large_html_with_script_tag_at_start_returns_true():
    # Large HTML with <script> tag without src at the very start
    html = "<script>console.log(1);</script>" + "<div>" + ("<p>Hello</p>" * 500) + "</div>"
    codeflash_output = _has_script_tag_without_src(html) # 12.0μs -> 11.8μs (1.61% faster)

def test_large_html_with_script_tag_in_middle_returns_true():
    # Large HTML with <script> tag without src in the middle
    html = "<div>" + ("<p>Hello</p>" * 250) + "</div>" + \
           "<script>console.log(1);</script>" + \
           "<div>" + ("<p>Hello</p>" * 250) + "</div>"
    codeflash_output = _has_script_tag_without_src(html) # 824μs -> 12.7μs (6417% faster)

def test_large_html_with_many_script_tags_some_malformed_returns_true():
    # Large HTML with many <script> tags, some malformed, one valid without src
    html = (
        "".join(f'<script src="file{i}.js"></script>' for i in range(200)) +
        '<script>alert("hi");</script' +  # malformed, but should still be detected
        "".join(f'<script src="file{i}.js"></script>' for i in range(200, 400))
    )
    codeflash_output = _has_script_tag_without_src(html) # 1.12ms -> 1.14ms (1.85% slower)

def test_large_html_with_script_tags_with_attributes_returns_true():
    # Large HTML with <script> tags with various attributes, one without src
    html = (
        "".join(f'<script src="file{i}.js" type="text/javascript"></script>' for i in range(200)) +
        '<script type="application/javascript"></script>' +
        "".join(f'<script src="file{i}.js" defer></script>' for i in range(200, 400))
    )
    codeflash_output = _has_script_tag_without_src(html) # 1.35ms -> 1.31ms (2.92% faster)

def test_large_html_with_script_tags_with_attributes_returns_false():
    # Large HTML with <script> tags with various attributes, all with src
    html = (
        "".join(f'<script src="file{i}.js" type="text/javascript"></script>' for i in range(400))
    )
    codeflash_output = _has_script_tag_without_src(html) # 2.67ms -> 2.61ms (2.46% faster)

def test_large_html_with_script_tags_with_mixed_case_src_returns_false():
    # Large HTML with <script> tags with mixed case src attribute
    html = (
        "".join(f'<script SrC="file{i}.js"></script>' for i in range(400))
    )
    codeflash_output = _has_script_tag_without_src(html) # 2.15ms -> 2.25ms (4.62% slower)

def test_large_html_with_script_tags_with_mixed_case_src_and_one_without_src_returns_true():
    # Large HTML with <script> tags with mixed case src attribute, one without src
    html = (
        "".join(f'<script SrC="file{i}.js"></script>' for i in range(200)) +
        '<script>alert("hi");</script>' +
        "".join(f'<script SrC="file{i}.js"></script>' for i in range(200, 400))
    )
    codeflash_output = _has_script_tag_without_src(html) # 1.12ms -> 1.12ms (0.362% faster)

def test_large_html_with_script_tags_with_spaces_and_one_without_src_returns_true():
    # Large HTML with <script> tags with spaces, one without src
    html = (
        "".join(f'<script src = "file{i}.js"></script>' for i in range(200)) +
        '<script>alert("hi");</script>' +
        "".join(f'<script src = "file{i}.js"></script>' for i in range(200, 400))
    )
    codeflash_output = _has_script_tag_without_src(html) # 1.15ms -> 1.10ms (4.37% faster)

def test_large_html_with_script_tags_with_tabs_and_one_without_src_returns_true():
    # Large HTML with <script> tags with tabs, one without src
    html = (
        "".join(f'<script\tsrc="file{i}.js"></script>' for i in range(200)) +
        '<script>alert("hi");</script>' +
        "".join(f'<script\tsrc="file{i}.js"></script>' for i in range(200, 400))
    )
    codeflash_output = _has_script_tag_without_src(html) # 1.14ms -> 1.13ms (0.860% faster)

def test_large_html_with_script_tags_with_newlines_and_one_without_src_returns_true():
    # Large HTML with <script\n> tags with newlines, one without src
    html = (
        "".join(f'<script\nsrc="file{i}.js"></script>' for i in range(200)) +
        '<script>alert("hi");</script>' +
        "".join(f'<script\nsrc="file{i}.js"></script>' for i in range(200, 400))
    )
    codeflash_output = _has_script_tag_without_src(html) # 1.14ms -> 1.13ms (1.10% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from marimo._output.formatters.iframe import _has_script_tag_without_src

def test__has_script_tag_without_src():
    _has_script_tag_without_src('<script>')

def test__has_script_tag_without_src_2():
    _has_script_tag_without_src('<script')

def test__has_script_tag_without_src_3():
    _has_script_tag_without_src('')
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_hg3s6k0k/tmp3pixswt5/test_concolic_coverage.py::test__has_script_tag_without_src 11.5μs 11.0μs 4.79%✅
codeflash_concolic_hg3s6k0k/tmp3pixswt5/test_concolic_coverage.py::test__has_script_tag_without_src_2 7.88μs 7.93μs -0.618%⚠️
codeflash_concolic_hg3s6k0k/tmp3pixswt5/test_concolic_coverage.py::test__has_script_tag_without_src_3 366ns 563ns -35.0%⚠️

To edit these changes git checkout codeflash/optimize-_has_script_tag_without_src-mhb3eczg and push.

Codeflash

The optimization replaces the string containment check `"<script" not in html_content` with `html_content.find("<script")` and then passes only the substring starting from the first `<script>` tag to the HTML parser instead of the entire HTML content.

**Key changes:**
1. **More efficient early exit**: `find()` returns the index (-1 if not found) which is slightly more efficient than the `in` operator for this use case
2. **Substring parsing**: Instead of parsing the entire HTML document, only parse from the first `<script>` tag onward using `parser.feed(html_content[idx:])`

**Why this is faster:**
- **Reduced parser workload**: The HTML parser (`ScriptTagParser`) no longer needs to process potentially large amounts of HTML content before the first `<script>` tag. This is especially beneficial for large documents where `<script>` tags appear later in the HTML.
- **Early termination advantage**: Since the parser raises `StopIteration` when it finds a script tag without `src`, parsing only the relevant portion means less overall work.

**Test case performance patterns:**
- **Dramatic speedups** for large HTML documents with script tags in the middle or end (up to 13000% faster in some cases)
- **Slight slowdowns** (1-5%) for very small HTML snippets due to the additional `find()` call overhead
- **Consistent improvements** for medium to large documents, especially when script tags appear after other HTML content

The optimization is particularly effective when the HTML content has substantial non-script content before the first `<script>` tag, making it a worthwhile trade-off despite minor overhead on tiny inputs.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 28, 2025 21:42
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant