Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 28, 2025

📄 51% (0.51x) speedup for convert_sam2_segmentation_response_to_inference_instances_seg_response in inference/core/workflows/core_steps/models/foundation/segment_anything2/v1.py

⏱️ Runtime : 30.0 milliseconds 19.8 milliseconds (best of 209 runs)

📝 Explanation and details

The optimized code achieves a 51% speedup through two key NumPy-based optimizations:

1. Vectorized coordinate operations: Instead of using Python list comprehensions to extract x/y coordinates ([coord[0] for coord in mask]), the code converts each mask to a NumPy array once with np.asarray(mask) and then uses vectorized slicing (mask_coords[:, 0] and mask_coords[:, 1]) and NumPy's optimized min()/max() methods. This eliminates expensive Python loops for coordinate processing.

2. Early confidence filtering: The confidence threshold check (prediction.confidence < threshold) is moved outside the mask loop, so when a prediction fails the threshold test, all its masks are skipped immediately rather than processing each mask before the confidence check.

Performance characteristics from tests:

  • Best gains on large-scale scenarios: 55.7% faster with many predictions/masks, 54.9% faster with many masks per prediction
  • Consistent improvements across basic cases: 15-25% faster for typical mask processing
  • Minimal overhead for edge cases with no valid masks (only 1-7% differences)

These optimizations are particularly effective for computer vision workloads where masks typically contain many coordinate points, making the vectorized NumPy operations significantly faster than Python list processing.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 34 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from typing import List, Optional

import numpy as np
# imports
import pytest
from inference.core.workflows.core_steps.models.foundation.segment_anything2.v1 import \
    convert_sam2_segmentation_response_to_inference_instances_seg_response

# --- Dummy entity classes for testing (as per function signature) ---

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __eq__(self, other):
        return isinstance(other, Point) and self.x == other.x and self.y == other.y

class InstanceSegmentationPrediction:
    def __init__(self, x, y, width, height, points, confidence, class_, class_id, parent_id):
        self.x = x
        self.y = y
        self.width = width
        self.height = height
        self.points = points
        self.confidence = confidence
        self.class_ = class_
        self.class_id = class_id
        self.parent_id = parent_id

    def __eq__(self, other):
        return (
            isinstance(other, InstanceSegmentationPrediction)
            and self.x == other.x
            and self.y == other.y
            and self.width == other.width
            and self.height == other.height
            and self.points == other.points
            and self.confidence == other.confidence
            and self.class_ == other.class_
            and self.class_id == other.class_id
            and self.parent_id == other.parent_id
        )

class InferenceResponseImage:
    def __init__(self, width, height):
        self.width = width
        self.height = height

class InstanceSegmentationInferenceResponse:
    def __init__(self, predictions, image):
        self.predictions = predictions
        self.image = image

class Sam2SegmentationPrediction:
    def __init__(self, masks: List[List[List[int]]], confidence: float):
        self.masks = masks
        self.confidence = confidence

class WorkflowImageData:
    def __init__(self, numpy_image):
        self.numpy_image = numpy_image
from inference.core.workflows.core_steps.models.foundation.segment_anything2.v1 import \
    convert_sam2_segmentation_response_to_inference_instances_seg_response

# --- Unit tests ---

# BASIC TEST CASES

def test_basic_single_mask_above_threshold():
    # Single mask, confidence above threshold
    img = WorkflowImageData(numpy_image=np.zeros((100, 200)))
    mask = [[10, 10], [20, 10], [20, 20], [10, 20]]
    sam2_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.8)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [1], ["cat"], ["det1"], 0.5
    ); result = codeflash_output # 65.0μs -> 55.7μs (16.6% faster)
    pred = result.predictions[0]

def test_basic_multiple_masks_mixed_threshold():
    # Two masks, one above and one below threshold
    img = WorkflowImageData(numpy_image=np.zeros((50, 50)))
    mask1 = [[0, 0], [10, 0], [10, 10]]
    mask2 = [[20, 20], [30, 20], [30, 30]]
    sam2_pred = Sam2SegmentationPrediction(masks=[mask1, mask2], confidence=0.7)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [2], ["dog"], ["det2"], 0.6
    ); result = codeflash_output # 71.4μs -> 58.3μs (22.6% faster)
    # Check first mask
    pred1 = result.predictions[0]
    # Check second mask
    pred2 = result.predictions[1]

def test_basic_no_masks():
    # No masks in prediction
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    sam2_pred = Sam2SegmentationPrediction(masks=[], confidence=0.9)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [0], ["none"], [None], 0.5
    ); result = codeflash_output # 13.5μs -> 14.2μs (4.91% slower)

def test_basic_no_class_id_provided():
    # prompt_class_ids is empty, should default to foreground/0/None
    img = WorkflowImageData(numpy_image=np.zeros((5, 5)))
    mask = [[1, 1], [2, 1], [2, 2]]
    sam2_pred = Sam2SegmentationPrediction(masks=[mask], confidence=1.0)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [], [], [], 0.5
    ); result = codeflash_output # 56.0μs -> 48.5μs (15.5% faster)
    pred = result.predictions[0]

# EDGE TEST CASES

def test_edge_mask_with_less_than_three_points():
    # Mask with less than 3 points should be skipped
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    mask = [[1, 2], [3, 4]]  # Only 2 points
    sam2_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.9)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [1], ["thing"], ["det"], 0.5
    ); result = codeflash_output # 13.0μs -> 12.7μs (1.77% faster)

def test_edge_confidence_exactly_at_threshold():
    # Mask with confidence exactly equal to threshold should be included
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    mask = [[1, 1], [2, 2], [3, 3]]
    sam2_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.5)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [1], ["thing"], ["det"], 0.5
    ); result = codeflash_output # 54.3μs -> 47.5μs (14.3% faster)

def test_edge_confidence_just_below_threshold():
    # Mask with confidence just below threshold should be skipped
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    mask = [[1, 1], [2, 2], [3, 3]]
    sam2_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.4999)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [1], ["thing"], ["det"], 0.5
    ); result = codeflash_output # 13.1μs -> 12.8μs (2.47% faster)

def test_edge_empty_masks_and_empty_class_ids():
    # No masks and no class ids: should return empty predictions
    img = WorkflowImageData(numpy_image=np.zeros((5, 5)))
    sam2_pred = Sam2SegmentationPrediction(masks=[], confidence=1.0)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [], [], [], 0.5
    ); result = codeflash_output # 14.4μs -> 14.5μs (0.345% slower)

def test_edge_multiple_predictions_with_mismatched_class_ids():
    # More predictions than class ids: zip truncates to shortest
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    mask = [[1, 1], [2, 2], [3, 3]]
    sam2_pred1 = Sam2SegmentationPrediction(masks=[mask], confidence=0.6)
    sam2_pred2 = Sam2SegmentationPrediction(masks=[mask], confidence=0.7)
    # Only one class id/class name/detection id provided
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred1, sam2_pred2], img, [5], ["apple"], ["id"], 0.5
    ); result = codeflash_output # 58.5μs -> 50.4μs (15.9% faster)

def test_edge_mask_with_negative_coordinates():
    # Mask with negative coordinates should be processed correctly
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    mask = [[-5, -5], [0, 0], [5, 5]]
    sam2_pred = Sam2SegmentationPrediction(masks=[mask], confidence=1.0)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [1], ["neg"], ["det"], 0.5
    ); result = codeflash_output # 52.6μs -> 44.6μs (18.1% faster)
    pred = result.predictions[0]

def test_edge_mask_with_duplicate_points():
    # Mask with duplicate points should be processed as is
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    mask = [[1, 1], [1, 1], [2, 2]]
    sam2_pred = Sam2SegmentationPrediction(masks=[mask], confidence=1.0)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [2], ["dup"], ["det"], 0.5
    ); result = codeflash_output # 50.9μs -> 45.1μs (12.9% faster)
    pred = result.predictions[0]

# LARGE SCALE TEST CASES

def test_large_scale_many_predictions_and_masks():
    # Many predictions and masks, check performance and correctness
    img = WorkflowImageData(numpy_image=np.zeros((100, 100)))
    num_preds = 100
    masks_per_pred = 5
    sam2_preds = []
    class_ids = []
    class_names = []
    detection_ids = []
    for i in range(num_preds):
        masks = []
        for j in range(masks_per_pred):
            # Each mask is a triangle
            masks.append([[j, j], [j+1, j], [j, j+1]])
        sam2_preds.append(Sam2SegmentationPrediction(masks=masks, confidence=0.8))
        class_ids.append(i)
        class_names.append(f"class_{i}")
        detection_ids.append(f"id_{i}")
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam2_preds, img, class_ids, class_names, detection_ids, 0.5
    ); result = codeflash_output # 8.76ms -> 5.63ms (55.7% faster)
    # Check a few random predictions for correctness
    for idx in [0, 50, 499]:
        pred = result.predictions[idx]

def test_large_scale_masks_with_varying_sizes():
    # Masks with varying number of points
    img = WorkflowImageData(numpy_image=np.zeros((20, 20)))
    masks = []
    # Add a triangle
    masks.append([[0, 0], [1, 0], [0, 1]])
    # Add a quadrilateral
    masks.append([[2, 2], [4, 2], [4, 4], [2, 4]])
    # Add a pentagon
    masks.append([[5, 5], [6, 5], [7, 6], [6, 7], [5, 6]])
    sam2_pred = Sam2SegmentationPrediction(masks=masks, confidence=0.9)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [0], ["shape"], ["det"], 0.5
    ); result = codeflash_output # 105μs -> 85.9μs (23.1% faster)
    # Triangle
    pred0 = result.predictions[0]
    # Quadrilateral
    pred1 = result.predictions[1]
    # Pentagon
    pred2 = result.predictions[2]

def test_large_scale_all_masks_below_threshold():
    # All masks have confidence below threshold
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    num_preds = 50
    sam2_preds = [Sam2SegmentationPrediction(masks=[[[0, 0], [1, 1], [2, 2]]], confidence=0.1) for _ in range(num_preds)]
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam2_preds, img, [0]*num_preds, ["low"]*num_preds, [None]*num_preds, 0.5
    ); result = codeflash_output # 19.8μs -> 16.2μs (22.0% faster)

def test_large_scale_empty_masks_in_large_batch():
    # Large batch, but all masks are empty
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    num_preds = 100
    sam2_preds = [Sam2SegmentationPrediction(masks=[], confidence=1.0) for _ in range(num_preds)]
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam2_preds, img, [], [], [], 0.5
    ); result = codeflash_output # 24.5μs -> 26.3μs (6.91% slower)

def test_large_scale_mixed_valid_and_invalid_masks():
    # Large batch, some masks valid, some invalid (too small or low confidence)
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    num_preds = 50
    sam2_preds = []
    for i in range(num_preds):
        if i % 2 == 0:
            # Valid mask, high confidence
            sam2_preds.append(Sam2SegmentationPrediction(masks=[[[0, 0], [1, 1], [2, 2]]], confidence=0.9))
        else:
            # Invalid mask, low confidence
            sam2_preds.append(Sam2SegmentationPrediction(masks=[[[0, 0], [1, 1], [2, 2]]], confidence=0.2))
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam2_preds, img, [1]*num_preds, ["mixed"]*num_preds, [None]*num_preds, 0.5
    ); result = codeflash_output # 484μs -> 328μs (47.5% faster)
    for pred in result.predictions:
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import List, Optional

import numpy as np
# imports
import pytest
from inference.core.workflows.core_steps.models.foundation.segment_anything2.v1 import \
    convert_sam2_segmentation_response_to_inference_instances_seg_response


# Minimal stubs for required classes
class Point:
    def __init__(self, x: float, y: float):
        self.x = x
        self.y = y
    def __eq__(self, other):
        return isinstance(other, Point) and self.x == other.x and self.y == other.y
    def __repr__(self):
        return f"Point(x={self.x}, y={self.y})"

class InferenceResponseImage:
    def __init__(self, width: int, height: int):
        self.width = width
        self.height = height
    def __eq__(self, other):
        return isinstance(other, InferenceResponseImage) and self.width == other.width and self.height == other.height

class InstanceSegmentationPrediction:
    def __init__(self, x, y, width, height, points, confidence, class_, class_id, parent_id):
        self.x = x
        self.y = y
        self.width = width
        self.height = height
        self.points = points
        self.confidence = confidence
        self.class_ = class_
        self.class_id = class_id
        self.parent_id = parent_id
    def __eq__(self, other):
        return (
            isinstance(other, InstanceSegmentationPrediction)
            and self.x == other.x
            and self.y == other.y
            and self.width == other.width
            and self.height == other.height
            and self.points == other.points
            and self.confidence == other.confidence
            and self.class_ == other.class_
            and self.class_id == other.class_id
            and self.parent_id == other.parent_id
        )
    def __repr__(self):
        return (f"InstanceSegmentationPrediction(x={self.x}, y={self.y}, width={self.width}, height={self.height}, "
                f"points={self.points}, confidence={self.confidence}, class_={self.class_}, class_id={self.class_id}, parent_id={self.parent_id})")

class InstanceSegmentationInferenceResponse:
    def __init__(self, predictions: List[InstanceSegmentationPrediction], image: InferenceResponseImage):
        self.predictions = predictions
        self.image = image
    def __eq__(self, other):
        return (
            isinstance(other, InstanceSegmentationInferenceResponse)
            and self.predictions == other.predictions
            and self.image == other.image
        )
    def __repr__(self):
        return f"InstanceSegmentationInferenceResponse(predictions={self.predictions}, image={self.image})"

class Sam2SegmentationPrediction:
    def __init__(self, masks: List[List[List[float]]], confidence: float):
        self.masks = masks
        self.confidence = confidence

class WorkflowImageData:
    def __init__(self, numpy_image: np.ndarray):
        self.numpy_image = numpy_image
from inference.core.workflows.core_steps.models.foundation.segment_anything2.v1 import \
    convert_sam2_segmentation_response_to_inference_instances_seg_response

# unit tests

# Basic Test Cases

def test_basic_single_mask_above_threshold():
    # Single prediction, single mask, above threshold
    image = WorkflowImageData(np.zeros((100, 200, 3)))
    mask = [[10, 20], [30, 40], [50, 60]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.9)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [1], ["cat"], ["det1"], 0.5
    ); resp = codeflash_output # 58.9μs -> 49.7μs (18.5% faster)
    pred = resp.predictions[0]

def test_basic_multiple_masks_and_predictions():
    # Multiple predictions, each with multiple masks
    image = WorkflowImageData(np.zeros((50, 50, 3)))
    mask1 = [[0, 0], [10, 0], [10, 10]]
    mask2 = [[5, 5], [15, 5], [15, 15]]
    sam_pred1 = Sam2SegmentationPrediction(masks=[mask1], confidence=0.7)
    sam_pred2 = Sam2SegmentationPrediction(masks=[mask2], confidence=0.8)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred1, sam_pred2], image, [1, 2], ["dog", "cat"], ["id1", "id2"], 0.6
    ); resp = codeflash_output # 71.1μs -> 58.4μs (21.7% faster)
    # First prediction
    pred1 = resp.predictions[0]
    # Second prediction
    pred2 = resp.predictions[1]

def test_basic_empty_masks_are_skipped():
    # Masks with <3 points are skipped
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask1 = [[1, 2], [3, 4]]  # only 2 points
    mask2 = [[0, 0], [1, 0], [1, 1]]  # valid
    sam_pred = Sam2SegmentationPrediction(masks=[mask1, mask2], confidence=0.9)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [7], ["test"], ["id"], 0.1
    ); resp = codeflash_output # 48.3μs -> 39.1μs (23.7% faster)
    pred = resp.predictions[0]

def test_basic_confidence_threshold():
    # Masks below threshold are skipped
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask = [[0, 0], [1, 0], [1, 1]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.2)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [1], ["low"], ["pid"], 0.5
    ); resp = codeflash_output # 13.5μs -> 12.8μs (5.15% faster)

def test_basic_default_class_ids_names_detection_ids():
    # If prompt_class_ids is empty, defaults are used
    image = WorkflowImageData(np.zeros((5, 5, 3)))
    mask = [[0, 0], [1, 1], [2, 2]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=1.0)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [], [], [], 0.5
    ); resp = codeflash_output # 56.3μs -> 46.9μs (20.1% faster)
    pred = resp.predictions[0]

# Edge Test Cases

def test_edge_no_predictions():
    # No predictions input
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [], image, [], [], [], 0.5
    ); resp = codeflash_output # 14.5μs -> 14.3μs (0.935% faster)

def test_edge_mask_with_exactly_three_points():
    # Mask with exactly three points is valid
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask = [[1, 1], [2, 2], [3, 3]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.6)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [5], ["triangle"], ["pid"], 0.5
    ); resp = codeflash_output # 56.5μs -> 48.0μs (17.6% faster)
    pred = resp.predictions[0]

def test_edge_mask_with_non_integer_coordinates():
    # Mask with float coordinates
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask = [[1.5, 2.5], [3.5, 4.5], [5.5, 6.5]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.8)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [2], ["float"], ["pid"], 0.1
    ); resp = codeflash_output # 55.3μs -> 45.9μs (20.3% faster)
    pred = resp.predictions[0]

def test_edge_mask_with_negative_coordinates():
    # Mask with negative coordinates
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask = [[-5, -5], [0, 0], [5, 5]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.9)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [3], ["neg"], ["pid"], 0.1
    ); resp = codeflash_output # 51.5μs -> 44.6μs (15.5% faster)
    pred = resp.predictions[0]

def test_edge_mismatched_lengths_of_prompts_and_predictions():
    # If prompts lists are shorter than predictions, zip truncates
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask1 = [[0, 0], [1, 1], [2, 2]]
    mask2 = [[3, 3], [4, 4], [5, 5]]
    sam_pred1 = Sam2SegmentationPrediction(masks=[mask1], confidence=0.9)
    sam_pred2 = Sam2SegmentationPrediction(masks=[mask2], confidence=0.9)
    # Only one class_id/name/detection_id provided
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred1, sam_pred2], image, [1], ["a"], ["id"], 0.1
    ); resp = codeflash_output # 50.9μs -> 43.1μs (18.0% faster)
    pred = resp.predictions[0]

def test_edge_mask_with_duplicate_points():
    # Mask contains duplicate points
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask = [[1, 1], [1, 1], [2, 2]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.9)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [1], ["dup"], ["pid"], 0.1
    ); resp = codeflash_output # 51.2μs -> 43.1μs (18.8% faster)
    pred = resp.predictions[0]

def test_edge_mask_with_large_coordinates():
    # Mask with very large coordinates
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask = [[1e6, 2e6], [3e6, 4e6], [5e6, 6e6]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.95)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [9], ["big"], ["pid"], 0.1
    ); resp = codeflash_output # 52.0μs -> 44.2μs (17.7% faster)
    pred = resp.predictions[0]

def test_edge_mask_with_nan_inf_coordinates():
    # Mask with NaN/Inf coordinates should not crash
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask = [[float('nan'), 1], [2, float('inf')], [3, 4]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.95)
    try:
        codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
            [sam_pred], image, [1], ["nan"], ["pid"], 0.1
        ); resp = codeflash_output
        pred = resp.predictions[0]
    except Exception as e:
        pytest.fail(f"Function crashed with nan/inf coordinates: {e}")

# Large Scale Test Cases

def test_large_scale_many_predictions():
    # Large number of predictions, each with a mask
    image = WorkflowImageData(np.zeros((100, 100, 3)))
    num_preds = 500
    sam_preds = []
    class_ids = []
    class_names = []
    detection_ids = []
    for i in range(num_preds):
        mask = [[i, i], [i+1, i+1], [i+2, i+2]]
        sam_preds.append(Sam2SegmentationPrediction(masks=[mask], confidence=0.9))
        class_ids.append(i)
        class_names.append(f"class_{i}")
        detection_ids.append(f"id_{i}")
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam_preds, image, class_ids, class_names, detection_ids, 0.5
    ); resp = codeflash_output # 8.87ms -> 5.87ms (50.9% faster)
    # Spot check a few predictions
    for i in [0, 100, 499]:
        pred = resp.predictions[i]

def test_large_scale_many_masks_per_prediction():
    # Each prediction has many masks
    image = WorkflowImageData(np.zeros((100, 100, 3)))
    num_preds = 10
    num_masks = 50
    sam_preds = []
    for i in range(num_preds):
        masks = []
        for j in range(num_masks):
            masks.append([[j, j], [j+1, j+1], [j+2, j+2]])
        sam_preds.append(Sam2SegmentationPrediction(masks=masks, confidence=0.99))
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam_preds, image, [i for i in range(num_preds)], [f"class_{i}" for i in range(num_preds)], [f"id_{i}" for i in range(num_preds)], 0.5
    ); resp = codeflash_output # 8.73ms -> 5.64ms (54.9% faster)
    # Spot check
    for i in [0, 9]:
        for j in [0, 49]:
            idx = i * num_masks + j
            pred = resp.predictions[idx]

def test_large_scale_all_masks_below_threshold():
    # All masks below threshold, should produce no predictions
    image = WorkflowImageData(np.zeros((100, 100, 3)))
    num_preds = 100
    sam_preds = [Sam2SegmentationPrediction(masks=[[[0,0],[1,1],[2,2]]], confidence=0.1) for _ in range(num_preds)]
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam_preds, image, [0]*num_preds, ["a"]*num_preds, ["id"]*num_preds, 0.5
    ); resp = codeflash_output # 27.3μs -> 22.6μs (20.7% faster)

def test_large_scale_all_masks_empty():
    # All masks have less than 3 points
    image = WorkflowImageData(np.zeros((100, 100, 3)))
    num_preds = 100
    sam_preds = [Sam2SegmentationPrediction(masks=[[ [0,0], [1,1] ]], confidence=0.99) for _ in range(num_preds)]
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam_preds, image, [0]*num_preds, ["a"]*num_preds, ["id"]*num_preds, 0.5
    ); resp = codeflash_output # 22.6μs -> 24.1μs (6.18% slower)

def test_large_scale_default_prompts():
    # Large number of predictions, default prompts used
    image = WorkflowImageData(np.zeros((100, 100, 3)))
    num_preds = 100
    sam_preds = [Sam2SegmentationPrediction(masks=[[[i, i], [i+1, i+1], [i+2, i+2]]], confidence=0.99) for i in range(num_preds)]
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam_preds, image, [], [], [], 0.5
    ); resp = codeflash_output # 1.82ms -> 1.22ms (49.7% faster)
    for pred in resp.predictions:
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-convert_sam2_segmentation_response_to_inference_instances_seg_response-mh9th7dp and push.

Codeflash

…g_response

The optimized code achieves a **51% speedup** through two key NumPy-based optimizations:

**1. Vectorized coordinate operations**: Instead of using Python list comprehensions to extract x/y coordinates (`[coord[0] for coord in mask]`), the code converts each mask to a NumPy array once with `np.asarray(mask)` and then uses vectorized slicing (`mask_coords[:, 0]` and `mask_coords[:, 1]`) and NumPy's optimized `min()`/`max()` methods. This eliminates expensive Python loops for coordinate processing.

**2. Early confidence filtering**: The confidence threshold check (`prediction.confidence < threshold`) is moved outside the mask loop, so when a prediction fails the threshold test, all its masks are skipped immediately rather than processing each mask before the confidence check.

**Performance characteristics from tests**:
- **Best gains** on large-scale scenarios: 55.7% faster with many predictions/masks, 54.9% faster with many masks per prediction
- **Consistent improvements** across basic cases: 15-25% faster for typical mask processing
- **Minimal overhead** for edge cases with no valid masks (only 1-7% differences)

These optimizations are particularly effective for computer vision workloads where masks typically contain many coordinate points, making the vectorized NumPy operations significantly faster than Python list processing.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 28, 2025 00:17
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants