⚡️ Speed up function `convert_sam2_segmentation_response_to_inference_instances_seg_response` by 51% #585

codeflash-ai · 2025-10-28T00:17:02Z

📄 51% (0.51x) speedup for `convert_sam2_segmentation_response_to_inference_instances_seg_response` in `inference/core/workflows/core_steps/models/foundation/segment_anything2/v1.py`

⏱️ Runtime : 30.0 milliseconds → 19.8 milliseconds (best of 209 runs)

📝 Explanation and details

The optimized code achieves a 51% speedup through two key NumPy-based optimizations:

1. Vectorized coordinate operations: Instead of using Python list comprehensions to extract x/y coordinates ([coord[0] for coord in mask]), the code converts each mask to a NumPy array once with np.asarray(mask) and then uses vectorized slicing (mask_coords[:, 0] and mask_coords[:, 1]) and NumPy's optimized min()/max() methods. This eliminates expensive Python loops for coordinate processing.

2. Early confidence filtering: The confidence threshold check (prediction.confidence < threshold) is moved outside the mask loop, so when a prediction fails the threshold test, all its masks are skipped immediately rather than processing each mask before the confidence check.

Performance characteristics from tests:

Best gains on large-scale scenarios: 55.7% faster with many predictions/masks, 54.9% faster with many masks per prediction
Consistent improvements across basic cases: 15-25% faster for typical mask processing
Minimal overhead for edge cases with no valid masks (only 1-7% differences)

These optimizations are particularly effective for computer vision workloads where masks typically contain many coordinate points, making the vectorized NumPy operations significantly faster than Python list processing.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 34 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

from typing import List, Optional

import numpy as np
# imports
import pytest
from inference.core.workflows.core_steps.models.foundation.segment_anything2.v1 import \
    convert_sam2_segmentation_response_to_inference_instances_seg_response

# --- Dummy entity classes for testing (as per function signature) ---

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __eq__(self, other):
        return isinstance(other, Point) and self.x == other.x and self.y == other.y

class InstanceSegmentationPrediction:
    def __init__(self, x, y, width, height, points, confidence, class_, class_id, parent_id):
        self.x = x
        self.y = y
        self.width = width
        self.height = height
        self.points = points
        self.confidence = confidence
        self.class_ = class_
        self.class_id = class_id
        self.parent_id = parent_id

    def __eq__(self, other):
        return (
            isinstance(other, InstanceSegmentationPrediction)
            and self.x == other.x
            and self.y == other.y
            and self.width == other.width
            and self.height == other.height
            and self.points == other.points
            and self.confidence == other.confidence
            and self.class_ == other.class_
            and self.class_id == other.class_id
            and self.parent_id == other.parent_id
        )

class InferenceResponseImage:
    def __init__(self, width, height):
        self.width = width
        self.height = height

class InstanceSegmentationInferenceResponse:
    def __init__(self, predictions, image):
        self.predictions = predictions
        self.image = image

class Sam2SegmentationPrediction:
    def __init__(self, masks: List[List[List[int]]], confidence: float):
        self.masks = masks
        self.confidence = confidence

class WorkflowImageData:
    def __init__(self, numpy_image):
        self.numpy_image = numpy_image
from inference.core.workflows.core_steps.models.foundation.segment_anything2.v1 import \
    convert_sam2_segmentation_response_to_inference_instances_seg_response

# --- Unit tests ---

# BASIC TEST CASES

def test_basic_single_mask_above_threshold():
    # Single mask, confidence above threshold
    img = WorkflowImageData(numpy_image=np.zeros((100, 200)))
    mask = [[10, 10], [20, 10], [20, 20], [10, 20]]
    sam2_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.8)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [1], ["cat"], ["det1"], 0.5
    ); result = codeflash_output # 65.0μs -> 55.7μs (16.6% faster)
    pred = result.predictions[0]

def test_basic_multiple_masks_mixed_threshold():
    # Two masks, one above and one below threshold
    img = WorkflowImageData(numpy_image=np.zeros((50, 50)))
    mask1 = [[0, 0], [10, 0], [10, 10]]
    mask2 = [[20, 20], [30, 20], [30, 30]]
    sam2_pred = Sam2SegmentationPrediction(masks=[mask1, mask2], confidence=0.7)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [2], ["dog"], ["det2"], 0.6
    ); result = codeflash_output # 71.4μs -> 58.3μs (22.6% faster)
    # Check first mask
    pred1 = result.predictions[0]
    # Check second mask
    pred2 = result.predictions[1]

def test_basic_no_masks():
    # No masks in prediction
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    sam2_pred = Sam2SegmentationPrediction(masks=[], confidence=0.9)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [0], ["none"], [None], 0.5
    ); result = codeflash_output # 13.5μs -> 14.2μs (4.91% slower)

def test_basic_no_class_id_provided():
    # prompt_class_ids is empty, should default to foreground/0/None
    img = WorkflowImageData(numpy_image=np.zeros((5, 5)))
    mask = [[1, 1], [2, 1], [2, 2]]
    sam2_pred = Sam2SegmentationPrediction(masks=[mask], confidence=1.0)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [], [], [], 0.5
    ); result = codeflash_output # 56.0μs -> 48.5μs (15.5% faster)
    pred = result.predictions[0]

# EDGE TEST CASES

def test_edge_mask_with_less_than_three_points():
    # Mask with less than 3 points should be skipped
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    mask = [[1, 2], [3, 4]]  # Only 2 points
    sam2_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.9)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [1], ["thing"], ["det"], 0.5
    ); result = codeflash_output # 13.0μs -> 12.7μs (1.77% faster)

def test_edge_confidence_exactly_at_threshold():
    # Mask with confidence exactly equal to threshold should be included
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    mask = [[1, 1], [2, 2], [3, 3]]
    sam2_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.5)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [1], ["thing"], ["det"], 0.5
    ); result = codeflash_output # 54.3μs -> 47.5μs (14.3% faster)

def test_edge_confidence_just_below_threshold():
    # Mask with confidence just below threshold should be skipped
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    mask = [[1, 1], [2, 2], [3, 3]]
    sam2_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.4999)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [1], ["thing"], ["det"], 0.5
    ); result = codeflash_output # 13.1μs -> 12.8μs (2.47% faster)

def test_edge_empty_masks_and_empty_class_ids():
    # No masks and no class ids: should return empty predictions
    img = WorkflowImageData(numpy_image=np.zeros((5, 5)))
    sam2_pred = Sam2SegmentationPrediction(masks=[], confidence=1.0)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [], [], [], 0.5
    ); result = codeflash_output # 14.4μs -> 14.5μs (0.345% slower)

def test_edge_multiple_predictions_with_mismatched_class_ids():
    # More predictions than class ids: zip truncates to shortest
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    mask = [[1, 1], [2, 2], [3, 3]]
    sam2_pred1 = Sam2SegmentationPrediction(masks=[mask], confidence=0.6)
    sam2_pred2 = Sam2SegmentationPrediction(masks=[mask], confidence=0.7)
    # Only one class id/class name/detection id provided
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred1, sam2_pred2], img, [5], ["apple"], ["id"], 0.5
    ); result = codeflash_output # 58.5μs -> 50.4μs (15.9% faster)

def test_edge_mask_with_negative_coordinates():
    # Mask with negative coordinates should be processed correctly
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    mask = [[-5, -5], [0, 0], [5, 5]]
    sam2_pred = Sam2SegmentationPrediction(masks=[mask], confidence=1.0)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [1], ["neg"], ["det"], 0.5
    ); result = codeflash_output # 52.6μs -> 44.6μs (18.1% faster)
    pred = result.predictions[0]

def test_edge_mask_with_duplicate_points():
    # Mask with duplicate points should be processed as is
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    mask = [[1, 1], [1, 1], [2, 2]]
    sam2_pred = Sam2SegmentationPrediction(masks=[mask], confidence=1.0)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [2], ["dup"], ["det"], 0.5
    ); result = codeflash_output # 50.9μs -> 45.1μs (12.9% faster)
    pred = result.predictions[0]

# LARGE SCALE TEST CASES

def test_large_scale_many_predictions_and_masks():
    # Many predictions and masks, check performance and correctness
    img = WorkflowImageData(numpy_image=np.zeros((100, 100)))
    num_preds = 100
    masks_per_pred = 5
    sam2_preds = []
    class_ids = []
    class_names = []
    detection_ids = []
    for i in range(num_preds):
        masks = []
        for j in range(masks_per_pred):
            # Each mask is a triangle
            masks.append([[j, j], [j+1, j], [j, j+1]])
        sam2_preds.append(Sam2SegmentationPrediction(masks=masks, confidence=0.8))
        class_ids.append(i)
        class_names.append(f"class_{i}")
        detection_ids.append(f"id_{i}")
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam2_preds, img, class_ids, class_names, detection_ids, 0.5
    ); result = codeflash_output # 8.76ms -> 5.63ms (55.7% faster)
    # Check a few random predictions for correctness
    for idx in [0, 50, 499]:
        pred = result.predictions[idx]

def test_large_scale_masks_with_varying_sizes():
    # Masks with varying number of points
    img = WorkflowImageData(numpy_image=np.zeros((20, 20)))
    masks = []
    # Add a triangle
    masks.append([[0, 0], [1, 0], [0, 1]])
    # Add a quadrilateral
    masks.append([[2, 2], [4, 2], [4, 4], [2, 4]])
    # Add a pentagon
    masks.append([[5, 5], [6, 5], [7, 6], [6, 7], [5, 6]])
    sam2_pred = Sam2SegmentationPrediction(masks=masks, confidence=0.9)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam2_pred], img, [0], ["shape"], ["det"], 0.5
    ); result = codeflash_output # 105μs -> 85.9μs (23.1% faster)
    # Triangle
    pred0 = result.predictions[0]
    # Quadrilateral
    pred1 = result.predictions[1]
    # Pentagon
    pred2 = result.predictions[2]

def test_large_scale_all_masks_below_threshold():
    # All masks have confidence below threshold
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    num_preds = 50
    sam2_preds = [Sam2SegmentationPrediction(masks=[[[0, 0], [1, 1], [2, 2]]], confidence=0.1) for _ in range(num_preds)]
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam2_preds, img, [0]*num_preds, ["low"]*num_preds, [None]*num_preds, 0.5
    ); result = codeflash_output # 19.8μs -> 16.2μs (22.0% faster)

def test_large_scale_empty_masks_in_large_batch():
    # Large batch, but all masks are empty
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    num_preds = 100
    sam2_preds = [Sam2SegmentationPrediction(masks=[], confidence=1.0) for _ in range(num_preds)]
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam2_preds, img, [], [], [], 0.5
    ); result = codeflash_output # 24.5μs -> 26.3μs (6.91% slower)

def test_large_scale_mixed_valid_and_invalid_masks():
    # Large batch, some masks valid, some invalid (too small or low confidence)
    img = WorkflowImageData(numpy_image=np.zeros((10, 10)))
    num_preds = 50
    sam2_preds = []
    for i in range(num_preds):
        if i % 2 == 0:
            # Valid mask, high confidence
            sam2_preds.append(Sam2SegmentationPrediction(masks=[[[0, 0], [1, 1], [2, 2]]], confidence=0.9))
        else:
            # Invalid mask, low confidence
            sam2_preds.append(Sam2SegmentationPrediction(masks=[[[0, 0], [1, 1], [2, 2]]], confidence=0.2))
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam2_preds, img, [1]*num_preds, ["mixed"]*num_preds, [None]*num_preds, 0.5
    ); result = codeflash_output # 484μs -> 328μs (47.5% faster)
    for pred in result.predictions:
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import List, Optional

import numpy as np
# imports
import pytest
from inference.core.workflows.core_steps.models.foundation.segment_anything2.v1 import \
    convert_sam2_segmentation_response_to_inference_instances_seg_response


# Minimal stubs for required classes
class Point:
    def __init__(self, x: float, y: float):
        self.x = x
        self.y = y
    def __eq__(self, other):
        return isinstance(other, Point) and self.x == other.x and self.y == other.y
    def __repr__(self):
        return f"Point(x={self.x}, y={self.y})"

class InferenceResponseImage:
    def __init__(self, width: int, height: int):
        self.width = width
        self.height = height
    def __eq__(self, other):
        return isinstance(other, InferenceResponseImage) and self.width == other.width and self.height == other.height

class InstanceSegmentationPrediction:
    def __init__(self, x, y, width, height, points, confidence, class_, class_id, parent_id):
        self.x = x
        self.y = y
        self.width = width
        self.height = height
        self.points = points
        self.confidence = confidence
        self.class_ = class_
        self.class_id = class_id
        self.parent_id = parent_id
    def __eq__(self, other):
        return (
            isinstance(other, InstanceSegmentationPrediction)
            and self.x == other.x
            and self.y == other.y
            and self.width == other.width
            and self.height == other.height
            and self.points == other.points
            and self.confidence == other.confidence
            and self.class_ == other.class_
            and self.class_id == other.class_id
            and self.parent_id == other.parent_id
        )
    def __repr__(self):
        return (f"InstanceSegmentationPrediction(x={self.x}, y={self.y}, width={self.width}, height={self.height}, "
                f"points={self.points}, confidence={self.confidence}, class_={self.class_}, class_id={self.class_id}, parent_id={self.parent_id})")

class InstanceSegmentationInferenceResponse:
    def __init__(self, predictions: List[InstanceSegmentationPrediction], image: InferenceResponseImage):
        self.predictions = predictions
        self.image = image
    def __eq__(self, other):
        return (
            isinstance(other, InstanceSegmentationInferenceResponse)
            and self.predictions == other.predictions
            and self.image == other.image
        )
    def __repr__(self):
        return f"InstanceSegmentationInferenceResponse(predictions={self.predictions}, image={self.image})"

class Sam2SegmentationPrediction:
    def __init__(self, masks: List[List[List[float]]], confidence: float):
        self.masks = masks
        self.confidence = confidence

class WorkflowImageData:
    def __init__(self, numpy_image: np.ndarray):
        self.numpy_image = numpy_image
from inference.core.workflows.core_steps.models.foundation.segment_anything2.v1 import \
    convert_sam2_segmentation_response_to_inference_instances_seg_response

# unit tests

# Basic Test Cases

def test_basic_single_mask_above_threshold():
    # Single prediction, single mask, above threshold
    image = WorkflowImageData(np.zeros((100, 200, 3)))
    mask = [[10, 20], [30, 40], [50, 60]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.9)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [1], ["cat"], ["det1"], 0.5
    ); resp = codeflash_output # 58.9μs -> 49.7μs (18.5% faster)
    pred = resp.predictions[0]

def test_basic_multiple_masks_and_predictions():
    # Multiple predictions, each with multiple masks
    image = WorkflowImageData(np.zeros((50, 50, 3)))
    mask1 = [[0, 0], [10, 0], [10, 10]]
    mask2 = [[5, 5], [15, 5], [15, 15]]
    sam_pred1 = Sam2SegmentationPrediction(masks=[mask1], confidence=0.7)
    sam_pred2 = Sam2SegmentationPrediction(masks=[mask2], confidence=0.8)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred1, sam_pred2], image, [1, 2], ["dog", "cat"], ["id1", "id2"], 0.6
    ); resp = codeflash_output # 71.1μs -> 58.4μs (21.7% faster)
    # First prediction
    pred1 = resp.predictions[0]
    # Second prediction
    pred2 = resp.predictions[1]

def test_basic_empty_masks_are_skipped():
    # Masks with <3 points are skipped
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask1 = [[1, 2], [3, 4]]  # only 2 points
    mask2 = [[0, 0], [1, 0], [1, 1]]  # valid
    sam_pred = Sam2SegmentationPrediction(masks=[mask1, mask2], confidence=0.9)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [7], ["test"], ["id"], 0.1
    ); resp = codeflash_output # 48.3μs -> 39.1μs (23.7% faster)
    pred = resp.predictions[0]

def test_basic_confidence_threshold():
    # Masks below threshold are skipped
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask = [[0, 0], [1, 0], [1, 1]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.2)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [1], ["low"], ["pid"], 0.5
    ); resp = codeflash_output # 13.5μs -> 12.8μs (5.15% faster)

def test_basic_default_class_ids_names_detection_ids():
    # If prompt_class_ids is empty, defaults are used
    image = WorkflowImageData(np.zeros((5, 5, 3)))
    mask = [[0, 0], [1, 1], [2, 2]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=1.0)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [], [], [], 0.5
    ); resp = codeflash_output # 56.3μs -> 46.9μs (20.1% faster)
    pred = resp.predictions[0]

# Edge Test Cases

def test_edge_no_predictions():
    # No predictions input
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [], image, [], [], [], 0.5
    ); resp = codeflash_output # 14.5μs -> 14.3μs (0.935% faster)

def test_edge_mask_with_exactly_three_points():
    # Mask with exactly three points is valid
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask = [[1, 1], [2, 2], [3, 3]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.6)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [5], ["triangle"], ["pid"], 0.5
    ); resp = codeflash_output # 56.5μs -> 48.0μs (17.6% faster)
    pred = resp.predictions[0]

def test_edge_mask_with_non_integer_coordinates():
    # Mask with float coordinates
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask = [[1.5, 2.5], [3.5, 4.5], [5.5, 6.5]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.8)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [2], ["float"], ["pid"], 0.1
    ); resp = codeflash_output # 55.3μs -> 45.9μs (20.3% faster)
    pred = resp.predictions[0]

def test_edge_mask_with_negative_coordinates():
    # Mask with negative coordinates
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask = [[-5, -5], [0, 0], [5, 5]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.9)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [3], ["neg"], ["pid"], 0.1
    ); resp = codeflash_output # 51.5μs -> 44.6μs (15.5% faster)
    pred = resp.predictions[0]

def test_edge_mismatched_lengths_of_prompts_and_predictions():
    # If prompts lists are shorter than predictions, zip truncates
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask1 = [[0, 0], [1, 1], [2, 2]]
    mask2 = [[3, 3], [4, 4], [5, 5]]
    sam_pred1 = Sam2SegmentationPrediction(masks=[mask1], confidence=0.9)
    sam_pred2 = Sam2SegmentationPrediction(masks=[mask2], confidence=0.9)
    # Only one class_id/name/detection_id provided
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred1, sam_pred2], image, [1], ["a"], ["id"], 0.1
    ); resp = codeflash_output # 50.9μs -> 43.1μs (18.0% faster)
    pred = resp.predictions[0]

def test_edge_mask_with_duplicate_points():
    # Mask contains duplicate points
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask = [[1, 1], [1, 1], [2, 2]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.9)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [1], ["dup"], ["pid"], 0.1
    ); resp = codeflash_output # 51.2μs -> 43.1μs (18.8% faster)
    pred = resp.predictions[0]

def test_edge_mask_with_large_coordinates():
    # Mask with very large coordinates
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask = [[1e6, 2e6], [3e6, 4e6], [5e6, 6e6]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.95)
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        [sam_pred], image, [9], ["big"], ["pid"], 0.1
    ); resp = codeflash_output # 52.0μs -> 44.2μs (17.7% faster)
    pred = resp.predictions[0]

def test_edge_mask_with_nan_inf_coordinates():
    # Mask with NaN/Inf coordinates should not crash
    image = WorkflowImageData(np.zeros((10, 10, 3)))
    mask = [[float('nan'), 1], [2, float('inf')], [3, 4]]
    sam_pred = Sam2SegmentationPrediction(masks=[mask], confidence=0.95)
    try:
        codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
            [sam_pred], image, [1], ["nan"], ["pid"], 0.1
        ); resp = codeflash_output
        pred = resp.predictions[0]
    except Exception as e:
        pytest.fail(f"Function crashed with nan/inf coordinates: {e}")

# Large Scale Test Cases

def test_large_scale_many_predictions():
    # Large number of predictions, each with a mask
    image = WorkflowImageData(np.zeros((100, 100, 3)))
    num_preds = 500
    sam_preds = []
    class_ids = []
    class_names = []
    detection_ids = []
    for i in range(num_preds):
        mask = [[i, i], [i+1, i+1], [i+2, i+2]]
        sam_preds.append(Sam2SegmentationPrediction(masks=[mask], confidence=0.9))
        class_ids.append(i)
        class_names.append(f"class_{i}")
        detection_ids.append(f"id_{i}")
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam_preds, image, class_ids, class_names, detection_ids, 0.5
    ); resp = codeflash_output # 8.87ms -> 5.87ms (50.9% faster)
    # Spot check a few predictions
    for i in [0, 100, 499]:
        pred = resp.predictions[i]

def test_large_scale_many_masks_per_prediction():
    # Each prediction has many masks
    image = WorkflowImageData(np.zeros((100, 100, 3)))
    num_preds = 10
    num_masks = 50
    sam_preds = []
    for i in range(num_preds):
        masks = []
        for j in range(num_masks):
            masks.append([[j, j], [j+1, j+1], [j+2, j+2]])
        sam_preds.append(Sam2SegmentationPrediction(masks=masks, confidence=0.99))
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam_preds, image, [i for i in range(num_preds)], [f"class_{i}" for i in range(num_preds)], [f"id_{i}" for i in range(num_preds)], 0.5
    ); resp = codeflash_output # 8.73ms -> 5.64ms (54.9% faster)
    # Spot check
    for i in [0, 9]:
        for j in [0, 49]:
            idx = i * num_masks + j
            pred = resp.predictions[idx]

def test_large_scale_all_masks_below_threshold():
    # All masks below threshold, should produce no predictions
    image = WorkflowImageData(np.zeros((100, 100, 3)))
    num_preds = 100
    sam_preds = [Sam2SegmentationPrediction(masks=[[[0,0],[1,1],[2,2]]], confidence=0.1) for _ in range(num_preds)]
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam_preds, image, [0]*num_preds, ["a"]*num_preds, ["id"]*num_preds, 0.5
    ); resp = codeflash_output # 27.3μs -> 22.6μs (20.7% faster)

def test_large_scale_all_masks_empty():
    # All masks have less than 3 points
    image = WorkflowImageData(np.zeros((100, 100, 3)))
    num_preds = 100
    sam_preds = [Sam2SegmentationPrediction(masks=[[ [0,0], [1,1] ]], confidence=0.99) for _ in range(num_preds)]
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam_preds, image, [0]*num_preds, ["a"]*num_preds, ["id"]*num_preds, 0.5
    ); resp = codeflash_output # 22.6μs -> 24.1μs (6.18% slower)

def test_large_scale_default_prompts():
    # Large number of predictions, default prompts used
    image = WorkflowImageData(np.zeros((100, 100, 3)))
    num_preds = 100
    sam_preds = [Sam2SegmentationPrediction(masks=[[[i, i], [i+1, i+1], [i+2, i+2]]], confidence=0.99) for i in range(num_preds)]
    codeflash_output = convert_sam2_segmentation_response_to_inference_instances_seg_response(
        sam_preds, image, [], [], [], 0.5
    ); resp = codeflash_output # 1.82ms -> 1.22ms (49.7% faster)
    for pred in resp.predictions:
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-convert_sam2_segmentation_response_to_inference_instances_seg_response-mh9th7dp and push.

…g_response The optimized code achieves a **51% speedup** through two key NumPy-based optimizations: **1. Vectorized coordinate operations**: Instead of using Python list comprehensions to extract x/y coordinates (`[coord[0] for coord in mask]`), the code converts each mask to a NumPy array once with `np.asarray(mask)` and then uses vectorized slicing (`mask_coords[:, 0]` and `mask_coords[:, 1]`) and NumPy's optimized `min()`/`max()` methods. This eliminates expensive Python loops for coordinate processing. **2. Early confidence filtering**: The confidence threshold check (`prediction.confidence < threshold`) is moved outside the mask loop, so when a prediction fails the threshold test, all its masks are skipped immediately rather than processing each mask before the confidence check. **Performance characteristics from tests**: - **Best gains** on large-scale scenarios: 55.7% faster with many predictions/masks, 54.9% faster with many masks per prediction - **Consistent improvements** across basic cases: 15-25% faster for typical mask processing - **Minimal overhead** for edge cases with no valid masks (only 1-7% differences) These optimizations are particularly effective for computer vision workloads where masks typically contain many coordinate points, making the vectorized NumPy operations significantly faster than Python list processing.

codeflash-ai bot requested a review from mashraf-222 October 28, 2025 00:17

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 28, 2025

misrasaurabh1 approved these changes Oct 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

⚡️ Speed up function `convert_sam2_segmentation_response_to_inference_instances_seg_response` by 51% #585

⚡️ Speed up function `convert_sam2_segmentation_response_to_inference_instances_seg_response` by 51% #585

codeflash-ai bot commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

⚡️ Speed up function convert_sam2_segmentation_response_to_inference_instances_seg_response by 51% #585

Are you sure you want to change the base?

⚡️ Speed up function convert_sam2_segmentation_response_to_inference_instances_seg_response by 51% #585

Conversation

codeflash-ai bot commented Oct 28, 2025

📄 51% (0.51x) speedup for convert_sam2_segmentation_response_to_inference_instances_seg_response in inference/core/workflows/core_steps/models/foundation/segment_anything2/v1.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

⚡️ Speed up function `convert_sam2_segmentation_response_to_inference_instances_seg_response` by 51% #585

⚡️ Speed up function `convert_sam2_segmentation_response_to_inference_instances_seg_response` by 51% #585

📄 51% (0.51x) speedup for `convert_sam2_segmentation_response_to_inference_instances_seg_response` in `inference/core/workflows/core_steps/models/foundation/segment_anything2/v1.py`