Skip to content

Conversation

@adharshctr
Copy link
Contributor

@adharshctr adharshctr commented Oct 31, 2025

  • I have added tests that cover my changes.
  • If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
  • PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
  • (If applicable) I have updated the documentation accordingly.

Important

Add _sanitize_for_serialization() to handle JSON serialization of unpicklable dataclass fields in base.py, updating input/output handling and exception catching.

  • Serialization:
    • Add _sanitize_for_serialization() to base.py to handle JSON serialization of unpicklable dataclass fields.
    • Integrate _sanitize_for_serialization() into _handle_span_input() and _handle_span_output() to sanitize inputs and outputs.
  • Exception Handling:
    • Update _handle_span_output() to catch ValueError in addition to TypeError.

This description was created by Ellipsis for be36de0. You can customize this summary. It will automatically update as commits are pushed.

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced serialization handling to support a broader range of data types in traces
    • Extended error handling coverage during trace data processing

@coderabbitai
Copy link

coderabbitai bot commented Oct 31, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

A sanitization layer is introduced to the decorator base module to convert potentially unpicklable objects into JSON-friendly representations before serialization. The new helper function processes span inputs and outputs recursively, handling primitives, containers, dataclasses, and fallback string conversions. Exception handling in the output handler is expanded to catch both TypeError and ValueError.

Changes

Cohort / File(s) Summary
Serialization Robustness
packages/traceloop-sdk/traceloop/sdk/decorators/base.py
Added _sanitize_for_serialization() helper function to recursively convert unpicklable objects into JSON-friendly representations; updated _handle_span_input() to use sanitized args/kwargs before encoding; updated _handle_span_output() to use sanitized results before encoding; expanded exception handling in output handler to catch (TypeError, ValueError)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

  • The sanitization function includes recursion and dataclass handling logic that warrants verification for edge cases and type coverage
  • Exception handling expansion should be verified to ensure both error types are appropriately handled in context
  • Single-file scope and consistent sanitization pattern minimize overall complexity

Poem

🐰 Hopping through data with JSON so fine,
Converting the tricky to formats divine,
With recursion and care for each unpicklable friend,
Your spans now serialize without end! 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "fix(traceloop-sdk): Improve JSON serialization by sanitizing unpicklable dataclass fields" directly and accurately describes the main change introduced in the pull request. The title identifies the component (traceloop-sdk), the type of improvement (JSON serialization), and the core mechanism (sanitizing unpicklable dataclass fields), which aligns with the actual implementation of the new _sanitize_for_serialization() helper function. The title is concise, follows conventional commit format, and is specific enough for teammates to quickly understand the primary purpose of this changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed everything up to be36de0 in 1 minute and 39 seconds. Click for details.
  • Reviewed 102 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/traceloop-sdk/traceloop/sdk/decorators/base.py:205
  • Draft comment:
    Using copy.copy/deepcopy on each dataclass field may incur performance overhead for large or complex objects. Consider optimizing or caching results if performance becomes an issue.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
2. packages/traceloop-sdk/traceloop/sdk/decorators/base.py:175
  • Draft comment:
    Consider adding type annotations (e.g. obj: Any -> Any) in _sanitize_for_serialization for improved clarity and static type checking.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
3. packages/traceloop-sdk/traceloop/sdk/decorators/base.py:243
  • Draft comment:
    Good improvement: sanitizing arguments before JSON serialization helps avoid errors from unpicklable objects.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
4. packages/traceloop-sdk/traceloop/sdk/decorators/base.py:269
  • Draft comment:
    Catching ValueError in addition to TypeError in _handle_span_output is a useful enhancement to capture more serialization errors.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_jcTtHgSoDJLwD5j8

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

return span, ctx, ctx_token


def _sanitize_for_serialization(obj):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider handling circular references in _sanitize_for_serialization to prevent infinite recursion when processing cyclic data structures.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
packages/traceloop-sdk/traceloop/sdk/decorators/base.py (1)

175-236: Reconsider the approach: "picklability" ≠ JSON serializability.

The function uses copy.deepcopy() to test whether objects are "picklable," but the actual goal is JSON serialization. These are fundamentally different concerns:

  • An object can be picklable but not JSON-serializable (e.g., custom classes)
  • An object can be JSON-serializable but not picklable
  • Testing with deepcopy() is expensive and conceptually misaligned

Additionally:

  1. Circular references are not handled, which can cause infinite recursion (as noted in the past review).
  2. Redundancy with JSONEncoder: The code already passes cls=JSONEncoder to json.dumps(). This sanitization layer may be unnecessary, or the logic should be integrated into the custom encoder instead.
  3. Inefficiency: Calling copy.copy() (line 208) and copy.deepcopy() (lines 215, 230) just to test copyability is computationally expensive.
  4. Local imports: dataclasses and copy (lines 180-181) should be imported at the module level.
  5. Incomplete type coverage: Common types like set, frozenset, bytes, etc., are not handled.

Consider one of these approaches:

Option 1 (Preferred): Extend the existing JSONEncoder to handle unpicklable objects directly during json.dumps():

# In utils/json_encoder.py or similar
class JSONEncoder(json.JSONEncoder):
    def default(self, obj):
        if dataclasses.is_dataclass(obj):
            try:
                return dataclasses.asdict(obj)
            except (TypeError, ValueError):
                return f"<non-serializable: {type(obj).__name__}>"
        # Handle other types...
        try:
            return super().default(obj)
        except TypeError:
            return f"<non-serializable: {type(obj).__name__}>"

Option 2: If pre-sanitization is required, test JSON serializability directly instead of using deepcopy:

def _sanitize_for_serialization(obj, _seen=None):
    """Recursively sanitize objects for JSON serialization."""
    # Handle circular references
    if _seen is None:
        _seen = set()
    
    obj_id = id(obj)
    if obj_id in _seen:
        return f"<circular reference: {type(obj).__name__}>"
    
    # Primitives pass through
    if isinstance(obj, (type(None), str, int, float, bool)):
        return obj
    
    # Track this object
    _seen.add(obj_id)
    
    try:
        if isinstance(obj, (list, tuple)):
            result = type(obj)(_sanitize_for_serialization(item, _seen) for item in obj)
        elif isinstance(obj, dict):
            result = {key: _sanitize_for_serialization(value, _seen) for key, value in obj.items()}
        elif dataclasses.is_dataclass(obj):
            result = {
                field.name: _sanitize_for_serialization(getattr(obj, field.name), _seen)
                for field in dataclasses.fields(obj)
            }
        else:
            # Test actual JSON serializability
            json.dumps(obj)
            result = obj
    except (TypeError, ValueError):
        result = f"<non-serializable: {type(obj).__name__}>"
    finally:
        _seen.discard(obj_id)
    
    return result

Move dataclasses import to the top of the file.

🧹 Nitpick comments (1)
packages/traceloop-sdk/traceloop/sdk/decorators/base.py (1)

269-269: Verify and document the addition of ValueError to exception handling.

The exception clause was expanded to catch ValueError in addition to TypeError. While this may be necessary, it should be verified and documented:

  • What specific scenarios raise ValueError during serialization or sanitization?
  • Is this a common case or an edge case?

Consider adding a comment explaining when ValueError is expected.

If you can provide specific examples of when ValueError is raised, add a clarifying comment:

-    except (TypeError, ValueError) as e:
+    except (TypeError, ValueError) as e:  # ValueError can occur during dataclass field access
         Telemetry().log_exception(e)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between cf28145 and 6cabe21.

📒 Files selected for processing (1)
  • packages/traceloop-sdk/traceloop/sdk/decorators/base.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

  • packages/traceloop-sdk/traceloop/sdk/decorators/base.py
🧬 Code graph analysis (1)
packages/traceloop-sdk/traceloop/sdk/decorators/base.py (2)
packages/opentelemetry-instrumentation-mcp/opentelemetry/instrumentation/mcp/fastmcp_instrumentation.py (2)
  • _should_send_prompts (156-160)
  • _truncate_json_if_needed (166-176)
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
  • SpanAttributes (64-245)
🪛 Ruff (0.14.2)
packages/traceloop-sdk/traceloop/sdk/decorators/base.py

224-224: Consider moving this statement to an else block

(TRY300)


232-232: Consider moving this statement to an else block

(TRY300)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build Packages (3.11)
  • GitHub Check: Test Packages (3.10)
  • GitHub Check: Lint
🔇 Additional comments (1)
packages/traceloop-sdk/traceloop/sdk/decorators/base.py (1)

242-247: ****

The sanitization and encoder layers are not redundant—they serve complementary, necessary purposes:

  • _sanitize_for_serialization (pre-serialization): Tests picklability via copy.deepcopy() and replaces unpicklable objects with string placeholders before json.dumps() is called. This prevents TypeError exceptions from truly unpicklable types (locks, sockets, etc.).

  • JSONEncoder (during serialization): Converts custom picklable types (dataclasses, objects with to_json(), etc.) to JSON-compatible forms. This runs during json.dumps() and cannot prevent errors from unpicklable objects.

The encoder alone cannot handle unpicklable objects because it operates too late in the pipeline. The current two-layer approach is architecturally correct: sanitize first to eliminate problematic objects, then apply type conversion for the remainder. Moving sanitization logic into the encoder would not solve the fundamental issue—unpicklable objects would still cause json.dumps() to fail.

Likely an incorrect or invalid review comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant