Skip to content

Conversation

@Tprojects66554
Copy link

@Tprojects66554 Tprojects66554 commented Nov 6, 2025

Description

This pull request addresses two critical issues in the Non-Sensitivity metric related to the features_in_step parameter and the internal logic of pixel perturbations.

  1. Logical inconsistency in perturbation evaluation
    When features_in_step > 1, multiple pixels (both feature and non-feature) are perturbed simultaneously within the same step.
    As a result, the computed difference between the perturbed and original predictions (y_pred) corresponds to mixed pixel groups, making it impossible to determine whether individual pixels preserve the model’s insensitivity property.

  2. Shape mismatch causing ValueError
    When running the metric with features_in_step != 1, Quantus raised:

    ValueError: operands could not be broadcast together with shapes (17,24) (17,150528)
    

    This occurred at:

    return (preds_differences ^ non_features).sum(-1)

    where:

    • non_features → shape (batch_size, n_features)
    • preds_differences → shape (batch_size, n_perturbations)
      Since n_perturbations != n_features for multi-step perturbations, the XOR (^) operation failed due to incompatible dimensions.

These issues caused both logical misinterpretation of sensitivity violations and runtime failures during evaluation.
Link to the issue

  • https://github.com/understandable-machine-intelligence-lab/Quantus/issues/367

Implemented changes

  • Rewrote the perturbation loop to evaluate per-pixel sensitivity, ensuring clean separation between feature and non-feature perturbations.
  • Adjusted the accumulation logic to correctly track prediction stability across perturbation steps.
  • Fixed the broadcasting mismatch by aligning array dimensions and reshaping operations before computing sensitivity violations.
  • Added debug information and improved documentation for reproducibility and clarity.

Minimum acceptance criteria

  • All tests under tests/metrics/test_non_sensitivity_metric.py and related evaluation modules pass successfully across supported environments (py310–py311).
  • The metric produces consistent scores for all features_in_step configurations without shape or logic errors.
  • Reviewer confirmation by @annahedstroem or a Quantus core maintainer.

@Tprojects66554 Tprojects66554 marked this pull request as ready for review November 9, 2025 17:34
@Tprojects66554
Copy link
Author

Hi @annahedstroem I would appreciate it if you could take a look at the PR to see what you think about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant