Skip to content

Conversation

ericspod
Copy link
Member

@ericspod ericspod commented Sep 11, 2025

Fixes #8355.

Description

This modifies the use of torch.load to load weights-only everywhere. This will change some behaviour in that data needs to be converted to tensors where possible, ie. converting Numpy arrays to tensors.

This issue was reported by @h3rrr in GHSA-6vm5-6jv9-rjpj.

This will also replace uses of pickle for saving and loading with torch.save/load. This also requires that Numpy arrays be converted but otherwise uses the restriction of weights-only to enforce safe loading.

This issue was reported by @h3rrr in GHSA-p8cm-mm2v-gwjm.

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick --unittests --disttests.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

Copy link
Contributor

coderabbitai bot commented Sep 11, 2025

Walkthrough

Replaces many torch.load calls to use weights_only=True across nnU-Net bundle integration, CheckpointLoader, StateCacher, datasets, and tests. Persistent/LMDB cache I/O now stores tensors, adds LMDBDataset._safe_serialize/_safe_deserialize, and defaults LMDBDataset/CacheNTransDataset pickle_protocol to torch.serialization.DEFAULT_PROTOCOL. Meta-serialization safe globals expanded to include MetaObj, MetaTensor, MetaKeys, SpaceKeys. Removes PICKLE_KEY_SUFFIX and pickle_operations; list_data_collate/decollate_batch no longer perform pickle-based (de)serialization. convert_to_tensor gained a convert_numeric parameter. Tests updated to match these changes. A few signatures changed (LMDBDataset.init, CacheNTransDataset.init, convert_to_tensor) and new LMDB methods added.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Linked Issues Check ⚠️ Warning The PR implements the linked issue's core tactics by switching many torch.load calls to weights_only=True, replacing pickle with torch.save/torch.load, adding LMDB tensor-safe serialization helpers, and registering safe globals for MetaTensor/MetaKeys. However CI/test output and PR comments report failing cache/meta-load tests and the PR itself notes legacy caches may become unloadable; additionally the diff introduces public/API changes (removal of PICKLE_KEY_SUFFIX and pickle_operations, and a convert_to_tensor signature change) that create compatibility risks not resolved by the current changes. Given these failing tests and outstanding compatibility/migration concerns, the PR does not yet fully satisfy the linked issue objectives. Run the full test suite and fix the failing cache/meta-load cases or provide an automated migration path for legacy caches, add unit tests that cover LMDB/cache/meta_tensor loading under weights_only, and either document and justify any public API removals/changes or split them into a separate migration PR so the compatibility-focused fixes can be reviewed and landed first.
Out of Scope Changes Check ⚠️ Warning The branch includes changes that go beyond the stated compatibility objective, specifically removal of public symbols (PICKLE_KEY_SUFFIX, pickle_operations), commenting out collate/decollate pickle paths, and a signature change to convert_to_tensor (convert_numeric parameter), all of which alter the public API surface and behavior. These edits are out-of-scope for a focused weights_only compatibility fix and may cause downstream breakage if merged without separate review/migration guidance. Revert or extract public API and behavioral changes into a distinct PR with an explicit deprecation and migration plan, add compatibility tests and changelog entries, and keep this PR focused on minimal, well-tested changes required to make weights_only loading safe and compatible.
✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title "Torch and Pickle Safe Load Fixes" is concise and accurately reflects the primary intent of the changeset—making torch.load/pickle usage safer by moving to weights-only loading and replacing pickle paths—so it is clear and appropriate for repository history.
Description Check ✅ Passed The PR description follows the repository template, includes "Fixes #8355", cites the security advisories, and explains the migration to weights_only torch.load and torch.save plus the need to convert numpy arrays to tensors; it also marks change types and documents intent. Several checklist items (tests, docs) are unchecked but the description is otherwise specific and actionable for reviewers.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (11)
monai/utils/state_cacher.py (2)

127-127: Load on CPU and fix lint.

Use explicit CPU mapping and drop the lambda to satisfy Ruff ARG005 and avoid accidental GPU allocations.

-        data_obj = torch.load(fn, map_location=lambda storage, location: storage, weights_only=True)
+        data_obj = torch.load(fn, map_location=torch.device("cpu"), weights_only=True)

67-70: Docstring default is stale (protocol).

Defaults are now torch.serialization.DEFAULT_PROTOCOL; update text.

-            pickle_protocol: can be specified to override the default protocol, default to `2`.
+            pickle_protocol: can be specified to override the default protocol, default to `torch.serialization.DEFAULT_PROTOCOL`.
-            pickle_protocol: can be specified to override the default protocol, default to `self.pickle_protocol`.
+            pickle_protocol: can be specified to override the default protocol, default to `self.pickle_protocol` (defaults to `torch.serialization.DEFAULT_PROTOCOL`).

Also applies to: 95-97

monai/data/dataset.py (1)

378-380: Map cache loads to CPU explicitly.

Avoid device surprises when reading user caches.

-                return torch.load(hashfile, weights_only=False)
+                return torch.load(hashfile, map_location=torch.device("cpu"), weights_only=False)
monai/handlers/checkpoint_loader.py (1)

125-125: Default to CPU map_location for safe, predictable loads.

Uses CPU unless caller overrides map_location.

-        checkpoint = torch.load(self.load_path, map_location=self.map_location, weights_only=True)
+        checkpoint = torch.load(
+            self.load_path, map_location=(self.map_location or torch.device("cpu")), weights_only=True
+        )
tests/utils/test_state_cacher.py (1)

30-36: Good switch to DEFAULT_PROTOCOL. Consider protocol matrix.

Add a parametrized case for pickle.HIGHEST_PROTOCOL to guard regressions across protocols.

monai/apps/nnunet/nnunet_bundle.py (4)

136-136: Map pretrained load to CPU.

Prevents accidental CUDA allocs during file IO.

-        state_dict = torch.load(pretrained_model, weights_only=True)
+        state_dict = torch.load(pretrained_model, map_location=torch.device("cpu"), weights_only=True)

390-395: CPU-map nnU-Net checkpoint reads.

Consistent with other loads here.

-    nnunet_checkpoint_final = torch.load(
-        Path(nnunet_model_folder).joinpath(f"fold_{fold}", "checkpoint_final.pth"), weights_only=True
-    )
-    nnunet_checkpoint_best = torch.load(
-        Path(nnunet_model_folder).joinpath(f"fold_{fold}", "checkpoint_best.pth"), weights_only=True
-    )
+    nnunet_checkpoint_final = torch.load(
+        Path(nnunet_model_folder).joinpath(f"fold_{fold}", "checkpoint_final.pth"),
+        map_location=torch.device("cpu"),
+        weights_only=True,
+    )
+    nnunet_checkpoint_best = torch.load(
+        Path(nnunet_model_folder).joinpath(f"fold_{fold}", "checkpoint_best.pth"),
+        map_location=torch.device("cpu"),
+        weights_only=True,
+    )

481-483: CPU-map optional model checkpoint.

-        state_dict = torch.load(model_ckpt, weights_only=True)
+        state_dict = torch.load(model_ckpt, map_location=torch.device("cpu"), weights_only=True)

545-546: CPU-map bundle/monai checkpoint reads.

-    nnunet_checkpoint: dict = torch.load(f"{bundle_root_folder}/models/nnunet_checkpoint.pth", weights_only=True)
+    nnunet_checkpoint: dict = torch.load(
+        f"{bundle_root_folder}/models/nnunet_checkpoint.pth", map_location=torch.device("cpu"), weights_only=True
+    )
-    monai_last_checkpoint: dict = torch.load(
-        f"{bundle_root_folder}/models/fold_{fold}/checkpoint_epoch={final_epoch}.pt", weights_only=True
-    )
+    monai_last_checkpoint: dict = torch.load(
+        f"{bundle_root_folder}/models/fold_{fold}/checkpoint_epoch={final_epoch}.pt",
+        map_location=torch.device("cpu"),
+        weights_only=True,
+    )
-    monai_best_checkpoint: dict = torch.load(
-        f"{bundle_root_folder}/models/fold_{fold}/checkpoint_key_metric={best_key_metric}.pt", weights_only=True
-    )
+    monai_best_checkpoint: dict = torch.load(
+        f"{bundle_root_folder}/models/fold_{fold}/checkpoint_key_metric={best_key_metric}.pt",
+        map_location=torch.device("cpu"),
+        weights_only=True,
+    )

Also applies to: 556-557, 569-570

monai/data/meta_tensor.py (1)

613-615: Allowlist looks right; add LazyAttr too for pending-ops safety.

Pending ops frequently store LazyAttr keys; without allowlisting LazyAttr, loads may fail for objects carrying pending metadata. Extend the list.

Apply:

-if hasattr(torch.serialization, "add_safe_globals"):
-    torch.serialization.add_safe_globals([MetaObj, MetaTensor, MetaKeys, SpaceKeys])
+if hasattr(torch.serialization, "add_safe_globals"):
+    torch.serialization.add_safe_globals([MetaObj, MetaTensor, MetaKeys, SpaceKeys, LazyAttr])
tests/data/meta_tensor/test_meta_tensor.py (1)

248-248: Gate weights_only kwarg for older PyTorch.

Directly passing weights_only breaks on torch versions that don’t support the kwarg. Make the test tolerant.

Patch in-place:

-            m2 = torch.load(fname, weights_only=True)
+            import inspect
+            if "weights_only" in inspect.signature(torch.load).parameters:
+                m2 = torch.load(fname, weights_only=True)
+            else:
+                m2 = torch.load(fname)

If your CI matrix is 2.6+ only, feel free to skip this change; otherwise this avoids TypeError on older torch.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Knowledge Base: Disabled due to Reviews > Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between b5bc69d and 5f1f57c.

📒 Files selected for processing (7)
  • monai/apps/nnunet/nnunet_bundle.py (8 hunks)
  • monai/data/dataset.py (2 hunks)
  • monai/data/meta_tensor.py (1 hunks)
  • monai/handlers/checkpoint_loader.py (1 hunks)
  • monai/utils/state_cacher.py (1 hunks)
  • tests/data/meta_tensor/test_meta_tensor.py (1 hunks)
  • tests/utils/test_state_cacher.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.

Files:

  • monai/data/meta_tensor.py
  • tests/data/meta_tensor/test_meta_tensor.py
  • tests/utils/test_state_cacher.py
  • monai/utils/state_cacher.py
  • monai/data/dataset.py
  • monai/handlers/checkpoint_loader.py
  • monai/apps/nnunet/nnunet_bundle.py
🪛 Ruff (0.12.2)
monai/utils/state_cacher.py

127-127: Unused lambda argument: location

(ARG005)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: min-dep-pytorch (2.5.1)
  • GitHub Check: min-dep-py3 (3.10)
  • GitHub Check: min-dep-py3 (3.12)
  • GitHub Check: min-dep-pytorch (2.8.0)
  • GitHub Check: min-dep-py3 (3.11)
  • GitHub Check: min-dep-pytorch (2.7.1)
  • GitHub Check: min-dep-pytorch (2.6.0)
  • GitHub Check: min-dep-py3 (3.9)
  • GitHub Check: min-dep-os (ubuntu-latest)
  • GitHub Check: min-dep-os (macOS-latest)
  • GitHub Check: min-dep-os (windows-latest)
  • GitHub Check: flake8-py3 (mypy)
  • GitHub Check: quick-py3 (windows-latest)
  • GitHub Check: packaging
  • GitHub Check: quick-py3 (ubuntu-latest)
  • GitHub Check: quick-py3 (macOS-latest)
  • GitHub Check: build-docs
  • GitHub Check: flake8-py3 (codeformat)
  • GitHub Check: flake8-py3 (pytype)

@MattTheCuber
Copy link
Contributor

This MR fixed part of #8565

@ericspod ericspod mentioned this pull request Sep 12, 2025
53 tasks
Signed-off-by: Eric Kerfoot <[email protected]>
…ing to tensors before saving.

Signed-off-by: Eric Kerfoot <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
monai/data/dataset.py (1)

376-389: Fix misleading comment and treat legacy/incompatible cache files as misses.

Comment says weights_only=False but code uses True; also handle UnpicklingError to auto-rebuild.

-            try:
-                # Loading with weights_only=False is expected to be safe as these should be the user's own cached data
-                return torch.load(hashfile, weights_only=True)
+            try:
+                # Load with weights_only=True for safety. Legacy cache files created with older MONAI/PyTorch
+                # may be incompatible; they will be deleted and recomputed below.
+                return torch.load(hashfile, weights_only=True)
             except PermissionError as e:
                 if sys.platform != "win32":
                     raise e
-            except RuntimeError as e:
-                if "Invalid magic number; corrupt file" in str(e):
+            except (pickle.UnpicklingError, RuntimeError) as e:
+                # Treat incompatible or corrupt cache as a cache miss.
+                if "Invalid magic number; corrupt file" in str(e) or isinstance(e, pickle.UnpicklingError):
                     warnings.warn(f"Corrupt cache file detected: {hashfile}. Deleting and recomputing.")
                     hashfile.unlink()
                 else:
                     raise e
♻️ Duplicate comments (1)
monai/data/dataset.py (1)

210-213: Tighten wording; capitalize NumPy; add legacy-cache note.

Clarify allowed types and call out that old cache files may need deletion.

-        Cached data is expected to be tensors, primitives, or dictionaries keying to these values. Numpy arrays will
-        be converted to tensors, however any other object type returned by transforms will not be loadable since
-        `torch.load` will be used with `weights_only=True` to prevent loading of potentially malicious objects.
+        Cached data is expected to be tensors, primitives, or dictionaries mapping to these types. NumPy arrays are
+        converted to tensors; other object types returned by transforms are not loadable because `torch.load` uses
+        `weights_only=True` to prevent deserializing arbitrary objects. If you have legacy cache files created before
+        this change, delete them to force a rebuild.
🧹 Nitpick comments (1)
monai/data/dataset.py (1)

1678-1678: Surface a clearer error when meta cache fails to load.

Wrap to emit an actionable message suggesting cache cleanup.

-            return torch.load(self.cache_dir / meta_hash_file_name, weights_only=True)
+            try:
+                return torch.load(self.cache_dir / meta_hash_file_name, weights_only=True)
+            except (pickle.UnpicklingError, RuntimeError) as e:
+                raise RuntimeError(
+                    f"Failed to load meta cache '{meta_hash_file_name}' with weights_only=True. "
+                    f"Please delete cached files under '{self.cache_dir}' and retry."
+                ) from e
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 93a5dd1 and 64221d0.

📒 Files selected for processing (1)
  • monai/data/dataset.py (5 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.

Files:

  • monai/data/dataset.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: min-dep-pytorch (2.7.1)
  • GitHub Check: min-dep-pytorch (2.8.0)
  • GitHub Check: min-dep-pytorch (2.5.1)
  • GitHub Check: min-dep-pytorch (2.6.0)
  • GitHub Check: min-dep-py3 (3.9)
  • GitHub Check: min-dep-py3 (3.12)
  • GitHub Check: min-dep-py3 (3.11)
  • GitHub Check: min-dep-os (windows-latest)
  • GitHub Check: min-dep-os (macOS-latest)
  • GitHub Check: min-dep-os (ubuntu-latest)
  • GitHub Check: min-dep-py3 (3.10)
  • GitHub Check: build-docs
  • GitHub Check: quick-py3 (windows-latest)
  • GitHub Check: packaging
  • GitHub Check: quick-py3 (ubuntu-latest)
  • GitHub Check: flake8-py3 (mypy)
  • GitHub Check: flake8-py3 (pytype)
  • GitHub Check: flake8-py3 (codeformat)
  • GitHub Check: quick-py3 (macOS-latest)
🔇 Additional comments (2)
monai/data/dataset.py (2)

400-400: Good change: tensorizing cached payload makes weights-only reload safe.

This aligns cache writes with safe loads.


1658-1659: Confirm meta contents are tensor/primitives-only before tensorization.

If MetaTensor.meta ever carries non-primitive objects, weights-only load will fail. Add/verify a unit test covering typical meta (affine, spacing, orientation, strings, numbers).

Signed-off-by: Eric Kerfoot <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
monai/data/dataset.py (3)

377-389: Harden cache-read path; preserve traceback; handle legacy/unsafe payloads

  • Use bare raise (TRY201).
  • Treat EOFError and “weights-only” failures as cache-miss → delete and recompute.
  • Optionally pin to CPU on load.
-            try:
-                return torch.load(hashfile, weights_only=True)
-            except PermissionError as e:
-                if sys.platform != "win32":
-                    raise e
-            except (pickle.UnpicklingError, RuntimeError) as e:  # corrupt or unloadable cached files are recomputed
-                if "Invalid magic number; corrupt file" in str(e) or isinstance(e, pickle.UnpicklingError):
-                    warnings.warn(f"Corrupt cache file detected: {hashfile}. Deleting and recomputing.")
-                    hashfile.unlink()
-                else:
-                    raise e
+            try:
+                return torch.load(hashfile, map_location="cpu", weights_only=True)
+            except PermissionError as e:
+                if sys.platform != "win32":
+                    raise
+            except (pickle.UnpicklingError, EOFError, RuntimeError) as e:
+                msg = str(e)
+                if (
+                    isinstance(e, pickle.UnpicklingError)
+                    or "Invalid magic number; corrupt file" in msg
+                    or "weights only load failed" in msg.lower()
+                    or "only tensors and" in msg.lower()
+                ):
+                    warnings.warn(f"Invalid or legacy cache detected: {hashfile}. Deleting and recomputing.")
+                    try:
+                        hashfile.unlink()
+                    except FileNotFoundError:
+                        pass
+                else:
+                    raise

1658-1662: Don’t tensorize metadata dict; keep shape as tuple for reshape

Tensorizing the meta dict turns shape into a Tensor, breaking downstream reshape expectations. Save plain Python primitives.

-                torch.save(
-                    obj=convert_to_tensor(self._meta_cache[meta_hash_file_name]),
+                torch.save(
+                    obj=self._meta_cache[meta_hash_file_name],
                     f=temp_hash_file,
                     pickle_module=look_up_option(self.pickle_module, SUPPORTED_PICKLE_MOD),
                     pickle_protocol=self.pickle_protocol,
                 )

1606-1614: Fix list-of-dicts GPU path: wrong reshape target and wrong value saved

Two correctness bugs:

  • Line 1612 reshapes item[i] (a dict) instead of the loaded array.
  • Lines 1634–1639 save the whole dict _item instead of the per-key value.
@@
-                    for i, _item in enumerate(item_transformed):
-                        for k in _item:
+                    for i, _item in enumerate(item_transformed):
+                        for k in _item:
                             meta_i_k = self._load_meta_cache(meta_hash_file_name=f"{hashfile.name}-{k}-meta-{i}")
                             item_k = kvikio_numpy.fromfile(
                                 f"{hashfile}-{k}-{i}", dtype=meta_i_k["dtype"], like=cp.empty(())
                             )
-                            item_k = convert_to_tensor(item[i].reshape(meta_i_k["shape"]), device=f"cuda:{self.device}")
+                            shape = tuple(meta_i_k["shape"]) if not isinstance(meta_i_k["shape"], tuple) else meta_i_k["shape"]
+                            item_k = convert_to_tensor(item_k.reshape(shape), device=f"cuda:{self.device}")
                             item[i].update({k: item_k, f"{k}_meta_dict": meta_i_k})
@@
-            for i, _item in enumerate(_item_transformed):
-                for k in _item:
+            for i, _item in enumerate(_item_transformed):
+                for k, v in _item.items():
                     data_hashfile = f"{hashfile}-{k}-{i}"
                     meta_hash_file_name = f"{hashfile.name}-{k}-meta-{i}"
-                    self._create_new_cache(_item, data_hashfile, meta_hash_file_name)
+                    self._create_new_cache(v, data_hashfile, meta_hash_file_name)

Also applies to: 1634-1639

🧹 Nitpick comments (2)
monai/data/dataset.py (2)

210-213: Tighten wording; confirm intended policy for PersistentDataset

Minor phrasing/capitalization. Also, this class now uses weights_only=True, which conflicts with the PR summary stating PersistentDataset would retain non-weights-only. Please confirm the intent.

-        Cached data is expected to be tensors, primitives, or dictionaries keying to these values. Numpy arrays will
-        be converted to tensors, however any other object type returned by transforms will not be loadable since
-        `torch.load` will be used with `weights_only=True` to prevent loading of potentially malicious objects.
-        Legacy cache files may not be loadable and may need to be recomputed.
+        Cached data must be tensors, simple Python primitives, or dicts mapping to these values. NumPy arrays will
+        be converted to tensors; any other object type returned by transforms will not be loadable because
+        `torch.load` is invoked with `weights_only=True` to prevent loading potentially malicious objects.
+        Legacy cache files may not be loadable and may need to be recomputed.

263-263: Doc typo: TraceKeys

Use the correct constant casing in the doc.

-            reset_ops_id: whether to set `TraceKeys.ID` to ``Tracekys.NONE``, defaults to ``True``.
+            reset_ops_id: whether to set `TraceKeys.ID` to ``TraceKeys.NONE``, defaults to ``True``.
-            reset_ops_id: whether to set `TraceKeys.ID` to ``Tracekeys.NONE``, defaults to ``True``.
+            reset_ops_id: whether to set `TraceKeys.ID` to ``TraceKeys.NONE``, defaults to ``True``.

Also applies to: 564-565

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 64221d0 and 8a75795.

📒 Files selected for processing (1)
  • monai/data/dataset.py (5 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.

Files:

  • monai/data/dataset.py
🪛 Ruff (0.12.2)
monai/data/dataset.py

382-382: Use raise without specifying exception name

Remove exception name

(TRY201)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: flake8-py3 (pytype)
  • GitHub Check: flake8-py3 (mypy)
  • GitHub Check: quick-py3 (ubuntu-latest)
  • GitHub Check: build-docs
  • GitHub Check: packaging
  • GitHub Check: quick-py3 (macOS-latest)
  • GitHub Check: flake8-py3 (codeformat)
  • GitHub Check: quick-py3 (windows-latest)
  • GitHub Check: min-dep-pytorch (2.8.0)
  • GitHub Check: min-dep-pytorch (2.7.1)
  • GitHub Check: min-dep-pytorch (2.6.0)
  • GitHub Check: min-dep-pytorch (2.5.1)
  • GitHub Check: min-dep-py3 (3.10)
  • GitHub Check: min-dep-py3 (3.11)
  • GitHub Check: min-dep-py3 (3.9)
  • GitHub Check: min-dep-py3 (3.12)
  • GitHub Check: min-dep-os (ubuntu-latest)
  • GitHub Check: min-dep-os (windows-latest)
  • GitHub Check: min-dep-os (macOS-latest)
🔇 Additional comments (3)
monai/data/dataset.py (3)

400-404: Confirm intentional metadata drop

convert_to_tensor(_item_transformed) will drop MetaTensor metadata. That matches the docstring but is a behavior change. Confirm this is desired for PersistentDataset caches.


1678-1678: LGTM: safe load for meta cache

Weights-only load here is consistent with saving plain primitives. Good.


371-405: Add tests for cache compatibility and GPU list-of-dicts path

Please add unit tests to cover:

  • Legacy cache files containing non-tensor objects → load error → file deleted → recompute path.
  • GDSDataset list-of-dicts GPU branch (per-key save/load, reshape correctness).
  • PersistentDataset behavior change (no MetaTensor meta persisted).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
monai/data/dataset.py (3)

380-389: Handle weights-only failures and preserve traceback.

  • Re-raising with raise e drops the original traceback. Use bare raise.
  • Treat “Weights only load failed” as a recompute signal to avoid hard failures on legacy caches.

Apply:

-            except PermissionError as e:
-                if sys.platform != "win32":
-                    raise e
-            except (pickle.UnpicklingError, RuntimeError) as e:  # corrupt or unloadable cached files are recomputed
-                if "Invalid magic number; corrupt file" in str(e) or isinstance(e, pickle.UnpicklingError):
+            except PermissionError:
+                if sys.platform != "win32":
+                    raise
+            except (pickle.UnpicklingError, RuntimeError) as e:  # corrupt or unloadable cached files are recomputed
+                if (
+                    "Invalid magic number; corrupt file" in str(e)
+                    or "Weights only load failed" in str(e)
+                    or isinstance(e, pickle.UnpicklingError)
+                ):
                     warnings.warn(f"Corrupt cache file detected: {hashfile}. Deleting and recomputing.")
                     hashfile.unlink()
                 else:
                     raise e

1623-1627: Bug: wrong variable in reshape causes exception.

Should reshape item_k, not item[i].

Apply:

-                            item_k = convert_to_tensor(item[i].reshape(meta_i_k["shape"]), device=f"cuda:{self.device}")
+                            item_k = convert_to_tensor(item_k.reshape(meta_i_k["shape"]), device=f"cuda:{self.device}")

1649-1651: Bug: writing wrong value to cache for list-of-dicts.

Pass _item[k] (the tensor/array), not the dict _item.

Apply:

-                    self._create_new_cache(_item, data_hashfile, meta_hash_file_name)
+                    self._create_new_cache(_item[k], data_hashfile, meta_hash_file_name)
🧹 Nitpick comments (3)
monai/utils/type_conversion.py (1)

120-121: Clarify and harden convert_numeric behavior.

  • The recursion doesn’t thread convert_numeric, so nested numerics in lists/tuples/dicts will always tensorize. If intentional, please document explicitly; otherwise, pass the flag through recursion.

Apply if you want nested behavior to mirror the top-level flag:

-    elif isinstance(data, list):
-        list_ret = [convert_to_tensor(i, dtype=dtype, device=device, track_meta=track_meta) for i in data]
+    elif isinstance(data, list):
+        list_ret = [convert_to_tensor(i, dtype=dtype, device=device, track_meta=track_meta, convert_numeric=convert_numeric) for i in data]
@@
-    elif isinstance(data, tuple):
-        tuple_ret = tuple(convert_to_tensor(i, dtype=dtype, device=device, track_meta=track_meta) for i in data)
+    elif isinstance(data, tuple):
+        tuple_ret = tuple(convert_to_tensor(i, dtype=dtype, device=device, track_meta=track_meta, convert_numeric=convert_numeric) for i in data)
@@
-    elif isinstance(data, dict):
-        return {k: convert_to_tensor(v, dtype=dtype, device=device, track_meta=track_meta) for k, v in data.items()}
+    elif isinstance(data, dict):
+        return {k: convert_to_tensor(v, dtype=dtype, device=device, track_meta=track_meta, convert_numeric=convert_numeric) for k, v in data.items()}

Also applies to: 140-141, 172-173

monai/data/dataset.py (2)

593-594: Avoid print in library code; use logger.

Use logging instead of stdout.

Apply:

-        print(f"Accessing lmdb file: {self.db_file.absolute()}.")
+        logging.getLogger(__name__).info("Accessing lmdb file: %s", self.db_file.absolute())

400-405: Consistency: use the configured pickle module in all torch.save calls.

Some saves use pickle_module; keep consistent to avoid environment-specific issues.

Apply:

-                torch.save(
+                torch.save(
                     obj=convert_to_tensor(_item_transformed, convert_numeric=False),
                     f=temp_hash_file,
-                    pickle_module=look_up_option(self.pickle_module, SUPPORTED_PICKLE_MOD),
+                    pickle_module=look_up_option(self.pickle_module, SUPPORTED_PICKLE_MOD),
                     pickle_protocol=self.pickle_protocol,
                 )

(ensure pickle_module is passed everywhere torch.save is used)

Also applies to: 1670-1675

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 8a75795 and 14e5e6b.

📒 Files selected for processing (4)
  • monai/data/__init__.py (1 hunks)
  • monai/data/dataset.py (9 hunks)
  • monai/data/utils.py (4 hunks)
  • monai/utils/type_conversion.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.

Files:

  • monai/data/__init__.py
  • monai/utils/type_conversion.py
  • monai/data/utils.py
  • monai/data/dataset.py
🪛 Ruff (0.12.2)
monai/data/dataset.py

383-383: Use raise without specifying exception name

Remove exception name

(TRY201)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: min-dep-py3 (3.12)
  • GitHub Check: min-dep-pytorch (2.5.1)
  • GitHub Check: min-dep-pytorch (2.7.1)
  • GitHub Check: min-dep-pytorch (2.8.0)
  • GitHub Check: min-dep-pytorch (2.6.0)
  • GitHub Check: min-dep-py3 (3.11)
  • GitHub Check: min-dep-py3 (3.9)
  • GitHub Check: min-dep-py3 (3.10)
  • GitHub Check: min-dep-os (macOS-latest)
  • GitHub Check: min-dep-os (ubuntu-latest)
  • GitHub Check: min-dep-os (windows-latest)
  • GitHub Check: build-docs
  • GitHub Check: quick-py3 (windows-latest)
  • GitHub Check: flake8-py3 (mypy)
  • GitHub Check: packaging
  • GitHub Check: quick-py3 (ubuntu-latest)
  • GitHub Check: quick-py3 (macOS-latest)
  • GitHub Check: flake8-py3 (pytype)
  • GitHub Check: flake8-py3 (codeformat)
🔇 Additional comments (2)
monai/data/utils.py (1)

502-504: Pickle path removed; update docs and ensure invertibility paths don’t regress.

Commenting out config.USE_META_DICT and pickle_operations changes collation/decollation behavior. Confirm docs and tests cover Applied_operations/inversion flows without the pickle round-trip.

Also applies to: 656-666

monai/data/dataset.py (1)

211-215: Docs conflict with PR summary; align the policy.

Text says PersistentDataset uses weights_only=True; PR summary claims Persistent datasets keep non-weights-only loads. Please reconcile and update docs/implementation accordingly.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
monai/data/dataset.py (1)

604-613: Align LMDB safe (de)serialization with PersistentDataset and pickle module registry.

Use the registered pickle module, avoid converting numerics to tensors, and be explicit about CPU on load.

-    def _safe_serialize(self,val):
-        out=BytesIO()
-        torch.save(convert_to_tensor(val), out, pickle_protocol =self.pickle_protocol)
-        out.seek(0)
-        return out.read()
+    def _safe_serialize(self, val):
+        out = BytesIO()
+        torch.save(
+            convert_to_tensor(val, convert_numeric=False),
+            out,
+            pickle_module=look_up_option(self.pickle_module, SUPPORTED_PICKLE_MOD),
+            pickle_protocol=self.pickle_protocol,
+        )
+        out.seek(0)
+        return out.read()
 
-    def _safe_deserialize(self,val):
-        out=BytesIO(val)
-        return torch.load(out,weights_only=True)
+    def _safe_deserialize(self, val):
+        out = BytesIO(val)
+        return torch.load(out, map_location="cpu", weights_only=True)

Also consider brief Google-style docstrings for these two helpers.

🧹 Nitpick comments (5)
monai/utils/state_cacher.py (1)

127-131: Use explicit CPU map_location and fix Ruff ARG005.

Avoid the unused lambda arg and be explicit about CPU loads.

-        data_obj = torch.load(fn, map_location=lambda storage, location: storage, weights_only=True)
+        data_obj = torch.load(fn, map_location="cpu", weights_only=True)
monai/data/dataset.py (4)

211-215: Nit: wording/typo.

Use “NumPy” and tighten punctuation.

-        Cached data is expected to be tensors, primitives, or dictionaries keying to these values. Numpy arrays will
-        be converted to tensors, however any other object type returned by transforms will not be loadable since
+        Cached data is expected to be tensors, primitives, or dictionaries keying to these values. NumPy arrays will
+        be converted to tensors; however, any other object type returned by transforms will not be loadable since
         `torch.load` will be used with `weights_only=True` to prevent loading of potentially malicious objects.
         Legacy cache files may not be loadable and may need to be recomputed.

380-380: Prefer explicit CPU load.

Be explicit to avoid accidental device restores when reading caches.

-                return torch.load(hashfile, weights_only=True)
+                return torch.load(hashfile, map_location="cpu", weights_only=True)

383-383: Use bare raise to preserve traceback.

Fixes Ruff TRY201 and keeps context.

-                    raise e
+                    raise

1692-1693: Be explicit about CPU when loading meta.

Avoid any accidental GPU allocations on metadata load.

-            return torch.load(self.cache_dir / meta_hash_file_name, weights_only=True)
+            return torch.load(self.cache_dir / meta_hash_file_name, map_location="cpu", weights_only=True)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 14e5e6b and 77d6992.

📒 Files selected for processing (2)
  • monai/data/dataset.py (13 hunks)
  • monai/utils/state_cacher.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.

Files:

  • monai/data/dataset.py
  • monai/utils/state_cacher.py
🪛 Ruff (0.12.2)
monai/data/dataset.py

383-383: Use raise without specifying exception name

Remove exception name

(TRY201)

monai/utils/state_cacher.py

127-127: Unused lambda argument: location

(ARG005)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: min-dep-pytorch (2.6.0)
  • GitHub Check: min-dep-pytorch (2.8.0)
  • GitHub Check: min-dep-pytorch (2.5.1)
  • GitHub Check: min-dep-pytorch (2.7.1)
  • GitHub Check: min-dep-py3 (3.12)
  • GitHub Check: min-dep-py3 (3.11)
  • GitHub Check: min-dep-py3 (3.9)
  • GitHub Check: min-dep-py3 (3.10)
  • GitHub Check: min-dep-os (ubuntu-latest)
  • GitHub Check: min-dep-os (windows-latest)
  • GitHub Check: min-dep-os (macOS-latest)
  • GitHub Check: quick-py3 (macOS-latest)
  • GitHub Check: packaging
  • GitHub Check: flake8-py3 (mypy)
  • GitHub Check: quick-py3 (windows-latest)
  • GitHub Check: flake8-py3 (pytype)
  • GitHub Check: quick-py3 (ubuntu-latest)
  • GitHub Check: flake8-py3 (codeformat)
🔇 Additional comments (6)
monai/utils/state_cacher.py (1)

67-69: Docstring tweak LGTM.

Accurately documents DEFAULT_PROTOCOL usage.

monai/data/dataset.py (5)

15-18: Imports LGTM.

BytesIO and UnpicklingError are used below.


401-405: Safe save path LGTM.

Consistent with safer tensor-based serialization and DEFAULT_PROTOCOL.


540-541: Default protocol doc LGTM.

Matches torch.serialization.DEFAULT_PROTOCOL.


685-687: Use helpers — good.

Switch to _safe_deserialize is consistent with the new policy.


211-215: Confirm intent: persistent datasets now use weights_only=True

monai/data/dataset.py (lines 211–215) and docs use weights_only=True, but the PR summary claims persistent datasets remain non-weights-only — confirm intended behavior and update the PR description/docs for consistency. Found torch.load calls without weights_only in tests/bundle/test_bundle_download.py:{335,339,420,424} and monai/apps/nnunet/nnunet_bundle.py:{184,197,390,393,555,568} — reconcile or document exceptions.

@ericspod ericspod changed the title Torch Safe Load Fix Torch and Pickle Safe Load Fix Sep 14, 2025
@ericspod ericspod changed the title Torch and Pickle Safe Load Fix Torch and Pickle Safe Load Fixes Sep 14, 2025
Signed-off-by: Eric Kerfoot <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
monai/utils/type_conversion.py (1)

172-181: Propagate convert_numeric into recursive calls.

Currently nested numerics still convert even when convert_numeric=False. Propagate the flag for consistency.

Apply this diff:

-    elif isinstance(data, list):
-        list_ret = [convert_to_tensor(i, dtype=dtype, device=device, track_meta=track_meta) for i in data]
+    elif isinstance(data, list):
+        list_ret = [
+            convert_to_tensor(i, dtype=dtype, device=device, track_meta=track_meta, convert_numeric=convert_numeric)
+            for i in data
+        ]
         return _convert_tensor(list_ret, dtype=dtype, device=device) if wrap_sequence else list_ret
-    elif isinstance(data, tuple):
-        tuple_ret = tuple(convert_to_tensor(i, dtype=dtype, device=device, track_meta=track_meta) for i in data)
+    elif isinstance(data, tuple):
+        tuple_ret = tuple(
+            convert_to_tensor(i, dtype=dtype, device=device, track_meta=track_meta, convert_numeric=convert_numeric)
+            for i in data
+        )
         return _convert_tensor(tuple_ret, dtype=dtype, device=device) if wrap_sequence else tuple_ret
-    elif isinstance(data, dict):
-        return {k: convert_to_tensor(v, dtype=dtype, device=device, track_meta=track_meta) for k, v in data.items()}
+    elif isinstance(data, dict):
+        return {
+            k: convert_to_tensor(v, dtype=dtype, device=device, track_meta=track_meta, convert_numeric=convert_numeric)
+            for k, v in data.items()
+        }
monai/data/dataset.py (1)

1621-1627: Bug: reshape uses dict instead of array (crashes).

Should reshape item_k, not item[i].

Apply:

-                            item_k = convert_to_tensor(item[i].reshape(meta_i_k["shape"]), device=f"cuda:{self.device}")
+                            item_k = convert_to_tensor(item_k.reshape(meta_i_k["shape"]), device=f"cuda:{self.device}")
♻️ Duplicate comments (2)
monai/data/utils.py (1)

58-61: Restoring PICKLE_KEY_SUFFIX for BC (deprecated alias).

Removing this symbol is breaking; PR is labeled non‑breaking. Restore it in all and as an alias to TraceKeys.KEY_SUFFIX with a deprecation note.

Suggested changes (outside the current diff):

 __all__ = [
     "AFFINE_TOL",
     "SUPPORTED_PICKLE_MOD",
+    "PICKLE_KEY_SUFFIX",
     ...
 ]
 
+# Deprecated alias for backward compatibility; remove in a future major release.
+PICKLE_KEY_SUFFIX = TraceKeys.KEY_SUFFIX  # deprecated
monai/data/dataset.py (1)

604-612: Honor pickle_module and avoid converting top‑level numerics.

Use the registered pickle module and convert_numeric=False; also mirror pickle_module on load.

Apply:

-    def _safe_serialize(self, val):
-        out = BytesIO()
-        torch.save(convert_to_tensor(val), out, pickle_protocol=self.pickle_protocol)
-        out.seek(0)
-        return out.read()
+    def _safe_serialize(self, val):
+        out = BytesIO()
+        torch.save(
+            convert_to_tensor(val, convert_numeric=False),
+            out,
+            pickle_module=look_up_option(self.pickle_module, SUPPORTED_PICKLE_MOD),
+            pickle_protocol=self.pickle_protocol,
+        )
+        out.seek(0)
+        return out.read()
 
-    def _safe_deserialize(self, val):
-        return torch.load(BytesIO(val), map_location="cpu", weights_only=True)
+    def _safe_deserialize(self, val):
+        return torch.load(
+            BytesIO(val),
+            map_location="cpu",
+            weights_only=True,
+            pickle_module=look_up_option(self.pickle_module, SUPPORTED_PICKLE_MOD),
+        )
🧹 Nitpick comments (3)
monai/utils/type_conversion.py (1)

140-142: Docstring: clarify scope of convert_numeric.

State whether convert_numeric applies recursively. If we propagate it (see next comment), update the docstring accordingly.

Apply:

-        convert_numeric: if `True`, convert numeric Python values to tensors.
+        convert_numeric: if `True`, convert numeric Python values to tensors. This flag is propagated
+            recursively into nested lists/tuples/dicts for consistent behavior.
monai/data/utils.py (1)

419-426: Typo: “collage_meta_tensor” → “collate_meta_tensor”.

Fix docstring reference.

Apply:

-    Collate a sequence of meta tensor into a single batched metatensor. This is called by `collage_meta_tensor`
+    Collate a sequence of meta tensor into a single batched metatensor. This is called by `collate_meta_tensor`
monai/data/dataset.py (1)

1691-1692: Pass configured pickle_module when loading meta cache.

Ensure symmetry with save and support custom pickle modules.

Apply:

-            return torch.load(self.cache_dir / meta_hash_file_name, weights_only=True)
+            return torch.load(
+                self.cache_dir / meta_hash_file_name,
+                weights_only=True,
+                pickle_module=look_up_option(self.pickle_module, SUPPORTED_PICKLE_MOD),
+                map_location="cpu",
+            )
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 77d6992 and 11c0ee5.

📒 Files selected for processing (4)
  • monai/data/__init__.py (0 hunks)
  • monai/data/dataset.py (13 hunks)
  • monai/data/utils.py (2 hunks)
  • monai/utils/type_conversion.py (3 hunks)
💤 Files with no reviewable changes (1)
  • monai/data/init.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.

Files:

  • monai/data/utils.py
  • monai/utils/type_conversion.py
  • monai/data/dataset.py
🪛 Ruff (0.12.2)
monai/data/dataset.py

383-383: Use raise without specifying exception name

Remove exception name

(TRY201)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: build-docs
  • GitHub Check: flake8-py3 (mypy)
  • GitHub Check: packaging
  • GitHub Check: quick-py3 (ubuntu-latest)
  • GitHub Check: flake8-py3 (pytype)
  • GitHub Check: quick-py3 (macOS-latest)
  • GitHub Check: flake8-py3 (codeformat)
  • GitHub Check: quick-py3 (windows-latest)
  • GitHub Check: min-dep-os (ubuntu-latest)
  • GitHub Check: min-dep-py3 (3.12)
  • GitHub Check: min-dep-pytorch (2.8.0)
  • GitHub Check: min-dep-pytorch (2.7.1)
  • GitHub Check: min-dep-os (windows-latest)
  • GitHub Check: min-dep-os (macOS-latest)
  • GitHub Check: min-dep-pytorch (2.5.1)
  • GitHub Check: min-dep-pytorch (2.6.0)
  • GitHub Check: min-dep-py3 (3.9)
  • GitHub Check: min-dep-py3 (3.11)
  • GitHub Check: min-dep-py3 (3.10)
🔇 Additional comments (3)
monai/data/dataset.py (3)

211-215: PersistentDataset docs conflict with PR summary—confirm intent.

Docs say caches load with weights_only=True; the PR summary states PersistentDataset should allow full unpickling (weights_only=False). Reconcile and align code/docs.

If keeping weights_only=True, update the PR description and migration notes; if reverting, change the load sites accordingly and adjust the docstring.


401-405: LGTM: safe save path.

Saving via convert_to_tensor(..., convert_numeric=False) with configured pickle module and protocol looks correct.


684-686: LGTM: LMDB cache uses safe deserialization.

Loading via weights_only=True is aligned with the new strategy.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
monai/utils/type_conversion.py (2)

127-142: Docstring: clarify scope and edge cases.

Tighten wording and document wrap_sequence interaction and NumPy scalars.

@@
-        dtype: target data type to when converting to Tensor.
+        dtype: target data type when converting to Tensor.
@@
-        convert_numeric: if `True`, convert numeric Python values to tensors.
+        convert_numeric: if `True`, convert numeric Python values (int, float, bool) to tensors.
+            Note: NumPy scalars (e.g., np.float32(1)) are also treated as numeric when converting.
+            Note: If `wrap_sequence=True`, sequences may become a single tensor regardless of `convert_numeric`.

179-186: wrap_sequence x convert_numeric nuance—document or assert.

With wrap_sequence=True, a list/tuple of numerics becomes a single tensor even if convert_numeric=False. If this is intentional, add a short note to the docstring; otherwise, gate the wrap with convert_numeric.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 11c0ee5 and 2edf46c.

📒 Files selected for processing (1)
  • monai/utils/type_conversion.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.

Files:

  • monai/utils/type_conversion.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: flake8-py3 (pytype)
  • GitHub Check: flake8-py3 (mypy)
  • GitHub Check: flake8-py3 (codeformat)
  • GitHub Check: quick-py3 (windows-latest)
  • GitHub Check: build-docs
  • GitHub Check: packaging
  • GitHub Check: quick-py3 (ubuntu-latest)
  • GitHub Check: quick-py3 (macOS-latest)
  • GitHub Check: min-dep-pytorch (2.8.0)
  • GitHub Check: min-dep-pytorch (2.7.1)
  • GitHub Check: min-dep-py3 (3.11)
  • GitHub Check: min-dep-pytorch (2.5.1)
  • GitHub Check: min-dep-py3 (3.12)
  • GitHub Check: min-dep-py3 (3.9)
  • GitHub Check: min-dep-pytorch (2.6.0)
  • GitHub Check: min-dep-py3 (3.10)
  • GitHub Check: min-dep-os (macOS-latest)
  • GitHub Check: min-dep-os (ubuntu-latest)
  • GitHub Check: min-dep-os (windows-latest)
🔇 Additional comments (1)
monai/utils/type_conversion.py (1)

162-164: Good: propagate dtype/device/track_meta/convert_numeric into recursion.

Excluding wrap_sequence here is correct to avoid collapsing nested sequences.

Signed-off-by: Eric Kerfoot <[email protected]>
@KumoLiu
Copy link
Contributor

KumoLiu commented Sep 16, 2025

/build

2 similar comments
@KumoLiu
Copy link
Contributor

KumoLiu commented Sep 16, 2025

/build

@KumoLiu
Copy link
Contributor

KumoLiu commented Sep 16, 2025

/build

@KumoLiu
Copy link
Contributor

KumoLiu commented Sep 16, 2025


[2025-09-16T12:56:49.390Z] ======================================================================

[2025-09-16T12:56:49.390Z] ERROR: test_cache (tests.data.test_gdsdataset.TestDataset)

[2025-09-16T12:56:49.390Z] testing no inplace change to the hashed item

[2025-09-16T12:56:49.390Z] ----------------------------------------------------------------------

[2025-09-16T12:56:49.390Z] Traceback (most recent call last):

[2025-09-16T12:56:49.390Z]   File "/home/jenkins/agent/workspace/MONAI-premerge/monai/tests/data/test_gdsdataset.py", line 93, in test_cache

[2025-09-16T12:56:49.390Z]     assert_allclose(ds[0], ds1[0], type_test=False)

[2025-09-16T12:56:49.390Z]   File "/home/jenkins/agent/workspace/MONAI-premerge/monai/monai/data/dataset.py", line 109, in __getitem__

[2025-09-16T12:56:49.390Z]     return self._transform(index)

[2025-09-16T12:56:49.390Z]   File "/home/jenkins/agent/workspace/MONAI-premerge/monai/monai/data/dataset.py", line 418, in _transform

[2025-09-16T12:56:49.390Z]     pre_random_item = self._cachecheck(self.data[index])

[2025-09-16T12:56:49.390Z]   File "/home/jenkins/agent/workspace/MONAI-premerge/monai/monai/data/dataset.py", line 1611, in _cachecheck

[2025-09-16T12:56:49.390Z]     _meta = self._load_meta_cache(meta_hash_file_name=f"{hashfile.name}-meta")

[2025-09-16T12:56:49.390Z]   File "/home/jenkins/agent/workspace/MONAI-premerge/monai/monai/data/dataset.py", line 1691, in _load_meta_cache

[2025-09-16T12:56:49.390Z]     return torch.load(self.cache_dir / meta_hash_file_name, weights_only=True)

[2025-09-16T12:56:49.390Z]   File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1274, in load

[2025-09-16T12:56:49.390Z]     raise pickle.UnpicklingError(_get_wo_message(str(e))) from None

[2025-09-16T12:56:49.390Z] _pickle.UnpicklingError: Weights only load failed. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.

[2025-09-16T12:56:49.390Z] Please file an issue with the following so that we can make `weights_only=True` compatible with your use case: WeightsUnpickler error: Unsupported operand 149

[2025-09-16T12:56:49.390Z] 

[2025-09-16T12:56:49.390Z] Check the documentation of torch.load to learn more about types accepted by default with weights_only 

cc @ericspod

@ericspod
Copy link
Member Author

This is likely related to what pickle protocol is used, I'm changing defaults in tests now.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/data/test_gdsdataset.py (1)

88-90: Good switch to torch.serialization.DEFAULT_PROTOCOL; clarify TODO and tie to issue.

The intent is clear; let’s anchor the TODO to the tracked issue and PyTorch behavior for future maintainers.

-                    # TODO: was pickle.HIGHEST_PROTOCOL but this wasn't compatible with torch.load, need to improve compatibility
+                    # TODO(#8355): previously used pickle.HIGHEST_PROTOCOL; incompatible with torch.load(weights_only=True) in PyTorch>=2.6.
+                    # Audit/document supported protocols and add a regression test covering cache read/write under weights_only=True.
tests/data/test_persistentdataset.py (1)

20-20: Import torch solely for DEFAULT_PROTOCOL is acceptable here.

If you want to avoid a heavyweight import in tests, alias once at module scope and reuse:

-import torch
+import torch
+# Optional: TORCH_PICKLE_PROTOCOL = torch.serialization.DEFAULT_PROTOCOL
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 9b171d4 and dd1de4a.

📒 Files selected for processing (2)
  • tests/data/test_gdsdataset.py (1 hunks)
  • tests/data/test_persistentdataset.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.

Files:

  • tests/data/test_gdsdataset.py
  • tests/data/test_persistentdataset.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: min-dep-py3 (3.12)
  • GitHub Check: min-dep-pytorch (2.5.1)
  • GitHub Check: min-dep-pytorch (2.8.0)
  • GitHub Check: min-dep-pytorch (2.6.0)
  • GitHub Check: min-dep-os (windows-latest)
  • GitHub Check: min-dep-os (ubuntu-latest)
  • GitHub Check: min-dep-py3 (3.11)
  • GitHub Check: min-dep-pytorch (2.7.1)
  • GitHub Check: min-dep-os (macOS-latest)
  • GitHub Check: min-dep-py3 (3.10)
  • GitHub Check: min-dep-py3 (3.9)
  • GitHub Check: flake8-py3 (pytype)
  • GitHub Check: quick-py3 (windows-latest)
  • GitHub Check: packaging
  • GitHub Check: quick-py3 (macOS-latest)
  • GitHub Check: build-docs
  • GitHub Check: flake8-py3 (mypy)
  • GitHub Check: flake8-py3 (codeformat)
  • GitHub Check: quick-py3 (ubuntu-latest)

@KumoLiu
Copy link
Contributor

KumoLiu commented Sep 16, 2025

/build

@ericspod ericspod enabled auto-merge (squash) September 16, 2025 15:49
@ericspod ericspod merged commit 948fbb7 into Project-MONAI:dev Sep 17, 2025
27 checks passed
@ericspod ericspod deleted the torch_load_fix branch September 17, 2025 10:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Weights only load failed In PyTorch 2.6
3 participants