-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Path traversal issue security fix #8568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Signed-off-by: h3rrr <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: h3rrr <[email protected]>
Co-authored-by: Eric Kerfoot <[email protected]> Signed-off-by: h3rrr <[email protected]>
Path traversal security issue fix Zip Slip Signed-off-by: h3rrr <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: h3rrr <[email protected]>
Changed to previous content, the fix will be filed in a new PR Signed-off-by: h3rrr <[email protected]>
The path traversal issue in the previously submitted vulnerability report has been fixed and has been verified and fixed locally. Signed-off-by: h3rrr <[email protected]>
WalkthroughAdds SECURITY.md with reporting, coordinated disclosure, and acknowledgements. Updates monai/apps/utils.py to introduce an internal safe_extract_member helper and replace archive-wide extractall with per-member, context-managed ZIP and TAR extraction that normalizes paths, disallows absolute or parent-traversal entries, and blocks symlinks/hardlinks while streaming files to disk. Adds tests in tests/apps/test_download_and_extract.py (TestPathTraversalProtection) covering valid extraction and malicious ZIP/TAR cases: path traversal, absolute paths, symlinks, and hardlinks. Public APIs unchanged. Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
monai/apps/utils.py (1)
123-147
: Consider custom exception types for better error handling.The current ValueError usage works but custom exceptions would improve error categorization.
class ArchiveSecurityError(Exception): """Raised when archive extraction poses security risks.""" pass def safe_extract_member(member, extract_to): """Securely verify compressed package member paths to prevent path traversal attacks""" # Get member path (handle different compression formats) if hasattr(member, 'filename'): member_path = member.filename # zipfile elif hasattr(member, 'name'): member_path = member.name # tarfile else: member_path = str(member) member_path = os.path.normpath(member_path) if os.path.isabs(member_path) or '..' in member_path.split(os.sep): raise ArchiveSecurityError(f"Unsafe path detected in archive: {member_path}") full_path = os.path.join(extract_to, member_path) full_path = os.path.normpath(full_path) extract_to_abs = os.path.abspath(extract_to) full_path_abs = os.path.abspath(full_path) if not (full_path_abs == extract_to_abs or full_path_abs.startswith(extract_to_abs + os.sep)): raise ArchiveSecurityError(f"Path traversal attack detected: {member_path}") return full_path
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base
setting
📒 Files selected for processing (2)
SECURITY.md
(1 hunks)monai/apps/utils.py
(3 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.md
⚙️ CodeRabbit configuration file
Remember that documentation must be updated with the latest information.
Files:
SECURITY.md
**/*.py
⚙️ CodeRabbit configuration file
Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.
Files:
monai/apps/utils.py
🪛 Ruff (0.12.2)
monai/apps/utils.py
136-136: Avoid specifying long messages outside the exception class
(TRY003)
145-145: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: min-dep-os (windows-latest)
- GitHub Check: min-dep-os (ubuntu-latest)
- GitHub Check: min-dep-py3 (3.9)
- GitHub Check: min-dep-os (macOS-latest)
- GitHub Check: min-dep-pytorch (2.8.0)
- GitHub Check: min-dep-py3 (3.11)
- GitHub Check: min-dep-py3 (3.10)
- GitHub Check: min-dep-pytorch (2.6.0)
- GitHub Check: min-dep-pytorch (2.5.1)
- GitHub Check: min-dep-py3 (3.12)
- GitHub Check: min-dep-pytorch (2.7.1)
- GitHub Check: flake8-py3 (mypy)
- GitHub Check: build-docs
- GitHub Check: flake8-py3 (codeformat)
- GitHub Check: flake8-py3 (pytype)
- GitHub Check: quick-py3 (windows-latest)
- GitHub Check: quick-py3 (ubuntu-latest)
- GitHub Check: quick-py3 (macOS-latest)
- GitHub Check: packaging
🔇 Additional comments (5)
SECURITY.md (1)
1-19
: Good security policy documentation.Clear structure and appropriate content for vulnerability reporting. Documentation follows best practices for coordinated disclosure.
monai/apps/utils.py (4)
14-15
: Import order updated appropriately.Adding
os
andshutil
imports to support the new path traversal security fixes.
123-147
: Excellent path traversal prevention implementation.Robust security function properly validates archive member paths. Handles both zipfile and tarfile formats correctly. Path normalization and traversal checks are comprehensive.
314-323
: Safe ZIP extraction implementation correctly replaces vulnerable extractall.Secure per-member extraction prevents path traversal attacks. Directory creation and file streaming implementation is correct.
325-336
: Safe TAR extraction implementation correctly replaces vulnerable extractall.Secure per-member extraction with proper file-only filtering. Correct use of
extractfile
and streaming copy to prevent directory traversal attacks.
This PR includes the security file as well but this should go away once we merge the other PR. This happened because you made your |
Hi, @ericspod the pull request for SECURITY.md mentioned the need for testing. |
What I mean to say is that we need unit tests added to the |
Sorry... I chose the wrong branch before. I added some anti-bypass measures to make the decompression operation safer. Signed-off-by: h3rrr <[email protected]>
for more information, see https://pre-commit.ci
New test cases added as required Signed-off-by: h3rrr <[email protected]>
for more information, see https://pre-commit.ci
I understand what you mean. In the latest pull request, I fixed some potential exploits and added the corresponding testing module. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (9)
tests/apps/test_download_and_extract.py (5)
88-103
: Avoid broad exception catch in success-path test (Ruff BLE001).Let the test fail naturally; no need for try/except.
- try: - extractall(str(zip_path), str(extract_dir)) + extractall(str(zip_path), str(extract_dir)) @@ - except Exception as e: - self.fail(f"Valid zip extraction should not raise exception: {e}")
143-157
: Avoid broad exception catch in success-path test (Ruff BLE001).Same here; remove try/except.
- try: - extractall(str(tar_path), str(extract_dir)) + extractall(str(tar_path), str(extract_dir)) @@ - except Exception as e: - self.fail(f"Valid tar extraction should not raise exception: {e}")
219-224
: Make assertion resilient to message variants.Current code raises “Symbolic link …”; test looks for “symlink”. Either keep as-is if you change code to include “symlink” (see utils.py comment), or loosen the check to accept both.
- error_msg = str(context.exception).lower() - self.assertTrue("unsafe path" in error_msg or "symlink" in error_msg) + error_msg = str(context.exception).lower() + self.assertTrue( + any(x in error_msg for x in ("unsafe path", "symlink", "symbolic link")) + )
245-249
: Same resilience for hard link message.Match “hardlink” or “hard link”.
- error_msg = str(context.exception).lower() - self.assertTrue("unsafe path" in error_msg or "hardlink" in error_msg) + error_msg = str(context.exception).lower() + self.assertTrue( + any(x in error_msg for x in ("unsafe path", "hardlink", "hard link")) + )
71-199
: Optional: add a ZIP symlink test for parity with TAR cases.ZIP symlinks can be represented via external_attr bits; add a test to ensure they’re rejected or handled consistently. I can open a follow-up PR if preferred.
def test_malicious_zip_symlink_protection(self): import stat with tempfile.TemporaryDirectory() as tmp_dir: zip_path = Path(tmp_dir) / "malicious_symlink.zip" extract_dir = Path(tmp_dir) / "extract" extract_dir.mkdir() with zipfile.ZipFile(zip_path, "w") as zf: zi = zipfile.ZipInfo("link_to_outside") zi.create_system = 3 # UNIX zi.external_attr = (stat.S_IFLNK | 0o777) << 16 zf.writestr(zi, "../../../etc/passwd") with self.assertRaises(ValueError) as ctx: extractall(str(zip_path), str(extract_dir)) self.assertIn("symlink", str(ctx.exception).lower())monai/apps/utils.py (4)
140-151
: Harden path check using commonpath.
startswith
is okay;os.path.commonpath
is more robust.- if not (full_path_abs == extract_to_abs or full_path_abs.startswith(extract_to_abs + os.sep)): - raise ValueError(f"Path traversal attack detected: {member_path}") + if os.path.commonpath([extract_to_abs, full_path_abs]) != extract_to_abs: + raise ValueError(f"Unsafe path: traversal {member_path}")
123-152
: Docstring: add Args/Raises per guidelines.Enhance discoverability and consistency with project style.
-def safe_extract_member(member, extract_to): - """Securely verify compressed package member paths to prevent path traversal attacks""" +def safe_extract_member(member, extract_to): + """Validate and resolve a safe extraction path for an archive member. + + Args: + member: ZipInfo or TarInfo (or path-like name) representing the archive entry. + extract_to: Directory where the member should be extracted. + + Returns: + str: A normalized path under `extract_to` where the member may be safely written. + + Raises: + ValueError: If the member is a symlink/hardlink, is absolute, attempts traversal, + or would resolve outside of `extract_to`. + """
14-25
: Optional: detect ZIP symlinks via external_attr.ZIP symlinks aren’t caught by TarInfo checks. If desired, block them explicitly.
import os import shutil +import stat @@ def safe_extract_member(member, extract_to): @@ - else: - member_path = str(member) + else: + member_path = str(member) + + # ZIP symlink detection via UNIX mode bits + if hasattr(member, "external_attr"): + mode = (member.external_attr >> 16) & 0xFFFF + if stat.S_IFMT(mode) == stat.S_IFLNK: + raise ValueError(f"Unsafe path: symlink {member_path}")
319-329
: ZIP extraction OK; consider preserving permissions (optional).Current code writes bytes but drops exec bits/timestamps. If that matters, apply mode and mtime from metadata.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base
setting
📒 Files selected for processing (2)
monai/apps/utils.py
(3 hunks)tests/apps/test_download_and_extract.py
(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.
Files:
tests/apps/test_download_and_extract.py
monai/apps/utils.py
🪛 Ruff (0.12.2)
tests/apps/test_download_and_extract.py
101-101: Do not catch blind exception: Exception
(BLE001)
155-155: Do not catch blind exception: Exception
(BLE001)
monai/apps/utils.py
134-134: Avoid specifying long messages outside the exception class
(TRY003)
136-136: Avoid specifying long messages outside the exception class
(TRY003)
141-141: Avoid specifying long messages outside the exception class
(TRY003)
150-150: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: min-dep-pytorch (2.8.0)
- GitHub Check: min-dep-pytorch (2.6.0)
- GitHub Check: min-dep-pytorch (2.5.1)
- GitHub Check: min-dep-py3 (3.10)
- GitHub Check: min-dep-pytorch (2.7.1)
- GitHub Check: min-dep-py3 (3.12)
- GitHub Check: min-dep-py3 (3.11)
- GitHub Check: min-dep-py3 (3.9)
- GitHub Check: min-dep-os (ubuntu-latest)
- GitHub Check: min-dep-os (windows-latest)
- GitHub Check: quick-py3 (macOS-latest)
- GitHub Check: flake8-py3 (codeformat)
- GitHub Check: quick-py3 (windows-latest)
- GitHub Check: flake8-py3 (pytype)
- GitHub Check: build-docs
- GitHub Check: flake8-py3 (mypy)
- GitHub Check: quick-py3 (ubuntu-latest)
- GitHub Check: packaging
🔇 Additional comments (1)
monai/apps/utils.py (1)
331-343
: tar.gz extraction is broken: use transparent read mode.
tarfile.open(..., 'r')
won't read gzip members; tests using .tar.gz will fail. User:*
.- with tarfile.open(filepath, 'r') as tar_file: + with tarfile.open(filepath, 'r:*') as tar_file:Likely an incorrect or invalid review comment.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: h3rrr <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
monai/apps/utils.py (1)
133-136
: Exception messages now match prior review guidance. LGTM.“Unsafe path: symlink … / hardlink …” aligns with tests.
🧹 Nitpick comments (5)
monai/apps/utils.py (5)
123-151
: Path traversal mitigation is solid; add types + full docstring.Give this helper a typed signature and a Google-style docstring per guidelines.
-def safe_extract_member(member, extract_to): - """Securely verify compressed package member paths to prevent path traversal attacks""" +def safe_extract_member(member: Any, extract_to: PathLike) -> str: + """Validate an archive member path and return a safe extraction target. + + Args: + member: A member object from zipfile (ZipInfo) or tarfile (TarInfo). + extract_to: Directory to extract into. + + Returns: + The normalized safe filesystem path for this member inside extract_to. + + Raises: + ValueError: If the member is a symlink/hardlink, absolute path, contains + parent traversal, or resolves outside extract_to. + """
133-136
: Keep exact messages; silence Ruff TRY003 inline.Tests likely depend on these messages; add noqa rather than changing text.
- if hasattr(member, 'issym') and member.issym(): - raise ValueError(f"Unsafe path: symlink {member_path}") - if hasattr(member, 'islnk') and member.islnk(): - raise ValueError(f"Unsafe path: hardlink {member_path}") + if hasattr(member, 'issym') and member.issym(): + raise ValueError(f"Unsafe path: symlink {member_path}") # noqa: TRY003 + if hasattr(member, 'islnk') and member.islnk(): + raise ValueError(f"Unsafe path: hardlink {member_path}") # noqa: TRY003 @@ - if os.path.isabs(member_path) or '..' in member_path.split(os.sep): - raise ValueError(f"Unsafe path detected in archive: {member_path}") + if os.path.isabs(member_path) or '..' in member_path.split(os.sep): + raise ValueError(f"Unsafe path detected in archive: {member_path}") # noqa: TRY003 @@ - if not (full_path_abs == extract_to_abs or full_path_abs.startswith(extract_to_abs + os.sep)): - raise ValueError(f"Path traversal attack detected: {member_path}") + if not (full_path_abs == extract_to_abs or full_path_abs.startswith(extract_to_abs + os.sep)): + raise ValueError(f"Path traversal attack detected: {member_path}") # noqa: TRY003Also applies to: 139-141, 148-149
126-137
: Optionally detect ZIP symlinks too.Zip symlinks aren’t caught by issym/islnk; check external_attr to reject them early.
if hasattr(member, 'filename'): member_path = member.filename # zipfile @@ - if hasattr(member, 'issym') and member.issym(): + # ZIP symlink detection (posix zips set S_IFLNK in external_attr) + if isinstance(member, zipfile.ZipInfo): + import stat # local import to avoid global dependency if unused + if member.create_system == 3 and stat.S_ISLNK(member.external_attr >> 16): + raise ValueError(f"Unsafe path: symlink {member_path}") # noqa: TRY003 + + if hasattr(member, 'issym') and member.issym(): raise ValueError(f"Unsafe path: symlink {member_path}")Note: hardlinks don’t exist in ZIP; this keeps behavior consistent while tightening checks.
330-342
: TAR extraction: use r: and restore metadata.*Be explicit about compression auto-detect and preserve mode/mtime.
- with tarfile.open(filepath, 'r') as tar_file: + with tarfile.open(filepath, 'r:*') as tar_file: @@ with open(safe_path, 'wb') as target: shutil.copyfileobj(source, target) + # restore basic metadata + try: + os.chmod(safe_path, member.mode & 0o777) + os.utime(safe_path, (member.mtime, member.mtime)) + except Exception: + pass
318-328
: ZIP extraction: preserve perms/mtime to avoid subtle regressions.Restore basic metadata (unix permission bits and mtime) after writing each zip member so extracted files keep executable bits/timestamps.
Apply the patch at monai/apps/utils.py (around lines 318–328):
with zip_file.open(member) as source: with open(safe_path, 'wb') as target: shutil.copyfileobj(source, target) + # restore basic metadata + perms = (member.external_attr >> 16) & 0o777 + if perms: + os.chmod(safe_path, perms) + from datetime import datetime + try: + mtime = datetime(*member.date_time).timestamp() + os.utime(safe_path, (mtime, mtime)) + except Exception: + passAlso update monai/bundle/scripts.py (uses ZipFile.extractall at ~lines 290–293) to either call this helper or apply equivalent metadata restoration. Run tests that exercise download_and_extract/extractall (tests/apps/test_download_and_extract.py and related integration tests) after the change.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base
setting
📒 Files selected for processing (1)
monai/apps/utils.py
(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.
Files:
monai/apps/utils.py
🪛 Ruff (0.12.2)
monai/apps/utils.py
134-134: Avoid specifying long messages outside the exception class
(TRY003)
136-136: Avoid specifying long messages outside the exception class
(TRY003)
140-140: Avoid specifying long messages outside the exception class
(TRY003)
149-149: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (1)
monai/apps/utils.py (1)
14-15
: Imports are appropriate.Adding os and shutil is necessary for the secure extraction logic.
|
||
with zipfile.ZipFile(zip_path, 'w') as zf: | ||
# Try to use absolute path | ||
zf.writestr("/etc/passwd", "malicious content") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests all look good, but please change this to be something other than "/etc/passwd" just in case someone somehow overwrites that file. Perhaps use "/etc/passwd_bad" here and elsewhere instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
This looks good now, just run |
Signed-off-by: h3rrr <[email protected]>
Signed-off-by: h3rrr <[email protected]>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
monai/apps/utils.py (1)
135-139
: Standardize messages to include “unsafe path”; silence TRY003 locally.Aligns with tests and prior feedback; avoids linter noise.
- if hasattr(member, "issym") and member.issym(): - raise ValueError(f"Symbolic link detected in archive: {member_path}") - if hasattr(member, "islnk") and member.islnk(): - raise ValueError(f"Hard link detected in archive: {member_path}") + if hasattr(member, "issym") and member.issym(): + raise ValueError(f"Unsafe path: symlink {member_path}") # noqa: TRY003 + if hasattr(member, "islnk") and member.islnk(): + raise ValueError(f"Unsafe path: hardlink {member_path}") # noqa: TRY003
🧹 Nitpick comments (6)
monai/apps/utils.py (4)
142-144
: Unify absolute/parent check message and quiet TRY003.Keep the check; just normalize the wording.
- if os.path.isabs(member_path) or ".." in member_path.split(os.sep): - raise ValueError(f"Unsafe path detected in archive: {member_path}") + if os.path.isabs(member_path) or ".." in member_path.split(os.sep): + raise ValueError(f"Unsafe path: {member_path}") # noqa: TRY003
125-131
: Docstring + typing: clarify contract of the helper.Add types and Google-style sections per repo guidelines.
-def safe_extract_member(member, extract_to): - """Securely verify compressed package member paths to prevent path traversal attacks""" +def safe_extract_member(member: Any, extract_to: PathLike) -> str: + """ + Securely validate an archive member's target path and return a safe path within `extract_to`. + + Args: + member: zipfile.ZipInfo or tarfile.TarInfo describing the entry. + extract_to: Destination directory for extraction. + + Returns: + The validated filesystem path for this member within `extract_to`. + + Raises: + ValueError: If the member is a symlink/hardlink or resolves outside `extract_to`. + """
322-332
: ZIP extraction looks good; consider a lightweight zip-bomb guard.Optional: cap total uncompressed bytes per archive.
- with zipfile.ZipFile(filepath, "r") as zip_file: - for member in zip_file.infolist(): + with zipfile.ZipFile(filepath, "r") as zip_file: + total_uncompressed = 0 + for member in zip_file.infolist(): safe_path = safe_extract_member(member, output_dir) if member.is_dir(): continue - + total_uncompressed += getattr(member, "file_size", 0) + if total_uncompressed > MAX_TOTAL_UNCOMPRESSED: # define at module level + raise ValueError("Unsafe path: archive too large") # noqa: TRY003 os.makedirs(os.path.dirname(safe_path), exist_ok=True) with zip_file.open(member) as source: with open(safe_path, "wb") as target: shutil.copyfileobj(source, target)Add near the top of this module (outside this hunk):
# Max total uncompressed bytes allowed when extracting archives (defense-in-depth) MAX_TOTAL_UNCOMPRESSED = 2 * 1024 * 1024 * 1024 # 2 GiB
334-346
: Open tar with auto-detected compression.Be explicit; improves readability and handles gz/xz transparently.
- with tarfile.open(filepath, "r") as tar_file: + with tarfile.open(filepath, "r:*") as tar_file:tests/apps/test_download_and_extract.py (2)
74-103
: Drop broad exception catch; let the test fail naturally (fixes BLE001).Catching
Exception
isn’t needed; if it raises, the test already fails.- # This should not raise any exception - try: - extractall(str(zip_path), str(extract_dir)) - - # Verify files were extracted correctly - self.assertTrue((extract_dir / "normal_file.txt").exists()) - self.assertTrue((extract_dir / "subdir" / "nested_file.txt").exists()) - self.assertTrue((extract_dir / "another_file.json").exists()) - - # Verify content - with open(extract_dir / "normal_file.txt") as f: - self.assertEqual(f.read(), "This is a normal file") - - except Exception as e: - self.fail(f"Valid zip extraction should not raise exception: {e}") + extractall(str(zip_path), str(extract_dir)) + # Verify files were extracted correctly + self.assertTrue((extract_dir / "normal_file.txt").exists()) + self.assertTrue((extract_dir / "subdir" / "nested_file.txt").exists()) + self.assertTrue((extract_dir / "another_file.json").exists()) + # Verify content + with open(extract_dir / "normal_file.txt") as f: + self.assertEqual(f.read(), "This is a normal file")
124-157
: Same here: remove blanket catch in the TAR happy-path test.Keeps the test simpler and fixes BLE001.
- # This should not raise any exception - try: - extractall(str(tar_path), str(extract_dir)) - - # Verify files were extracted correctly - self.assertTrue((extract_dir / "normal_file.txt").exists()) - self.assertTrue((extract_dir / "subdir" / "nested_file.txt").exists()) - - # Verify content - with open(extract_dir / "normal_file.txt") as f: - self.assertEqual(f.read(), "This is a normal file") - - except Exception as e: - self.fail(f"Valid tar extraction should not raise exception: {e}") + extractall(str(tar_path), str(extract_dir)) + # Verify files were extracted correctly + self.assertTrue((extract_dir / "normal_file.txt").exists()) + self.assertTrue((extract_dir / "subdir" / "nested_file.txt").exists()) + # Verify content + with open(extract_dir / "normal_file.txt") as f: + self.assertEqual(f.read(), "This is a normal file")
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base
setting
📒 Files selected for processing (2)
monai/apps/utils.py
(2 hunks)tests/apps/test_download_and_extract.py
(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.
Files:
tests/apps/test_download_and_extract.py
monai/apps/utils.py
🪛 Ruff (0.12.2)
tests/apps/test_download_and_extract.py
101-101: Do not catch blind exception: Exception
(BLE001)
155-155: Do not catch blind exception: Exception
(BLE001)
monai/apps/utils.py
136-136: Avoid specifying long messages outside the exception class
(TRY003)
138-138: Avoid specifying long messages outside the exception class
(TRY003)
143-143: Avoid specifying long messages outside the exception class
(TRY003)
152-152: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: min-dep-py3 (3.10)
- GitHub Check: min-dep-pytorch (2.6.0)
- GitHub Check: min-dep-pytorch (2.5.1)
- GitHub Check: min-dep-pytorch (2.7.1)
- GitHub Check: min-dep-py3 (3.12)
- GitHub Check: min-dep-pytorch (2.8.0)
- GitHub Check: min-dep-py3 (3.11)
- GitHub Check: min-dep-py3 (3.9)
- GitHub Check: min-dep-os (macOS-latest)
- GitHub Check: min-dep-os (windows-latest)
- GitHub Check: min-dep-os (ubuntu-latest)
- GitHub Check: quick-py3 (macOS-latest)
- GitHub Check: flake8-py3 (mypy)
- GitHub Check: quick-py3 (ubuntu-latest)
- GitHub Check: flake8-py3 (codeformat)
- GitHub Check: flake8-py3 (pytype)
- GitHub Check: packaging
- GitHub Check: quick-py3 (windows-latest)
- GitHub Check: build-docs
🔇 Additional comments (3)
tests/apps/test_download_and_extract.py (3)
181-198
: Good change using “/etc/passwd_bad”.Safer test target; aligns with review feedback.
199-224
: Symlink protection test LGTM.Covers the critical case; message assertion is flexible.
225-250
: Hardlink protection test LGTM.Completes coverage with links.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: h3rrr <[email protected]>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
monai/apps/utils.py (2)
135-139
: Standardize messages and detect ZIP symlinks (tests currently fail).Messages must include “Unsafe path:” and explicit link type; also detect symlinks in ZIP via
external_attr
. This aligns with tests and hardens ZIP handling.- if hasattr(member, "issym") and member.issym(): - raise ValueError(f"Symbolic link detected in archive: {member_path}") - if hasattr(member, "islnk") and member.islnk(): - raise ValueError(f"Hard link detected in archive: {member_path}") + # tar: symbolic/hard links + if hasattr(member, "issym") and member.issym(): + raise ValueError(f"Unsafe path: symlink {member_path}") # noqa: TRY003 + if hasattr(member, "islnk") and member.islnk(): + raise ValueError(f"Unsafe path: hardlink {member_path}") # noqa: TRY003 + # zip: symlink via external attributes (UNIX) + if hasattr(member, "external_attr"): + mode = (member.external_attr >> 16) & 0xFFFF + try: + import stat # local import to avoid top-level dependency where not needed + if stat.S_ISLNK(mode): + raise ValueError(f"Unsafe path: symlink {member_path}") # noqa: TRY003 + except Exception: + pass
142-144
: Unify path-traversal error text to match tests.Use the same “Unsafe path: path traversal …” message for both checks.
- if os.path.isabs(member_path) or ".." in member_path.split(os.sep): - raise ValueError(f"Unsafe path detected in archive: {member_path}") + if os.path.isabs(member_path) or ".." in member_path.split(os.sep): + raise ValueError(f"Unsafe path: path traversal {member_path}") # noqa: TRY003 @@ - if os.path.commonpath([extract_root, target_real]) != extract_root: - raise ValueError(f"Unsafe path: path traversal {member_path}") + if os.path.commonpath([extract_root, target_real]) != extract_root: + raise ValueError(f"Unsafe path: path traversal {member_path}") # noqa: TRY003Also applies to: 151-152
🧹 Nitpick comments (4)
monai/apps/utils.py (4)
154-154
: Return the canonical path.Return
target_real
to reflect the resolved path you validated.- return full_path + return target_real
125-127
: Docstring: expand to Google style with Args/Returns/Raises.Improves maintainability and meets repo guidelines.
-def safe_extract_member(member, extract_to): - """Securely verify compressed package member paths to prevent path traversal attacks""" +def safe_extract_member(member, extract_to): + """Resolve and validate a member's extraction path to prevent path traversal. + + Args: + member: A zipfile.ZipInfo or tarfile.TarInfo-like object describing an archive entry. + extract_to: Destination root directory for extraction. + + Returns: + str: Canonical absolute path within ``extract_to`` where the member should be written. + + Raises: + ValueError: If the member is a symlink/hardlink, has an absolute path, or escapes ``extract_to``. + """
321-331
: Mitigate TOCTOU: use a safe, exclusive writer (no symlink follow) for files.Reduces symlink race risk on the leaf path and prevents clobbering existing files.
- with zip_file.open(member) as source: - with open(safe_path, "wb") as target: - shutil.copyfileobj(source, target) + with zip_file.open(member) as source: + with _open_for_write_safely(safe_path) as target: + shutil.copyfileobj(source, target)Add this helper once at module scope:
def _open_for_write_safely(path: str): """ Create a new file exclusively and avoid following final-path symlinks where supported. """ import os if os.name == "posix": flags = os.O_WRONLY | os.O_CREAT | os.O_EXCL if hasattr(os, "O_NOFOLLOW"): flags |= os.O_NOFOLLOW fd = os.open(path, flags, 0o644) return os.fdopen(fd, "wb") # Fallback for platforms without O_NOFOLLOW return open(path, "xb")
333-345
: Same safe writer for TAR; and be explicit with mode.Use the safe writer; consider
r:*
for clarity.- with tarfile.open(filepath, "r") as tar_file: + with tarfile.open(filepath, "r:*") as tar_file: @@ - if source is not None: - with source: - with open(safe_path, "wb") as target: - shutil.copyfileobj(source, target) + if source is not None: + with source: + with _open_for_write_safely(safe_path) as target: + shutil.copyfileobj(source, target)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base
setting
📒 Files selected for processing (1)
monai/apps/utils.py
(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.
Files:
monai/apps/utils.py
🪛 Ruff (0.12.2)
monai/apps/utils.py
136-136: Avoid specifying long messages outside the exception class
(TRY003)
138-138: Avoid specifying long messages outside the exception class
(TRY003)
143-143: Avoid specifying long messages outside the exception class
(TRY003)
152-152: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: min-dep-os (ubuntu-latest)
- GitHub Check: min-dep-pytorch (2.6.0)
- GitHub Check: min-dep-pytorch (2.8.0)
- GitHub Check: min-dep-py3 (3.10)
- GitHub Check: min-dep-py3 (3.12)
- GitHub Check: min-dep-py3 (3.11)
- GitHub Check: min-dep-pytorch (2.7.1)
- GitHub Check: min-dep-pytorch (2.5.1)
- GitHub Check: min-dep-py3 (3.9)
- GitHub Check: min-dep-os (windows-latest)
- GitHub Check: min-dep-os (macOS-latest)
- GitHub Check: quick-py3 (windows-latest)
- GitHub Check: flake8-py3 (pytype)
- GitHub Check: packaging
- GitHub Check: quick-py3 (ubuntu-latest)
- GitHub Check: build-docs
- GitHub Check: flake8-py3 (mypy)
- GitHub Check: quick-py3 (macOS-latest)
- GitHub Check: flake8-py3 (codeformat)
Signed-off-by: h3rrr <[email protected]>
Signed-off-by: h3rrr <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
monai/apps/utils.py (2)
154-155
: Return the canonical path resolved via realpath.Prevents later consumers from following symlinked parents.
- return full_path + return target_real
135-154
: Unify error messages to "Unsafe path" + link type; silence TRY003.Matches test expectations and quiets Ruff.
- if hasattr(member, "issym") and member.issym(): - raise ValueError(f"Symbolic link detected in archive: {member_path}") - if hasattr(member, "islnk") and member.islnk(): - raise ValueError(f"Hard link detected in archive: {member_path}") + if hasattr(member, "issym") and member.issym(): + raise ValueError(f"Unsafe path: symlink {member_path}") # noqa: TRY003 + if hasattr(member, "islnk") and member.islnk(): + raise ValueError(f"Unsafe path: hardlink {member_path}") # noqa: TRY003 @@ - if os.path.isabs(member_path) or ".." in member_path.split(os.sep): - raise ValueError(f"Unsafe path detected in archive: {member_path}") + if os.path.isabs(member_path) or ".." in member_path.split(os.sep): + raise ValueError(f"Unsafe path: path traversal {member_path}") # noqa: TRY003 @@ - if os.path.commonpath([extract_root, target_real]) != extract_root: - raise ValueError(f"Unsafe path: path traversal {member_path}") + if os.path.commonpath([extract_root, target_real]) != extract_root: + raise ValueError(f"Unsafe path: path traversal {member_path}") # noqa: TRY003
🧹 Nitpick comments (1)
monai/apps/utils.py (1)
125-132
: Add type hints and a proper Google-style docstring.Improves clarity and meets guidelines.
-def safe_extract_member(member, extract_to): - """Securely verify compressed package member paths to prevent path traversal attacks""" +def safe_extract_member(member: Any, extract_to: PathLike) -> str: + """Validate an archive member and return a safe, canonical extraction path. + + Args: + member: ZipInfo or TarInfo-like object for an archive entry. + extract_to: Target directory into which the entry should be extracted. + + Returns: + Absolute, canonical file path under ``extract_to`` that is safe to write. + + Raises: + ValueError: If the entry is a symlink/hardlink or attempts path traversal. + """
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base
setting
📒 Files selected for processing (1)
monai/apps/utils.py
(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.
Files:
monai/apps/utils.py
🪛 Ruff (0.12.2)
monai/apps/utils.py
136-136: Avoid specifying long messages outside the exception class
(TRY003)
138-138: Avoid specifying long messages outside the exception class
(TRY003)
143-143: Avoid specifying long messages outside the exception class
(TRY003)
152-152: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (1)
monai/apps/utils.py (1)
334-334
: tar.gz extraction broken: use tarfile.open(..., 'r:*').Mode "r" won’t read gzip/bzip2/xz tars. Use auto-detect.
- with tarfile.open(filepath, "r") as tar_file: + with tarfile.open(filepath, "r:*") as tar_file:Likely an incorrect or invalid review comment.
Signed-off-by: h3rrr <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (5)
monai/apps/utils.py (5)
135-139
: Zip symlink bypass; unify messages with tests; add noqa.issym/islnk only cover tar; zip symlinks (external_attr) slip through. Also align error text to “Unsafe path: symlink/hardlink …”.
- if hasattr(member, "issym") and member.issym(): - raise ValueError(f"Symbolic link detected in archive: {member_path}") - if hasattr(member, "islnk") and member.islnk(): - raise ValueError(f"Hard link detected in archive: {member_path}") + # Reject symlinks and hardlinks (tar) + if hasattr(member, "issym") and member.issym(): + raise ValueError(f"Unsafe path: symlink {member_path}") # noqa: TRY003 + if hasattr(member, "islnk") and member.islnk(): + raise ValueError(f"Unsafe path: hardlink {member_path}") # noqa: TRY003 + # Reject symlinks in ZIP (mode is encoded in external_attr) + if hasattr(member, "external_attr"): + mode = (member.external_attr >> 16) + if stat.S_ISLNK(mode): + raise ValueError(f"Unsafe path: symlink {member_path}") # noqa: TRY003Add import:
# top of file import stat
142-144
: Be explicit: absolute vs traversal; add noqa.Improves testability and lints.
- if os.path.isabs(member_path) or ".." in member_path.split(os.sep): - raise ValueError(f"Unsafe path detected in archive: {member_path}") + if os.path.isabs(member_path): + raise ValueError(f"Unsafe path: absolute path {member_path}") # noqa: TRY003 + if ".." in member_path.split(os.sep): + raise ValueError(f"Unsafe path: path traversal {member_path}") # noqa: TRY003
277-287
: ZIP: create dirs, preserve permissions, and use canonical path.Restores parity with extractall and avoids losing +x; also handles explicit dir entries.
with zipfile.ZipFile(filepath, "r") as zip_file: for member in zip_file.infolist(): safe_path = safe_extract_member(member, output_dir) - if member.is_dir(): - continue - os.makedirs(os.path.dirname(safe_path), exist_ok=True) + if member.is_dir(): + os.makedirs(safe_path, exist_ok=True) + perm = (member.external_attr >> 16) & 0o777 + if perm: + try: + os.chmod(safe_path, perm) + except OSError: + pass + continue + os.makedirs(os.path.dirname(safe_path), exist_ok=True) with zip_file.open(member) as source: with open(safe_path, "wb") as target: shutil.copyfileobj(source, target) + # Preserve file permissions + perm = (member.external_attr >> 16) & 0o777 + if perm: + try: + os.chmod(safe_path, perm) + except OSError: + pass
289-301
: TAR: handle directory entries and preserve permissions.Matches tarfile.extractall behavior for modes.
with tarfile.open(filepath, "r") as tar_file: for member in tar_file.getmembers(): safe_path = safe_extract_member(member, output_dir) - if not member.isfile(): - continue + if member.isdir(): + os.makedirs(safe_path, exist_ok=True) + perm = member.mode & 0o777 + if perm: + try: + os.chmod(safe_path, perm) + except OSError: + pass + continue + if not member.isfile(): + continue os.makedirs(os.path.dirname(safe_path), exist_ok=True) source = tar_file.extractfile(member) if source is not None: with source: with open(safe_path, "wb") as target: shutil.copyfileobj(source, target) + # Preserve file permissions + perm = member.mode & 0o777 + if perm: + try: + os.chmod(safe_path, perm) + except OSError: + pass
154-154
: Return canonical path to reduce TOCTOU risk.Use the resolved path for subsequent file ops.
- return full_path + return target_realNote: consider using os.open(..., O_NOFOLLOW) on POSIX for the final write to further mitigate races.
🧹 Nitpick comments (2)
monai/apps/utils.py (2)
125-132
: Docstring: specify args/returns/raises (Google-style).Clarify behavior and exceptions raised.
-def safe_extract_member(member, extract_to): - """Securely verify compressed package member paths to prevent path traversal attacks""" +def safe_extract_member(member, extract_to): + """Validate an archive member path and return a safe, canonical destination path. + + Args: + member: Archive entry (zipfile.ZipInfo or tarfile.TarInfo). + extract_to: Root directory under which to extract. + + Returns: + str: Canonical absolute path (symlinks resolved) to write the member. + + Raises: + ValueError: If the member path is unsafe (absolute path, traversal, symlink, hardlink). + """
148-153
: Lint and consistency.Keep the second containment check; add noqa to satisfy Ruff TRY003.
- if os.path.commonpath([extract_root, target_real]) != extract_root: - raise ValueError(f"Unsafe path: path traversal {member_path}") + if os.path.commonpath([extract_root, target_real]) != extract_root: + raise ValueError(f"Unsafe path: path traversal {member_path}") # noqa: TRY003
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base
setting
📒 Files selected for processing (1)
monai/apps/utils.py
(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.
Files:
monai/apps/utils.py
🪛 Ruff (0.12.2)
monai/apps/utils.py
136-136: Avoid specifying long messages outside the exception class
(TRY003)
138-138: Avoid specifying long messages outside the exception class
(TRY003)
143-143: Avoid specifying long messages outside the exception class
(TRY003)
152-152: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: flake8-py3 (mypy)
- GitHub Check: build-docs
- GitHub Check: flake8-py3 (codeformat)
- GitHub Check: quick-py3 (ubuntu-latest)
- GitHub Check: packaging
- GitHub Check: quick-py3 (windows-latest)
- GitHub Check: flake8-py3 (pytype)
- GitHub Check: quick-py3 (macOS-latest)
- GitHub Check: min-dep-pytorch (2.6.0)
- GitHub Check: min-dep-pytorch (2.8.0)
- GitHub Check: min-dep-os (windows-latest)
- GitHub Check: min-dep-pytorch (2.7.1)
- GitHub Check: min-dep-pytorch (2.5.1)
- GitHub Check: min-dep-os (macOS-latest)
- GitHub Check: min-dep-py3 (3.12)
- GitHub Check: min-dep-py3 (3.9)
- GitHub Check: min-dep-py3 (3.10)
- GitHub Check: min-dep-os (ubuntu-latest)
- GitHub Check: min-dep-py3 (3.11)
🔇 Additional comments (1)
monai/apps/utils.py (1)
347-352
: LGTM: safer per-member extraction dispatch.Type routing reads cleanly and keeps public API stable.
def safe_extract_member(member, extract_to): | ||
"""Securely verify compressed package member paths to prevent path traversal attacks""" | ||
# Get member path (handle different compression formats) | ||
if hasattr(member, "filename"): | ||
member_path = member.filename # zipfile | ||
elif hasattr(member, "name"): | ||
member_path = member.name # tarfile | ||
else: | ||
member_path = str(member) | ||
|
||
if hasattr(member, "issym") and member.issym(): | ||
raise ValueError(f"Symbolic link detected in archive: {member_path}") | ||
if hasattr(member, "islnk") and member.islnk(): | ||
raise ValueError(f"Hard link detected in archive: {member_path}") | ||
|
||
member_path = os.path.normpath(member_path) | ||
|
||
if os.path.isabs(member_path) or ".." in member_path.split(os.sep): | ||
raise ValueError(f"Unsafe path detected in archive: {member_path}") | ||
|
||
full_path = os.path.join(extract_to, member_path) | ||
full_path = os.path.normpath(full_path) | ||
|
||
extract_root = os.path.realpath(extract_to) | ||
target_real = os.path.realpath(full_path) | ||
# Ensure the resolved path stays within the extraction root | ||
if os.path.commonpath([extract_root, target_real]) != extract_root: | ||
raise ValueError(f"Unsafe path: path traversal {member_path}") | ||
|
||
return full_path | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Quick check: lints and tests.
After applying, run Ruff and the targeted tests.
Also applies to: 277-301
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Lint the changed file
python -m pip install ruff pytest
ruff check monai/apps/utils.py
# Run only the new extraction tests
pytest -q tests/apps/test_download_and_extract.py -k "PathTraversalProtection or extract"
Length of output: 14852
Fix Ruff errors in monai/apps/utils.py and re-run the targeted tests
- Ruff failed: 21 issues (1 auto-fixable); ruff exited before pytest ran.
- High-priority fixes:
- RUF022: all not sorted (line 45).
- S310: unsafe URL opens (urlretrieve / urlopen) — lines 115, 119, 247.
- BLE001: broad
except Exception
in check_hash (line 180). - TRY400 / TRY201: use
logging.exception
and avoidraise e
(lines 121–122, 181). - B028: add
stacklevel
towarnings.warn
(line 118). - Multiple TRY003 / TRY002: long/generic exception messages and generic
Exception
re-raises — shorten messages or use specific exception types (lines 136–152, 232, 242, 260, 271, 342, 378, 420).
- Action: run
ruff --fix
where appropriate, address the security/broad-except issues, then re-run:
ruff check monai/apps/utils.py && pytest -q tests/apps/test_download_and_extract.py -k "PathTraversalProtection or extract"
🧰 Tools
🪛 Ruff (0.12.2)
136-136: Avoid specifying long messages outside the exception class
(TRY003)
138-138: Avoid specifying long messages outside the exception class
(TRY003)
143-143: Avoid specifying long messages outside the exception class
(TRY003)
152-152: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
monai/apps/utils.py lines ~45 and 115–420: ruff reported multiple issues — sort
the __all__ export at line ~45 (RUF022); replace unsafe urlretrieve/urlopen
usages at lines ~115, 119, 247 with safer, validated downloads or use requests
with explicit timeouts and URL validation (S310); in check_hash around line 180
narrow the broad "except Exception" to specific exceptions (e.g.,
FileNotFoundError, OSError, hashlib-related errors) and avoid re-raising with
"raise e" (use plain "raise"); replace logging.error + raise with
logging.exception where stack traces are useful (TRY400/TRY201 at ~121–122,
~181); add stacklevel to warnings.warn calls at ~118 (B028); shorten long
generic exception messages and replace generic excepts with specific exception
types at the other listed locations (~136–152, 232, 242, 260, 271, 342, 378,
420) to provide concise messages and proper exception handling. After changes
run ruff --fix for auto-fixable issues and then run: ruff check
monai/apps/utils.py && pytest -q tests/apps/test_download_and_extract.py -k
"PathTraversalProtection or extract" to verify.
/build |
The path traversal issue in the previously submitted vulnerability report has been fixed and has been verified and fixed locally.
GHSA-x6ww-pf9m-m73m
Description
Added a security function to fix a security issue caused by directly using extractall in zip_file.extractall(output_dir).


This has been verified locally and does not affect previous functionality.
Figure 1: Malicious zip file loaded, Figure 2: Normal zip file loaded
Types of changes
./runtests.sh -f -u --net --coverage
../runtests.sh --quick --unittests --disttests
.make html
command in thedocs/
folder.