Skip to content

Conversation

IamTingTing
Copy link
Contributor

@IamTingTing IamTingTing commented Jul 18, 2025

Fixes #8496

Description

A few sentences describing the changes proposed in this pull request.

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick --unittests --disttests.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

Copy link
Contributor

coderabbitai bot commented Jul 18, 2025

Walkthrough

The output head for both DiffusionModelUNet and DiffusionModelEncoder was changed from a prebuilt nn.Sequential with a hardcoded input size to a lazily-initialized Optional attribute. self.out is now None at init and constructed on first forward using the runtime flattened feature dimension (h.shape[1]) to build the linear→ReLU→Dropout→linear head, then applied to h.

Changes

File(s) Change Summary
monai/networks/nets/diffusion_model_unet.py Replaced fixed self.out (nn.Sequential with hardcoded 4096 input) in DiffusionModelUNet and DiffusionModelEncoder with self.out: Optional[nn.Module] = None; create self.out lazily on first forward() using the runtime flattened feature size.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant DiffusionModel (UNet/Encoder)
    participant nn.Sequential

    User->>DiffusionModel: forward(input)
    DiffusionModel->>DiffusionModel: Compute and flatten features (h)
    alt self.out not initialized
        DiffusionModel->>nn.Sequential: Create head with input size = h.shape[1]
        DiffusionModel->>DiffusionModel: Assign to self.out
    end
    DiffusionModel->>nn.Sequential: Pass h through self.out
    nn.Sequential-->>DiffusionModel: Output tensor
    DiffusionModel-->>User: Return output
Loading

Poem

A lazy head built when called on the fly,
No more fixed 4096 to make shapes cry.
UNet and Encoder now size at runtime,
Features align neat, one forward at a time.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The description only contains "Fixes #8496" and a placeholder sentence rather than a real summary of the implementation, and the checklist shows no tests or doc updates were confirmed. Because it lacks a clear description of the actual code changes, testing performed, and documentation/docstring status, the PR description is incomplete. Update the description to state what was changed, which files/classes were affected, and the test/documentation status. Replace the placeholder with a concise summary of the implementation (e.g., lazy initialization of the output head, removal of hardcoded 4096, classes/files changed), state which tests were run or add tests that reproduce the original failure, and update the checklist and any docstrings or docs as appropriate.
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly and accurately describes the primary change: removing a hardcoded input dimension in DiffusionModelEncoder. It is specific to the affected component, concise, and useful for reviewers scanning history. No extraneous wording or noise is present.
Linked Issues Check ✅ Passed The changes address issue #8496 by replacing the hardcoded Linear(4096, ...) with a runtime-sized head created from h.shape[1] in DiffusionModelEncoder, which resolves the flattened-feature-size mismatch; the same pattern was also applied to DiffusionModelUNet. This directly meets the linked issue's coding objective to adapt the final linear layer to the actual flattened feature size. Note a potential runtime device placement concern: the lazy-created module should be placed on the same device as the model/input to avoid device-mismatch errors.
Out of Scope Changes Check ✅ Passed All modifications are confined to diffusion_model_unet.py and are focused on converting the final output head to lazy initialization; there are no unrelated file changes or functional additions beyond this scope. The change to DiffusionModelUNet is a closely related symmetry rather than an unrelated feature. No out-of-scope changes were detected.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
monai/networks/nets/diffusion_model_unet.py (1)

2053-2056: Consider device placement and thread safety for lazy initialization.

The lazy initialization logic is correct, but consider these improvements:

  1. Device placement: The dynamically created module should be moved to the same device as the input tensor.
  2. Thread safety: In multi-threaded environments, this could create race conditions.

Apply this diff to ensure proper device placement:

-        if self.out is None:
-            self.out = nn.Sequential(
-                nn.Linear(h.shape[1], 512), nn.ReLU(), nn.Dropout(0.1), nn.Linear(512, self.out_channels)
-            )
+        if self.out is None:
+            self.out = nn.Sequential(
+                nn.Linear(h.shape[1], 512), nn.ReLU(), nn.Dropout(0.1), nn.Linear(512, self.out_channels)
+            ).to(h.device)

For thread safety, consider using a lock or moving the initialization to a separate method called during model setup.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e499362 and 3e45145.

📒 Files selected for processing (1)
  • monai/networks/nets/diffusion_model_unet.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: min-dep-pytorch (2.6.0)
  • GitHub Check: min-dep-py3 (3.12)
  • GitHub Check: min-dep-pytorch (2.5.1)
  • GitHub Check: min-dep-py3 (3.11)
  • GitHub Check: min-dep-pytorch (2.4.1)
  • GitHub Check: min-dep-os (ubuntu-latest)
  • GitHub Check: min-dep-py3 (3.10)
  • GitHub Check: min-dep-py3 (3.9)
  • GitHub Check: min-dep-os (windows-latest)
  • GitHub Check: min-dep-os (macOS-latest)
  • GitHub Check: quick-py3 (macOS-latest)
  • GitHub Check: build-docs
  • GitHub Check: flake8-py3 (mypy)
  • GitHub Check: quick-py3 (ubuntu-latest)
  • GitHub Check: quick-py3 (windows-latest)
  • GitHub Check: packaging
  • GitHub Check: flake8-py3 (codeformat)
  • GitHub Check: flake8-py3 (pytype)
🔇 Additional comments (1)
monai/networks/nets/diffusion_model_unet.py (1)

2008-2008: LGTM: Proper lazy initialization setup.

Setting self.out = None correctly defers the output layer creation until runtime, enabling dynamic input dimension adaptation.

@KumoLiu KumoLiu requested a review from ericspod July 18, 2025 14:26
@ericspod ericspod mentioned this pull request Sep 12, 2025
55 tasks
@KumoLiu
Copy link
Contributor

KumoLiu commented Sep 16, 2025

/build

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
monai/networks/nets/diffusion_model_unet.py (1)

1985-1996: Off‑by‑one: final downsample flag is wrong in Encoder.

is_final_block = i == len(channels) is never true; the last block still downsamples. This diverges from UNet’s logic and likely shrinks features one step too far.

Apply:

-            is_final_block = i == len(channels)  # - 1
+            is_final_block = i == len(channels) - 1
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 595ca84 and 99e9491.

📒 Files selected for processing (1)
  • monai/networks/nets/diffusion_model_unet.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.

Files:

  • monai/networks/nets/diffusion_model_unet.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: build-docs
  • GitHub Check: flake8-py3 (mypy)
  • GitHub Check: flake8-py3 (codeformat)
  • GitHub Check: flake8-py3 (pytype)
  • GitHub Check: packaging
  • GitHub Check: quick-py3 (ubuntu-latest)
  • GitHub Check: quick-py3 (macOS-latest)
  • GitHub Check: quick-py3 (windows-latest)
  • GitHub Check: min-dep-pytorch (2.8.0)
  • GitHub Check: min-dep-pytorch (2.5.1)
  • GitHub Check: min-dep-py3 (3.10)
  • GitHub Check: min-dep-os (windows-latest)
  • GitHub Check: min-dep-pytorch (2.6.0)
  • GitHub Check: min-dep-py3 (3.12)
  • GitHub Check: min-dep-os (macOS-latest)
  • GitHub Check: min-dep-py3 (3.9)
  • GitHub Check: min-dep-pytorch (2.7.1)
  • GitHub Check: min-dep-os (ubuntu-latest)
  • GitHub Check: min-dep-py3 (3.11)
🔇 Additional comments (2)
monai/networks/nets/diffusion_model_unet.py (2)

36-36: Import Optional is appropriate for new lazy attr.

No issues.


2010-2010: ```shell
#!/bin/bash
set -euo pipefail
FILE="monai/networks/nets/diffusion_model_unet.py"
echo "PWD: $(pwd)"
if [ -f "$FILE" ]; then
echo "=== SNIPPET 1960..2040 ==="
sed -n '1960,2040p' "$FILE" || true
echo "=== LINES WITH self.out ==="
rg -n "self.out" "$FILE" || true
echo "=== LINES WITH 'self.out =' IN REPO ==="
rg -n "self.out\s*=" || true
echo "=== SEARCH LazyLinear ==="
rg -n "LazyLinear" || true
echo "=== def forward occurrences in file ==="
rg -n "def forward" "$FILE" || true
FORWARD_LINE=$(rg -n "def forward" "$FILE" | head -n1 | cut -d: -f1 || true)
if [ -n "$FORWARD_LINE" ]; then
END=$((FORWARD_LINE+400))
echo "=== FORWARD FUNCTION LINES $FORWARD_LINE..$END ==="
sed -n "${FORWARD_LINE},${END}p" "$FILE" || true
fi
else
echo "File not found at $FILE; searching repo for diffusion_model_unet.py"
rg -n "diffusion_model_unet.py" || true
fi
echo "=== SEARCH repo for torch usage / requirements ==="
rg -n --hidden --no-ignore "torch" || true
rg -n --hidden --no-ignore "torch.nn" || true
echo "=== CHECK common packaging files for torch version pins ==="
for f in requirements.txt pyproject.toml setup.cfg; do
if [ -f "$f" ]; then
echo "---- $f ----"
rg -n "torch" "$f" || true
fi
done
echo "=== DONE ==="


</blockquote></details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hardcoded 4096 dimension in DiffusionModelEncoder prevents architectural flexibility
4 participants