Skip to content

Conversation

zjgemi
Copy link
Collaborator

@zjgemi zjgemi commented Aug 26, 2025

Summary by CodeRabbit

  • New Features

    • Configurable asynchronous fingerprinting via a new async_ratio setting.
    • Optional async FP mode in concurrent learning that produces async configurations and async labeled data carried through the workflow.
    • New async_confs artifact propagated across selection, block, loop, and data collection stages.
  • Improvements

    • Workflow step matching extended to recognize async iteration keys.
  • Tests

    • Added and updated tests to validate async FP flow and async_confs outputs.

zjgemi and others added 3 commits August 26, 2025 15:56
Copy link

coderabbitai bot commented Aug 26, 2025

📝 Walkthrough

Walkthrough

Adds configurable asynchronous FP handling controlled by fp.async_ratio. Selectors can split configurations into async_confs; async artifacts and labeled data are threaded through submit, block, loop, ops, and tests. CLI arg and submit wiring expose the async ratio and enable an async FP path when ratio > 0.

Changes

Cohort / File(s) Summary
CLI FP args
dpgen2/entrypoint/args.py
Adds optional float argument async_ratio with docstring; default 0.0.
Submit wiring
dpgen2/entrypoint/submit.py
Reads fp.async_ratio; introduces async_fp flag, passes async_fp and ratio into concurrent learning and exploration scheduler creation; emits async_confs artifact.
Selector base API
dpgen2/exploration/selector/conf_selector.py
ConfSelector.select return type changed to (confs, async_confs, report).
Frame selector with async split
dpgen2/exploration/selector/conf_selector_frame.py
Adds async_ratio ctor arg; select now returns (confs, async_confs, report); implements split_multisystems(ms, ratio) and writes async subset to async_confs.
DP loop artifacts
dpgen2/flow/dpgen_loop.py
Adds optional input artifact async_confs to ConcurrentLearningLoop/ConcurrentLearning; threads async_confs through loop, block, and next steps.
OP: collect data
dpgen2/op/collect_data.py
Adds optional input artifact async_labeled_data; processing merges labeled_data and async_labeled_data when present.
OP: select confs
dpgen2/op/select_confs.py
Output signature now includes async_confs; execution unpacks three-tuple from selector.
Superop block (CL)
dpgen2/superop/block.py
Adds async_fp flag to ConcurrentLearningBlock; optional -async--prep-run-fp step consumes async_confs and yields async_labeled_data; block exposes async_confs in/out; collect_data wired to accept async_labeled_data.
DFlow query utils
dpgen2/utils/dflow_query.py
Extends matched_step_key regex to match async iteration keys (-async- variants).
Tests: selector frames
tests/exploration/test_conf_selector_frame.py
Updated to unpack triple return; adds test_async_confs validating async split; ConfSelectorFrames ctor supports async_ratio.
Tests: mocked ops
tests/mocked_ops.py
Adds MockedAsyncConfSelector producing confs and async_confs; MockedSelectConfs updated to include async_confs in outputs.
Tests: block async FP
tests/test_block_cl.py
Adds TestBlockCLAsyncFp exercising async FP path, verifying -async--prep-run-fp labeled data and final async_confs artifact.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User
  participant Submit as submit.workflow_concurrent_learning
  participant CL as ConcurrentLearningBlock
  participant Sel as SelectConfs (ConfSelectorFrames)
  participant FP as prep-run-fp (sync)
  participant AFP as prep-run-fp (async)
  participant Col as CollectData
  Note over Submit: Read fp.async_ratio -> async_fp=(ratio>0)
  Submit->>CL: make_concurrent_learning_op(async_fp)
  CL->>Sel: select(trajs, model_devis, async_ratio)
  Sel-->>CL: confs, async_confs, report
  CL->>FP: run on confs
  alt async_fp enabled
    CL->>AFP: run on async_confs
    AFP-->>CL: async_labeled_data
  end
  FP-->>CL: labeled_data
  CL->>Col: labeled_data + async_labeled_data?
  Col-->>CL: iter_data
  CL-->>Submit: outputs include async_confs
Loading
sequenceDiagram
  participant Flow as ConcurrentLearningLoop/_dpgen
  participant Block as block step
  participant Next as next step
  Note over Flow: `async_confs` added as optional InputArtifact
  Flow->>Block: artifacts {..., async_confs}
  Block-->>Next: artifacts {..., async_confs}
  Next-->>Flow: propagate for subsequent iterations
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • wanghan-iapcm

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4339411 and f8d6ca1.

📒 Files selected for processing (1)
  • tests/test_block_cl.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_block_cl.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build (3.9)
  • GitHub Check: build (3.10)
  • GitHub Check: build (3.8)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
dpgen2/exploration/selector/conf_selector.py (1)

34-41: Add docstring to ConfSelector.select and align all implementations to the triple‐return contract

The abstract declaration in dpgen2/exploration/selector/conf_selector.py must document the three outputs, and every subclass (including tests) must update its return‐type annotation to Tuple[List[Path], List[Path], ExplorationReport] to match the actual values returned.

• In dpgen2/exploration/selector/conf_selector.py, amend the select stub:

@@ class ConfSelector(ABC):
     @abstractmethod
     def select(
         self,
         trajs: Union[List[Path], List[HDF5Dataset]],
         model_devis: Union[List[Path], List[HDF5Dataset]],
         type_map: Optional[List[str]] = None,
         optional_outputs: Optional[List[Path]] = None,
-    ) -> Tuple[List[Path], List[Path], ExplorationReport]:
-        pass
+    ) -> Tuple[List[Path], List[Path], ExplorationReport]:
+        """
+        Returns
+        -------
+        confs : List[Path]
+            Selected configurations for the standard FP path.
+        async_confs : List[Path]
+            Selected configurations for the asynchronous FP path. May be empty when disabled.
+        report : ExplorationReport
+            Detailed report of the exploration process.
+        """
+        pass

• In tests/mocked_ops.py, both mock selectors must update their return‐type annotations:

@@ class MockedConfSelector(ConfSelector):
-    def select(...) -> Tuple[List[Path], ExplorationReport]:
+    def select(...) -> Tuple[List[Path], List[Path], ExplorationReport]:
@@ class MockedAsyncConfSelector(ConfSelector):
-    def select(...) -> Tuple[List[Path], ExplorationReport]:
+    def select(...) -> Tuple[List[Path], List[Path], ExplorationReport]:

• Double-check any other ConfSelector implementations (e.g., in dpgen2/exploration/selector/*) to ensure their signatures and return statements consistently reflect the three‐item contract.

tests/test_block_cl.py (1)

59-77: Fix missing import causing NameError: add MockedAsyncConfSelector to imports.

The async test uses MockedAsyncConfSelector (Line 499) but it isn’t imported here, causing the unit test failure reported by CI.

Apply this diff to import the symbol:

 from mocked_ops import (
   MockedCollectData,
   MockedCollectDataCheckOptParam,
   MockedConfSelector,
+  MockedAsyncConfSelector,
   MockedExplorationReport,
   MockedExplorationTaskGroup,
   MockedPrepDPTrain,
   MockedPrepVasp,
   MockedRunDPTrain,
   MockedRunDPTrainCheckOptParam,
   MockedRunLmp,
   MockedRunVasp,
   MockedSelectConfs,
   make_mocked_init_data,
   make_mocked_init_models,
   mocked_incar_template,
   mocked_numb_models,
   mocked_numb_select,
   mocked_template_script,
 )
🧹 Nitpick comments (13)
dpgen2/entrypoint/args.py (1)

470-470: Clarify semantics and bounds for async_ratio

Please make the doc explicit that async_ratio is a fraction in [0.0, 1.0] controlling the proportion of selected configurations routed to the async FP path. Right now it’s ambiguous.

Apply this doc tweak:

-    doc_async_ratio = "Configuration ratio for async fp"
+    doc_async_ratio = (
+        "Fraction [0.0, 1.0] of selected configurations routed to the asynchronous "
+        "FP path. 0.0 disables async FP; 1.0 routes all selected configurations."
+    )
dpgen2/op/collect_data.py (1)

71-90: Update execute() docstring to document async_labeled_data

The docstring omits the new optional async input. Please add it for discoverability.

@@
             - `labeled_data`: (`Artifact(List[Path])`) The paths of labeled data generated by FP tasks of the current iteration.
+            - `async_labeled_data`: (`Artifact(List[Path])`, optional)
+              The paths of labeled data generated by asynchronous FP tasks of the current iteration.
             - `iter_data`: (`Artifact(List[Path])`) The data paths previous iterations.
dpgen2/op/select_confs.py (1)

71-80: Fix return docs: pluralize confs and add async_confs

Docs still say “conf” and don’t mention async_confs.

@@
-            - `report`: (`ExplorationReport`) The report on the exploration.
-            - `conf`: (`Artifact(List[Path])`) The selected configurations.
+            - `report`: (`ExplorationReport`) The report on the exploration.
+            - `confs`: (`Artifact(List[Path])`) The selected configurations.
+            - `async_confs`: (`Artifact(List[Path])`) The configurations routed to the async FP path
+              (may be an empty list when async_ratio == 0).
dpgen2/flow/dpgen_loop.py (1)

467-478: Optional: expose async_confs at the loop/concurrent-learning outputs for observability.

If downstream consumers (e.g., resubmission tooling) need access to accumulated async_confs, consider adding an OutputArtifact on ConcurrentLearningLoop and ConcurrentLearning and forwarding the latest value similarly to models/iter_data via if_expression. This keeps the external interface symmetric.

tests/exploration/test_conf_selector_frame.py (1)

196-216: Clean up async_confs directory in tearDown to avoid test pollution.

select() writes an async_confs directory. Tear down should remove it, like confs, to keep tests hermetic.

Apply this diff to extend teardown:

   def tearDown(self):
     for ii in ["foo.dump", "bar.dump", "foo.md", "bar.md"]:
         if Path(ii).is_file():
             os.remove(ii)
-    for ii in ["confs"]:
+    for ii in ["confs", "async_confs"]:
         if Path(ii).is_dir():
             shutil.rmtree(ii)
tests/mocked_ops.py (2)

870-893: Align MockedConfSelector.select signature with the new abstract API.

The abstract ConfSelector.select now returns (confs, async_confs, report). Update the annotation accordingly to avoid confusion and future type-check issues.

Apply this diff:

   def select(
       self,
       trajs: List[Path],
       model_devis: List[Path],
       type_map: List[str] = None,
       optional_outputs: Optional[List[Path]] = None,
-  ) -> Tuple[List[Path], ExplorationReport]:
+  ) -> Tuple[List[Path], List[Path], ExplorationReport]:
       confs = []
       ...
-    return confs, [], report
+    return confs, [], report

895-915: Fix MockedAsyncConfSelector.select return annotation (currently mismatched).

The function returns three values but the type hint advertises two. Update to the triple tuple to match the base class and actual return value.

Apply this diff:

 class MockedAsyncConfSelector(ConfSelector):
     def __init__(
         self,
         conv_accuracy: float = 0.9,
     ):
         self.conv_accuracy = conv_accuracy

     def select(
         self,
         trajs: List[Path],
         model_devis: List[Path],
         type_map: List[str] = None,
         optional_outputs: Optional[List[Path]] = None,
-    ) -> Tuple[List[Path], ExplorationReport]:
+    ) -> Tuple[List[Path], List[Path], ExplorationReport]:
         fname = Path("confs")
         fname.write_text("conf of confs")
         fname = Path("async_confs")
         fname.write_text("conf of async_confs")
         report = MockedExplorationReport(conv_accuracy=self.conv_accuracy)
         return [Path("confs")], [Path("async_confs")], report
dpgen2/entrypoint/submit.py (1)

310-333: Backward-compatible default for fp.async_ratio.

Direct indexing will KeyError on older configs lacking fp.async_ratio. Use .get(..., 0) to default to synchronous behavior.

Apply these diffs:

 def make_naive_exploration_scheduler_without_conf(config, explore_style):
     model_devi_jobs = config["explore"]["stages"]
     fp_task_max = config["fp"]["task_max"]
-    fp_async_ratio = config["fp"]["async_ratio"]
+    fp_async_ratio = config["fp"].get("async_ratio", 0)
     ...
     selector = ConfSelectorFrames(
         render,
         report,
         fp_task_max,
         conf_filters,
-        fp_async_ratio,
+        fp_async_ratio,
     )
 def make_lmp_naive_exploration_scheduler(config):
     ...
     fp_task_max = config["fp"]["task_max"]
-    fp_async_ratio = config["fp"]["async_ratio"]
+    fp_async_ratio = config["fp"].get("async_ratio", 0)
     ...
     selector = ConfSelectorFrames(
         render,
         report,
         fp_task_max,
         conf_filters,
-        fp_async_ratio,
+        fp_async_ratio,
     )
 def workflow_concurrent_learning(
     config: Dict,
 ) -> Step:
     ...
-    fp_async_ratio = config["fp"]["async_ratio"]
+    fp_async_ratio = config["fp"].get("async_ratio", 0)

Also applies to: 368-395, 497-504

dpgen2/exploration/selector/conf_selector_frame.py (3)

86-90: Clarify return contract for async_confs possibly being empty

Doc currently states “list only has one item,” which isn’t true when no async selection occurs. Clarify to “zero or one item” to match runtime behavior.

Apply this diff:

-        async_confs : List[Path]
-            The selected confgurations for async fp, stored in a folder in deepmd/npy format, can be parsed as dpdata.MultiSystems. The `list` only has one item.
+        async_confs : List[Path]
+            The selected configurations for async FP, stored in a folder in deepmd/npy format, can be parsed as dpdata.MultiSystems.
+            This list has either zero or one item depending on the split result.

111-119: Handle 100% async split and empty write for ms to avoid failures in to_deepmd_npy

When async_ratio == 1.0 (or all frames randomly fall into the async bucket), ms can be empty. Depending on dpdata.MultiSystems semantics, ms.to_deepmd_npy(out_path) may error, and downstream steps expecting non-empty confs may break.

  • Verify dpdata.MultiSystems.to_deepmd_npy supports empty MultiSystems without error.
  • If not, either:
    • return an empty confs list to signal “no non-async confs,” and conditionally skip the normal FP step in the block (see block.py suggestion), or
    • ensure at least one frame remains in the non-async set when async_ratio < 1.0.

Proposed minimal guard:

         out_path = Path("confs")
         out_path.mkdir(exist_ok=True)
-        ms.to_deepmd_npy(out_path)  # type: ignore
+        if len(ms) > 0:
+            ms.to_deepmd_npy(out_path)  # type: ignore

And adjust the return to reflect emptiness:

-        return [out_path], async_confs, copy.deepcopy(self.report)
+        confs = [out_path] if len(ms) > 0 else []
+        return confs, async_confs, copy.deepcopy(self.report)

Follow-up: see coupled change to conditionally schedule the normal prep-run-fp step in block.py.


127-140: Make split_multisystems robust (type hints, clamped ratio, stable ordering)

  • Clamp ratio to [0, 1] defensively.
  • Preserve frame order in unselected_indices for reproducibility.
  • Add type hints to improve readability and tooling.

Apply this diff:

-def split_multisystems(ms, ratio):
-    selected_ms = dpdata.MultiSystems()
-    unselected_ms = dpdata.MultiSystems()
-    for s in ms:
-        nsel = math.floor(len(s) * ratio)
-        if random.random() < len(s) * ratio - nsel:
-            nsel += 1
-        selected_indices = random.sample(range(len(s)), nsel)
-        unselected_indices = list(set(range(len(s))).difference(selected_indices))
-        if len(selected_indices) > 0:
-            selected_ms.append(s.sub_system(selected_indices))
-        if len(unselected_indices) > 0:
-            unselected_ms.append(s.sub_system(unselected_indices))
-    return selected_ms, unselected_ms
+from typing import Tuple
+
+def split_multisystems(
+    ms: dpdata.MultiSystems, ratio: float
+) -> Tuple[dpdata.MultiSystems, dpdata.MultiSystems]:
+    ratio = max(0.0, min(1.0, float(ratio)))
+    selected_ms = dpdata.MultiSystems()
+    unselected_ms = dpdata.MultiSystems()
+    for s in ms:
+        n = len(s)
+        nsel = math.floor(n * ratio)
+        # randomized rounding to meet expectation exactly on average
+        if random.random() < n * ratio - nsel:
+            nsel += 1
+        if nsel > 0:
+            selected_indices = sorted(random.sample(range(n), nsel))
+        else:
+            selected_indices = []
+        selected_set = set(selected_indices)
+        # preserve original ordering for unselected indices
+        unselected_indices = [i for i in range(n) if i not in selected_set]
+        if selected_indices:
+            selected_ms.append(s.sub_system(selected_indices))
+        if unselected_indices:
+            unselected_ms.append(s.sub_system(unselected_indices))
+    return selected_ms, unselected_ms

Optional: expose a seed or RNG to make the split reproducible across runs.

dpgen2/superop/block.py (2)

277-301: Archive async_confs from select_confs to ensure artifact persistence

PythonOPTemplate currently archives only “confs”. If “async_confs” must be persisted (e.g., downloaded or passed via artifact store), include it in output_artifact_archive as well.

Apply this diff:

         select_confs = Step(
             name=name + "-select-confs",
             template=PythonOPTemplate(
                 select_confs_op,
-                output_artifact_archive={"confs": None},
+                output_artifact_archive={"confs": None, "async_confs": None},
                 python_packages=upload_python_packages,
                 **select_confs_template_config,
             ),

Please confirm whether dflow requires explicit listing for all output artifacts to be archived. If not required, this change is still harmless and makes intent explicit.


143-151: Expose async prep-run-fp key for query tooling (consistency with keys contract)

If your dflow_query logic relies on ConcurrentLearningBlock.keys, consider including the async variant so users/tools can discover it uniformly.

Apply this diff:

         self._my_keys = ["select-confs", "collect-data"]
         self._keys = (
             prep_run_dp_train_op.keys
             + prep_run_explore_op.keys
             + self._my_keys[:1]
             + prep_run_fp_op.keys
             + self._my_keys[1:2]
         )
+        if async_fp:
+            # Advertise async prep-run-fp key variant for discoverability
+            self._keys = self._keys + [f"{k}-async" for k in prep_run_fp_op.keys]

If dflow_query was explicitly updated to synthesize “-async” keys, you can skip this; otherwise this keeps the API self-descriptive.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between edbaa44 and 4339411.

📒 Files selected for processing (12)
  • dpgen2/entrypoint/args.py (2 hunks)
  • dpgen2/entrypoint/submit.py (9 hunks)
  • dpgen2/exploration/selector/conf_selector.py (1 hunks)
  • dpgen2/exploration/selector/conf_selector_frame.py (4 hunks)
  • dpgen2/flow/dpgen_loop.py (5 hunks)
  • dpgen2/op/collect_data.py (2 hunks)
  • dpgen2/op/select_confs.py (3 hunks)
  • dpgen2/superop/block.py (8 hunks)
  • dpgen2/utils/dflow_query.py (1 hunks)
  • tests/exploration/test_conf_selector_frame.py (5 hunks)
  • tests/mocked_ops.py (3 hunks)
  • tests/test_block_cl.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
dpgen2/exploration/selector/conf_selector.py (1)
dpgen2/exploration/report/report.py (1)
  • ExplorationReport (18-102)
dpgen2/op/select_confs.py (3)
dpgen2/exploration/selector/conf_selector_frame.py (1)
  • select (61-124)
tests/mocked_ops.py (2)
  • select (864-892)
  • select (902-914)
dpgen2/exploration/selector/conf_selector.py (1)
  • select (34-41)
tests/exploration/test_conf_selector_frame.py (3)
dpgen2/exploration/selector/conf_selector_frame.py (1)
  • select (61-124)
tests/mocked_ops.py (2)
  • select (864-892)
  • select (902-914)
dpgen2/exploration/render/traj_render_lammps.py (1)
  • TrajRenderLammps (40-134)
tests/test_block_cl.py (4)
tests/mocked_ops.py (7)
  • MockedPrepVasp (432-473)
  • MockedRunVasp (476-528)
  • make_mocked_init_models (109-115)
  • make_mocked_init_data (118-124)
  • MockedAsyncConfSelector (895-914)
  • MockedSelectConfs (917-940)
  • MockedCollectData (645-675)
dpgen2/superop/prep_run_fp.py (2)
  • PrepRunFp (49-125)
  • keys (124-125)
dpgen2/fp/vasp_input.py (1)
  • VaspInputs (23-110)
dpgen2/superop/block.py (2)
  • ConcurrentLearningBlock (87-190)
  • keys (189-190)
dpgen2/exploration/selector/conf_selector_frame.py (3)
tests/mocked_ops.py (2)
  • select (864-892)
  • select (902-914)
dpgen2/exploration/selector/conf_selector.py (1)
  • select (34-41)
dpgen2/exploration/report/report.py (1)
  • ExplorationReport (18-102)
tests/mocked_ops.py (2)
dpgen2/exploration/selector/conf_selector.py (2)
  • ConfSelector (30-41)
  • select (34-41)
dpgen2/exploration/selector/conf_selector_frame.py (1)
  • select (61-124)
🪛 Ruff (0.12.2)
tests/test_block_cl.py

499-499: Undefined name MockedAsyncConfSelector

(F821)

🪛 GitHub Actions: Python unit-tests
tests/test_block_cl.py

[error] 499-499: NameError: name 'MockedAsyncConfSelector' is not defined during test setup (dpgen2/dpgen2/tests/test_block_cl.py:499).

🔇 Additional comments (15)
dpgen2/entrypoint/args.py (1)

489-495: Consider validating async_ratio ∈ [0, 1]

Given it’s a ratio, values outside [0, 1] should be rejected or clamped before downstream logic uses it. If dargs lacks built-in range validators here, we can add a lightweight check where the value is consumed (e.g., in the selector or block). For this file, the doc update above is sufficient, but please ensure a runtime guard exists downstream.

Would you like me to open a follow-up patch adding a guard where async_ratio is used to split MultiSystems?

dpgen2/utils/dflow_query.py (1)

41-41: Verify async-sliced step patterns
No existing keys matching the “async–…–slice” form were found in the codebase, but absence of evidence is not evidence of absence. Please manually confirm whether any async‐sliced step keys are ever generated or consumed at runtime. If you do need to support them, apply the following updates in dpgen2/utils/dflow_query.py:

• In matched_step_key(), extend the async match to include a slice suffix:

@@ def matched_step_key(all_keys: List[str], step_keys: Optional[List[str]] = None,):
-                or re.match(f"iter-[0-9]*-async--{jj}", kk)
+                or re.match(f"iter-[0-9]*-async--{jj}", kk)
+                or re.match(f"iter-[0-9]*-async--{jj}-[0-9]*", kk)

• In find_slice_ranges(), allow an optional “-async” when detecting slices:

@@ def find_slice_ranges(keys: List[str], sliced_subkey: str,):
-            if re.match(f"iter-[0-9]*--{sliced_subkey}-[0-9]*", ii):
+            if re.match(f"iter-[0-9]*(-async)?--{sliced_subkey}-[0-9]*", ii):
@@ def find_slice_ranges(keys: List[str], sliced_subkey: str,):
-            if not re.match(f"iter-[0-9]*--{sliced_subkey}-[0-9]*", ii):
+            if not re.match(f"iter-[0-9]*(-async)?--{sliced_subkey}-[0-9]*", ii):

Please review whether any async-sliced steps exist or could be introduced, and apply these changes if needed.

dpgen2/op/select_confs.py (2)

52-53: Confirm dflow semantics for empty artifacts; consider making async_confs optional

When async_ratio == 0, implementations return an empty list for async_confs. Some dflow setups treat empty Artifact(List[Path]) as invalid. If that’s the case here, mark async_confs optional.

If needed, apply:

-                "async_confs": Artifact(List[Path]),
+                "async_confs": Artifact(List[Path], optional=True),

92-104: LGTM: triple return wiring and propagation

Unpacking (confs, async_confs, report) and exposing async_confs in outputs matches the new ConfSelector API and keeps backward compatibility for consumers of confs/report.

tests/test_block_cl.py (1)

448-601: Async FP E2E test looks solid and exercises the new data path.

Good coverage: validates the async prep-run-fp step output and the final async_confs artifact on the block step. The assertions match the mocked FP behavior.

dpgen2/flow/dpgen_loop.py (2)

191-192: Threading async_confs into loop inputs is correct.

Adding async_confs as an optional InputArtifact in both ConcurrentLearningLoop and ConcurrentLearning is consistent and backward-compatible.

Also applies to: 283-284


381-382: Correctly wiring async_confs through block and next steps.

Passing inputs.async_confs to the block, and the block’s async_confs to the next iteration ensures persistence of async inflow across iterations.

Also applies to: 454-455, 559-560

tests/exploration/test_conf_selector_frame.py (1)

102-105: Signature adaptation to triple return (confs, async_confs, report) is good.

Using the throwaway placeholder for async_confs in existing tests maintains backward compatibility while adopting the new API.

Also applies to: 127-130, 153-156, 183-185

tests/mocked_ops.py (1)

926-939: MockedSelectConfs now emits async_confs as expected.

Good adaptation of the mocked operator to the extended select() contract and outputs schema.

dpgen2/entrypoint/submit.py (3)

156-174: Public API addition async_fp is clear and well-scoped.

Exposing async_fp on make_concurrent_learning_op is the right place to control the block behavior.


262-273: Correctly forwarding async_fp to ConcurrentLearningBlock.

This ensures the block can enable the async pipeline based on config.


672-676: Providing an empty async_confs input artifact is sensible.

Seeding the DPGEN step with an empty async_confs ensures the loop’s optional artifact wiring remains valid even when not used at iteration 0.

dpgen2/superop/block.py (3)

99-101: LGTM: async_fp flag adds an opt‑in path without breaking defaults

Default False preserves existing behavior. Good backward compatibility.


119-129: Inputs/outputs for async_confs are well introduced

Adding async_confs as an optional input and as an output aligns with the new async flow.


355-357: LGTM: wiring async_confs to block outputs

Output artifact correctly sources from select_confs, enabling loop propagation.

Comment on lines +53 to 60
async_ratio: float = 0.0,
):
self.max_numb_sel = max_numb_sel
self.conf_filters = conf_filters
self.traj_render = traj_render
self.report = report
self.async_ratio = async_ratio

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Validate async_ratio in constructor to prevent out-of-bounds selection and runtime errors

Negative or >1 ratios can lead to ValueError in random.sample and undefined split behavior. Guard early.

Apply this diff:

     def __init__(
         self,
         traj_render: TrajRender,
         report: ExplorationReport,
         max_numb_sel: Optional[int] = None,
         conf_filters: Optional[ConfFilters] = None,
-        async_ratio: float = 0.0,
+        async_ratio: float = 0.0,
     ):
         self.max_numb_sel = max_numb_sel
         self.conf_filters = conf_filters
         self.traj_render = traj_render
         self.report = report
-        self.async_ratio = async_ratio
+        # Validate [0, 1] to keep downstream splitting safe
+        if not (0.0 <= async_ratio <= 1.0):
+            raise ValueError(f"async_ratio must be within [0, 1], got {async_ratio}")
+        self.async_ratio = float(async_ratio)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async_ratio: float = 0.0,
):
self.max_numb_sel = max_numb_sel
self.conf_filters = conf_filters
self.traj_render = traj_render
self.report = report
self.async_ratio = async_ratio
async_ratio: float = 0.0,
):
self.max_numb_sel = max_numb_sel
self.conf_filters = conf_filters
self.traj_render = traj_render
self.report = report
# Validate [0, 1] to keep downstream splitting safe
if not (0.0 <= async_ratio <= 1.0):
raise ValueError(f"async_ratio must be within [0, 1], got {async_ratio}")
self.async_ratio = float(async_ratio)
🤖 Prompt for AI Agents
In dpgen2/exploration/selector/conf_selector_frame.py around lines 53 to 60,
validate the async_ratio constructor argument to ensure it is within [0.0, 1.0];
if async_ratio < 0.0 or async_ratio > 1.0 raise a ValueError with a clear
message indicating the invalid value and expected range so invalid ratios cannot
later cause random.sample or splitting errors.

Comment on lines +53 to 54
"async_labeled_data": Artifact(List[Path], optional=True),
"iter_data": Artifact(List[Path]),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Guard optional async_labeled_data to avoid KeyError; simplify concatenation

Accessing ip["async_labeled_data"] raises KeyError when the optional artifact is absent. Use .get() and default to an empty list. Then drop the inner “or []” in the concatenation.

Apply:

@@ class CollectData(OP):
-                "async_labeled_data": Artifact(List[Path], optional=True),
+                "async_labeled_data": Artifact(List[Path], optional=True),
@@ def execute(self, ip: OPIO) -> OPIO:
-        async_labeled_data = ip["async_labeled_data"]
+        async_labeled_data = ip.get("async_labeled_data") or []
@@ def execute(self, ip: OPIO) -> OPIO:
-        for ii in labeled_data + (async_labeled_data or []):
+        for ii in labeled_data + async_labeled_data:

Also applies to: 95-100

🤖 Prompt for AI Agents
In dpgen2/op/collect_data.py around lines 53-54 (and similarly lines 95-100),
the code indexes the optional artifact ip["async_labeled_data"] which raises
KeyError when absent; change those accesses to ip.get("async_labeled_data", [])
so missing keys yield an empty list, and simplify concatenations by removing the
inner "or []" (i.e., concatenate using ip.get("async_labeled_data", []) +
ip["iter_data"] or similar) to avoid redundant fallbacks.

Comment on lines +239 to +257
if async_fp:
async_fp_step = Step(
name=name + "-async-prep-run-fp",
template=prep_run_fp_op,
parameters={
"block_id": f"{block_steps.inputs.parameters['block_id']}-async",
"fp_config": block_steps.inputs.parameters["fp_config"],
"type_map": block_steps.inputs.parameters["type_map"],
},
artifacts={
"confs": block_steps.inputs.artifacts["async_confs"],
},
key=f"{block_steps.inputs.parameters['block_id']}-async--prep-run-fp",
)
block_steps.add([prep_run_dp_train, async_fp_step])
async_labeled_data = async_fp_step.outputs.artifacts["labeled_data"]
else:
block_steps.add(prep_run_dp_train)
async_labeled_data = None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Gate the async FP step when no async_confs are provided to avoid runtime failures

If async_fp is True but the block receives no async_confs (typical for the first iteration), PrepRunFp may receive a missing/empty artifact and fail. Use an Argo “when” guard leveraging argo_len so the step runs only when input exists.

Apply this diff:

-    if async_fp:
-        async_fp_step = Step(
+    if async_fp:
+        async_fp_step = Step(
             name=name + "-async-prep-run-fp",
             template=prep_run_fp_op,
             parameters={
                 "block_id": f"{block_steps.inputs.parameters['block_id']}-async",
                 "fp_config": block_steps.inputs.parameters["fp_config"],
                 "type_map": block_steps.inputs.parameters["type_map"],
             },
             artifacts={
                 "confs": block_steps.inputs.artifacts["async_confs"],
             },
+            # Run only when upstream provided async_confs
+            when=argo_len(block_steps.inputs.artifacts["async_confs"]) > 0,
             key=f"{block_steps.inputs.parameters['block_id']}-async--prep-run-fp",
         )
         block_steps.add([prep_run_dp_train, async_fp_step])
         async_labeled_data = async_fp_step.outputs.artifacts["labeled_data"]
     else:

Follow-up: if you adopt the conf_selector change to return [] when confs are empty, consider adding a similar guard to the normal prep-run-fp step keyed by the length of select_confs.outputs.artifacts["confs"] to keep the DAG robust when the split is 100% async.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if async_fp:
async_fp_step = Step(
name=name + "-async-prep-run-fp",
template=prep_run_fp_op,
parameters={
"block_id": f"{block_steps.inputs.parameters['block_id']}-async",
"fp_config": block_steps.inputs.parameters["fp_config"],
"type_map": block_steps.inputs.parameters["type_map"],
},
artifacts={
"confs": block_steps.inputs.artifacts["async_confs"],
},
key=f"{block_steps.inputs.parameters['block_id']}-async--prep-run-fp",
)
block_steps.add([prep_run_dp_train, async_fp_step])
async_labeled_data = async_fp_step.outputs.artifacts["labeled_data"]
else:
block_steps.add(prep_run_dp_train)
async_labeled_data = None
if async_fp:
async_fp_step = Step(
name=name + "-async-prep-run-fp",
template=prep_run_fp_op,
parameters={
"block_id": f"{block_steps.inputs.parameters['block_id']}-async",
"fp_config": block_steps.inputs.parameters["fp_config"],
"type_map": block_steps.inputs.parameters["type_map"],
},
artifacts={
"confs": block_steps.inputs.artifacts["async_confs"],
},
# Run only when upstream provided async_confs
when=argo_len(block_steps.inputs.artifacts["async_confs"]) > 0,
key=f"{block_steps.inputs.parameters['block_id']}-async--prep-run-fp",
)
block_steps.add([prep_run_dp_train, async_fp_step])
async_labeled_data = async_fp_step.outputs.artifacts["labeled_data"]
else:
block_steps.add(prep_run_dp_train)
async_labeled_data = None
🤖 Prompt for AI Agents
In dpgen2/superop/block.py around lines 239-257, the async FP Step is created
and added unconditionally when async_fp is True which can cause PrepRunFp to
fail if the artifact async_confs is missing/empty; guard the async_fp_step with
an Argo "when" condition that checks
argo_len(block_steps.inputs.artifacts['async_confs']) > 0 so the step only
executes when async_confs exists and has items (apply the same condition to its
key/name logic as needed), and as a follow-up consider adding a similar
argo_len-based "when" guard to the normal prep-run-fp step using
select_confs.outputs.artifacts['confs'] to protect against 100% async splits.

Copy link

codecov bot commented Aug 26, 2025

Codecov Report

❌ Patch coverage is 95.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.28%. Comparing base (edbaa44) to head (f8d6ca1).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
dpgen2/entrypoint/submit.py 66.66% 1 Missing ⚠️
dpgen2/exploration/selector/conf_selector_frame.py 96.15% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #308      +/-   ##
==========================================
+ Coverage   84.22%   84.28%   +0.05%     
==========================================
  Files         104      104              
  Lines        6112     6148      +36     
==========================================
+ Hits         5148     5182      +34     
- Misses        964      966       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant