Skip to content

“Actor object is missing” in GRPO example run #577

@zch42

Description

@zch42

🐛 Describe the bug

Following the installation tutorial (conda env setup), I ran:

python -m apps.grpo.main --config apps/grpo/qwen3_1_7b.yaml

with the default TORCHSTORE_RDMA_ENABLED=1, I first hit an RDMA-related failure that seems related to this issue.

After disabling RDMA (TORCHSTORE_RDMA_ENABLED=0), the run still failed with:

ActorError: A remote actor call has failed.
AssertionError: Actor object is missing when executing init_backends
Actor object is missing … local_fetcher_actor error

It appears that by the time register_fetcher runs, the policy mesh or its local_fetcher_actor has failed to start, so Monarch reports “actor object is missing” when metric loggers attempt to access it.

Versions

  • torch: 2.9.0+cu128
  • torchmonarch: 0.1.2
  • torchtitan: 0.2.0
  • vLLM: 0.10.1.dev0 (commit 6d8d0a24c, built 2025-11-14)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions