fix: Gracefully skip overlong prompts during training to prevent crashes #280

erranlli · 2025-10-31T09:58:01Z

Summary

Implements graceful degradation for prompts exceeding max_prompt_length, preventing training crashes.

Problem

Training crashed when prompts exceeded max length: Exception: Trajectory {idx}: initial prompt length 3302 already exceeded max_prompt_length 2048, retrying

Solution

✅ Overlong prompts return None and are skipped gracefully
✅ Batch size dynamically adjusts to match

Key Changes

agent_execution_engine.py: Return None for overlong prompts instead of crashing
agent_ppo_trainer.py: Track skipped indices and filter batch to match

Benefits

Training continues instead of failing
No NaN gradients from division by zero
Dynamic batch size adjustment
Clean and simple implementation

Testing

bash examples/deepscaler/test_graceful_degradation.sh

Expected: Training continues with warnings when 3302-token prompt is encountered, no crashes.

Files Changed

rllm/engine/agent_execution_engine.py (graceful degradation)
rllm/trainer/verl/agent_ppo_trainer.py (batch alignment)

LianShuQuan · 2025-11-01T06:47:46Z

in generate_agent_trajectories_async()

                async for item in self.agent_execution_engine.trajectory_generator(timing_raw=timing_raw, mode=mode, meta_info=meta_info):
                    # This item can not be None. Instead, overlong prompts will be skipped.
                    queue.put(item)

Because

            if item is None:
                break

And then in generate_agent_trajectory()

                for trajectory in gen_seq_generator:
                    # Skip None trajectories (overlong prompts)
                    if trajectory is not None:
                        trajectories.append(trajectory)

is not necessary

erranlli added 2 commits November 7, 2025 09:39

fix format

d01af13

minor fix

25577bb

erranlli force-pushed the new-lenfilter branch from fda3768 to 25577bb Compare November 7, 2025 09:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Gracefully skip overlong prompts during training to prevent crashes #280

fix: Gracefully skip overlong prompts during training to prevent crashes #280

erranlli commented Oct 31, 2025

Uh oh!

LianShuQuan commented Nov 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: Gracefully skip overlong prompts during training to prevent crashes #280

Are you sure you want to change the base?

fix: Gracefully skip overlong prompts during training to prevent crashes #280

Conversation

erranlli commented Oct 31, 2025

Summary

Problem

Solution

Key Changes

Benefits

Testing

Files Changed

Uh oh!

LianShuQuan commented Nov 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants