feat(AsyncExecutor): AsyncExecutor allows adding tasks after shutdown, which may cause unpredictable exceptions. #1092

CLFutureX · 2025-11-07T03:39:43Z

Background:
Currently, AsyncExecutor still permits task additions after being shut down. This could lead to unpredictable exceptions.
Optimizations:
1 Add closed-loop logic to improve the lifecycle management of the async executor.
2 Add pre-checks to reduce concurrent locking overhead.
3 Strengthen validation for the current loop, requiring it to be in a running state.
4 Optimize the waiting process for loop startup.
5 Fix resource leaks that might happen with the current shutdown

Signed-off-by: CLFutureX <[email protected]>

CLFutureX · 2025-11-07T03:40:56Z

@ryanhoangt @xingyaoww hey, PTAL , thanks

github-actions

Based on my analysis of the AsyncExecutor changes, here is my review:

Issues Found

🔴 Critical Issues

Race condition in _ensure_loop() early return (lines 32-33): The optimization that checks if self._loop is not None and self._loop.is_running() BEFORE acquiring the lock introduces a critical race condition. Between checking if the loop is running and returning it, another thread could call _shutdown_loop(), making the returned loop invalid. This violates the "thread safety" guarantee stated in the class docstring.

Impact: A caller could receive a reference to a loop that's being shut down or already stopped, leading to RuntimeErrors when trying to schedule tasks.

Recommendation: Remove the early return at lines 32-33, or move the check inside the lock to maintain thread safety.
Bypass of shutdown check: The early return at line 33 bypasses the _shutdown.is_set() check at line 35-36. This means that if shutdown has been initiated but the loop hasn't fully stopped yet, _ensure_loop() could still return the loop, allowing new tasks to be scheduled during shutdown.

Recommendation: Ensure the shutdown check always happens before returning a loop reference.

🟡 Important Issues

Typo in error message (line 36): Error message says "asyncExecutor has been shut down" but should be "AsyncExecutor" (capital E) for consistency with the class name.

String continuation style (lines 41-42): The warning message uses backslash line continuation (\), which is not the preferred Python style. Implicit string concatenation with parentheses is cleaner and more maintainable.

# Current:
logger.warning("The loop is not empty, but it is not in a running state." \
" Under normal circumstances, this should not happen.")

# Better:
logger.warning(
    "The loop is not empty, but it is not in a running state. "
    "Under normal circumstances, this should not happen."
)

Questionable "should not happen" warning: The warning at lines 41-42 claims this scenario "should not happen under normal circumstances," but the new early return optimization actually makes this more likely due to potential race conditions.

🟢 Minor Issues

Import ordering (line 7): import time is placed after from collections.abc import Callable, breaking the convention of grouping standard library imports alphabetically. Should be moved up to line 4 (after import threading).
Double-check locking optimization may be premature: The performance benefit of the early return check (lines 32-33) is likely minimal for this use case, while it introduces significant correctness risks. Consider whether this optimization is actually needed based on profiling data.

Signed-off-by: CLFutureX <[email protected]>

CLFutureX · 2025-11-08T13:28:24Z

Based on my analysis of the AsyncExecutor changes, here is my review:

Issues Found

🔴 Critical Issues

Race condition in _ensure_loop() early return (lines 32-33): The optimization that checks if self._loop is not None and self._loop.is_running() BEFORE acquiring the lock introduces a critical race condition. Between checking if the loop is running and returning it, another thread could call _shutdown_loop(), making the returned loop invalid. This violates the "thread safety" guarantee stated in the class docstring.
Impact: A caller could receive a reference to a loop that's being shut down or already stopped, leading to RuntimeErrors when trying to schedule tasks.
Recommendation: Remove the early return at lines 32-33, or move the check inside the lock to maintain thread safety.

Bypass of shutdown check: The early return at line 33 bypasses the _shutdown.is_set() check at line 35-36. This means that if shutdown has been initiated but the loop hasn't fully stopped yet, _ensure_loop() could still return the loop, allowing new tasks to be scheduled during shutdown.
Recommendation: Ensure the shutdown check always happens before returning a loop reference.

🟡 Important Issues
Typo in error message (line 36): Error message says "asyncExecutor has been shut down" but should be "AsyncExecutor" (capital E) for consistency with the class name.
String continuation style (lines 41-42): The warning message uses backslash line continuation (\), which is not the preferred Python style. Implicit string concatenation with parentheses is cleaner and more maintainable.
# Current:
logger.warning("The loop is not empty, but it is not in a running state." \
" Under normal circumstances, this should not happen.")

# Better:
logger.warning(
    "The loop is not empty, but it is not in a running state. "
    "Under normal circumstances, this should not happen."
)
Questionable "should not happen" warning: The warning at lines 41-42 claims this scenario "should not happen under normal circumstances," but the new early return optimization actually makes this more likely due to potential race conditions.
🟢 Minor Issues

Import ordering (line 7): import time is placed after from collections.abc import Callable, breaking the convention of grouping standard library imports alphabetically. Should be moved up to line 4 (after import threading).

Double-check locking optimization may be premature: The performance benefit of the early return check (lines 32-33) is likely minimal for this use case, while it introduces significant correctness risks. Consider whether this optimization is actually needed based on profiling data.

Good analysis – this is indeed an area that needs further optimization. Here's the new adjustment: since the loop is a shared resource, we should add a lock for every write operation on it.

…nt-sdk into fix_acync_executor

CLFutureX · 2025-11-08T13:34:26Z

@openhands-agent please help review it

github-actions

Based on my analysis of the PR changes, here is my review:

Issues Found

🔴 Critical Issues

Event loop resource leak in _shutdown_loop (lines 62-81): The method stops the event loop but never calls loop.close(), causing a ResourceWarning: unclosed event loop. After stopping the loop and joining the thread, you must add loop.close() to properly clean up resources.
```
# After line 81, add:
if loop and not loop.is_closed():
    loop.close()
```
Missing error handling for loop.close() (line 43): In _safe_execute_on_loop, loop.close() is called without a try-except block. If the loop has pending callbacks or is in an invalid state, this will raise an exception and crash the method while holding the lock. Wrap this in proper error handling.

🟡 Important Issues

Coroutine leak when run_async is called with pre-created coroutine after shutdown (lines 106-113): If a user creates a coroutine and then calls run_async(coro) after shutdown, the coroutine will never be awaited, causing a RuntimeWarning: coroutine was never awaited. When raising RuntimeError at line 107, you should check if awaitable_or_fn is already a coroutine and close it with awaitable_or_fn.close().
Misleading log message in _shutdown_loop (line 65): The message "AsyncExecutor has been shutdown" implies past tense, but it's checking if shutdown was already called. Change to "AsyncExecutor is already shut down" or "Shutdown already in progress" for clarity.
No timeout for loop startup wait (lines 55-56): The while not loop.is_running() loop could potentially hang forever if the thread fails to start properly. Add a timeout mechanism (e.g., 5 seconds) and raise an exception if the loop doesn't start in time.

🟢 Minor Issues

Confusing warning message wording (line 41): "The loop is not empty" is unclear. Consider changing to "The loop exists but is not running. This usually indicates improper cleanup." for better clarity.
Minor typo in branch name: Branch is named fix_acync_executor instead of fix_async_executor (missing 's').

enyst · 2025-11-08T17:17:30Z

@openhands-agent please help review it

Just to note, sorry, I'm not sure if the tag works right now on agent-sdk, but if it does, it's running in github actions on the repo, like the 'ask for review' that I just did, so it will require write permissions on the repo and probably not a fork branch, a repo branch. But you could install it on your repo and run it. 😅

Signed-off-by: CLFutureX <[email protected]>

CLFutureX · 2025-11-10T02:19:11Z

@openhands-agent please help review it

Just to note, sorry, I'm not sure if the tag works right now on agent-sdk, but if it does, it's running in github actions on the repo, like the 'ask for review' that I just did, so it will require write permissions on the repo and probably not a fork branch, a repo branch. But you could install it on your repo and run it. 😅

@enyst All adjusted， PTAL

enyst · 2025-11-10T02:57:21Z

@OpenHands Do a /codereview-roasted on this PR.

Everytime you answer the user, use markdown with headings, bold, and such, because it’s more readable. Introduce yourself in the first line too, so the reader knows who is speaking.

openhands-ai · 2025-11-10T02:57:30Z

I'm on it! enyst can track my progress at all-hands.dev

openhands-ai · 2025-11-10T03:25:20Z

Hi, I’m OpenHands — your brutally honest code reviewer. Here’s a concise summary of my “/codereview-roasted” on PR #1092.

What I Did

Used the GitHub API to fetch PR metadata and the diff for PR feat(AsyncExecutor): AsyncExecutor allows adding tasks after shutdown, which may cause unpredictable exceptions. #1092.
Reviewed the changes in openhands-sdk/openhands/sdk/utils/async_executor.py.

Key Findings

Critical: Timeout doesn’t cancel tasks
- run_async calls fut.result(timeout) without canceling on asyncio.TimeoutError. The task continues running in the background and can leak resources.
Critical: Shutdown race window
- _shutdown.set() is set after taking the lock and clearing loop/thread. There’s a narrow window where submissions can slip through before the shutdown flag is visible.
Behavior Change: close() becomes one-way
- Previously, the executor could lazily recreate the loop after close(). Now, once closed, it’s permanently disabled. This is a breaking change unless explicitly documented.
Concurrency/Design: Lock held too long and callback-under-lock
- Loop startup waits while holding the class lock (even with sleep), and the callback-under-lock pattern complicates lifecycle vs scheduling. Both increase contention and complexity.
Shutdown completeness
- Shutdown stops and closes the loop but doesn’t cancel pending tasks, risking “Task was destroyed but pending!” warnings and leaks.
Style: Ruff E501 violations (line-length=88)
- Long lines at 31, 41, 123; plus a minor log grammar nit (“shutdown” vs “shut down”).

Recommendations

Cancel on timeout
- Wrap fut.result(timeout) and on asyncio.TimeoutError call fut.cancel() then re-raise.
Set shutdown gate earlier
- Call self._shutdown.set() before acquiring the lock in _shutdown_loop to immediately block new submissions.
Clarify lifecycle semantics
- If “close is final,” document RuntimeError on run_async post-close. If not, restore restart behavior and separate “closing” vs “closed and restartable” states.
Reduce lock hold-time and simplify control flow
- Use a threading.Event signaled by the runner when loop starts; release the lock while waiting. Avoid running scheduling callbacks while holding the lock.
Graceful shutdown of pending tasks
- Schedule a coroutine to cancel and await all pending tasks before stop/close.
Fix E501 and minor wording
- Break long function signatures/strings and adjust log messaging.

Verdict

Needs small-but-important rework: address timeout cancellation, shutdown gating, lifecycle clarity, and lock-hold simplification. Then it’s a robust, predictable async-to-sync bridge.

No Code Changes Made

Per your request, I only performed a review and did not modify or push any code. If you want, I can implement the suggested fixes next and run pre-commit/pytest accordingly.

_{View full conversation}

Signed-off-by: CLFutureX <[email protected]>

CLFutureX · 2025-11-10T06:02:16Z

Ruff E501 violations (line-length=88)

@OpenHands Do a /codereview-roasted on this PR.

Everytime you answer the user, use markdown with headings, bold, and such, because it’s more readable. Introduce yourself in the first line too, so the reader knows who is speaking.

hey， New adjustments:
1 Add cancellation for pending tasks
2 Move self._shutdown.set() inside the lock to avoid duplicate execution of subsequent logic:
with self._lock: if self._shutdown.is_set(): return self._shutdown.set()
3 Format optimizations
4 The existing description of the close method may already clearly state that the executor's lifecycle will end:
"""Close the async executor and cleanup resources."""

openhands-ai · 2025-11-10T06:02:26Z

I'm on it! CLFutureX can track my progress at all-hands.dev

openhands-ai · 2025-11-10T06:05:29Z

📋 Final Summary: AsyncExecutor Code Review Completed

What I Did

I provided a comprehensive "/codereview-roasted" analysis of PR #1092 for the AsyncExecutor changes, as requested. Here's what I accomplished:

✅ Review Process Completed:

Fetched PR Details - Retrieved PR feat(AsyncExecutor): AsyncExecutor allows adding tasks after shutdown, which may cause unpredictable exceptions. #1092 metadata and diff from GitHub API
Applied Changes Locally - Created fix_acync_executor branch and applied the diff to examine the actual code
Analyzed Code Quality - Ran ruff checks and identified linting issues (E501, import sorting, TimeoutError usage)
Delivered Brutal Review - Provided Linus-style technical critique with specific line-by-line feedback

🔥 Key Findings from My Review:

Taste Rating: 🟡 Acceptable (Works but violates simplicity principles)

Critical Issues Identified:

Over-engineered callback pattern in _safe_execute_on_loop() - unnecessary complexity
Double-checked locking anti-pattern in shutdown logic - race condition prone
Poor abstraction choices - turned simple 3-line logic into complex callback soup

Improvements Needed:

Fix E501 line length violations in logging statements
Eliminate unnecessary nested functions
Address fragile task cancellation logic with race conditions

What They Got Right:

✅ Task cancellation on timeout (real bug fix)
✅ Shutdown state tracking with threading.Event
✅ Proper resource cleanup during shutdown

My Recommendation

❌ Needs rework - The author is solving a real problem (preventing task submission after shutdown) but with an over-engineered solution. The callback pattern should be reverted to a simple _ensure_loop() approach with shutdown checks added directly.

No Code Changes Made

As requested for a code review, I did not modify any code - only provided detailed technical feedback for the author to address. The review identifies both the legitimate improvements (timeout cancellation, shutdown tracking) and the problematic over-engineering that needs simplification.

The user can now use this feedback to guide improvements to the PR before merging.

_{View full conversation}

Signed-off-by: CLFutureX <[email protected]>

CLFutureX · 2025-11-10T06:38:00Z

📋 Final Summary: AsyncExecutor Code Review Completed

What I Did

I provided a comprehensive "/codereview-roasted" analysis of PR #1092 for the AsyncExecutor changes, as requested. Here's what I accomplished:

✅ Review Process Completed:

Fetched PR Details - Retrieved PR feat(AsyncExecutor): AsyncExecutor allows adding tasks after shutdown, which may cause unpredictable exceptions. #1092 metadata and diff from GitHub API

Applied Changes Locally - Created fix_acync_executor branch and applied the diff to examine the actual code

Analyzed Code Quality - Ran ruff checks and identified linting issues (E501, import sorting, TimeoutError usage)

Delivered Brutal Review - Provided Linus-style technical critique with specific line-by-line feedback

🔥 Key Findings from My Review:

Taste Rating: 🟡 Acceptable (Works but violates simplicity principles)

Critical Issues Identified:

Over-engineered callback pattern in _safe_execute_on_loop() - unnecessary complexity

Double-checked locking anti-pattern in shutdown logic - race condition prone

Poor abstraction choices - turned simple 3-line logic into complex callback soup

Improvements Needed:

Fix E501 line length violations in logging statements

Eliminate unnecessary nested functions

Address fragile task cancellation logic with race conditions

What They Got Right:

✅ Task cancellation on timeout (real bug fix)

✅ Shutdown state tracking with threading.Event

✅ Proper resource cleanup during shutdown

My Recommendation

❌ Needs rework - The author is solving a real problem (preventing task submission after shutdown) but with an over-engineered solution. The callback pattern should be reverted to a simple _ensure_loop() approach with shutdown checks added directly.

No Code Changes Made

As requested for a code review, I did not modify any code - only provided detailed technical feedback for the author to address. The review identifies both the legitimate improvements (timeout cancellation, shutdown tracking) and the problematic over-engineering that needs simplification.

The user can now use this feedback to guide improvements to the PR before merging.

View full conversation

hey，please review again

CLFutureX and others added 2 commits November 7, 2025 11:38

optimize asyncExecutor

98845f4

Signed-off-by: CLFutureX <[email protected]>

Merge branch 'main' into fix_acync_executor

f92bdb0

xingyaoww requested review from openhands-agent and ryanhoangt November 7, 2025 15:39

github-actions bot reviewed Nov 7, 2025

View reviewed changes

CLFutureX added 3 commits November 8, 2025 21:06

update

9575713

Signed-off-by: CLFutureX <[email protected]>

update

1bf460b

Signed-off-by: CLFutureX <[email protected]>

update

7f944ca

Signed-off-by: CLFutureX <[email protected]>

Merge branch 'fix_acync_executor' of https://github.com/CLFutureX/age…

856b96b

…nt-sdk into fix_acync_executor

enyst requested review from openhands-agent and removed request for openhands-agent November 8, 2025 17:05

github-actions bot reviewed Nov 8, 2025

View reviewed changes

update

27ab7a9

Signed-off-by: CLFutureX <[email protected]>

CLFutureX added 2 commits November 10, 2025 11:46

update

7840df6

Signed-off-by: CLFutureX <[email protected]>

update

992e2a8

Signed-off-by: CLFutureX <[email protected]>

update

aa37af2

Signed-off-by: CLFutureX <[email protected]>

Merge branch 'main' into fix_acync_executor

93c9539

feat(AsyncExecutor): AsyncExecutor allows adding tasks after shutdown, which may cause unpredictable exceptions. #1092

Are you sure you want to change the base?

feat(AsyncExecutor): AsyncExecutor allows adding tasks after shutdown, which may cause unpredictable exceptions. #1092

Conversation

CLFutureX commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLFutureX commented Nov 7, 2025

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Issues Found

Uh oh!

CLFutureX commented Nov 8, 2025

Issues Found

Issues Found

Uh oh!

CLFutureX commented Nov 8, 2025

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Issues Found

Uh oh!

enyst commented Nov 8, 2025

Uh oh!

CLFutureX commented Nov 10, 2025

Uh oh!

enyst commented Nov 10, 2025

Uh oh!

openhands-ai bot commented Nov 10, 2025

Uh oh!

openhands-ai bot commented Nov 10, 2025

What I Did

Key Findings

Recommendations

Verdict

No Code Changes Made

Uh oh!

CLFutureX commented Nov 10, 2025

Uh oh!

openhands-ai bot commented Nov 10, 2025

Uh oh!

openhands-ai bot commented Nov 10, 2025

📋 Final Summary: AsyncExecutor Code Review Completed

What I Did

✅ Review Process Completed:

🔥 Key Findings from My Review:

My Recommendation

No Code Changes Made

Uh oh!

CLFutureX commented Nov 10, 2025

📋 Final Summary: AsyncExecutor Code Review Completed

What I Did

✅ Review Process Completed:

🔥 Key Findings from My Review:

My Recommendation

No Code Changes Made

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CLFutureX commented Nov 7, 2025 •

edited

Loading