Skip to content

Conversation

@matdev83
Copy link
Owner

@matdev83 matdev83 commented Oct 12, 2025

Summary

  • tighten pytest command detection in the full-suite steering handler so installation commands like pip install pytest are ignored
  • treat pytest token matches more strictly and add tests covering installation scenarios

Testing

  • python -m pytest tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py
  • python -m pytest (fails: existing suite issues such as gemini request counter warnings and quality checks)

https://chatgpt.com/codex/tasks/task_e_68ec25355ce08333b87d0889ca85b9a5

Summary by CodeRabbit

  • Bug Fixes

    • Improved detection of pytest full-suite runs using robust tokenization and segmented command analysis, reducing false positives.
    • Ignores installation/update commands (e.g., pip install pytest/plugins) and treats empty or complex shell commands correctly.
    • Recognizes platform-specific pytest executables (including .exe/.bat and path-qualified invocations) and minimizes repeated steering within a session.
  • Tests

    • Added unit tests covering path-qualified invocations and installation/plugin install cases.

@coderabbitai
Copy link

coderabbitai bot commented Oct 12, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds token-aware command parsing and detection to determine whether a shell command truly invokes pytest (including path-qualified executables and Windows extensions), ignores installation-related segments, tightens regex anchoring, refines extraction for shell/provider contexts, and adds tests covering path-based and install commands.

Changes

Cohort / File(s) Summary
Pytest full-suite handler logic
src/core/services/tool_call_handlers/pytest_full_suite_handler.py
Added shlex-based tokenization with fallback, _token_invokes_pytest, _split_command_tokens, _command_invokes_pytest, and _segment_represents_installation; updated _PYTEST_ROOT_PATTERN to be anchored and include .exe/.bat; switched to token/fullmatch checks and refined _extract_pytest_command behavior for shell tools and provider-mapped names while preserving session deduplication and swallow metadata.
Unit tests for handler behavior
tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py
Expanded tests to assert detection for path-qualified pytest executables and to ensure installation-related commands (e.g., pip install pytest, pip install pytest-cov) are ignored (no swallow), covering positive and negative invocation forms.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Tool as Tool / User
  participant Handler as PytestFullSuiteHandler
  participant Tokenizer as Tokenizer (shlex / split)
  participant Matcher as Invocation Matcher

  Tool->>Handler: send tool call (tool name, args/command)
  Handler->>Tokenizer: split command into tokens
  Tokenizer-->>Handler: tokens
  Handler->>Matcher: evaluate tokens
  Note over Matcher: checks include\n- anchored pytest pattern (.exe/.bat)\n- full token matches\n- skip install/update segments
  alt Tokens indicate pytest invocation
    Matcher-->>Handler: true
    Handler-->>Tool: swallow & steering metadata
  else Not a pytest invocation or install-like
    Matcher-->>Handler: false
    Handler-->>Tool: ignore / pass-through
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I twitch my whiskers at paths in the night,
Sniff tokens and patterns until one feels right.
If it's just installs, I hop past with grace—
But a true pytest call gets its proper place.
Thump-thump, I guide the warren to trace. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.38% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly describes the primary change, which is to prevent steering behavior for pytest installation commands, matching the pull request’s focus on tightening detection logic to ignore installation scenarios. It is clear, specific, and directly reflects the main update without extraneous detail.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/fix-a-bug-in-reactor-tool

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting

Comment on lines 29 to +34
# Matches commands invoking pytest (pytest, python -m pytest, py.test, etc.)
_PYTEST_ROOT_PATTERN = re.compile(r"\b(pytest|py\.test)(?:\b|\.py\b)", re.IGNORECASE)
_PYTEST_ROOT_PATTERN = re.compile(
r"^(pytest|py\.test)(?:$|\.py$|\.exe$|\.bat$)",
re.IGNORECASE,
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore detection for path-qualified pytest commands

The updated regex now insists that each token be exactly pytest/py.test (via fullmatch), so _extract_pytest_command ignores commands like ./.venv/bin/pytest or C:\venv\Scripts\pytest.exe. These are common ways to invoke pytest from virtual environments and were previously matched because the pattern searched for pytest anywhere in the token. After this change, full‑suite runs executed via a path skip the steering logic entirely, defeating the handler’s purpose.

Useful? React with 👍 / 👎.

@codecov
Copy link

codecov bot commented Oct 12, 2025

Codecov Report

❌ Patch coverage is 72.22222% with 10 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...es/tool_call_handlers/pytest_full_suite_handler.py 72.22% 5 Missing and 5 partials ⚠️

📢 Thoughts on this report? Let us know!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/core/services/tool_call_handlers/pytest_full_suite_handler.py (1)

105-129: Consider edge cases with pytest token mentions.

The logic assumes any pytest token not preceded by installation keywords represents a pytest invocation. Commands like which pytest, type pytest, or echo pytest would return True, even though they don't invoke pytest.

While this may be acceptable for the shell tool use case (where pytest tokens typically indicate invocation), consider if additional filtering is needed for common query/info commands.

If needed, extend the filtering logic:

 def _segment_represents_installation(tokens: list[str]) -> bool:
     if not tokens:
         return False
 
     installation_keywords = {
         "install",
         "add",
         "remove",
         "uninstall",
         "update",
         "upgrade",
     }
+    # Also filter out query/info commands that mention pytest without invoking it
+    non_invocation_keywords = {
+        "which",
+        "type",
+        "echo",
+        "whereis",
+        "show",
+        "info",
+        "cat",
+        "ls",
+    }
 
-    return any(token.lower() in installation_keywords for token in tokens)
+    all_filter_keywords = installation_keywords | non_invocation_keywords
+    return any(token.lower() in all_filter_keywords for token in tokens)

Alternatively, rename _segment_represents_installation to _segment_does_not_invoke_pytest for clarity.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 326702b and b3e4f38.

📒 Files selected for processing (2)
  • src/core/services/tool_call_handlers/pytest_full_suite_handler.py (6 hunks)
  • tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Avoid using emojis in Python code comments or docstrings
Follow PEP 8 and use type hints for all functions
Import order must be: standard library, third-party, then local imports, separated by blank lines
Naming conventions: snake_case for variables and functions; PascalCase for classes
Error handling: use specific exceptions and include meaningful error messages
Prefer f-strings for string formatting

Files:

  • tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py
  • src/core/services/tool_call_handlers/pytest_full_suite_handler.py
src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Catch specific exceptions; avoid broad except Exception blocks
If a broad exception must be caught, log with exc_info=True and re-raise a specific custom exception
Use the most specific exception class from src.core.common.exceptions that accurately describes the error
Provide clear, helpful error messages and include relevant details in the details dictionary of custom exceptions

Files:

  • src/core/services/tool_call_handlers/pytest_full_suite_handler.py
🧬 Code graph analysis (1)
tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py (1)
src/core/services/tool_call_handlers/pytest_full_suite_handler.py (3)
  • PytestFullSuiteHandler (226-326)
  • can_handle (243-256)
  • handle (258-291)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test (3.10)
  • GitHub Check: Analyze (python)
🔇 Additional comments (10)
tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py (1)

189-208: LGTM! Comprehensive test coverage for installation scenarios.

The new tests correctly verify that installation commands (both direct pytest and pytest plugins) are ignored by the handler, ensuring can_handle returns False and the handler does not swallow these commands.

src/core/services/tool_call_handlers/pytest_full_suite_handler.py (9)

16-16: LGTM! Shlex import enables proper shell command parsing.

The shlex module import is necessary for robust tokenization of shell commands with proper handling of quotes and special characters.


31-34: LGTM! Stricter pattern matching with cross-platform support.

The updated pattern uses anchoring (^) and fullmatch semantics, and includes .exe and .bat extensions for Windows compatibility. This makes pytest token detection more precise.


98-102: LGTM! Robust tokenization with fallback.

The function uses shlex.split for proper shell parsing with a simple split fallback for edge cases like unterminated quotes.


158-158: LGTM! Defensive check for empty commands.

Adding an empty string check prevents potential issues downstream and aligns with robust input validation.


167-167: LGTM! Stricter token matching with fullmatch.

Using fullmatch instead of search ensures the entire token matches the pytest pattern, preventing false positives from partial matches.


311-314: LGTM! Filters out installation commands for shell tools.

The added check ensures that shell commands must actually invoke pytest. This correctly prevents installation commands from being handled by this handler.


317-317: LGTM! Stricter tool name matching.

Using fullmatch ensures the entire tool name matches the pytest pattern, preventing false positives.


320-324: LGTM! Consistent pytest invocation verification.

The check ensures that even for non-shell tools, commands are verified to invoke pytest before being processed. This maintains consistency with the shell tool handling.


132-145: Verify coverage of installation keywords
The keyword set currently covers common pip verbs; extend or document it to include other package manager commands (e.g., apt, yum, npm, cargo, poetry, pipx) based on the workflows you intend to support.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/core/services/tool_call_handlers/pytest_full_suite_handler.py (1)

31-45: Restored detection for path-qualified pytest via Path.name — good fix.

Using Path(token).name with a strict fullmatch correctly re-enables "./.venv/bin/pytest" and similar, while keeping matching tight.

🧹 Nitpick comments (5)
tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py (2)

149-158: Add a handler-level Windows path test.

We only validate "./.venv/bin/pytest" through the handler. Please also cover Windows-style paths for can_handle/handle; current tokenization may miss them (see handler review).

Apply this test in addition:

+@pytest.mark.asyncio
+async def test_handler_detects_windows_path_pytest_invocation() -> None:
+    handler = PytestFullSuiteHandler(enabled=True)
+    context = _build_context(r"C:\\venv\\Scripts\\pytest.exe")
+
+    assert await handler.can_handle(context) is True
+    result = await handler.handle(context)
+    assert result.should_swallow is True

203-223: Nice: install commands are ignored. Consider adding a couple more common forms.

Add coverage for chained install+run and python -m pip variants.

Apply one or both:

+@pytest.mark.asyncio
+async def test_handler_ignores_chained_install_then_run_detection_only_triggers_on_run() -> None:
+    handler = PytestFullSuiteHandler(enabled=True)
+    context = _build_context("pip install pytest && pytest -q")
+    assert await handler.can_handle(context) is True
+    result = await handler.handle(context)
+    assert result.should_swallow is True
+
+@pytest.mark.asyncio
+async def test_handler_ignores_python_m_pip_install_pytest() -> None:
+    handler = PytestFullSuiteHandler(enabled=True)
+    context = _build_context("python -m pip install -U pytest")
+    assert await handler.can_handle(context) is False
+    result = await handler.handle(context)
+    assert result.should_swallow is False
src/core/services/tool_call_handlers/pytest_full_suite_handler.py (3)

31-34: Broaden executable suffixes and distro variants.

Consider supporting common wrappers/variants: .cmd, .ps1, and distro pytest-3.

-_PYTEST_ROOT_PATTERN = re.compile(
-    r"^(pytest|py\.test)(?:$|\.py$|\.exe$|\.bat$)",
+_PYTEST_ROOT_PATTERN = re.compile(
+    r"^(pytest(?:-?\d+)?|py\.test)(?:$|\.py$|\.exe$|\.bat$|\.cmd$|\.ps1$)",
     re.IGNORECASE,
 )

120-126: Include single ampersand as a separator (cmd.exe).

Windows cmd chains commands with &. Including it improves segment detection in a & pytest.

-    separators = {"&&", ";", "||", "|"}
+    separators = {"&&", ";", "||", "|", "&"}

167-175: Use the same tokenizer to respect quoting and paths.

_looks_like_full_suite uses split() which ignores quoting and Windows paths. Reuse _split_command_tokens for consistency.

-    tokens = normalized.split()
+    tokens = _split_command_tokens(normalized)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3e4f38 and dfb57ad.

📒 Files selected for processing (2)
  • src/core/services/tool_call_handlers/pytest_full_suite_handler.py (6 hunks)
  • tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Avoid using emojis in Python code comments or docstrings
Follow PEP 8 and use type hints for all functions
Import order must be: standard library, third-party, then local imports, separated by blank lines
Naming conventions: snake_case for variables and functions; PascalCase for classes
Error handling: use specific exceptions and include meaningful error messages
Prefer f-strings for string formatting

Files:

  • tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py
  • src/core/services/tool_call_handlers/pytest_full_suite_handler.py
src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Catch specific exceptions; avoid broad except Exception blocks
If a broad exception must be caught, log with exc_info=True and re-raise a specific custom exception
Use the most specific exception class from src.core.common.exceptions that accurately describes the error
Provide clear, helpful error messages and include relevant details in the details dictionary of custom exceptions

Files:

  • src/core/services/tool_call_handlers/pytest_full_suite_handler.py
🔇 Additional comments (1)
tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py (1)

17-19: Good additions to cover path-qualified and Windows executable names in detection.

Including "./.venv/bin/pytest" and "C:\venv\Scripts\pytest.exe" improves parity across platforms; negative case for "./.venv/bin/pytest tests/unit" is also correct.

Also applies to: 27-27

Comment on lines +108 to +114
def _split_command_tokens(command: str) -> list[str]:
try:
return shlex.split(command, posix=True)
except ValueError:
return command.split()


Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Windows path tokenization bug: shlex POSIX parsing mangles backslashes.

For commands like C:\venv\Scripts\pytest.exe -q, shlex.split(posix=True) collapses backslashes, so _token_invokes_pytest never sees pytest.exe. Handler then fails to detect/steer on Windows strings passed via shell tools.

Apply a Windows-aware fallback in _split_command_tokens:

 def _split_command_tokens(command: str) -> list[str]:
-    try:
-        return shlex.split(command, posix=True)
-    except ValueError:
-        return command.split()
+    # Prefer Windows parsing when an absolute Windows path is present (e.g., C:\...).
+    # This avoids POSIX backslash-escaping mangling Windows paths.
+    if re.search(r'(?i)\b[A-Z]:\\', command):
+        try:
+            return shlex.split(command, posix=False)
+        except ValueError:
+            pass
+    try:
+        return shlex.split(command, posix=True)
+    except ValueError:
+        return command.split()

Please add a handler-level test for C:\\venv\\Scripts\\pytest.exe (see tests suggestion).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _split_command_tokens(command: str) -> list[str]:
try:
return shlex.split(command, posix=True)
except ValueError:
return command.split()
def _split_command_tokens(command: str) -> list[str]:
# Prefer Windows parsing when an absolute Windows path is present (e.g., C:\...).
# This avoids POSIX backslash-escaping mangling Windows paths.
if re.search(r'(?i)\b[A-Z]:\\', command):
try:
return shlex.split(command, posix=False)
except ValueError:
pass
try:
return shlex.split(command, posix=True)
except ValueError:
return command.split()
🤖 Prompt for AI Agents
In src/core/services/tool_call_handlers/pytest_full_suite_handler.py around
lines 108 to 114, shlex.split(..., posix=True) mangles Windows backslashes
causing tokens like C:\venv\Scripts\pytest.exe to be altered; update
_split_command_tokens to try shlex.split(command, posix=True) and on ValueError
or when the result lacks backslashes on Windows fall back to using
shlex.split(command, posix=False) or a raw command.split() that preserves
backslashes (detect via os.name == "nt" or pathlib.Path heuristics), and add a
unit test that asserts _split_command_tokens("C:\\venv\\Scripts\\pytest.exe -q")
returns ["C:\\venv\\Scripts\\pytest.exe", "-q"] to ensure the handler sees
pytest.exe on Windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant