Avoid steering pytest installation commands #634

matdev83 · 2025-10-12T22:33:33Z

Summary

tighten pytest command detection in the full-suite steering handler so installation commands like pip install pytest are ignored
treat pytest token matches more strictly and add tests covering installation scenarios

Testing

python -m pytest tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py
python -m pytest (fails: existing suite issues such as gemini request counter warnings and quality checks)

https://chatgpt.com/codex/tasks/task_e_68ec25355ce08333b87d0889ca85b9a5

Summary by CodeRabbit

Bug Fixes
- Improved detection of pytest full-suite runs using robust tokenization and segmented command analysis, reducing false positives.
- Ignores installation/update commands (e.g., pip install pytest/plugins) and treats empty or complex shell commands correctly.
- Recognizes platform-specific pytest executables (including .exe/.bat and path-qualified invocations) and minimizes repeated steering within a session.
Tests
- Added unit tests covering path-qualified invocations and installation/plugin install cases.

coderabbitai · 2025-10-12T22:33:41Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds token-aware command parsing and detection to determine whether a shell command truly invokes pytest (including path-qualified executables and Windows extensions), ignores installation-related segments, tightens regex anchoring, refines extraction for shell/provider contexts, and adds tests covering path-based and install commands.

Changes

Cohort / File(s)	Summary
Pytest full-suite handler logic `src/core/services/tool_call_handlers/pytest_full_suite_handler.py`	Added shlex-based tokenization with fallback, `_token_invokes_pytest`, `_split_command_tokens`, `_command_invokes_pytest`, and `_segment_represents_installation`; updated `_PYTEST_ROOT_PATTERN` to be anchored and include `.exe`/`.bat`; switched to token/fullmatch checks and refined `_extract_pytest_command` behavior for shell tools and provider-mapped names while preserving session deduplication and swallow metadata.
Unit tests for handler behavior `tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py`	Expanded tests to assert detection for path-qualified pytest executables and to ensure installation-related commands (e.g., `pip install pytest`, `pip install pytest-cov`) are ignored (no swallow), covering positive and negative invocation forms.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Tool as Tool / User
  participant Handler as PytestFullSuiteHandler
  participant Tokenizer as Tokenizer (shlex / split)
  participant Matcher as Invocation Matcher

  Tool->>Handler: send tool call (tool name, args/command)
  Handler->>Tokenizer: split command into tokens
  Tokenizer-->>Handler: tokens
  Handler->>Matcher: evaluate tokens
  Note over Matcher: checks include\n- anchored pytest pattern (.exe/.bat)\n- full token matches\n- skip install/update segments
  alt Tokens indicate pytest invocation
    Matcher-->>Handler: true
    Handler-->>Tool: swallow & steering metadata
  else Not a pytest invocation or install-like
    Matcher-->>Handler: false
    Handler-->>Tool: ignore / pass-through
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I twitch my whiskers at paths in the night,
Sniff tokens and patterns until one feels right.
If it's just installs, I hop past with grace—
But a true pytest call gets its proper place.
Thump-thump, I guide the warren to trace. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 15.38% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly describes the primary change, which is to prevent steering behavior for pytest installation commands, matching the pull request’s focus on tightening detection logic to ignore installation scenarios. It is clear, specific, and directly reflects the main update without extraneous detail.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch codex/fix-a-bug-in-reactor-tool

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting

@codex fix this CI failure
@codex address that feedback

chatgpt-codex-connector · 2025-10-12T22:35:32Z

src/core/services/tool_call_handlers/pytest_full_suite_handler.py

 # Matches commands invoking pytest (pytest, python -m pytest, py.test, etc.)
-_PYTEST_ROOT_PATTERN = re.compile(r"\b(pytest|py\.test)(?:\b|\.py\b)", re.IGNORECASE)
+_PYTEST_ROOT_PATTERN = re.compile(
+    r"^(pytest|py\.test)(?:$|\.py$|\.exe$|\.bat$)",
+    re.IGNORECASE,
+)


Restore detection for path-qualified pytest commands

The updated regex now insists that each token be exactly pytest/py.test (via fullmatch), so _extract_pytest_command ignores commands like ./.venv/bin/pytest or C:\venv\Scripts\pytest.exe. These are common ways to invoke pytest from virtual environments and were previously matched because the pattern searched for pytest anywhere in the token. After this change, full‑suite runs executed via a path skip the steering logic entirely, defeating the handler’s purpose.

Useful? React with 👍 / 👎.

codecov · 2025-10-12T22:37:09Z

Codecov Report

❌ Patch coverage is 72.22222% with 10 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...es/tool_call_handlers/pytest_full_suite_handler.py	72.22%	5 Missing and 5 partials ⚠️

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/core/services/tool_call_handlers/pytest_full_suite_handler.py (1)
105-129: Consider edge cases with pytest token mentions.

The logic assumes any pytest token not preceded by installation keywords represents a pytest invocation. Commands like which pytest, type pytest, or echo pytest would return True, even though they don't invoke pytest.

While this may be acceptable for the shell tool use case (where pytest tokens typically indicate invocation), consider if additional filtering is needed for common query/info commands.

If needed, extend the filtering logic:
 def _segment_represents_installation(tokens: list[str]) -> bool:
     if not tokens:
         return False
 
     installation_keywords = {
         "install",
         "add",
         "remove",
         "uninstall",
         "update",
         "upgrade",
     }
+    # Also filter out query/info commands that mention pytest without invoking it
+    non_invocation_keywords = {
+        "which",
+        "type",
+        "echo",
+        "whereis",
+        "show",
+        "info",
+        "cat",
+        "ls",
+    }
 
-    return any(token.lower() in installation_keywords for token in tokens)
+    all_filter_keywords = installation_keywords | non_invocation_keywords
+    return any(token.lower() in all_filter_keywords for token in tokens)
Alternatively, rename _segment_represents_installation to _segment_does_not_invoke_pytest for clarity.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 326702b and b3e4f38.

📒 Files selected for processing (2)

src/core/services/tool_call_handlers/pytest_full_suite_handler.py (6 hunks)
tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Avoid using emojis in Python code comments or docstrings
Follow PEP 8 and use type hints for all functions
Import order must be: standard library, third-party, then local imports, separated by blank lines
Naming conventions: snake_case for variables and functions; PascalCase for classes
Error handling: use specific exceptions and include meaningful error messages
Prefer f-strings for string formatting

Files:

tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py
src/core/services/tool_call_handlers/pytest_full_suite_handler.py

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Catch specific exceptions; avoid broad except Exception blocks
If a broad exception must be caught, log with exc_info=True and re-raise a specific custom exception
Use the most specific exception class from src.core.common.exceptions that accurately describes the error
Provide clear, helpful error messages and include relevant details in the details dictionary of custom exceptions

Files:

src/core/services/tool_call_handlers/pytest_full_suite_handler.py

🧬 Code graph analysis (1)

tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py (1)

src/core/services/tool_call_handlers/pytest_full_suite_handler.py (3)

PytestFullSuiteHandler (226-326)

can_handle (243-256)

handle (258-291)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: test (3.10)
GitHub Check: Analyze (python)

🔇 Additional comments (10)

tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py (1)

189-208: LGTM! Comprehensive test coverage for installation scenarios.

The new tests correctly verify that installation commands (both direct pytest and pytest plugins) are ignored by the handler, ensuring can_handle returns False and the handler does not swallow these commands.

src/core/services/tool_call_handlers/pytest_full_suite_handler.py (9)

16-16: LGTM! Shlex import enables proper shell command parsing.

The shlex module import is necessary for robust tokenization of shell commands with proper handling of quotes and special characters.

31-34: LGTM! Stricter pattern matching with cross-platform support.

The updated pattern uses anchoring (^) and fullmatch semantics, and includes .exe and .bat extensions for Windows compatibility. This makes pytest token detection more precise.

98-102: LGTM! Robust tokenization with fallback.

The function uses shlex.split for proper shell parsing with a simple split fallback for edge cases like unterminated quotes.

158-158: LGTM! Defensive check for empty commands.

Adding an empty string check prevents potential issues downstream and aligns with robust input validation.

167-167: LGTM! Stricter token matching with fullmatch.

Using fullmatch instead of search ensures the entire token matches the pytest pattern, preventing false positives from partial matches.

311-314: LGTM! Filters out installation commands for shell tools.

The added check ensures that shell commands must actually invoke pytest. This correctly prevents installation commands from being handled by this handler.

317-317: LGTM! Stricter tool name matching.

Using fullmatch ensures the entire tool name matches the pytest pattern, preventing false positives.

320-324: LGTM! Consistent pytest invocation verification.

The check ensures that even for non-shell tools, commands are verified to invoke pytest before being processed. This maintains consistency with the shell tool handling.

132-145: Verify coverage of installation keywords
The keyword set currently covers common pip verbs; extend or document it to include other package manager commands (e.g., apt, yum, npm, cargo, poetry, pipx) based on the workflows you intend to support.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

src/core/services/tool_call_handlers/pytest_full_suite_handler.py (1)

31-45: Restored detection for path-qualified pytest via Path.name — good fix.

Using Path(token).name with a strict fullmatch correctly re-enables "./.venv/bin/pytest" and similar, while keeping matching tight.

🧹 Nitpick comments (5)

tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py (2)
149-158: Add a handler-level Windows path test.

We only validate "./.venv/bin/pytest" through the handler. Please also cover Windows-style paths for can_handle/handle; current tokenization may miss them (see handler review).

Apply this test in addition:
+@pytest.mark.asyncio
+async def test_handler_detects_windows_path_pytest_invocation() -> None:
+    handler = PytestFullSuiteHandler(enabled=True)
+    context = _build_context(r"C:\\venv\\Scripts\\pytest.exe")
+
+    assert await handler.can_handle(context) is True
+    result = await handler.handle(context)
+    assert result.should_swallow is True
203-223: Nice: install commands are ignored. Consider adding a couple more common forms.

Add coverage for chained install+run and python -m pip variants.

Apply one or both:
+@pytest.mark.asyncio
+async def test_handler_ignores_chained_install_then_run_detection_only_triggers_on_run() -> None:
+    handler = PytestFullSuiteHandler(enabled=True)
+    context = _build_context("pip install pytest && pytest -q")
+    assert await handler.can_handle(context) is True
+    result = await handler.handle(context)
+    assert result.should_swallow is True
+
+@pytest.mark.asyncio
+async def test_handler_ignores_python_m_pip_install_pytest() -> None:
+    handler = PytestFullSuiteHandler(enabled=True)
+    context = _build_context("python -m pip install -U pytest")
+    assert await handler.can_handle(context) is False
+    result = await handler.handle(context)
+    assert result.should_swallow is False
src/core/services/tool_call_handlers/pytest_full_suite_handler.py (3)
31-34: Broaden executable suffixes and distro variants.

Consider supporting common wrappers/variants: .cmd, .ps1, and distro pytest-3.
-_PYTEST_ROOT_PATTERN = re.compile(
-    r"^(pytest|py\.test)(?:$|\.py$|\.exe$|\.bat$)",
+_PYTEST_ROOT_PATTERN = re.compile(
+    r"^(pytest(?:-?\d+)?|py\.test)(?:$|\.py$|\.exe$|\.bat$|\.cmd$|\.ps1$)",
     re.IGNORECASE,
 )
120-126: Include single ampersand as a separator (cmd.exe).

Windows cmd chains commands with &. Including it improves segment detection in a & pytest.
-    separators = {"&&", ";", "||", "|"}
+    separators = {"&&", ";", "||", "|", "&"}
167-175: Use the same tokenizer to respect quoting and paths.

_looks_like_full_suite uses split() which ignores quoting and Windows paths. Reuse _split_command_tokens for consistency.
-    tokens = normalized.split()
+    tokens = _split_command_tokens(normalized)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3e4f38 and dfb57ad.

📒 Files selected for processing (2)

src/core/services/tool_call_handlers/pytest_full_suite_handler.py (6 hunks)
tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py (4 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Avoid using emojis in Python code comments or docstrings
Follow PEP 8 and use type hints for all functions
Import order must be: standard library, third-party, then local imports, separated by blank lines
Naming conventions: snake_case for variables and functions; PascalCase for classes
Error handling: use specific exceptions and include meaningful error messages
Prefer f-strings for string formatting

Files:

tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py
src/core/services/tool_call_handlers/pytest_full_suite_handler.py

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Catch specific exceptions; avoid broad except Exception blocks
If a broad exception must be caught, log with exc_info=True and re-raise a specific custom exception
Use the most specific exception class from src.core.common.exceptions that accurately describes the error
Provide clear, helpful error messages and include relevant details in the details dictionary of custom exceptions

Files:

src/core/services/tool_call_handlers/pytest_full_suite_handler.py

🔇 Additional comments (1)

tests/unit/core/services/tool_call_handlers/test_pytest_full_suite_handler.py (1)

17-19: Good additions to cover path-qualified and Windows executable names in detection.

Including "./.venv/bin/pytest" and "C:\venv\Scripts\pytest.exe" improves parity across platforms; negative case for "./.venv/bin/pytest tests/unit" is also correct.

Also applies to: 27-27

coderabbitai · 2025-10-15T21:15:33Z

src/core/services/tool_call_handlers/pytest_full_suite_handler.py

+def _split_command_tokens(command: str) -> list[str]:
+    try:
+        return shlex.split(command, posix=True)
+    except ValueError:
+        return command.split()
+
+


⚠️ Potential issue | 🟠 Major

Windows path tokenization bug: shlex POSIX parsing mangles backslashes.

For commands like C:\venv\Scripts\pytest.exe -q, shlex.split(posix=True) collapses backslashes, so _token_invokes_pytest never sees pytest.exe. Handler then fails to detect/steer on Windows strings passed via shell tools.

Apply a Windows-aware fallback in _split_command_tokens:

def _split_command_tokens(command: str) -> list[str]: - try: - return shlex.split(command, posix=True) - except ValueError: - return command.split() + # Prefer Windows parsing when an absolute Windows path is present (e.g., C:\...). + # This avoids POSIX backslash-escaping mangling Windows paths. + if re.search(r'(?i)\b[A-Z]:\\', command): + try: + return shlex.split(command, posix=False) + except ValueError: + pass + try: + return shlex.split(command, posix=True) + except ValueError: + return command.split()

Please add a handler-level test for C:\\venv\\Scripts\\pytest.exe (see tests suggestion).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _split_command_tokens(command: str) -> list[str]:

try:

return shlex.split(command, posix=True)

except ValueError:

return command.split()

def _split_command_tokens(command: str) -> list[str]:

# Prefer Windows parsing when an absolute Windows path is present (e.g., C:\...).

# This avoids POSIX backslash-escaping mangling Windows paths.

if re.search(r'(?i)\b[A-Z]:\\', command):

try:

return shlex.split(command, posix=False)

except ValueError:

pass

try:

return shlex.split(command, posix=True)

except ValueError:

return command.split()

🤖 Prompt for AI Agents

In src/core/services/tool_call_handlers/pytest_full_suite_handler.py around lines 108 to 114, shlex.split(..., posix=True) mangles Windows backslashes causing tokens like C:\venv\Scripts\pytest.exe to be altered; update _split_command_tokens to try shlex.split(command, posix=True) and on ValueError or when the result lacks backslashes on Windows fall back to using shlex.split(command, posix=False) or a raw command.split() that preserves backslashes (detect via os.name == "nt" or pathlib.Path heuristics), and add a unit test that asserts _split_command_tokens("C:\\venv\\Scripts\\pytest.exe -q") returns ["C:\\venv\\Scripts\\pytest.exe", "-q"] to ensure the handler sees pytest.exe on Windows.

Filter pytest installation commands from full-suite steering

b3e4f38

matdev83 added the codex label Oct 12, 2025 — with ChatGPT Codex Connector

chatgpt-codex-connector bot reviewed Oct 12, 2025

View reviewed changes

coderabbitai bot reviewed Oct 12, 2025

View reviewed changes

Restore path-qualified pytest detection

dfb57ad

coderabbitai bot reviewed Oct 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid steering pytest installation commands #634

Avoid steering pytest installation commands #634

Uh oh!

matdev83 commented Oct 12, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 12, 2025 •

edited

Loading

Other AI code review bot(s) detected

Review ran into problems

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Oct 12, 2025

Uh oh!

codecov bot commented Oct 12, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Avoid steering pytest installation commands #634

Are you sure you want to change the base?

Avoid steering pytest installation commands #634

Uh oh!

Conversation

matdev83 commented Oct 12, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Review ran into problems

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Oct 12, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Oct 12, 2025

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

matdev83 commented Oct 12, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 12, 2025 •

edited

Loading