Skip to content

Conversation

@tgrunnagle
Copy link
Collaborator

Creates a github action to run the claude-code-harness against the mcp-optimizer.

Note: still determining how to run the action before merging.

Creates a github action to run the [claude-code-harness](https://github.com/StacklokLabs/claude-code-harness) against the mcp-optimizer.
@claude
Copy link

claude bot commented Oct 30, 2025

Code Review

Overall: Well-structured GitHub Actions workflow with good security practices (pinned action versions with SHA hashes). A few suggestions for improvement:

Issues & Recommendations

Critical

  • Line 54-56: Missing error handling. The test harness run should likely have continue-on-error or explicit failure handling
  • Line 49: Uses PAT secret but no documentation on required scopes. Add comment specifying required permissions (e.g., # Requires: repo read access)

Improvements

  • Line 28: platforms: linux/amd64 - Consider if multi-platform support needed (arm64)
  • Line 31: cache-from: type=gha without cache-to - Add cache-to: type=gha,mode=max for better caching
  • Line 63: Path pattern logs/*.jsonl may fail silently if no files exist. Consider adding if-no-files-found: warn or error
  • Line 54-56: Hard-coded config paths. Consider making these configurable via workflow inputs for reusability

Minor

  • Line 34: Comment says "install uv and thv" but only uv is installed. Update comment to "install uv and ToolHive" or just "install dependencies"
  • Line 39: Python 3.13 is specified but project requirements should be verified for compatibility

Security

✅ Good use of SHA-pinned actions
✅ Minimal permissions (contents: read)
⚠️ PAT token stored in secrets (unavoidable but ensure it has minimal scopes)

Suggested additions

Consider adding timeout-minutes to the job to prevent hung workflows


on:
workflow_call:
workflow_dispatch:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to call this workflow from code-checks.yml

Comment on lines 64 to 88
- name: Checkout claude-code-harness code
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
repository: StacklokLabs/claude-code-harness
ref: wait-for-running_2025-10-30
# PAT with read-only access to the claude-code-harness repo
token: ${{ secrets.GHA_CLAUDE_CODE_HARNESS_READ_PAT }}
path: claude-code-harness

# Run the test harness
- name: Run Claude Code Test Harness
run: |
cd claude-code-harness
export ANTHROPIC_API_KEY="${{ secrets.ANTHROPIC_API_KEY }}"
uv run python -m src ./configs/test/gha.json --setup ./configs/test/gha_server_setup.json --persist-servers
thv logs mcp-optimizer > ./mcp-optimizer-server.log || echo "Failed to get mcp-optimizer logs"
continue-on-error: true

# Upload the results as an artifact
- name: Upload Test Harness Run Logs
uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4 # v5.0.0
with:
name: claude-code-harness-logs
path: claude-code-harness/logs/*.jsonl
if-no-files-found: warn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to call this as a github action instead of cloning the repo completely. Meaning to do something in this action like:

- name: Run Claude Code test harness
  uses: StacklokLabs/[email protected]
  with:
    mcp_severs: ["time", "fetch"]
    cases:
      query: "In what timezone is Mexico?"
      expected: "The timezone in Mexico is foo"
    optimizer_config:
      toolhive_host: "172.17.0.1"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants