Skip to content

stdio_client hangs during session.initialize() due to failed message transfer via internal anyio memory stream #382

@RansSelected

Description

@RansSelected

Labels: bug, transport:stdio, client, server, anyio

Body:

Environment:

  • OS: Debian GNU/Linux 11 (bullseye) / kernel: Linux 5.10.0-34-cloud-amd64 / x86-64 (on GCP Vertex AI Workbench)
  • Python Version: 3.10.16
  • modelcontextprotocol SDK Version: [v1.5.0]
  • anyio Version: [v4.9.0]

Description:

When using the documented mcp.client.stdio.stdio_client to connect to a mcp.server.fastmcp.FastMCP server running via the stdio transport (await mcp.run_stdio_async()), the client consistently hangs during the await session.initialize() call, eventually timing out.

Extensive debugging using monkeypatching revealed the following sequence:

  1. The client connects successfully via stdio_client.
  2. The client sends the initialize request.
  3. The server process starts correctly.
  4. The background task within mcp.server.stdio.stdio_server successfully reads the initialize request from the process's stdin (using anyio.wrap_file(TextIOWrapper(...))).
  5. This background task successfully sends the validated JSONRPCMessage onto the anyio memory stream (read_stream_writer) intended for the server's main processing loop.
  6. The server's main processing loop, specifically within mcp.shared.session.BaseSession._receive_loop, awaits messages on the receiving end of that same anyio memory stream (async for message in self._read_stream:).
  7. Crucially, the async for loop in BaseSession._receive_loop never yields the message that was sent to the memory stream. It remains blocked.
  8. Because the initialize message is never received by the BaseSession loop, no response is generated.
  9. The client eventually times out waiting for the initialize response.

This indicates a failure in message passing across the anyio memory stream used internally by the stdio transport implementation, specifically between the task group managing stdio bridging and the task group managing session message processing, when running under the asyncio backend in this configuration.

A separate test confirmed that replacing the internal anyio memory streams with standard asyncio.Queues does allow the message to be transferred successfully between these task contexts, allowing initialization and subsequent communication to proceed. This strongly suggests the issue lies within the anyio memory stream implementation or its usage in this specific cross-task-group stdio scenario.

Steps to Reproduce:

  1. Save the following server code as mcp_file_server.py:
    (Use the original, unpatched version that calls await mcp.run_stdio_async())

    # mcp_file_server.py (Original - Demonstrates Hang)
    import asyncio
    import sys
    from pathlib import Path
    import logging
    
    logging.basicConfig(level=logging.DEBUG, format='%(asctime)s [%(name)s] %(levelname)s: %(message)s')
    log = logging.getLogger("MCPFileServer_Original")
    
    try:
        import pandas as pd
        from mcp.server.fastmcp import FastMCP
        import mcp.server.stdio as mcp_stdio
    except ImportError as e:
        log.error(f"Import error: {e}")
        sys.exit(1)
    
    mcp = FastMCP("FileToolsServer")
    log.info("FastMCP server 'FileToolsServer' initialized.")
    
    @mcp.tool()
    def FileReaderTool(uri: str) -> str:
        log.info(f"Tool 'FileReaderTool' called with URI: {uri}")
        if not uri.startswith("file:"): return f"Error: Invalid URI scheme."
        try:
            fp = Path(uri.replace("file://", "")).resolve()
            if not fp.is_file(): return f"Error: File not found: {fp}"
            content = fp.read_text(encoding="utf-8")
            log.info(f"Read {len(content)} chars from {fp}")
            return content
        except Exception as e: log.exception(f"Error reading file {uri}"); return f"Error: Failed to read file '{uri}'. Reason: {str(e)}"
    
    @mcp.tool()
    def CsvReaderTool(uri: str) -> str:
        log.info(f"Tool 'CsvReaderTool' called with URI: {uri}")
        if not uri.startswith("file:"): return f"Error: Invalid URI scheme."
        try:
            fp = Path(uri.replace("file://", "")).resolve()
            if not fp.is_file(): return f"Error: CSV file not found: {fp}"
            df = pd.read_csv(fp)
            content_str = df.to_string(index=False)
            log.info(f"Read and formatted CSV from {fp}")
            return content_str
        except Exception as e: log.exception(f"Error reading CSV file {uri}"); return f"Error: Failed to read CSV file '{uri}'. Reason: {str(e)}"
    
    async def main():
        log.info("Starting MCP server main() coroutine.")
        try:
            log.info("Entering stdio_server context manager...")
            # stdio_server yields anyio memory streams
            async with mcp_stdio.stdio_server() as (read_stream, write_stream):
                log.debug(f"stdio_server provided read_stream: {type(read_stream)}")
                log.debug(f"stdio_server provided write_stream: {type(write_stream)}")
                log.info("stdio streams established. Calling mcp.run_stdio_async()...")
                log.debug(">>> About to await mcp.run_stdio_async()")
                # This internally calls Server.run which uses BaseSession._receive_loop
                await mcp.run_stdio_async()
                log.debug("<<< mcp.run_stdio_async() completed") # This is never reached before client disconnect
                log.info("mcp.run_stdio_async() finished.")
            log.info("stdio_server context exited.")
        except Exception as e:
            log.exception("Exception occurred within stdio_server or mcp.run_stdio_async()")
        finally:
            log.info("MCP server main() function exiting.")
    
    if __name__ == "__main__":
        log.info(f"Executing server script: {__file__}")
        try:
            asyncio.run(main())
        except KeyboardInterrupt: log.info("Server stopped by user.")
        except Exception as e: log.exception("An unexpected error occurred at the top level.")
  2. Save the following client code as minimal_client.py:
    (Use the version corrected for Python 3.10 timeouts and list_tools processing)

    # minimal_client.py
    import asyncio
    import sys
    import logging
    from pathlib import Path
    
    logging.basicConfig(level=logging.INFO, format='%(asctime)s [Minimal Client] %(levelname)s: %(message)s')
    log = logging.getLogger("MinimalClient")
    
    try:
        from mcp import ClientSession, StdioServerParameters, types as mcp_types
        from mcp.client.stdio import stdio_client
    except ImportError as e:
        sys.exit(f"Import Error: {e}. Ensure 'modelcontextprotocol' is installed.")
    
    SERVER_SCRIPT_PATH = Path("./mcp_file_server.py").resolve()
    
    async def run_minimal_test_inner():
        log.info("Starting minimal client test.")
        if not SERVER_SCRIPT_PATH.is_file():
            log.error(f"Server script not found: {SERVER_SCRIPT_PATH}")
            return False
        server_params = StdioServerParameters(command=sys.executable, args=[str(SERVER_SCRIPT_PATH)])
        log.info(f"Server params: {sys.executable} {SERVER_SCRIPT_PATH}")
        init_successful = False
        try:
            log.info("Attempting to connect via stdio_client...")
            async with stdio_client(server_params) as (reader, writer):
                log.info("stdio_client connected. Creating ClientSession...")
                async with ClientSession(reader, writer) as session:
                    log.info("ClientSession created. Initializing...")
                    try:
                        init_timeout = 30.0
                        init_result = await asyncio.wait_for(session.initialize(), timeout=init_timeout)
                        log.info(f"Initialize successful! Server capabilities: {init_result.capabilities}")
                        init_successful = True
                        try:
                            list_timeout = 15.0
                            list_tools_response = await asyncio.wait_for(session.list_tools(), timeout=list_timeout)
                            log.info(f"Raw tools list response: {list_tools_response!r}")
                            tools_list = getattr(list_tools_response, 'tools', None)
                            if tools_list is not None and isinstance(tools_list, list):
                                tool_names = [t.name for t in tools_list if hasattr(t, 'name')]
                                if tool_names: log.info(f"Successfully listed tools: {tool_names}")
                                else: log.warning("Tools list present but no tool names found.")
                            else: log.warning("Could not get tools list from response.")
                        except asyncio.TimeoutError: log.error("Timeout listing tools.")
                        except Exception as e_list: log.exception("Error listing tools.")
                    except asyncio.TimeoutError: log.error(f"Timeout ({init_timeout}s) waiting for session.initialize().")
                    except Exception as e_init: log.exception("Error during session.initialize().")
                log.info("Exiting ClientSession context.")
            log.info("Exiting stdio_client context.")
        except Exception as e_main: log.exception(f"An error occurred connecting or during session: {e_main}")
        return init_successful
    
    async def main_with_overall_timeout():
        overall_timeout = 45.0
        log.info(f"Running test with overall timeout: {overall_timeout}s")
        try:
            success = await asyncio.wait_for(run_minimal_test_inner(), timeout=overall_timeout)
            if success: log.info("Minimal client test: INITIALIZATION SUCCEEDED.")
            else: log.error("Minimal client test: INITIALIZATION FAILED (within timeout).")
        except asyncio.TimeoutError: log.error(f"Minimal client test: OVERALL TIMEOUT ({overall_timeout}s) REACHED.")
        except Exception as e: log.exception("Unexpected error in main_with_overall_timeout")
    
    if __name__ == "__main__":
        try: asyncio.run(main_with_overall_timeout())
        except KeyboardInterrupt: log.info("Test interrupted.")
  3. Install dependencies: pip install modelcontextprotocol pandas (or using uv)

  4. Run the client: python minimal_client.py

Expected Behavior:

The client connects, initializes successfully, lists tools, and exits cleanly.

Actual Behavior:

The client connects but hangs at the Initializing... step. After the 30-second timeout expires for session.initialize(), it logs the timeout error and exits. Server logs confirm that mcp.run_stdio_async() was awaited but never processed the incoming message until after the client disconnected.

Logs:

(Logs showing the client timeout and the server hanging after >>> About to await mcp.run_stdio_async())

uv run minimal_client.py 
2025-03-27 09:59:39,140 [Minimal Client] INFO: Running test with overall timeout: 45.0s
2025-03-27 09:59:39,140 [Minimal Client] INFO: Starting minimal client test.
2025-03-27 09:59:39,140 [Minimal Client] INFO: Server params: /home/jupyter/MCP_TEST/.venv/bin/python3 /home/jupyter/MCP_TEST/mcp_file_server.py
2025-03-27 09:59:39,140 [Minimal Client] INFO: Attempting to connect via stdio_client...
2025-03-27 09:59:39,144 [Minimal Client] INFO: stdio_client connected. Creating ClientSession...
2025-03-27 09:59:39,144 [Minimal Client] INFO: ClientSession created. Initializing...
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Initializing server 'FileToolsServer'
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Registering handler for ListToolsRequest
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Registering handler for CallToolRequest
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Registering handler for ListResourcesRequest
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Registering handler for ReadResourceRequest
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Registering handler for PromptListRequest
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Registering handler for GetPromptRequest
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Registering handler for ListResourceTemplatesRequest
2025-03-27 09:59:39,811 [MCPFileServer_Original] INFO: FastMCP server 'FileToolsServer' initialized.
2025-03-27 09:59:39,813 [MCPFileServer_Original] INFO: Executing server script: /home/jupyter/MCP_TEST/mcp_file_server.py
2025-03-27 09:59:39,813 [asyncio] DEBUG: Using selector: EpollSelector
2025-03-27 09:59:39,813 [MCPFileServer_Original] INFO: Starting MCP server main() coroutine.
2025-03-27 09:59:39,814 [MCPFileServer_Original] INFO: Entering stdio_server context manager...
2025-03-27 09:59:39,817 [MCPFileServer_Original] DEBUG: stdio_server provided read_stream: <class 'anyio.streams.memory.MemoryObjectReceiveStream'>
2025-03-27 09:59:39,817 [MCPFileServer_Original] DEBUG: stdio_server provided write_stream: <class 'anyio.streams.memory.MemoryObjectSendStream'>
2025-03-27 09:59:39,817 [MCPFileServer_Original] INFO: stdio streams established. Calling mcp.run_stdio_async()...
2025-03-27 09:59:39,817 [MCPFileServer_Original] DEBUG: >>> About to await mcp.run_stdio_async()
2025-03-27 10:00:09,175 [Minimal Client] ERROR: Timeout (30.0s) waiting for session.initialize().
2025-03-27 10:00:09,175 [Minimal Client] INFO: Exiting ClientSession context.
2025-03-27 10:00:09,176 [MCPFileServer_Original] DEBUG: <<< mcp.run_stdio_async() completed
2025-03-27 10:00:09,176 [MCPFileServer_Original] INFO: mcp.run_stdio_async() finished.
2025-03-27 10:00:24,165 [Minimal Client] ERROR: Minimal client test: OVERALL TIMEOUT (45.0s) REACHED.

Additional Context:

  • Further debugging using extensive monkeypatching confirmed that the background task in mcp.server.stdio.stdio_server does successfully read the initialize request from stdin and sends it to the internal anyio memory stream.
  • However, the async for loop within mcp.shared.session.BaseSession._receive_loop (which reads from that memory stream) never yields the message.
  • Replacing the internal anyio memory streams with standard asyncio.Queues allowed the communication to succeed, isolating the problem to the anyio memory stream communication between the stdio bridging task group and the session processing task group.

This appears to be a bug in the stdio transport implementation related to anyio memory streams and task group interaction under the asyncio backend.

The patched working version with asyncio.Queue attached in

[working_code.zip](https://github.com/user-attachments/files/19485125/working_code.zip)

Run vis uv run minimal_client.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions