Add backpressure when rapidly creating new stateful Streamable HTTP sessions without closing them #677
+90
−15
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Also reduce MaxIdleSessionCount to default to 10,000
I tested the AspNetCoreMcpSample using https://github.com/wg/wrk/ and the following command:
And the following lua script:
The ~30,000 RPS isn't too bad considering the RPS I get only ~5,000 RPS more using the same parameters against the following minimal endpoint:
However, @DavidParks8 made another benchmark that created a new session per request which quickly overloaded the server. I emulated this benchmark by simply commenting out as the
-- wrk.headers["Mcp-Session-Id"] = args[1]
part of the lua script, and sure enough after just over 100,000 requests the server became overwhelmed. I attached a debugger and saw GC could not keep up disposing all the McpServers/McpSessions and the hundreds of thread pool threads started that were responsible for calling DisposeAsync and unreferencing the pruned idle sessions stalled waiting on GC leading to thread pool starvation.With these changes, since we only allow 11,000 new sessions to be created with the new default idle session limit of 10,000, we'll only prune down to 1,000 idle sessions every 5 seconds if the client doesn't gracefully close the session with a DELETE request.
The new default steady-states to approximately 200 new sessions per second. We can look at improving the maximum new session rate this once we have better object pooling to avoid GC pressure when creating new sessions so rapidly. We could also look into proactively pruning idle sessions when new sessions are waiting rather than waiting on the background service.
Here are results that show that the new session RPS remains stable after this change even if you try to open more concurrent connections to allow more parallel requests. Memory usage looks stable as well at about 400 MB in these tests.
These changes do not have any apparent impact on single-session performance which remains a little over 30k RPS.