-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Reimplement the default object pool for improved performance #45251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your PR, @geeknoid. Someone from the team will get assigned to your PR shortly and we'll get it reviewed. |
|
/benchmark plaintext aspnet-citrine-win kestrel |
|
Benchmark started for plaintext on aspnet-citrine-win with kestrel. Logs: link |
|
Oh wait, this is ObjectPool, that wouldn't affect Kestrel specific benchmarks |
|
Benchmark code from the zip: // © Microsoft Corporation. All rights reserved.
using System.Collections.Generic;
using System.Threading.Tasks;
using BenchmarkDotNet.Attributes;
using Microsoft.Extensions.ObjectPool;
namespace Microsoft.R9.Extensions.Pools.Bench;
#pragma warning disable R9A038 // Use 'Microsoft.R9.Extensions.Pools.ScaledObjectPool' instead of 'Microsoft.Extensions.ObjectPool.DefaultObjectPool' for improved performance
#pragma warning disable S109 // Magic numbers should not be used
#pragma warning disable R9A041 // Use concrete types when possible for improved performance
[MemoryDiagnoser]
public class Pools
{
private ObjectPool<Foo> _defaultPool = null!;
private ObjectPool<Foo> _scaledPool = null!;
[Params(2, 8, 16, 32, 256, 1024, 2048)]
public int MaxRetained { get; set; }
[Params(0, 8, 16)]
public int OverflowCount { get; set; }
[GlobalSetup]
public void GlobalSetup()
{
_defaultPool = new DefaultObjectPool<Foo>(new DefaultPooledObjectPolicy<Foo>(), MaxRetained);
_scaledPool = PoolFactory.CreatePool<Foo>(new DefaultPooledObjectPolicy<Foo>(), MaxRetained);
for (int i = 0; i < MaxRetained; i++)
{
_defaultPool.Return(new Foo());
_scaledPool.Return(new Foo());
}
}
[Benchmark]
public void DefaultPool()
{
UsePool(_defaultPool);
}
[Benchmark]
public void ScaledPool()
{
UsePool(_scaledPool);
}
[Benchmark]
public void DefaultPoolThreaded()
{
var t0 = Task.Run(() => UsePool(_defaultPool));
var t1 = Task.Run(() => UsePool(_defaultPool));
#pragma warning disable VSTHRD002 // Avoid problematic synchronous waits
Task.WaitAll(t0, t1);
#pragma warning restore VSTHRD002 // Avoid problematic synchronous waits
}
[Benchmark]
public void ScaledPoolThreaded()
{
var t0 = Task.Run(() => UsePool(_scaledPool));
var t1 = Task.Run(() => UsePool(_scaledPool));
#pragma warning disable VSTHRD002 // Avoid problematic synchronous waits
Task.WaitAll(t0, t1);
#pragma warning restore VSTHRD002 // Avoid problematic synchronous waits
}
private void UsePool(ObjectPool<Foo> pool)
{
var list = new List<Foo>(MaxRetained + OverflowCount);
for (int i = 0; i < MaxRetained + OverflowCount; i++)
{
var foo = pool.Get();
list.Add(foo);
}
foreach (var f in list)
{
pool.Return(f);
}
}
private sealed class Foo
{
}
} |
|
@sebastienros PTAL |
|
It would be good to check the microbenchmark into a I think it would probably make sense to test with more threads concurrently getting and then immediately returning pooled objects. As it stands, I don't think this test is creating as much contention as a high-scale web application with objects being both added and removed simultaneously. I think in this more realistic scenario, using the most-recently returned item (i.e. We should figure out what scale we want to focus on. We've discussed moving a similar type to the runtime (see dotnet/runtime#49680), but I don't think that should stop us from improving the existing type. |
|
@halter73 We run our pools usually with 1024 or 2048 entries so that we can recover from spiky traffic efficiently (if you imagine traffic spiking at 1K RPS and sometimes dropping to <100 RPS). Thus in our environment, it's not bound to the number of processors in the machine, it's bound to the number of concurrent operations that may need one of the pooled resources. That is, we use these pools not so much to avoid repeated allocations of the same object while processing a single request, but more to avoid allocations across multiple incoming concurrent requests. Using a stack would likely be a net improvement over using a bag for this case, but alas ConcurrentStack allocates and is thus ineffective for this use. But the stack only has a significant effect for quick get/put cases in quick succession, and shouldn't matter nearly as much when dealing with the broader cross-request cases as described above. Let me run the test with more contention, although I suspect the results will be very similar. The existing pool implementation is not designed for anything but very small retained sets. |
|
What changed between this run and the previous one? The original numbers showed worse perf at lower scale, e.g. |
|
@BrennanConroy Seems like my benchmark wasn't right. I'm updating it now and will report back once done. |
|
I've updated the implementation a bit and have a brand new more detailed benchmark. I'm attaching the benchmark as a zip file, and the README file in there shows the benchmark results. A key observation is that there tends to be two modalities of use for pooled objects. Short-lived rentals vs. long-lived rentals. Short-lived is something like "I need a StringBuilder and will return it very quickly" as opposed to "I need a FooBar for the duration of this request's lifetime". To satisfy the first case, you only need a few items in the pool. To satisfy the second case, you need as many items as there is potential variation in the # of concurrent requests processed. Anyway, the implementation update deals with this duality and delivers better perf all around than the existing implementation. |
|
Thanks for re-optimizing the return-it-very-quickly case. Given the microbenchmark results seem to show at most a small regression in some already-fast cases and much better performance in "scaled" cases, I'm personally leaning towards taking this. Can you rename the Pools.Bench to Microsoft.Extensions.ObjectPoo.Microbenchmarks and put project in |
|
@halter73 I added the benchmark code as requested. I had to modify it to remove the comparison with the old version (since the code base only contains the new version at this point). But I left the results intact in the readme with an explanation. |
halter73
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good! ConcurrentQueue really does seem to be the best for this kind of thing. Maybe if we exposed ConcurrentQueueSegment as a runtime ObjectPool as suggested by
@stephentoub and @benaadams in dotnet/runtime#49680 and dotnet/runtime#23700, we could do even better. But perfect shouldn't be the enemy of better.
Thanks for checking in the benchmarks too. @sebastienros @BrennanConroy @davidfowl Do we have any objections to merging this?
|
Merge away! |
|
@geeknoid, this change will be considered for inclusion in the blog post for the release it'll ship in. Nice work! Please ensure that the original comment in this thread contains a clear explanation of what the change does, why it's important (what problem does it solve?), and, if relevant, include things like code samples and/or performance numbers. This content may not be exactly what goes into the blog post, but it will help the team putting together the announcement. Thanks! |
|
Do we / how much do we care about performance of the pool on .NET Framework? If we don't care, great. If we do care, I think there's a good chance this will have regressed performance there, since |
|
Hi @stephentoub. It looks like you just commented on a closed PR. The team will most probably miss it. If you'd like to bring something important up to their attention, consider filing a new issue and add enough details to build context. |
|
@stephentoub good point. I know we "care" but we don't have any benchmarks for .NET Framework. @geeknoid How much do you care about this regressing .NET Framework performance? |
|
I care a little, but I'm not overly concerned. In general, if teams care deeply about perf, they will not be using the legacy runtimes. However, we could put back the old code as-is with a #if. It's not a big deal. |
|
Hi @geeknoid. It looks like you just commented on a closed PR. The team will most probably miss it. If you'd like to bring something important up to their attention, consider filing a new issue and add enough details to build context. |
Reimplement the default object pool for improved performance
Rewrite the default pool so that it scales better.
Description
The existing object pool performs very poorly at scale, this new one is considerably better. Attached is a benchmark, here are the results: