Right-size Lists when created #23714

martincostello · 2020-07-06T21:15:27Z

Changes

Create new instances of List<T> with an appropriate capacity for the items that will be added.
Use Array.Empty<T>() where appropriate, rather than create an empty list and then return it.
~~Create lists with the default capacity for a list containing only a small number of items, which is 4, so that a resize is not immediately required when the first item is added.~~

Rationale

I noticed there were a few places where lists were created and then immediately added to, so thought that the capacity could be specified instead. In many cases the lists have the potential to be added to again later after the method creating them returns, so I've used the default capacity the Lists would have had before the change once members were added by the code, rather than the exact number of items, as otherwise any resizes that may have happened would then happen sooner than before. Similarly this might also have changed when the resizes caused the arrays to double in size, so increasing the memory footprint.

I've separate this PR into two commits as the second commit uses 4 as a magic number quite a lot. I've put a comment on each instance of its use explaining why, but if the change is wanted there might be a better way to do it than sprinkling 4 everywhere.

I used the following micro-benchmarks to guide the use of 4, rather than say 1, as the default capacity.

As you can see from the *ThenAddOnce benchmarks, adding 1 item to an empty list allocates the same memory as adding 1 item to a list with a capacity of 4, but it does so ~37% faster due to there being no need to resize the internal array from 0 to 4 when the first item is added.

While only setting a capacity of 1 saves 8 bytes compared to initial capacities of 0 and 4, it doubles the time required to add a second item later, as well as requiring an allocation of an extra 24 bytes, as can be seen by the *ThenAddTwice benchmarks.

Overall, CreateWithFour* with new List<T>(4) seems the best option as it uses the same memory as new List<T>(), while being faster for adding 1 and 2 items using Add().

Benchmarks

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19041.329 (2004/?/20H1)
Intel Core i7-6700HQ CPU 2.60GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.1.301
  [Host]     : .NET Core 3.1.5 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.27001), X64 RyuJIT
  DefaultJob : .NET Core 3.1.5 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.27001), X64 RyuJIT

Method	Mean	Error	StdDev	Median	Ratio	RatioSD	Gen 0	Gen 1	Gen 2	Allocated
CreateEmptyThenAddOnce	23.92 ns	0.874 ns	2.521 ns	23.18 ns	1.00	0.00	0.0229	-	-	72 B
CreateWithOneThenAddOnce	13.99 ns	0.450 ns	1.276 ns	13.77 ns	0.59	0.08	0.0204	-	-	64 B
CreateWithFourThenAddOnce	14.95 ns	0.665 ns	1.949 ns	14.31 ns	0.63	0.11	0.0229	-	-	72 B
CreateEmptyThenAddTwice	24.28 ns	0.788 ns	2.237 ns	23.85 ns	1.02	0.13	0.0229	-	-	72 B
CreateWithOneThenAddTwice	44.18 ns	1.267 ns	3.511 ns	43.88 ns	1.87	0.25	0.0306	-	-	96 B
CreateWithFourThenAddTwice	16.94 ns	0.612 ns	1.796 ns	16.68 ns	0.71	0.10	0.0229	-	-	72 B

using System.Collections.Generic;
using BenchmarkDotNet.Attributes;

namespace ListBenchmark
{
    [MemoryDiagnoser]
    public class CreateListBenchmarks
    {
        [Benchmark(Baseline = true)]
        public List<int> CreateEmptyThenAddOnce()
        {
            var list = new List<int>();
            list.Add(1);
            return list;
        }

        [Benchmark]
        public List<int> CreateWithOneThenAddOnce()
        {
            var list = new List<int>(1);
            list.Add(1);
            return list;
        }

        [Benchmark]
        public List<int> CreateWithFourThenAddOnce()
        {
            var list = new List<int>(4);
            list.Add(1);
            return list;
        }

        [Benchmark]
        public List<int> CreateEmptyThenAddTwice()
        {
            var list = new List<int>();
            list.Add(1);
            list.Add(2);
            return list;
        }

        [Benchmark]
        public List<int> CreateWithOneThenAddTwice()
        {
            var list = new List<int>(1);
            list.Add(1);
            list.Add(2);
            return list;
        }

        [Benchmark]
        public List<int> CreateWithFourThenAddTwice()
        {
            var list = new List<int>(4);
            list.Add(1);
            list.Add(2);
            return list;
        }
    }
}

BrennanConroy · 2020-07-06T21:35:49Z

Interesting benchmarks, however these are super micro-optimizations and make the code a bit uglier to look at.

We should check which of these (if any) are per request and consider keeping only those ones and drop the rest of the changes.

martincostello · 2020-07-06T21:48:49Z

Interesting benchmarks, however these are super micro-optimizations and make the code a bit uglier to look at.

Yeah, it's not the prettiest thing once the comments explaining it get added.

We should check which of these (if any) are per request and consider keeping only those ones and drop the rest of the changes.

Yep, happy to cull anything not considered impactful enough to preserve the readability. I started with a Find All and reviewed all the non-test usage, and already pre-culled a few of them that looked like start-up one-offs to me.

I've been having a few issues with the build locally on my laptop, so pushed it up as a draft to start with to make sure it built correctly and I hadn't broken anything in the tests.

JamesNK · 2020-07-07T03:32:33Z

Whenever we're initializing a collection from a loop, pre-initializing the size is a good improvement.

I would rather not initialize collections to 1/2/3/4 sizes because it doesn't add much, and it adds a small maintainability burden. If we change the number of items manually added to the collection then we need to remember to change the initial capacity size. We should only make that sort of micro-optimization if it is on a hot-path (per-request)

martincostello · 2020-07-07T07:14:23Z

Anything either of you would like me to do on this before I un-draft it? I'm just conscious that with changes in quite a few files over the repo it will ping quite a few code owners for review 😅

martincostello · 2020-07-07T07:15:38Z

Quick minor note, any instances of 4 without the // 4 is the default capacity after 1 item is added comment are cases where 4 items are specifically added to the created list.

src/Http/Headers/src/HttpHeaderParser.cs

src/Middleware/HttpsPolicy/src/HstsOptions.cs

gfoidl · 2020-07-07T08:11:10Z

src/Mvc/Mvc.Core/src/Formatters/InputFormatter.cs

I'd change these and the following in the next files to the

mediaTypes ??= new List<string>(/* ... */)

pattern when you're on it.

src/SignalR/common/Http.Connections/src/Internal/HttpConnectionDispatcher.cs

martincostello · 2020-08-05T17:59:17Z

@pranavkm Anything specific you'd like me to edit on this PR before hitting the Ready for review button?

pranavkm · 2020-08-05T18:09:40Z

The few times that I've tried initializing collections with the right size, I hadn't noticed any significant performance improvement in application throughput \ allocation profiles. But it would be uncool to now get this in since you've already made the changes.

@halter73 \ @BrennanConroy thoughts on this?

BrennanConroy · 2020-08-05T18:13:24Z

I think we mostly wanted to scope the change down to initializing the collection with a size when we know how many elements are being added, such as loops
https://github.com/dotnet/aspnetcore/pull/23714/files#diff-cfe0ce07896d89a04f6140a2eb808b08R101

martincostello · 2020-08-05T18:16:30Z

Ok cool, I'll edit the changes soon to just do the capacities when the number of items is known 👍

Create new instances of List<T> with an appropriate capacity for the items that will be added. Use Array.Empty<T>() where appropriate, rather than create an empty list and then return it.

BrennanConroy

SignalR looks good except for the one comment

src/SignalR/common/Protocols.MessagePack/src/Protocol/MessagePackHubProtocolWorker.cs

…ackHubProtocolWorker.cs Co-authored-by: Brennan <[email protected]>

src/Razor/Razor/src/TagHelpers/ReadOnlyTagHelperAttributeList.cs

pranavkm · 2020-08-07T18:47:59Z

src/Servers/Kestrel/Transport.Sockets/src/Internal/SocketSender.cs

            if (_bufferList == null)
            {
-                _bufferList = new List<ArraySegment<byte>>();
+                _bufferList = new List<ArraySegment<byte>>((int)buffer.Length);


Could we revert this change? The usage is a little different from other changes (we're pre-allocating the list based on the number of spans in a sequence) and I worry there might be security implications to this.

Revert the change to set the capacity of the list.

Use Array.Empty<TagHelperAttribute>() in two places. Remove static readonly field containing zero-length array.

Co-authored-by: Pranav K <[email protected]>

pranavkm · 2020-08-07T19:09:59Z

Thanks for the PR!

Use Array.Empty<T>() instead of creating a new list.

Revert two changes to use Array.Empty<T>(), as it breaks things even though the class says it's read-only...

martincostello · 2020-08-19T07:25:50Z

Any more feedback on this? Looks like the failing test is just some Selenium flakiness.

pranavkm · 2020-08-19T16:45:59Z

Thanks again!

martincostello force-pushed the Rightsize-Lists branch from dab5b2d to 766aa6f Compare July 6, 2020 21:32

gfoidl reviewed Jul 7, 2020

View reviewed changes

mkArtakMSFT added the area-blazor Includes: Blazor, Razor Components label Jul 8, 2020

mkArtakMSFT assigned pranavkm Jul 8, 2020

mkArtakMSFT added the community-contribution Indicates that the PR has been added by a community member label Jul 20, 2020

martincostello force-pushed the Rightsize-Lists branch from 766aa6f to e8ae502 Compare August 2, 2020 12:42

Right-size List<T> instances

42df132

Create new instances of List<T> with an appropriate capacity for the items that will be added. Use Array.Empty<T>() where appropriate, rather than create an empty list and then return it.

martincostello force-pushed the Rightsize-Lists branch from e8ae502 to 42df132 Compare August 5, 2020 20:56

martincostello marked this pull request as ready for review August 5, 2020 20:57

martincostello requested review from BrennanConroy, Tratcher, halter73, javiercn and jkotalik as code owners August 5, 2020 20:57

BrennanConroy reviewed Aug 5, 2020

View reviewed changes

src/SignalR/common/Protocols.MessagePack/src/Protocol/MessagePackHubProtocolWorker.cs Outdated Show resolved Hide resolved

jaredpar mentioned this pull request Aug 5, 2020

OSX machines are de-provisioned during CI / PR runs leading to failures dotnet/runtime#34472

Closed

Update src/SignalR/common/Protocols.MessagePack/src/Protocol/MessageP…

ae9b4c8

…ackHubProtocolWorker.cs Co-authored-by: Brennan <[email protected]>

pranavkm approved these changes Aug 7, 2020

View reviewed changes

pranavkm added this to the 5.0.0-rc1 milestone Aug 7, 2020

pranavkm reviewed Aug 7, 2020

View reviewed changes

Revert change to set capacity

4679db0

Revert the change to set the capacity of the list.

martincostello and others added 2 commits August 7, 2020 20:00

Use Array.Empty<T>

5ca65ec

Use Array.Empty<TagHelperAttribute>() in two places. Remove static readonly field containing zero-length array.

Update src/Razor/Razor/src/TagHelpers/ReadOnlyTagHelperAttributeList.cs

6d3eef3

Co-authored-by: Pranav K <[email protected]>

pranavkm approved these changes Aug 7, 2020

View reviewed changes

martincostello added 2 commits August 7, 2020 21:03

Use Array.Empty<T>

fa3147b

Use Array.Empty<T>() instead of creating a new list.

Revert Array.Empty<T>

5669e5e

Revert two changes to use Array.Empty<T>(), as it breaks things even though the class says it's read-only...

pranavkm changed the base branch from master to release/5.0 August 19, 2020 13:22

pranavkm merged commit b22512d into dotnet:release/5.0 Aug 19, 2020

martincostello deleted the Rightsize-Lists branch August 19, 2020 16:52

xtqqczze mentioned this pull request Jul 15, 2021

Right-size Lists when created PowerShell/PowerShell#15782

Closed

Right-size Lists when created #23714

Right-size Lists when created #23714

Uh oh!

Conversation

martincostello commented Jul 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Rationale

Benchmarks

Uh oh!

BrennanConroy commented Jul 6, 2020

Uh oh!

martincostello commented Jul 6, 2020

Uh oh!

JamesNK commented Jul 7, 2020

Uh oh!

martincostello commented Jul 7, 2020

Uh oh!

martincostello commented Jul 7, 2020

Uh oh!

Uh oh!

Uh oh!

gfoidl Jul 7, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

martincostello commented Aug 5, 2020

Uh oh!

pranavkm commented Aug 5, 2020

Uh oh!

BrennanConroy commented Aug 5, 2020

Uh oh!

martincostello commented Aug 5, 2020

Uh oh!

BrennanConroy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pranavkm Aug 7, 2020

Choose a reason for hiding this comment

Uh oh!

pranavkm commented Aug 7, 2020

Uh oh!

martincostello commented Aug 19, 2020

Uh oh!

pranavkm commented Aug 19, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

martincostello commented Jul 6, 2020 •

edited

Loading