-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Fix inefficiency with ArrayList.insertSlice #17312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Suggest to run the std lib tests locally before pushing, it will help you catch mistakes before the CI runs. |
Includes a more robust implementation of replaceRange, which updates the ArrayListUnmanaged if state changes in the managed part of the code before returning an error. Co-authored-by: Andrew Kelley <[email protected]>
* Move `computeBetterCapacity` to the bottom so that `pub` stuff shows up first. * Rename `computeBetterCapacity` to `growCapacity`. Every function implicitly computes something; that word is always redundant in a function name. "better" is vague. Better in what way? Instead we describe what is actually happening. "grow". * Improve doc comments to be very explicit about when element pointers are invalidated or not. * Rename `addManyAtIndex` to `addManyAt`. The parameter is named `index`; that is enough. * Extract some duplicated code into `addManyAtAssumeCapacity` and make it `pub`. * Since I audited every line of code for correctness, I changed the style to my personal preference. * Avoid a redundant `@memset` to `undefined` - memory allocation does that already. * Fixed comment giving the wrong reason for not calling `ensureTotalCapacity`.
|
I rebased this branch, squashed your commits into one, and then added my own commit on top to make some more changes, and then finally have set this PR to auto-merge upon successful completion of the CI. Thanks for working on this 👍 |
|
This is a weird thing to worry about. The array needs to grown by a factor of its length every time or it will be turn an O(1) ammortized append into an O(n) append. realloc for most allocators may give you a few byte. There may be space recetly freed it can coallesce or just it rounded the last allocation up. You might be get the rest of the page of the original request was large enough to cause malloc to mmap a new region. Ideally in the last case the the list will know that and bump up request sizes to page multiples and and always be as large as possible. Unless these allocs are humdreds of megs, don't bother with mremap. The page tables games it plays are very slow and cause a full TLB flush. In the end, this is overly complex making up a poor allocation strategy at creation. Just make your next |
|
@jnordwick note that this 'try to resize in place, if it fails then allocate a new slice' strategy is used throughout Lines 1003 to 1012 in ae2cd5f
So benchmarks showing that |
There’s an inefficiency in insertSlice of std.ArrayList. This is the current implementation:
The problem is that if ensureUnusedCapacity triggers a resize, it will copy all the elements to the new allocation. insertSlice will then copy a slice of these same items to the end of the allocation, to make space for the insertion. It’s better to copy everything to their final place at once.
This pull request introduces a new function to the public interface of ArrayList / ArrayListUnmanaged:
This function makes space for a new slice of items at any position in the list. As other add* functions, the items are left unitialized. With this, it is trivial to reimplement insertSlice.
addManyAtIndex solves the inefficiency mentioned earlier by adapting code from ensureTotalCapacityPrecise and the old insertSlice. It manually grows the memory, if necessary, and copy everything only once.
In the process of implementing this function, I noticed none of the tests for insertSlice covered the case where the ArrayList needs to grow the capacity, so I wrote one.