Skip to content

feat(lib-storage): improve performance by reducing buffer copies #2612

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

Linkgoron
Copy link
Contributor

@Linkgoron Linkgoron commented Jul 22, 2021

Reduce memory usage by reducing buffer-copying and returning the original buffer instead.

Description

This PR reduces the number of buffers copied and created from existing buffers. The most significant change was made to streams that return buffers and Uint8Arrays, but some minor improvements were made for string and Uint8Array bodies as well. The following changes were made:

  • Yield without copying the given Buffer/Uint8Array in getDataReadable.ts and getDataReadableStream.ts (probably the most significant change).
  • In byteLength.ts, use Buffer.byteLength(string) instead of Buffer.from(string).byteLength for computing the string byteLength, so that the string is not copied into a buffer just for its byteLength.
  • Treat Uint8Array the same as a Buffer in chunker.ts
  • Use subarray instead of slice in getChunkBuffer.ts so that Uint8Array and Buffer inputs behave the same. In Node .slice on Uint8Array copies the data, while .subarray just takes the subarray without copying. Slice and subarray for Buffers in Node are the same (both don't copy the data).
  • getChunkStream.ts was changed in a similar way to getChunkBuffer.ts, and also added a change so that the last buffer is not copied if there's only one buffer left (the last change is probably relatively minor).

Testing

I ran the following file for different values:

import { getChunk } from './chunker';
import { Readable } from 'stream';
const buf = Buffer.from('#'.repeat(1024 * 1024 * 10));
const getIter = () => getChunk(Readable.from((async function* () {
    const chunkSize = 1024 * 1024;
    for (let i = 0; i < buf.byteLength; i += chunkSize) {
        yield buf.subarray(i, i + chunkSize);
    }
})()), 1024 * 1024 * 5)

async function iterateTimes(times: number) {
    console.time('timing')
    for (let i = 0; i < times; i++) {
        for await (const z of getIter()) { }
    }
    console.log(`times ${times}`);
    const used = process.memoryUsage() as any;
    for (let key in used) {
        console.log(`${key} ${Math.round(used[key] / 1024 / 1024 * 100) / 100} MB`);
    }
    console.timeEnd('timing')
}

iterateTimes(10);
results Original code

times 1
rss 59.67 MB
heapTotal 15.68 MB
heapUsed 12.78 MB
external 30.91 MB
arrayBuffers 30.02 MB
timing: 17.289ms

times 2
rss 79.71 MB
heapTotal 15.68 MB
heapUsed 12.82 MB
external 50.91 MB
arrayBuffers 50.02 MB
timing: 26.743ms

times 10
rss 173.15 MB
heapTotal 7.18 MB
heapUsed 2.3 MB
external 90.91 MB
arrayBuffers 90.02 MB
timing: 90.863ms

New:
times 1
rss 49.52 MB
heapTotal 15.68 MB
heapUsed 12.78 MB
external 20.91 MB
arrayBuffers 20.02 MB
timing: 13.046ms

times 2
rss 59.64 MB
heapTotal 15.43 MB
heapUsed 12.8 MB
external 30.91 MB
arrayBuffers 30.02 MB
timing: 17.994ms

times 10
rss 129.96 MB
heapTotal 7.18 MB
heapUsed 2.32 MB
external 60.91 MB
arrayBuffers 60.02 MB
timing: 59.366ms

Note that I executed the above multiple times, to see that the results are essentially the same, but I didn't do a z-test or anything. If there's any benchmarking tools that I can use to have better results to show that the results are significant, I'd be happy to use them. From my testing, the new chunking was faster and also used less memory, when chunking a stream of a 10MB buffer with 1MB chunks. I also checked to see if there was some degraded performance for streams that return strings instead of buffers, but performance was essentially the same.

For byteLength, I executed a similar script that called byteLength 10 times with a 10MB and 1MB strings and also saw significant memory reduction.

results 10MB strings: >>>>> original code

times 10
rss 129.17 MB
heapTotal 14.93 MB
heapUsed 12.1 MB
external 40.93 MB
arrayBuffers 40.01 MB
timing: 90.462ms

new code
times 10
rss 29.49 MB
heapTotal 15.43 MB
heapUsed 12.66 MB
external 0.93 MB
arrayBuffers 0.01 MB
timing: 26.115ms

1MB strings

original code
times 10
rss 30.39 MB
heapTotal 6.68 MB
heapUsed 3.66 MB
external 0.93 MB
arrayBuffers 0.01 MB
timing: 13.741ms

new code
times 10
rss 20.36 MB
heapTotal 6.68 MB
heapUsed 3.67 MB
external 0.93 MB
arrayBuffers 0.01 MB
timing: 7.668ms

I did not check Uint8Array buffers or streams (i.e. streams that return such chunks), but I expect similar performance number improvements as above. If it's necessary I'll also check as well.

Additional context

If the general direction of this PR is OK, I'd be happy to add some more (regression) tests for Uint8Array buffers and streams and some other edge-cases, as those currently don't exist so if I did mess something up with such inputs, there are no tests to show it. Also, if there's some benchmark tooling that I missed, I'd be happy to add benchmarks.


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@codecov-commenter
Copy link

codecov-commenter commented Jul 22, 2021

Codecov Report

Merging #2612 (79d5b58) into main (f6a3ef5) will decrease coverage by 14.52%.
The diff coverage is 87.75%.

❗ Current head 79d5b58 differs from pull request most recent head 767c18b. Consider uploading reports for the commit 767c18b to get more accurate results
Impacted file tree graph

@@             Coverage Diff             @@
##             main    #2612       +/-   ##
===========================================
- Coverage   75.19%   60.67%   -14.53%     
===========================================
  Files         474      516       +42     
  Lines       20721    27808     +7087     
  Branches     4755     6832     +2077     
===========================================
+ Hits        15581    16872     +1291     
- Misses       5140    10936     +5796     
Impacted Files Coverage Δ
...ib/lib-storage/src/chunks/getDataReadableStream.ts 81.81% <75.00%> (-7.08%) ⬇️
...ckages/node-http-handler/src/write-request-body.ts 94.11% <80.00%> (-5.89%) ⬇️
packages/node-http-handler/src/server.mock.ts 90.00% <82.60%> (-4.74%) ⬇️
lib/lib-storage/src/bytelength.ts 82.35% <100.00%> (+2.35%) ⬆️
lib/lib-storage/src/chunker.ts 72.22% <100.00%> (ø)
lib/lib-storage/src/chunks/getChunkBuffer.ts 100.00% <100.00%> (ø)
lib/lib-storage/src/chunks/getChunkStream.ts 100.00% <100.00%> (ø)
lib/lib-storage/src/chunks/getDataReadable.ts 100.00% <100.00%> (ø)
packages/util-body-length-node/src/index.ts 66.66% <100.00%> (ø)
packages/s3-request-presigner/src/presigner.ts 95.23% <0.00%> (-4.77%) ⬇️
... and 583 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 921e9b4...767c18b. Read the comment docs.

@Linkgoron
Copy link
Contributor Author

Linkgoron commented Jul 26, 2021

I've added two extra changes. They're technically not from lib-storage, but I felt that they're of the same spirit as my other changes so I thought that it made sense to put them here, instead of spamming PRs. Please tell me if you'd prefer a separate PR.

  • in util-body-length-node/src/index.ts - don't copy the buffer for its size, just return Buffer.byteLength(string) which avoids the copy
  • in node-http-handler/src/write-request-body.ts, from what I've read in Lambda InvokeCommand payload with UInt8Array type #1433, string and Buffer are fine as inputs for end, so there's no reason to copy them before sending. This change is a bit more delicate though, because it does change how strings are fed (however, this was the behaviour ~9 months ago).

@Linkgoron
Copy link
Contributor Author

@AllanZhengYP Hi, I'm pinging you because you've also replied in my other PR. Is there anything that I need to do to get this PR rolling (am I missing something, are there any changes that I need to do?)?

@AllanZhengYP AllanZhengYP self-requested a review August 7, 2021 00:47
@AllanZhengYP AllanZhengYP self-assigned this Aug 7, 2021
@Linkgoron Linkgoron force-pushed the lib-storage-improve-memory-usage branch from 96acaae to d4f8231 Compare August 7, 2021 16:03
@Linkgoron
Copy link
Contributor Author

Linkgoron commented Aug 7, 2021

I've added an additional tweak for Uint8Arrays in node-http-handler to reuse their memory as well, and I've added tests to verify my changes (to all of the components). The test changes look a bit larger than they actually are, as most of the changes are just whitespace changes to reuse the tests for both buffers and Uint8arrays.

@aws-sdk-js-automation
Copy link

AWS CodeBuild CI Report

  • CodeBuild project: sdk-staging-test
  • Commit ID: 79d5b58
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@MadaraUchiha
Copy link

@AllanZhengYP Hey, could you post an update on this, why is this stuck? This performance impact is affecting our team as well.

@Linkgoron
Copy link
Contributor Author

Hey, any updates regarding this PR? Is there anything that's missing, or something else that's blocking this PR? I'd be happy to make any modifications that are needed.

@Linkgoron Linkgoron force-pushed the lib-storage-improve-memory-usage branch from 79d5b58 to 767c18b Compare January 26, 2022 13:00
@Linkgoron Linkgoron requested a review from a team as a code owner January 26, 2022 13:00
@kuhe kuhe added the pr/needs-review This PR needs a review from a Member. label Jun 3, 2022
@AllanZhengYP AllanZhengYP assigned kuhe and unassigned AllanZhengYP Sep 3, 2022
@uncvrd
Copy link

uncvrd commented Dec 7, 2022

@Linkgoron hey! I've been dealing with OOM errors when bulk uploading using the lib-storage package (I've reduced my code to load a stream in from s3 and then reupload it using lib-storage as a test case). Since it doesn't look like this update to reduce memory footprint will be implemented any time soon, I was curious what alternative solutions you have taken? I've really struggled to find a stream only approach that doesnt utilize buffers and ISNT 6+ year old solution haha. Would love some guidance if possible?

I assume you were running in to memory issues when using this library as well? Honestly shocked I don't see more issues pop up about the steady memory increase over time until bulk uploading is complete.

I wrote an issue here where I assume it has something to do with backpressure or the buffers not releasing from gc quick enough, but honestly I'm not experienced in this realm of node at all.

Here's my issue ticket: #4257

@Linkgoron
Copy link
Contributor Author

@Linkgoron hey! I've been dealing with OOM errors when bulk uploading using the lib-storage package (I've reduced my code to load a stream in from s3 and then reupload it using lib-storage as a test case). Since it doesn't look like this update to reduce memory footprint will be implemented any time soon, I was curious what alternative solutions you have taken? I've really struggled to find a stream only approach that doesnt utilize buffers and ISNT 6+ year old solution haha. Would love some guidance if possible?

I assume you were running in to memory issues when using this library as well? Honestly shocked I don't see more issues pop up about the steady memory increase over time until bulk uploading is complete.

I wrote an issue here where I assume it has something to do with backpressure or the buffers not releasing from gc quick enough, but honestly I'm not experienced in this realm of node at all.

Here's my issue ticket: #4257

I didn't actually have any memory issues, but when i created a different PR I just noticed how many copies there were - this PR is a bit old now, so I might not remember correctly, but I believe that every buffer is copied at least twice and in some cases three times. I'd be happy to put this PR back into a mergable state if the team has interest in it.

@uncvrd
Copy link

uncvrd commented Dec 7, 2022

@Linkgoron hey! I've been dealing with OOM errors when bulk uploading using the lib-storage package (I've reduced my code to load a stream in from s3 and then reupload it using lib-storage as a test case). Since it doesn't look like this update to reduce memory footprint will be implemented any time soon, I was curious what alternative solutions you have taken? I've really struggled to find a stream only approach that doesnt utilize buffers and ISNT 6+ year old solution haha. Would love some guidance if possible?
I assume you were running in to memory issues when using this library as well? Honestly shocked I don't see more issues pop up about the steady memory increase over time until bulk uploading is complete.
I wrote an issue here where I assume it has something to do with backpressure or the buffers not releasing from gc quick enough, but honestly I'm not experienced in this realm of node at all.
Here's my issue ticket: #4257

I didn't actually have any memory issues, but when i created a different PR I just noticed how many copies there were - this PR is a bit old now, so I might not remember correctly, but I believe that every buffer is copied at least twice and in some cases three times. I'd be happy to put this PR back into a mergable state if the team has interest in it.

Got it - yea turns out my issue was not due to a leak but the fact that this package does take up around 500MB before stabilizing if bulk uploading, so because my staging sites were using 512MB, this was enough for it to look like the memory was constantly increasing thus crashing my server. Increasing the memory to >512MB solved that.

image(1)

I'm trying to bring more attention to your PR as it would be great to update this and eventually have it merged!!

@kuhe
Copy link
Contributor

kuhe commented Aug 11, 2023

due to package migration, moved the work into new PRs

#5078
smithy-lang/smithy-typescript#867

@kuhe kuhe closed this Aug 11, 2023
@github-actions
Copy link

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr/needs-review This PR needs a review from a Member.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants