Skip to content

controllers/krate/publish: Validate JSON metadata before reading tarball #10069

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 26, 2024

Conversation

Turbo87
Copy link
Member

@Turbo87 Turbo87 commented Nov 25, 2024

Previously, the BytesRequest allocated memory for the full tarball (and the JSON metadata blob) before even validating that the name and vers fields of the JSON metadata correspond to a crate and version that the user has publish access too. This commit changes the code to first read the JSON metadata from the request body stream, validate it, and then read the tarball bytes afterwards.

Unfortunately, this means that we still buffer the full tarball in memory, but since we only want to upload the tarball to S3 once we validated the content of the tarball, this is somewhat hard to fix. Ideas are welcome! 😉

@Turbo87 Turbo87 added C-internal 🔧 Category: Nonessential work that would make the codebase more consistent or clear A-backend ⚙️ labels Nov 25, 2024
@Turbo87 Turbo87 requested a review from eth3lbert November 25, 2024 14:31
Copy link

codecov bot commented Nov 25, 2024

Codecov Report

Attention: Patch coverage is 91.17647% with 6 lines in your changes missing coverage. Please review.

Project coverage is 89.49%. Comparing base (154e004) to head (1fb5bed).

Files with missing lines Patch % Lines
src/controllers/krate/publish.rs 87.75% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10069      +/-   ##
==========================================
- Coverage   89.50%   89.49%   -0.01%     
==========================================
  Files         295      295              
  Lines       31257    31309      +52     
==========================================
+ Hits        27976    28021      +45     
- Misses       3281     3288       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@eth3lbert
Copy link
Contributor

nice, I'll have a look tomorrow :D

Previously, the `BytesRequest` allocated memory for the full tarball (and the JSON metadata blob) before even validating that the `name` and `vers` fields of the JSON metadata correspond to a crate and version that the user has publish access too. This commit changes the code to first read the JSON metadata from the request body stream, validate it, and then read the tarball bytes afterwards.
Copy link
Contributor

@eth3lbert eth3lbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@eth3lbert
Copy link
Contributor

Unfortunately, this means that we still buffer the full tarball in memory, but since we only want to upload the tarball to S3 once we validated the content of the tarball, this is somewhat hard to fix. Ideas are welcome!

Yeah, it would be great if we could stream uploads to S3 and abort them if they're invalid. But this might incur additional costs(?).

@Turbo87
Copy link
Member Author

Turbo87 commented Nov 26, 2024

But this might incur additional costs(?).

yeah, that's the part that I'm slightly worried about. not sure if we can protect against that in a reasonable way.

@Turbo87 Turbo87 enabled auto-merge November 26, 2024 15:18
@Turbo87 Turbo87 merged commit 76cc6c4 into rust-lang:main Nov 26, 2024
8 checks passed
@Turbo87 Turbo87 deleted the publish-stream branch November 26, 2024 15:27
Turbo87 added a commit to Turbo87/crates.io that referenced this pull request Nov 28, 2024
Turbo87 added a commit that referenced this pull request Nov 28, 2024
Turbo87 added a commit to Turbo87/crates.io that referenced this pull request Nov 29, 2024
Turbo87 added a commit to Turbo87/crates.io that referenced this pull request Dec 5, 2024
Turbo87 added a commit to Turbo87/crates.io that referenced this pull request Dec 6, 2024
Turbo87 added a commit to Turbo87/crates.io that referenced this pull request Dec 9, 2024
Turbo87 added a commit that referenced this pull request Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-backend ⚙️ C-internal 🔧 Category: Nonessential work that would make the codebase more consistent or clear
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants