-
Notifications
You must be signed in to change notification settings - Fork 179
Fix calculation of multipart S3 upload chunk size #1648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+@jaroeichler As second pair of eyes to check that this calculation is correct.
Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: 0 of 2 LGTMs obtained, and all files reviewed, and 3 discussions need to be resolved (waiting on @jaroeichler)
-- commits
line 7 at r1:
This message seems to be out-of-sync with the code
nativelink-store/src/s3_store.rs
line 82 at r1 (raw file):
// S3 parts cannot be more than this number. See: // https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html const MAX_UPLOAD_PARTS: u64 = 10_000;
nit: Consider doing one of the following:
- Leave a comment to explain that this makes calculations with this value easier
- Make this a
u16
and use someu64::from
conversions in the calculations.
nativelink-store/src/s3_store.rs
line 425 at r1 (raw file):
// Sanity check if max_size > MAX_UPLOAD_SIZE {
fyi I think this is fine.
nativelink-store/src/s3_store.rs
line 546 at r1 (raw file):
// First approximation of number of chunks if we upload in 5MB chucks (min chunk size), // clamp to 10,000 parts. let chunk_count = (max_size / MIN_MULTIPART_SIZE).clamp(1, MAX_UPLOAD_PARTS) + 1;
Let's create a regression test for this.
I think with a test, this PR will be close. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 LGTMs obtained, and all files reviewed, and 5 discussions need to be resolved
nativelink-store/src/s3_store.rs
line 546 at r1 (raw file):
// First approximation of number of chunks if we upload in 5MB chucks (min chunk size), // clamp to 10,000 parts. let chunk_count = (max_size / MIN_MULTIPART_SIZE).clamp(1, MAX_UPLOAD_PARTS) + 1;
Shouldn't it be clamp(0, MAX_UPLOAD_PARTS - 1)
here to adjust for the +1?
nativelink-store/src/s3_store.rs
line 551 at r1 (raw file):
// chunk, excluding last chunk, clamping to min/max upload size 5MB, 5GB. let bytes_per_upload_part = (max_size / chunk_count).clamp(MIN_MULTIPART_SIZE, MAX_MULTIPART_SIZE);
Can't you just cut this off at the minimum MIN_MULTIPART_SIZE
and then check if bytes_per_upload_part < MAX_MULTIPART_SIZE
is satisfied? Then you wouldn't have to recalculate chunk_count
. Also why is there no +1 here? Don't we again have to take into account rounding from integer division to ensure that chunk_count * bytes_per_upload_part >= max_size
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies on the long reply time. Thank you for the thoughtful comments; I hope to have addressed all to satisfaction except for the test. As the update()
method currently stands, I feel like the test could be "pretty involved" where it would be nice to simply isolate the finer points to unit test and isolate the other business logic...
Reviewable status: 0 of 2 LGTMs obtained, and 0 of 1 files reviewed, and pending CI: Bazel Dev / macos-15, Bazel Dev / ubuntu-24.04, Cargo Dev / macos-15, Cargo Dev / ubuntu-24.04, Coverage, Installation / macos-13, Installation / macos-14, Installation / macos-15, Installation / ubuntu-22.04, Installation / ubuntu-24.04, Local / lre-rs / macos-15, NativeLink.com Cloud / Remote Cache / macos-15, NativeLink.com Cloud / Remote Cache / ubuntu-24.04, Publish image, Publish nativelink-worker-init, Remote / lre-cc / large-ubuntu-22.04, Remote / lre-rs / large-ubuntu-22.04, asan / ubuntu-24.04, macos-15, ubuntu-24.04, ubuntu-24.04 / stable, windows-2022 / stable, and 5 discussions need to be resolved (waiting on @jaroeichler)
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
This message seems to be out-of-sync with the code
Concur; will fix
nativelink-store/src/s3_store.rs
line 82 at r1 (raw file):
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
nit: Consider doing one of the following:
- Leave a comment to explain that this makes calculations with this value easier
- Make this a
u16
and use someu64::from
conversions in the calculations.
Comment added
nativelink-store/src/s3_store.rs
line 425 at r1 (raw file):
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
fyi I think this is fine.
As in - good addition (to which, I'll say thanks)? Or..?
nativelink-store/src/s3_store.rs
line 546 at r1 (raw file):
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
Let's create a regression test for this.
Been looking at the mechanisms already used to test s3 storage and should have a solution in the next 24 hours. Is an approach with StaticReplayClient
what you were expecting?
nativelink-store/src/s3_store.rs
line 546 at r1 (raw file):
Previously, jaroeichler wrote…
Shouldn't it be
clamp(0, MAX_UPLOAD_PARTS - 1)
here to adjust for the +1?
Concur; appreciate the fresh eyes
nativelink-store/src/s3_store.rs
line 551 at r1 (raw file):
Previously, jaroeichler wrote…
Can't you just cut this off at the minimum
MIN_MULTIPART_SIZE
and then checkif bytes_per_upload_part < MAX_MULTIPART_SIZE
is satisfied? Then you wouldn't have to recalculatechunk_count
. Also why is there no +1 here? Don't we again have to take into account rounding from integer division to ensure thatchunk_count * bytes_per_upload_part >= max_size
?
Not sure I was being overly cautious. If we have gotten past the first sanity check, there is no reason to assume that we cannot break the file down into no more than 10k parts of sizes b/w [5MiB, 5GiB]. I have made the corrections with associated comment clarifications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2.
Reviewable status: 0 of 2 LGTMs obtained, and all files reviewed, and pending CI: Bazel Dev / macos-15, Bazel Dev / ubuntu-24.04, Cargo Dev / macos-15, Cargo Dev / ubuntu-24.04, Coverage, NativeLink.com Cloud / Remote Cache / macos-15, NativeLink.com Cloud / Remote Cache / ubuntu-24.04, asan / ubuntu-24.04, macos-15, ubuntu-24.04, ubuntu-24.04 / stable, windows-2022 / stable, and 4 discussions need to be resolved (waiting on @jaroeichler)
nativelink-store/src/s3_store.rs
line 425 at r1 (raw file):
Previously, byteOfChaos (Kyle Wheat) wrote…
As in - good addition (to which, I'll say thanks)? Or..?
As in "good addition" 😊 (IIUC the commit message mentioned that you weren't sure about this)
nativelink-store/src/s3_store.rs
line 546 at r1 (raw file):
Previously, byteOfChaos (Kyle Wheat) wrote…
Been looking at the mechanisms already used to test s3 storage and should have a solution in the next 24 hours. Is an approach with
StaticReplayClient
what you were expecting?
Ideally yes. Although I noticed that it might be a bit tricky since we probably can't easily test this with a "naive" approach due to limited resources on the github runners. Maybe a streaming approach could work, though even with that, the time needed to chunk a 5TB file might be a bit too time consuming for CI. I'd say please try it, but if it turns out to be too much of a hassle it would also be ok to treat this as an exception where a regression test might simply be infeasible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing done. I believe it's impossible to have an upload within the limits of max upload size break any of the limits of max upload parts, min/max multipart size, etc. which simplifies the test and lets us avoid needing to test with outrageously large files (unless we want to add a test case where we calculate an chunk size that is within - but not equal to either - (MIN_MULTIPART_SIZE
, MAX_MULTIPART_SIZE
). That would probably strain CI.
Reviewable status: 0 of 2 LGTMs obtained, and 0 of 2 files reviewed, and pending CI: pre-commit-checks, and 4 discussions need to be resolved (waiting on @jaroeichler)
nativelink-store/src/s3_store.rs
line 425 at r1 (raw file):
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
As in "good addition" 😊 (IIUC the commit message mentioned that you weren't sure about this)
Ah, I see. I'll go all in and remove my uncertainty from the commit message.
nativelink-store/src/s3_store.rs
line 546 at r1 (raw file):
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
Ideally yes. Although I noticed that it might be a bit tricky since we probably can't easily test this with a "naive" approach due to limited resources on the github runners. Maybe a streaming approach could work, though even with that, the time needed to chunk a 5TB file might be a bit too time consuming for CI. I'd say please try it, but if it turns out to be too much of a hassle it would also be ok to treat this as an exception where a regression test might simply be infeasible.
I believe it's impossible to violate the MAX_MULTIPART_SIZE
in order to cause a clamp from the way we calculate chunk_count
and then bytes_per_upload_part
, so the test causes a clamp to MIN_MULTIPART_SIZE
. I left a comment that outlines some thoughts for a test case that will calculate a chunk size b/w (MIN_MULTIPART_SIZE
, MAX_MULTIPART_SIZE
).
- Add check for maximum file size - Calculate minimum chunk size (maximum chunk count) for multipart S3 file upload, checking for proper limit of chunk count before upload - Make MAX_UPLOAD_PARTS u64 for uniformity and simplicity
Description
Fix an incorrect calculation of the chunk size for a multipart file upload to S3. Original code seemed to be targeting calculation of smallest chunk size for successful S3 file upload.
Added a few range checks - which may be out of place; I'll happily receive correction - at the top of the
update()
method and after calculation of chunk size and chunk count.Fixes #1425
Type of change
How Has This Been Tested?
Included a range check after calculation, requiring visual verification. For S3 store, there seems to only be integration tests that do not cover multipart upload.
I'd be happy to explore more thorough testing after receiving some guidance.
Checklist
bazel test //...
passes locallygit amend
see some docsThis change is