Skip to content

Upgrade to AWS Java SDK v2 #6165

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Jul 6, 2025
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
b239ab9
first modifications
jorgee May 23, 2025
ce52d31
first nio changes
jorgee May 27, 2025
946abe0
convert nio package to sdk v2
jorgee May 30, 2025
eabc184
fix for test in V2
jorgee May 30, 2025
7349661
fix tests
jorgee May 30, 2025
623ae2b
fix rebase error
jorgee Jun 5, 2025
b06ec93
fix some tests
jorgee Jun 6, 2025
68f76cd
fix upload dir and tagging overwrite in copy
jorgee Jun 8, 2025
f88bafe
fix get caller account when acl call fails
jorgee Jun 9, 2025
8b2f30b
fix get caller account when acl call fails
jorgee Jun 9, 2025
a1d6cc0
fix tagging test
jorgee Jun 9, 2025
9612146
some clean up
jorgee Jun 16, 2025
6a9375a
Merge branch 'master' into aws-sdk-v2-fs-impl
jorgee Jun 16, 2025
5a2105c
Return http client builder in S3ClientConfiguration and new tests
jorgee Jun 17, 2025
1bcddbe
fix test and update sdk version
jorgee Jun 17, 2025
7328873
Update docs [ci fast]
bentsherman Jun 18, 2025
951f23e
cleanup
bentsherman Jun 18, 2025
2829e7e
add crt client
jorgee Jun 18, 2025
ac83841
fix issues with ClientOverrideOptions, netty S3 Async client and S3 c…
jorgee Jun 26, 2025
c1f3151
add new crt options
jorgee Jun 27, 2025
dfffad5
Add lang config options and fix s3 tranfer manager actions error logs
jorgee Jun 27, 2025
6dd2761
Apply suggestions from code review [ci skip]
jorgee Jul 1, 2025
2095cdd
remove S3 netty client option
jorgee Jul 1, 2025
f09b08e
remove netty test
jorgee Jul 1, 2025
b00f7a2
cleanup docs
bentsherman Jul 1, 2025
26da281
fix infinite loop in S3iterator and remove unused import in test
jorgee Jul 1, 2025
d491d3c
adding v1 to v2 migration text
jorgee Jul 2, 2025
a2554a0
changing deprecated comment in uploads options
jorgee Jul 2, 2025
90a1457
Merge branch 'master' into aws-sdk-v2-fs-impl
jorgee Jul 2, 2025
b456c2f
Update docs
bentsherman Jul 3, 2025
94bc3af
test [e2e prod]
pditommaso Jul 4, 2025
9b04ec4
update loggers with sdk v2 package prefix
jorgee Jul 4, 2025
fd86ab4
Use aws batch request model adapter
pditommaso Jul 5, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 64 additions & 2 deletions docs/reference/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -216,13 +216,36 @@ The following settings are available:
:::
: The retrieval tier to use when restoring objects from Glacier, one of [`Expedited`, `Standard`, `Bulk`].

`aws.client.maxConcurrency`
: :::{versionadded} 25.06.0-edge
:::
: The maximum number of concurrent S3 transfers when using the asynchronous S3 client. By default, this setting is determined by `aws.client.targetThroughputInGbps`. Modifying this value can affect the amount of memory used for S3 transfers.

`aws.client.maxConnections`
: The maximum number of allowed open HTTP connections (default: `50`).
: The maximum number of allowed open HTTP connections when using the synchronous S3 client (default: `50`).

`aws.client.maxErrorRetry`
: The maximum number of retry attempts for failed retryable requests (default: `-1`).

`aws.client.maxNativeMemory`
: :::{versionadded} 25.06.0-edge
:::
: The maximum native memory for S3 transfers when using the asynchronous S3 client. By default, this setting is determined by `aws.client.targetThroughputInGbps`. Modifying this value can affect the memory used by the S3 transfers.

`aws.client.minimumPartSize`
: :::{versionadded} 25.06.0-edge
:::
: The minimum part size for multi-part uploads when using the asynchronous S3 client (default: `8 MB`).

`aws.client.multipartThreshold`
: :::{versionadded} 25.06.0-edge
:::
: The object size threshold for performing multi-part uploads when using the asynchronous S3 client (default: same as `aws.cllient.minimumPartSize`).

`aws.client.protocol`
: :::{deprecated} 25.06.0-edge
This option is no longer supported.
:::
: The protocol to use when connecting to AWS. Can be `http` or `https` (default: `'https'`).

`aws.client.proxyHost`
Expand All @@ -231,6 +254,11 @@ The following settings are available:
`aws.client.proxyPort`
: The port to use when connecting through a proxy.

`aws.client.proxyScheme`
: :::{versionadded} 25.06.0-edge
:::
: The protocol scheme to use when connecting through a proxy. Can be `http` or `https` (default: `'http'`).

`aws.client.proxyUsername`
: The user name to use when connecting through a proxy.

Expand All @@ -240,18 +268,27 @@ The following settings are available:
`aws.client.requesterPays`
: :::{versionadded} 24.05.0-edge
:::
: Use [Rrequester Pays](https://docs.aws.amazon.com/AmazonS3/latest/userguide/RequesterPaysBuckets.html) for S3 buckets (default: `false`).
: Use [Requester Pays](https://docs.aws.amazon.com/AmazonS3/latest/userguide/RequesterPaysBuckets.html) for S3 buckets (default: `false`).

`aws.client.s3PathStyleAccess`
: Use the path-based access model to access objects in S3-compatible storage systems (default: `false`).

`aws.client.signerOverride`
: :::{deprecated} 25.06.0-edge
This option is no longer supported.
:::
: The name of the signature algorithm to use for signing requests made by the client.

`aws.client.socketSendBufferSizeHint`
: :::{deprecated} 25.06.0-edge
This option is no longer supported.
:::
: The Size hint (in bytes) for the low level TCP send buffer (default: `0`).

`aws.client.socketRecvBufferSizeHint`
: :::{deprecated} 25.06.0-edge
This option is no longer supported.
:::
: The Size hint (in bytes) for the low level TCP receive buffer (default: `0`).

`aws.client.socketTimeout`
Expand All @@ -265,20 +302,45 @@ The following settings are available:
:::
: The AWS KMS key Id to be used to encrypt files stored in the target S3 bucket.

`aws.client.targetThroughputInGbps`
: :::{versionadded} 25.06.0-edge
:::
: The target network throughput (in Gbps) when using the asynchronous S3 client (default: `10`). This setting is not used when `aws.client.maxConcurrency` and `aws.client.maxNativeMemory` are specified.

`aws.client.transferManagerThreads`
: :::{versionadded} 25.06.0-edge
:::
: Number of threads used by the S3 transfer manager (default `10`).

`aws.client.userAgent`
: :::{deprecated} 25.06.0-edge
This option is no longer supported.
:::
: The HTTP user agent header passed with all HTTP requests.

`aws.client.uploadChunkSize`
: The size of a single part in a multipart upload (default: `100 MB`).
: :::{versionchanged} 25.06.0-edge
This option only applies when uploading via `Files.newOutputStream()`. Other uploads are managed by the S3 transfer manager.
:::

`aws.client.uploadMaxAttempts`
: The maximum number of upload attempts after which a multipart upload returns an error (default: `5`).
: :::{versionchanged} 25.06.0-edge
This option only applies when uploading via `Files.newOutputStream()`. Other uploads are managed by the S3 transfer manager.
:::

`aws.client.uploadMaxThreads`
: The maximum number of threads used for multipart upload (default: `10`).
: :::{versionchanged} 25.06.0-edge
This option only applies when uploading via `Files.newOutputStream()`. Other uploads are managed by the S3 transfer manager.
:::

`aws.client.uploadRetrySleep`
: The time to wait after a failed upload attempt to retry the part upload (default: `500ms`).
: :::{versionchanged} 25.06.0-edge
This option only applies when uploading via `Files.newOutputStream()`. Other uploads are managed by the S3 transfer manager.
:::

`aws.client.uploadStorageClass`
: The S3 storage class applied to stored objects. Can be `STANDARD`, `STANDARD_IA`, `ONEZONE_IA`, or `INTELLIGENT_TIERING` (default: `STANDARD`).
Expand Down
8 changes: 6 additions & 2 deletions modules/nf-commons/src/main/nextflow/file/FileHelper.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -880,7 +880,9 @@ class FileHelper {

@Override
FileVisitResult visitFile(Path fullPath, BasicFileAttributes attrs) throws IOException {
final path = folder.relativize(fullPath)
final path = fullPath.isAbsolute()
? folder.relativize(fullPath)
: fullPath
log.trace "visitFiles > file=$path; includeFile=$includeFile; matches=${matcher.matches(path)}; isRegularFile=${attrs.isRegularFile()}"

if (includeFile && matcher.matches(path) && (attrs.isRegularFile() || (options.followLinks == false && attrs.isSymbolicLink())) && (includeHidden || !isHidden(fullPath))) {
Expand Down Expand Up @@ -912,7 +914,9 @@ class FileHelper {
}

static protected Path relativize0(Path folder, Path fullPath) {
def result = folder.relativize(fullPath)
final result = fullPath.isAbsolute()
? folder.relativize(fullPath)
: fullPath
String str
if( folder.is(FileSystems.default) || !(str=result.toString()).endsWith('/') )
return result
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,12 @@ The amount of time to wait (in milliseconds) when initially establishing a conne
""")
public String endpoint;

@ConfigOption
@Description("""
The maximum number of concurrency in S3 async clients.
""")
public int maxConcurrency;

@ConfigOption
@Description("""
The maximum number of allowed open HTTP connections.
Expand All @@ -61,6 +67,24 @@ The amount of time to wait (in milliseconds) when initially establishing a conne
""")
public int maxErrorRetry;

@ConfigOption
@Description("""
The maximum native memory used by the S3 asynchronous client for S3 transfers.
""")
public MemoryUnit maxNativeMemory;

@ConfigOption
@Description("""
The minimum size of a single part in a multipart upload (default: `8 MB`).
""")
public MemoryUnit minimumPartSize;

@ConfigOption
@Description("""
The S3 Async client threshold to create multipart S3 transfers. Default is the same as `minimumPartSize`.
""")
public MemoryUnit multipartThreshold;

@ConfigOption
@Description("""
The protocol (i.e. HTTP or HTTPS) to use when connecting to AWS.
Expand All @@ -79,6 +103,12 @@ The protocol (i.e. HTTP or HTTPS) to use when connecting to AWS.
""")
public int proxyPort;

@ConfigOption
@Description("""
The protocol scheme to use when connecting through a proxy (http/https).
""")
public String proxyScheme;

@ConfigOption
@Description("""
The user name to use when connecting through a proxy.
Expand Down Expand Up @@ -139,6 +169,18 @@ The amount of time to wait (in milliseconds) for data to be transferred over an
""")
public String storageKmsKeyId;

@ConfigOption
@Description("""
The S3 Async client target network throughput in Gbps. This value is used to automatically set `maxConcurrency` and `maxNativeMemory` (default: `10`).
""")
public Double targetThroughputInGbps;

@ConfigOption
@Description("""
The number of threads used by the S3 transfer manager (default: `10`).
""")
public int transferManagerThreads;

@ConfigOption
@Description("""
The HTTP user agent header passed with all HTTP requests.
Expand Down
25 changes: 14 additions & 11 deletions plugins/nf-amazon/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -38,17 +38,20 @@ dependencies {
compileOnly 'org.pf4j:pf4j:3.12.0'

api ('javax.xml.bind:jaxb-api:2.4.0-b180830.0359')
api ('com.amazonaws:aws-java-sdk-s3:1.12.777')
api ('com.amazonaws:aws-java-sdk-ec2:1.12.777')
api ('com.amazonaws:aws-java-sdk-batch:1.12.777')
api ('com.amazonaws:aws-java-sdk-iam:1.12.777')
api ('com.amazonaws:aws-java-sdk-ecs:1.12.777')
api ('com.amazonaws:aws-java-sdk-logs:1.12.777')
api ('com.amazonaws:aws-java-sdk-codecommit:1.12.777')
api ('com.amazonaws:aws-java-sdk-sts:1.12.777')
api ('com.amazonaws:aws-java-sdk-ses:1.12.777')
api ('software.amazon.awssdk:sso:2.26.26')
api ('software.amazon.awssdk:ssooidc:2.26.26')
api ('software.amazon.awssdk:s3:2.31.64')
api ('software.amazon.awssdk:ec2:2.31.64')
api ('software.amazon.awssdk:batch:2.31.64')
api ('software.amazon.awssdk:iam:2.31.64')
api ('software.amazon.awssdk:ecs:2.31.64')
api ('software.amazon.awssdk:cloudwatchlogs:2.31.64')
api ('software.amazon.awssdk:codecommit:2.31.64')
api ('software.amazon.awssdk:sts:2.31.64')
api ('software.amazon.awssdk:ses:2.31.64')
api ('software.amazon.awssdk:sso:2.31.64')
api ('software.amazon.awssdk:ssooidc:2.31.64')
api ('software.amazon.awssdk:s3-transfer-manager:2.31.64')
api ('software.amazon.awssdk:apache-client:2.31.64')
api ('software.amazon.awssdk:aws-crt-client:2.31.64')

constraints {
api 'com.fasterxml.jackson.core:jackson-databind:2.12.7.1'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,6 @@ class AmazonPlugin extends BasePlugin {
@Override
void start() {
super.start()
// disable aws sdk v1 warning
System.setProperty("aws.java.v1.disableDeprecationAnnouncement", "true")
FileHelper.getOrInstallProvider(S3FileSystemProvider)
}

Expand Down
Loading
Loading