Skip to content

Skip stacktrace in info log for s3 region loading issue #131587

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ywangd
Copy link
Member

@ywangd ywangd commented Jul 21, 2025

The stacktrace is rather excessive given the level of the log message. The information provided in the exception message should be sufficient for taking actions. Also adjust the wording to better match the contribuation guide for logging messages.

The stacktrace is rather excessive given the level of the log message.
The information provided in the exception message should be sufficient
for taking actions. Also adjust the wording to better match the
contribuation guide for logging messages.
@ywangd ywangd requested a review from DaveCTurner July 21, 2025 02:03
@ywangd ywangd added >non-issue :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs auto-backport Automatically create backport pull requests when merged v9.2.0 v9.1.1 v8.19.1 labels Jul 21, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Jul 21, 2025
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I think we need all the inner detail sometimes. But you're right that it'd be best not to emit this when starting the node since many nodes will not need this detail. Instead, I would rather we captured the exception here and then only emit the log message (once per node run) if org.elasticsearch.repositories.s3.S3Service#getClientRegion falls through to trying to use the default region.

@ywangd
Copy link
Member Author

ywangd commented Jul 21, 2025

The stacktrace is something like the followings

[2025-07-18T11:14:02,464][INFO ][o.e.r.s.S3RepositoryPlugin][testMultiProjectSnapshots] failed to obtain region from default provider chain
software.amazon.awssdk.core.exception.SdkClientException: Unable to load region from any of the providers in the chain software.amazon.awssdk.regions.providers.DefaultAwsRegionProviderChain@64e725ed: [software.amazon.awssdk.regions.providers.SystemSettingsRegionProvider@8b63edf: Unable to load region from system settings. Region must be specified either via environment variable (AWS_REGION) or  system property (aws.region)., software.amazon.awssdk.regions.providers.AwsProfileRegionProvider@5895fb34: No region provided in profile: default, software.amazon.awssdk.regions.providers.InstanceProfileRegionProvider@34273343: Unable to retrieve region information from EC2 Metadata service. Please make sure the application is running on EC2.]
	at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:130) ~[sdk-core-2.31.78.jar:?]
	at software.amazon.awssdk.regions.providers.AwsRegionProviderChain.getRegion(AwsRegionProviderChain.java:70) ~[regions-2.31.78.jar:?]
	at org.elasticsearch.repositories.s3.S3RepositoryPlugin.getDefaultRegion(S3RepositoryPlugin.java:102) ~[repository-s3-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
	at org.elasticsearch.repositories.s3.S3Service.lambda$new$0(S3Service.java:132) ~[repository-s3-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.RunOnce.run(RunOnce.java:41) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
	at org.elasticsearch.repositories.s3.S3Service.doStart(S3Service.java:418) ~[repository-s3-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
	at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:51) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
	at java.lang.Iterable.forEach(Iterable.java:75) ~[?:?]
	at org.elasticsearch.node.Node.start(Node.java:278) ~[elasticsearch-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
...

I don't see how they are useful other than the part that is preserved in this PR, i.e.:

[2025-07-18T11:14:02,464][INFO ][o.e.r.s.S3RepositoryPlugin][testMultiProjectSnapshots] unable to obtain region from default provider chain: Unable to load region from any of the providers in the chain software.amazon.awssdk.regions.providers.DefaultAwsRegionProviderChain@64e725ed: [software.amazon.awssdk.regions.providers.SystemSettingsRegionProvider@8b63edf: Unable to load region from system settings. Region must be specified either via environment variable (AWS_REGION) or  system property (aws.region)., software.amazon.awssdk.regions.providers.AwsProfileRegionProvider@5895fb34: No region provided in profile: default, software.amazon.awssdk.regions.providers.InstanceProfileRegionProvider@34273343: Unable to retrieve region information from EC2 Metadata service. Please make sure the application is running on EC2.]

Info logs are meant to be read by end-users. I'd think we don't want to burden them with the stacktrace which does not really add more value for them taking actions?

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Jul 21, 2025

I suspect there may sometimes be more detail in some inner exceptions.

Info logs are meant to be read by end-users. I'd think we don't want to burden them with the stacktrace which does not really add more value for them taking actions?

Exactly, that's why I think we shouldn't log anything unless there's actually some action to take, which would only be because S3Service#getClientRegion fell through to the bottom.

ETA: I opened ES-12449 to track this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >non-issue Team:Distributed Coordination Meta label for Distributed Coordination team v8.19.1 v9.1.1 v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants