Skip to content

Make it possible to set capabilities #366

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

jaronoff97
Copy link
Contributor

Closes #365

@jaronoff97 jaronoff97 requested a review from a team as a code owner March 24, 2025 20:21
@jaronoff97
Copy link
Contributor Author

For context i need this to solve open-telemetry/opentelemetry-operator#3822. Introducing leader election to the bridge for HA would mean one of the bridge pods would need to change its capabilities while running. I could accomplish this by shutting down and then re-starting the opamp client, but that's very heavy and unnecessary IMO.

Copy link

codecov bot commented Mar 24, 2025

Codecov Report

Attention: Patch coverage is 81.81818% with 16 lines in your changes missing coverage. Please review.

Project coverage is 80.14%. Comparing base (4b62964) to head (a130307).
Report is 16 commits behind head on main.

Files with missing lines Patch % Lines
client/internal/clientcommon.go 81.15% 8 Missing and 5 partials ⚠️
client/internal/clientstate.go 76.92% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #366      +/-   ##
==========================================
+ Coverage   80.10%   80.14%   +0.04%     
==========================================
  Files          25       26       +1     
  Lines        2423     2549     +126     
==========================================
+ Hits         1941     2043     +102     
- Misses        374      391      +17     
- Partials      108      115       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@@ -10,6 +10,8 @@ import (
"github.com/open-telemetry/opamp-go/protobufs"
)

var _ OpAMPClient = &httpClient{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We use both styles in the code, but this one is more common.

Suggested change
var _ OpAMPClient = &httpClient{}
var _ OpAMPClient = (*httpClient)(nil)

@@ -22,6 +22,8 @@ const (
defaultShutdownTimeout = 5 * time.Second
)

var _ OpAMPClient = &wsClient{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var _ OpAMPClient = &wsClient{}
var _ OpAMPClient = (*wsClient)(nil)

Copy link
Contributor

@andykellr andykellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tigrannajaryan
Copy link
Member

The spec does not say whether this is a support operation (to change capabilities after they are first reported). Let me think about it.

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking temporarily, give me a bit time to think about the implications of this.

@andykellr
Copy link
Contributor

andykellr commented Mar 25, 2025

The spec does not say whether this is a support operation (to change capabilities after they are first reported). Let me think about it.

Sounds good. I held off on merging until you had a chance to look at it.

My reading of the spec was that since it was not specifically mentioned or prohibited that it would be ok to allow it. Some servers may expect this to stay the same for the life of the connection (based on the existing go implementation and not a requirement of the spec) but since it is passed with every message I think the server should be able to adjust as needed.

@tigrannajaryan
Copy link
Member

tigrannajaryan commented Mar 25, 2025

I don't think we can make this blanket change.

There are capabilities which are checked at the Start() and the corresponding invariants for the capabilities are checked at the same time.

For example, in PrepareStart() we verify that if AcceptsPackages is set then PackagesStateProvider is also provided. Later in receivedProcessor we rely on this invariant. If we break the invariant (which you can easily do via SetCapabilities) then receivedProcessor will attempt to use a nil PackagesStateProvider (which will either crash or error, I didn't look further).

At least from implementation perspective this is not a change we can make.

I am also not sure conceptually it works for every single capability, to allow changing them on the fly.

Before we make this blanket change I would like to see the analysis which explain why it is OK to change each particular capability on the fly (I am not sure that is true). Then we will need to make sure the implementation is ready for that (it is currently not). We will also need to update the spec to explain that this is an allowed mode of operation.

As an alternate, is there a particular capability that you need to change after Start()? we can look at supporting just that.

@jaronoff97
Copy link
Contributor Author

Yes, i did notice that. I think we could do instantiation on demand though based on the capabilities being set. Initially, I just need to be able to enable AcceptsRemoteConfig, and I'd be okay limiting the SetCapabilities to only allow flipping that on and off initially, adding in more allowed changes in the future?

@tigrannajaryan
Copy link
Member

I think AcceptsRemoteConfig is doable. I don't see any special invariants for this capability in the codebase.

We can do this:

  1. Modify OpAMP spec to explicitly call out that the agent may send a different set of capabilities in subsequent AgentToServer messages after the initial one. Explain that only AcceptsRemoteConfig can change.
  2. Restrict SetCapabilities to only allow certain capabilities to be changed after Start. For now it will only allow AcceptsRemoteConfig. We will need to vet all new capabilities one by one if we want to allow changing them.

@jaronoff97
Copy link
Contributor Author

@tigrannajaryan made the updates, i have one open question on this

Comment on lines +532 to +534
// QUESTION: DO we want this?
// SetPackagesStateProvider allows the confgiguration of the packages state provider after start.
// func (c *ClientCommon) SetPackagesStateProvider(packagesStateProvider types.PackagesStateProvider) error {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now there would be no way for a client to set the package state provider if they wanted to modify the package capabilities. Should this be exposed to allow for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand. Why would the client need to set the state provider after the Start? It can be set before Start, but can return a different set of packages anytime.

//
// For more details, refer to the OpAMP specification:
// https://github.com/open-telemetry/opamp-spec/blob/main/specification.md#agenttoservercapabilities
SetCapabilities(capabilities *protobufs.AgentCapabilities) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This results in inconsistent way of setting capabilities depending on when you want to do it: before Start() requires setting a field in StartSettings, after Start() requires calling this method.

For functionality that MUST be set before Start() and MAY be set after Start() we previously used only a method. See for example SetAgentDescription().

Should we do the same for SetCapabilities() and deprecate Capabilities field from StartSettings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... I think that makes sense to me. @andykellr does that sound alright to you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaronoff97 I am not sure if you updated the PR to make this change. Is it ready for another review round?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was waiting on andy's review and when you said "we decided to go ahead with this feature" I thought it was dismissing this comment. I see now, i'll update the PR based on what you said above in this thread.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, no problem, I just wasn't sure if I need to take another look.
I would like @andykellr's opinion too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaronoff97 in case I wasn't clear, I am waiting for you to tell me if this is ready for another round of review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was waiting on andy's opinion here, but either way i can make your requested changes from above before its ready for more review. apologies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaronoff97 sounds good, thanks. Ping me when you update the PR, I will take a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tigrannajaryan i had two questions for the implementation below, but i think this should be good! Thank you for your patience.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach is fine with me.

@tigrannajaryan tigrannajaryan dismissed their stale review April 23, 2025 20:04

We decided to go ahead with this feature.

@jaronoff97 jaronoff97 requested a review from tigrannajaryan May 2, 2025 14:38
@@ -129,6 +131,12 @@ func prepareClient(t *testing.T, settings *types.StartSettings, c OpAMPClient) {
prepareSettings(t, settings, c)
err := c.SetAgentDescription(createAgentDescr())
assert.NoError(t, err)
// We ignore the error here.
if settings.Capabilities != 0 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i had to ignore the error here because a later test depended on the checks happening in the start client

if err := c.ClientSyncedState.SetCapabilities(&capabilities); err != nil {
return err
// Deprecated: Use client.SetCapabilities() instead.
if settings.Capabilities != 0 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tigrannajaryan this technically could be a breaking change if someone isn't setting any capabilities, should I just default to setting reports status here and then in the future we remove this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should emit a warning in the log if SetCapabilities is not called before Start. Later, we can make it an error. Then finally some time after that we remove Settings.Capabilities field.

In any of these 3 states, ReportsStatus should always be automatically added, regardless of how the Capabilities are set (via Settings or via method call).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tigrannajaryan i do make sure reportsstatus is always set (I do the add in this method as well in the clientcommon setcapabilities), but I'll add in a log right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tigrannajaryan should be all good now. PTAL 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Client should be able to change capabilities while running
3 participants