-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Rough sketch (draft 3) limiter extension and middleware #12700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…missed a form of gRPC interceptor. For both client and server gRPC cases in the middleware extension API, introduce a method to obtain a stats handler. ClientStatsHandler() and ServerStatsHandler() methods will be added, returning (grpc.StatsHandler, error). In limitermiddleware, add support for the two new methods. Implement the StatsHandler interface with empty methods for now. The type is named stats.Handler, package documented https://pkg.go.dev/google.golang.org/[email protected]/stats#Handler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed extensionmiddleware
and extensionlimiter
:
- I have very few comments that we can discuss about;
- We should merge these asap; matches exactly with what I expected to see.
- Next on my list will be the configs then the convertors.
I like the way things have shaped up here. I like the interfaces you've defined and I think it sets us going in the right direction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bogdandrutu, @mattsains. I've improved this PR (still a draft) based on your feedback.
I'm happy with how the gRPC rate limits are implemented, and I added HTTP network-bytes limits to flush out the skeleton of this approach. The http.RoundTripper and http.Handler will be created when either of the two weight keys it supports are used: network_bytes and request_count.
Note that request_items and resident_bytes will be implemented at a different level in the receiver(s).
However, there are still likely some challenges. There are existing middlewares with a pre-defined order, including auth, headers, compression, and opentelemetry instrumentation. To add a limiter we want to go before compression. To turn compression or opentelemetry instrumentation into extensions, which sounds nice, also implies a transition plan. I could imagine making the middleware configurable with a default of [compression, opentelemetry]; if you want to add a rate limiter you'll have to include compression and opentelemetry in the proper order.
P.S. the core Provider interface and contract with middleware and other callers is really the only thing bothering me - otherwise this is looking great. Thank you for working on it! ❤️ |
@axw I think it's a good question, whether the Weight key should be a fixed enum or an open set. I can imagine a user who decides they want to rate-limit auth requests, having added special support in their auth extension. Then the auth extension would list a limiter extension, and potentially it could use a Weight key like "authorization_count". 🤷 So, I made the value an enum, but I don't really understand the implications for adding values in the future. |
@axw @bogdandrutu Please review the changes in configgrpc, confighttp, and otlpreceiver. If I don't hear more input, I'll start to send out single-package changes after next week, starting with extensionlimiter, extensionmiddleware, configlimiter, configmiddleware, then configgrpc, confighttp, then limitermiddleware, memorylimiter, and work starts on two new limiters (admission and rate). Note: the OTLP receiver now implements network_bytes and request_count limits using middleware, but it implements request_items and memory_size limits directly after it knows this information. The functionality could be implemented in a shim-layer between Receiver and pipeline, potentially, but there isn't a precedent for this and I would be glad for items/size limits to be opt-in. Is it OK for the limitermiddleware to present itself as an extensionlimiter.Provider so that OTLP receiver can call it directly? I like this, and this is how OTel-Arrow receiver would like it as well -- we can't calculate items/size until the data is parsed and processed a bit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (can start PRs):
- configgrpc
- configmiddleware
- confighttp
- extensionmiddleware
For the rest I need more time to think about how will be used. For example the limitermiddlewareextension does it need to be public (or how a dev uses it)? Or can we have it part of the extensionmiddleware
to support conversion if the middleware ID is a limiter extension?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @bogdandrutu, the middleware bits look straightforward and ready to move ahead with.
For the limiter API, I would still prefer to completely remove the weight key.
AFAICS the the OTLP receiver doesn't need to call the limiter itself since the limit is acquired after decoding has happened - so I think this could just as well be done by a processor, and then it's immediately usable with other receivers too. Maybe I'm missing some subtle detail though.
Is it OK for the limitermiddleware to present itself as an extensionlimiter.Provider so that OTLP receiver can call it directly? I like this, and this is how OTel-Arrow receiver would like it as well -- we can't calculate items/size until the data is parsed and processed a bit.
Not sure I understand the rationale here. Would it still be relevant if we do the item/size limiting in a processor?
@bogdandrutu I need for us to reach agreement on the limiter APIs before I proceed. @axw Let's focus on "can this be a processor?"
My position is that processors are too late for effectively governing memory use. This is why memorylimiterprocessor basically doesn't work and the genesis of memorylimiterextension. I understand the intuition behind this question--in some receivers, very little happens between the point where a receiver knows the memory_size and request_items and the call to Consume(), so why not perform these functions in the first processor, instead of complicating receiver logic? In some receivers, a lot happens between the point where a receiver knows memory_size/request_items and starting to process the request. gRPC-unary and HTTP servers typically hide the creation of a goroutine per request, so it looks like there's no difference between "last thing a receiver does" and "first thing a processor does", but for many protocols this is not the case. I'll give two examples:
Note: I'm not actually interested in item-count limits. Item count limits can be implemented in a processor, but I don't think we should. I think it would be appropriate to give receivers an option, a sort of @bogdandrutu wrote:
I showed inside this PR how I would use it, for example where @axw raised the question above. (Note the OTLP receiver is a special case, because (a) it receives OTLP data, therefore middleware can easily limit request_items/memory_size, (b) as a gRPC-unary server, a goroutine has been allocated before the limiter is called.) @awx considering your suggested alternative, the point I want to make is that not all uses of limiters will be middleware. In my proposal, the limitermiddleware is nothing more than a reference to a limiter. In yours, a lot of configuration (i.e., "weight keys") goes into the limitermiddleware, which means I'll have to re-create it in receivers that do not support middleware. Narrowing in on @axw's statement:
I read "need to be documented" to mean "sounds complex", but the two look equally complex to me. All solutions need to be documented. :)
This is not how my proposal goes. Every limiter will support all keys, because limiters are expected to operate on weight information alone. There would be no reason for a limiter to support a subset of keys. In my counter-example, the rate-limiter has one section per weight key (and unconfigured keys are unlimited).
Summarizing: we have four weight-specific entries in every limiter, and there will be at least three limiters: memorylimiter, admissionlimiter, ratelimiter. Limiters are expected to support all weight keys and behave identically. Adding new weight keys will be simple for limiters, complex for receivers.
Disagree. I see it as every receiver's responsibility to ensure that all four standard limiter weights are implemented, through middleware or otherwise. A gRPC or HTTP receiver can document that middleware includes limiters; however we know that a middleware-limiter adapter can only limit network_bytes and request_items in general, leaving the receiver itself responsible for calling request_items and memory_size (which can be provided as helpers). Therefore, documentation is not what is needed: it's to update the receivers to support middleware and/or call receiver helpers and/or directly call the limiters. Here is what I think we have to document to mitigate complexity:
I hope this helps convince you both that limiter extensions should configure weight keys and that receivers need direct access to limiters. IMO this draft is the best path forward. |
@jmacd re "why not a processor?": thanks for the additional context, I get your point now with the OTel-Arrow & syslog examples. My overall takeaway is: we should perform the limiting at the earliest opportunity, and in non-OTLP receivers we will have the information before we have converted to OTLP/pdata, in which case a processor will not be the earliest opportunity.
OK. I had in mind that the limiter implementations would care about the keys, and the receivers were expected to use specific values. Please bear with me, there's a lot to get my head around here... In https://github.com/open-telemetry/opentelemetry-collector/pull/12700/files#r2020428860 you mentioned having some interstitial - do you have an idea of what that would look like already? I'm not sure I understand how that would help address my concern, which is that a user who cares about (only) one kind of rate limiting shouldn't pay for any other kinds. |
@axw Thank you! I refactored some of the code from the prior state into a 1.Note that network_bytes and request_count limits are applied in the limitermiddleware (protocol-specific) There are two interfaces in the helper library, and now the changes in otlpreceiver are quite small. For now, I placed this helper library into extension/extensionlimiter/limiterhelper. Please see:
|
// Provider is an interface that provides access to different limiter types | ||
// for specific weight keys. | ||
type Provider interface { | ||
// RateLimiter returns a RateLimiter for the specified weight key | ||
RateLimiter(key WeightKey) RateLimiter | ||
|
||
// ResourceLimiter returns a ResourceLimiter for the specified weight key. | ||
// | ||
// In cases where a component supports a rate limiter and does not use | ||
// a release function, the component may return a ResourceLimiterFunc | ||
// which calls the underlying rate limiter and returns a nil ReleaseFunc. | ||
ResourceLimiter(key WeightKey) ResourceLimiter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having a hard to understanding when you would use a RateLimiter vs. a ResourceLimiter.
In limiterMiddleware we have:
- ResourceLimiter for request_count
- RateLimiter for network_bytes
Using RateLimiter makes sense to me for network_bytes: there's an unlimited stream of bytes coming in, and we want to limit the rate at which they will be consumed.
Why would you not do the same for request_count? IIUC, if you use a ResourceLimiter then the rate at which resources are consumed would be dependent on both the defined rate and how quickly each one is consumed. That's because as soon as you release, the resource becomes available for another consumer.
I would expect a ResourceLimiter to be used only for limited resources, like memory_size. Another case where I think it would make sense is for limiting concurrency. In those cases we don't care about rates, we're just making enforcing a limit on that limited resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having said all that, I can think of cases where it would be useful to have an upper limit on concurrent requests in the pipeline in order to protect the backend from a rate it cannot possibly handle. I guess that's what you're going for here?
If that's the case, I'm wondering how the ResourceLimiter would work in a pipeline where the receiver and exporter are separated by a buffer (e.g. Kafka). If the receiver releases the resource immediately after producing to Kafka, then at sustained high receive rate the buffer growth may exceed the backend's capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added to the package documentation details about the Rate- and Resource-limiter APIs. I also mention a connection with the OTel metrics data model: Rates apply to Counters, Resource apply to UpDownCounters. I had already documented that Rate limiters can be applied as Resource limiters, simply by using a no-op ReleaseFunc: the same applies in the OTel metrics data model: you can count the rate of increments to an UpDownCounter while ignoring decrements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other forms of limiting that I've seen or heard discussed recently, in our context:
- goroutines (a resource)
- auth requests (a rate)
- retries (a rate)
We would be able to add new weight keys for these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, that helps. I suppose for keys like request_count there might be configuration to choose between rate or resource limiting then?
"go.opentelemetry.io/collector/pdata/ptrace" | ||
) | ||
|
||
// Consumer is a builder for creating wrapped consumers with resource limiters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this. Starting to become clearer. Would you also expect to add some common config struct that can be embedded in receiver config, similar to how exporters have exporterhelper.TimeoutConfig
and exporterhelper.QueueBatchConfig
?
Then the receiver can indicate which limits it supports with the With*Limit
options, and users can configure the receiver to enable/disable specific limits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a natural conclusion. I won't go as far as to implement this, but it would be an easy next step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those other structures (e.g., TimeoutConfig) are applied at the factory level, whereas the behavior I've illustrated uses the HTTP- or gRPC-level configuration of middleware to infer the limiters. Now that we're here, I have mixed feelings about this aspect of the design. To go in the direction you suggested, here's what I'll do:
Middlewares will continue to apply network_bytes and request_count limits, no changes in interceptor logic.
A config named receiverhelper.LimiterConfig
will list []configlimiter.Limiter
entries, thus factory-level logic can initialize limiters for request_items and memory_size. However, note that only a list of limiters is configured, no mention of weight keys. The receiver logic will specify which weight keys it expects to handle where, in code, while configuring the factory.
For example, the OTLP receiver covered in this draft would specify in its factory that receiver-level limiters should be configured for request_items and memory_size, while middleware-level limiters should be configured for request_count and network_bytes.
This is not the only design for weight keys, I'd like your opinion @axw. Would you extend configlimiter.Limiter
to refer to the weight key or keys to use for the binding? We might end up with configuration like:
receivers:
otlp:
protocols:
grpc:
middlewares:
- middleware: limitermiddleware/rate12
http:
middlewares:
- middleware: limitermiddleware/rate12
limiters:
- limiter: admissionlimiter/memory1
key: memory_size
extensions:
limitermiddleware/rate12:
- limiter: ratelimiter/rate1
key: network_bytes
- limiter: ratelimiter/rate2
key: request_count
This is maybe a little confusing and verbose, but it address the problem that request_count could be provided by middleware or it could be provided by the helper, and now we're configuring this detail instead of providing it in code. The consequence of this is that now misconfiguration is possible leading to start failures (e.g., requesting a key that is not provided); the benefit is that it's explicit.
One downside of this configuration, an orthogonal one, is that we cannot use separate limiters for gRPC and HTTP traffic on the same port because, at the factory level, the HTTP and gRPC traffic are identical. Potentially this could be addressed by adding context metadata indicating which protocol is in use, so that Acquire() and Limit() have access to this information, moving this configuration into the limiters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
grpc:
middlewares:
- middleware: limitermiddleware/rate12
http:
middlewares:
- middleware: limitermiddleware/rate12
Would this look better as?
grpc:
middlewares: [limitermiddleware/rate12]
http:
middlewares: [limitermiddleware/rate12]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is maybe a little confusing and verbose, but it address the problem that request_count could be provided by middleware or it could be provided by the helper, and now we're configuring this detail instead of providing it in code. The consequence of this is that now misconfiguration is possible leading to start failures (e.g., requesting a key that is not provided); the benefit is that it's explicit.
My preference is to be explicit over implicit, and bring all the complexity to the surface. That does mean more verbose config, but I think it should be clearer.
Re "requesting a key that is not provided": (IIUC) this is why I was proposing in #12700 (comment) to be even more explicit, and instead of having keys, make each type of limit its own configuration setting. Rebasing on your example above, what I have in mind is this:
receivers:
otlp:
protocols:
grpc:
middleware: [limitermiddleware/rate12]
http:
middleware: [limitermiddleware/rate12]
memory_size_limiter: admissionlimiter/memory1
extensions:
limitermiddleware/rate12:
network_bytes_limiter: ratelimiter/rate1
request_count_limiter: ratelimiter/rate2
One downside of this configuration, an orthogonal one, is that we cannot use separate limiters for gRPC and HTTP traffic on the same port because, at the factory level, the HTTP and gRPC traffic are identical. Potentially this could be addressed by adding context metadata indicating which protocol is in use, so that Acquire() and Limit() have access to this information, moving this configuration into the limiters.
Is this a hypothetical issue, or are there receivers that support both gRPC & HTTP on the same port? Anyway, if there are then they will need to figure out which requests are gRPC and which are not, so I agree that conveying through context metadata should be viable.
// Provider is an interface that provides access to different limiter types | ||
// for specific weight keys. | ||
// | ||
// Extensions implementing this interface can be referenced by their | ||
// names from component rate limiting configurations (e.g., limitermiddleware). | ||
type Provider interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the need of the Provider
. Does an extension need to implement both rate and resource? Why not have a RateLimiterProvider
(or RateLimiterExtension
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can one extension implement only one of the limiters?
} | ||
|
||
// Option represents the consumer options | ||
type Option func(*Config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We prefer the option of an interface with a private func for options.
} | ||
|
||
// NewConsumer creates a new limiterhelper Consumer | ||
func NewConsumer(provider extensionlimiter.Provider, options ...Option) *Consumer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure you need the Consumer
you can make:
func WrapTraces(provider extensionlimiter.Provider, nextConsumer consumer.Traces, options ...Option)
type Config struct { | ||
// Limiter configures the underlying extension used for limiting. | ||
Limiter configlimiter.Limiter `mapstructure:",squash"` | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where would this be used?
) | ||
|
||
// NewFactory returns a new factory for the Limiter Middleware extension. | ||
func NewFactory() extension.Factory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not clear yet where this will be used for me.
// Limit attempts to apply rate limiting based on the provided weight value. | ||
// Limit is expected to block the caller until the weight can be admitted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very confusing:
I see that https://github.com/open-telemetry/opentelemetry-collector/pull/12700/files#diff-50228098c7b1cb3e86d349cf309bfc70dae29f56dc0374f9a5e432f9659cf926R38 returns if memory is over limit immediately which in my opinion is the right thing.
Here Limit is expected to block the caller until the weight can be admitted.
, I don't agree with this.
// It may block until resources are available or return an error if the limit | ||
// cannot be satisfied. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who controls the blocking vs not blocking behavior?
…tor into jmacd/limiter_v3
Description
After prior drafts, summarized in #12603, with feedback from @bogdandrutu and @axw, I explored adding limiters via middleware structured as two separate configuration and two separate extensions.
This draft includes only the outline of 6 (six!) new modules, which piece together to support a variety of limiter and interceptor behaviors. While I am concerned about the scope of this (#9591, #7441), this appears to be a good direction:
Two new configuration modules, updates to configgrpc and confighttp:
Two extension interfaces:
Three extensions added/modified:
One helper library:
One receiver demonstrating item-count and memory-size limits:
Next steps:
Link to tracking issue
Part of #9591 #7441 #12603
Testing
NONE: for discussion
Documentation
NONE: TODO