-
Notifications
You must be signed in to change notification settings - Fork 520
feat: support endpoint override policy based routing #6458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
483e17a
to
a9e8062
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6458 +/- ##
==========================================
- Coverage 71.06% 71.04% -0.02%
==========================================
Files 225 225
Lines 39264 39404 +140
==========================================
+ Hits 27903 27996 +93
- Misses 9744 9789 +45
- Partials 1617 1619 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
a9e8062
to
364e8a8
Compare
/retest |
364e8a8
to
9fb2b51
Compare
91de7b0
to
be0b252
Compare
be0b252
to
3944377
Compare
/retest |
Endpoint Picker General Implementation LogicsThis is how we generally implemented the EPP logics in Envoy based API Gateway, no matter the control plane is Envoy Gateway, Istio, or KGateway:
Envoy Original Dst Cluster vs Host Override LbPolicyOriginal Dst Cluster: It is easy to implement and don't need the real cluster endpoints. But it does not support fallback, which means if the selection is failed, the routing will fail immediately. Host Override LbPolicy: It is a bit complexer than original dst cluster to implement, it requires the real cluster endpoints, and the selected endpoint should be in the endpoints, otherwise it will fallback. So when Gateway implements the InferencePool with host override lbpolicy, we usually need a real service selects the inference workload endpoints, and the host override lbpolicy is working on the kubernetes service, and the endpoint selection logics in EndpointPicker should also select the endpoints in the same endpoints (Istio creates a service with the same labels selectors with the InferencePool Selectors) How to implement the Endpoint Picker logics in Envoy Gateway?Different AI Gateway based on Envoy Gateway has different approaches to reach the above goal: Envoy AI Gateway: use Envoy Gateway Extension Server.It edits cluster, route, listener to make this work, this is quite challenging since it is a complex work, which need to work well with the existing config. This is not suitable for adopters like AIBrix. The default EPP implementation is GIE. AIBrix Inference Gateway: use Envoy Gateway CRD configuration.The EPP imlementation is AIBrix Gateway Plugin.(Similar to GIE, it provides intelligently endpoint picker)
|
ping @Xunzhuo |
5e0ac7f
to
8383e80
Compare
/retest |
d2260c2
to
771f02e
Compare
seeing consistent test failures @Xunzhuo
can we skip this for now, and track it with a GH issue |
@arkodg it happened only when ipv6? |
yeah look like it, but it passes on dual |
771f02e
to
9cf2be3
Compare
@Xunzhuo please fix gen-check |
Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
7c3cefc
to
6c792b6
Compare
What type of PR is this?
feat: support host override policy based routing
What this PR does / why we need it:
Support host override policy based routing, a typical scenario is the LLM Endpoint Picker.
Which issue(s) this PR fixes:
Fixes #6456
Release Notes: Yes
Use in this way: