Skip to content

feat: support endpoint override policy based routing #6458

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 30, 2025

Conversation

Xunzhuo
Copy link
Member

@Xunzhuo Xunzhuo commented Jul 3, 2025

What type of PR is this?

feat: support host override policy based routing

What this PR does / why we need it:

Support host override policy based routing, a typical scenario is the LLM Endpoint Picker.

Which issue(s) this PR fixes:

Fixes #6456

Release Notes: Yes

Use in this way:

- apiVersion: gateway.envoyproxy.io/v1alpha1
  kind: BackendTrafficPolicy
  metadata:
    namespace: default
    name: policy-for-header-override
  spec:
    targetRef:
      group: gateway.networking.k8s.io
      kind: HTTPRoute
      name: httproute
    loadBalancer:
      type: RoundRobin
      endpointOverride:
        extractFrom:
        - header: "x-gateway-destination-endpoint"
- apiVersion: gateway.networking.k8s.io/v1
  kind: HTTPRoute
  metadata:
    namespace: default
    name: httproute
  spec:
    hostnames:
    - gateway.envoyproxy.io
    parentRefs:
    - namespace: envoy-gateway
      name: inference-gateway
      sectionName: http
    rules:
    - matches:
      - path:
          value: "/v1"
      backendRefs:
      - name: fallback-inference-service
        port: 8080

@Xunzhuo Xunzhuo marked this pull request as ready for review July 3, 2025 09:51
@Xunzhuo Xunzhuo requested a review from a team as a code owner July 3, 2025 09:51
@Xunzhuo Xunzhuo force-pushed the feat-host-override branch 3 times, most recently from 483e17a to a9e8062 Compare July 3, 2025 09:59
Copy link
Member Author

@Xunzhuo Xunzhuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image

e2e passed locally.

Copy link

codecov bot commented Jul 3, 2025

Codecov Report

❌ Patch coverage is 61.42857% with 54 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.04%. Comparing base (907b90b) to head (6c792b6).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
internal/xds/translator/cluster.go 56.80% 49 Missing and 5 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6458      +/-   ##
==========================================
- Coverage   71.06%   71.04%   -0.02%     
==========================================
  Files         225      225              
  Lines       39264    39404     +140     
==========================================
+ Hits        27903    27996      +93     
- Misses       9744     9789      +45     
- Partials     1617     1619       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Xunzhuo Xunzhuo force-pushed the feat-host-override branch from a9e8062 to 364e8a8 Compare July 3, 2025 13:49
@Xunzhuo
Copy link
Member Author

Xunzhuo commented Jul 4, 2025

/retest

@Xunzhuo Xunzhuo force-pushed the feat-host-override branch from 364e8a8 to 9fb2b51 Compare July 4, 2025 14:26
@arkodg arkodg added this to the v1.5.0-rc.1 Release milestone Jul 9, 2025
@Xunzhuo Xunzhuo changed the title feat: support host override policy based routing feat: support endpoint override policy based routing Jul 10, 2025
@Xunzhuo Xunzhuo force-pushed the feat-host-override branch 4 times, most recently from 91de7b0 to be0b252 Compare July 10, 2025 14:14
@Xunzhuo Xunzhuo requested a review from arkodg July 10, 2025 14:17
@Xunzhuo Xunzhuo force-pushed the feat-host-override branch from be0b252 to 3944377 Compare July 10, 2025 14:25
@Xunzhuo
Copy link
Member Author

Xunzhuo commented Jul 11, 2025

/retest

@Xunzhuo
Copy link
Member Author

Xunzhuo commented Jul 11, 2025

Endpoint Picker General Implementation Logics

This is how we generally implemented the EPP logics in Envoy based API Gateway, no matter the control plane is Envoy Gateway, Istio, or KGateway:

  1. Control Plane: it tells envoy how to route (host override lbpolicy or original cluster) and tells envoy how to connect to the epp ext-proc (http ext-proc filter + route level epp ext-proc config override, if the extproc need to read/write the metadata we should also set receiving_namespaces/forwarding_namespaces at ext-proc config)
  2. Data Plane: epp ext proc selects the endpoint and adding it to metadata or header. Envoy routes to that endpoint based on the control plane sent rules.

Envoy Original Dst Cluster vs Host Override LbPolicy

Original Dst Cluster: It is easy to implement and don't need the real cluster endpoints. But it does not support fallback, which means if the selection is failed, the routing will fail immediately.

Host Override LbPolicy: It is a bit complexer than original dst cluster to implement, it requires the real cluster endpoints, and the selected endpoint should be in the endpoints, otherwise it will fallback. So when Gateway implements the InferencePool with host override lbpolicy, we usually need a real service selects the inference workload endpoints, and the host override lbpolicy is working on the kubernetes service, and the endpoint selection logics in EndpointPicker should also select the endpoints in the same endpoints (Istio creates a service with the same labels selectors with the InferencePool Selectors)

How to implement the Endpoint Picker logics in Envoy Gateway?

Different AI Gateway based on Envoy Gateway has different approaches to reach the above goal:

Envoy AI Gateway: use Envoy Gateway Extension Server.

It edits cluster, route, listener to make this work, this is quite challenging since it is a complex work, which need to work well with the existing config. This is not suitable for adopters like AIBrix.

The default EPP implementation is GIE.

AIBrix Inference Gateway: use Envoy Gateway CRD configuration.

The EPP imlementation is AIBrix Gateway Plugin.(Similar to GIE, it provides intelligently endpoint picker)

  1. v1 (currently): use envoy gateway EnvoyPatchPolicy (patch original cluster config) + EnvoyExtensionPolicy (add epp ext-proc config to gateway), this is static and not easy to maintain or orchestrate.
  2. v2 (planning): use envoy gateway btp (add host override lb policy) + eep (add epp ext-proc config to gateway, also add receiving/forwarding ns with 'envoy.lb' if needed), this can largely improve UX, and also add fallback abilities to the GW.
  3. v3 (after v2): use controller to automatically do what we configure manually in v2, and support InferencePool API. Simplify UX and can adopt GIE conformance test.

@zhaohuabing
Copy link
Member

ping @Xunzhuo

@Xunzhuo Xunzhuo force-pushed the feat-host-override branch 2 times, most recently from 5e0ac7f to 8383e80 Compare July 29, 2025 02:47
@zirain
Copy link
Member

zirain commented Jul 29, 2025

/retest

zirain
zirain previously approved these changes Jul 30, 2025
arkodg
arkodg previously approved these changes Jul 30, 2025
@zirain zirain enabled auto-merge (squash) July 30, 2025 02:53
@arkodg
Copy link
Contributor

arkodg commented Jul 30, 2025

seeing consistent test failures @Xunzhuo

    --- FAIL: TestE2E/EndpointOverrideLoadBalancing (4.26s)
        --- FAIL: TestE2E/EndpointOverrideLoadBalancing/header-based_endpoint_override_with_valid_pod_IP_should_route_to_specific_pod (1.02s)

can we skip this for now, and track it with a GH issue

@Xunzhuo
Copy link
Member Author

Xunzhuo commented Jul 30, 2025

@arkodg it happened only when ipv6?

@arkodg
Copy link
Contributor

arkodg commented Jul 30, 2025

@arkodg it happened only when ipv6?

yeah look like it, but it passes on dual

@Xunzhuo Xunzhuo dismissed stale reviews from arkodg and zirain via 9cf2be3 July 30, 2025 06:18
@Xunzhuo Xunzhuo force-pushed the feat-host-override branch from 771f02e to 9cf2be3 Compare July 30, 2025 06:18
@zirain
Copy link
Member

zirain commented Jul 30, 2025

@Xunzhuo please fix gen-check

Xunzhuo added 8 commits July 30, 2025 15:20
Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
Signed-off-by: bitliu <[email protected]>
@zirain zirain force-pushed the feat-host-override branch from 7c3cefc to 6c792b6 Compare July 30, 2025 07:20
@Xunzhuo Xunzhuo disabled auto-merge July 30, 2025 08:30
@Xunzhuo Xunzhuo merged commit 8673702 into envoyproxy:main Jul 30, 2025
44 of 48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support endpoint picker based on host override policy
6 participants