Skip to content

Add InferencePool Integration Support to Gateway Plugin #1233

@Xunzhuo

Description

@Xunzhuo

🚀 Feature Description and Motivation

Currently, we patch the configuration with EPP manually and static, see https://github.com/vllm-project/aibrix/blob/main/config/gateway/gateway.yaml#L100

Here is the summary of this proposal:

  1. v1 (currently): use envoy gateway EnvoyPatchPolicy (patch original cluster config) + EnvoyExtensionPolicy (add epp ext-proc config to gateway), this is static and not easy to maintain or orchestrate or extensible.
  2. v2 (planning): use envoy gateway btp (add host override lb policy) + eep (add epp ext-proc config to gateway, also add receiving/forwarding ns with 'envoy.lb' if needed), this can largely improve UX, and also add fallback abilities to the AIBrix GW.
  3. v3 (after v2): use aibrix controller to automatically do what we configure manually in v2, and support InferencePool API. Simplify UX and can adopt GIE conformance test.

Use Case

  1. Simplify configuration and improve UX.
  2. Support Fallback and improve reliabilities.
  3. Integration with GIE and inference upstream conformance tests.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions