ModelAdapters do not dynamically route to new pods

### 🐛 Describe the bug

When I scale up my deployment using a lora adapter I see that all the traffic to the lora adapter always goes to the pod that initially was there when the ModelAdapter was created.

Requests to the base model are still load balanced across the new backends

Additionally after scaling down if the initial pod is removed i see errors from the system when requesting the lora module BUT `kubectl describe modeladapter lora-name` shows that it is still Running (though likely pointing to a dead resource)

### Steps to Reproduce

1. Deploy a lora adapter with HPA on deployment
2. Load test to trigger scale up
3. Observe the number of running requests per pod are all on one pod

### Expected behavior

Load should be balanced across all running pods

### Environment

2.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ModelAdapters do not dynamically route to new pods #1095

🐛 Describe the bug

Steps to Reproduce

Expected behavior

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ModelAdapters do not dynamically route to new pods #1095

Description

🐛 Describe the bug

Steps to Reproduce

Expected behavior

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions