[Intel][Gaudi] Remove HPU's dependency on vllm #6061

yangw1234 · 2025-05-06T18:34:05Z

Motivation

This PR removes HPU's dependency on vllm.

This PR depends on #5252 and #5923

Modifications

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://slack.sglang.ai to discuss your PR.

fix import error and improve contiguous pa clean up clean up clean up style pre-commit clean up enable warmup add moe layer fix accuracy disable radix cache automatically remove sort clean up fix fix model type refine style style add hpu test add change try directly install vllm-fork clean up unnecessary changes fix awq address comments fix acc address comments fix warmup fix cpu scheduler fix style fix device remove block scales refactor refactor fix fix remove change fix attn mask fix attn bias add to device fix style address comments address comments Update python/sglang/srt/model_executor/forward_batch_info.py Co-authored-by: JieXin Liang <[email protected]> Update python/sglang/srt/model_executor/model_runner.py Co-authored-by: JieXin Liang <[email protected]> address comments optimize allocator and rope add profile add heap based allocator

pin numpy try running scheduler in hpu disable overlap schedule fix style remove schedule func

update xgrammar fix tests fix warmup add fix optimize perf a bit optimize some remove

revert registry fix rope

yangw1234 added 11 commits April 28, 2025 03:40

fix style

50a8428

pin numpy try running scheduler in hpu disable overlap schedule fix style remove schedule func

add type hint

b689283

remove rope

4668342

remove unnecessary changes

e9de267

change to hpu_attn_backend

6ff2e12

fix reference

5c9efd2

fix reference

cd2bcae

fix style

f09d5e2

initial support for prefix caching

940b52c

update xgrammar fix tests fix warmup add fix optimize perf a bit optimize some remove

remove vllm as a dependency

450fb96

revert registry fix rope

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Intel][Gaudi] Remove HPU's dependency on vllm #6061

[Intel][Gaudi] Remove HPU's dependency on vllm #6061

yangw1234 commented May 6, 2025

[Intel][Gaudi] Remove HPU's dependency on vllm #6061

Are you sure you want to change the base?

[Intel][Gaudi] Remove HPU's dependency on vllm #6061

Conversation

yangw1234 commented May 6, 2025

Motivation

Modifications

Checklist