-
-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix] Fix vllm_flash_attn rotary import
ci/build
#17247
opened Apr 27, 2025 by
jeejeelee
Loading…
[Core] Use platform-agnostic device control for DP engine core
v1
#17245
opened Apr 27, 2025 by
jianzs
Loading…
[MISC] Use string annotation types for class definitions
ready
ONLY add when PR is ready to merge/full CI is needed
#17244
opened Apr 27, 2025 by
jianzs
Loading…
[Bug fix] ROCm FlashAttention: add missing
full_scales
argument to Triton wrapper
needs-rebase
#17235
opened Apr 26, 2025 by
zhewenl
Loading…
[Docs] Add a security guide
documentation
Improvements or additions to documentation
#17230
opened Apr 26, 2025 by
russellb
Loading…
[ROCm] Effort to reduce the number of environment variables in command line
ci/build
rocm
Related to AMD ROCm
#17229
opened Apr 26, 2025 by
hongxiayang
Loading…
Use CUDA 12.6 as default for release and nightly wheels
ci/build
documentation
Improvements or additions to documentation
[Doc] Clarify note for H2O-VL
documentation
Improvements or additions to documentation
#17219
opened Apr 26, 2025 by
DarkLight1337
Loading…
[V1][Spec Decode] Apply torch.compile & cudagraph to EAGLE
documentation
Improvements or additions to documentation
v1
#17211
opened Apr 26, 2025 by
luyuzhe111
Loading…
[Misc][Tools][Benchmark] Publish script to auto tune server parameters
#17207
opened Apr 25, 2025 by
Chenyaaang
Loading…
[Benchmark] Add single turn MTBench to Serving Bench
#17202
opened Apr 25, 2025 by
ekagra-ranjan
Loading…
[Hardware][Apple] Allows VLLM_TARGET_DEVICE=empty on MacOs
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#17200
opened Apr 25, 2025 by
wallashss
Loading…
[WIP] Support vLLM in transformers hybrid attention implementation
#17198
opened Apr 25, 2025 by
wuisawesome
Loading…
[Security] Don't bind tcp zmq socket to all interfaces
documentation
Improvements or additions to documentation
ready
ONLY add when PR is ready to merge/full CI is needed
security
Security related issues and PRs
[WIP][Bugfix] Fix 'MistralTokenizer' object has no attribute 'init_kwargs'
bug
Something isn't working
ready
ONLY add when PR is ready to merge/full CI is needed
[V1] Remove num_input_tokens from attn_metadata
tpu
Related to Google TPUs
v1
#17193
opened Apr 25, 2025 by
heheda12345
Loading…
[Bugfix] support local dataset path in benchmark_serving
#17179
opened Apr 25, 2025 by
wubai
Loading…
Add option "--expand-tools-even-if-tool-choice-none"
frontend
tool-calling
#17177
opened Apr 25, 2025 by
okdshin
Loading…
[CI] Add mteb testing to test the accuracy of the embedding model
ci/build
#17175
opened Apr 25, 2025 by
noooop
Loading…
[Bugfix] Modifications to error handling of multiple vllm api endpoints
frontend
#17165
opened Apr 25, 2025 by
tunglinwood
Loading…
[Hardware][Power] Enable compressed tensor W8A8 INT8 quantization for POWER
ci/build
#17153
opened Apr 25, 2025 by
Akashcodes732
Loading…
[Misc] Add gemma3 chat template with pythonic-style function calling
documentation
Improvements or additions to documentation
tool-calling
#17149
opened Apr 25, 2025 by
philipchung
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.