-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Feat] Support FlashMLA backend with MTP and FP8 KV cache
#6109
opened May 8, 2025 by
quinnrong94
Loading…
6 tasks
Cutlass MLA: Disable split kv due to https://github.com/NVIDIA/cutlass/issues/2274
#6101
opened May 7, 2025 by
trevor-m
Loading…
6 tasks
[AMD] switch to custom allreduce regardless of MSCCL setting on ROCm
#6097
opened May 7, 2025 by
hubertlu-tw
Loading…
6 tasks
Log the timestamps of each prefill/decode iteration
#6094
opened May 7, 2025 by
yuhsuan-t
Loading…
1 of 6 tasks
[1/2] Add Kernel support for Cutlass based Fused FP4 MoE
#6093
opened May 7, 2025 by
pavanimajety
•
Draft
6 tasks
[Performance] Use the max num_tokens per DP rank as the CUDA graph batch size
#6092
opened May 7, 2025 by
xpmemeda
Loading…
1 of 6 tasks
[Feat] Integrate flashmla into sgl-kernel
#6065
opened May 7, 2025 by
PopSoda2002
Loading…
1 task done
Fix is_all_greedy flag update when merging batches
high priority
#6055
opened May 6, 2025 by
YavorGIvanov
Loading…
6 tasks done
restruct fused moe code style to avoid long lines files
#6052
opened May 6, 2025 by
BBuf
Loading…
6 tasks
[Docs] Minor changes to development_guide_using_docker.md
#6049
opened May 6, 2025 by
windsonsea
Loading…
3 of 6 tasks
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.