Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix[Docs]: link anchor is incorrect #20309 documentation Improvements or additions to documentation structured-output
#20315 opened Jul 1, 2025 by yyzxw Loading…
4 tasks
Add support for Prithvi geospatial model in serving mode documentation Improvements or additions to documentation frontend multi-modality Related to multi-modality (#4194) needs-rebase structured-output v1
#20307 opened Jul 1, 2025 by mgazz Draft
1 of 4 tasks
[doc] quark_mxfp4_introduction documentation Improvements or additions to documentation
#20306 opened Jul 1, 2025 by lihaoyang-amd Draft
[CUDA graphs] Enable full cuda graphs with FA3 AoT scheduling ci/build ready ONLY add when PR is ready to merge/full CI is needed v1
#20301 opened Jul 1, 2025 by WoosukKwon Loading…
[Misc] Minor refactoring for scheduler ready ONLY add when PR is ready to merge/full CI is needed v1
#20299 opened Jul 1, 2025 by WoosukKwon Loading…
[Feature] Support Minimax-M1 function calls features documentation Improvements or additions to documentation frontend tool-calling
#20297 opened Jul 1, 2025 by qscqesze Loading…
Enable fp8 kv cache on rocm aiter backend. rocm Related to AMD ROCm v1
#20295 opened Jul 1, 2025 by fsx950223 Draft
4 tasks
[Optimization] Cache sampled token ids in model runner ready ONLY add when PR is ready to merge/full CI is needed v1
#20291 opened Jul 1, 2025 by WoosukKwon Loading…
Enable group size 64 for Machete
#20290 opened Jul 1, 2025 by czhu-cohere Loading…
3 of 4 tasks
[Model] Adds support for SlimMoE models Phi-tiny-MoE-instruct
#20286 opened Jun 30, 2025 by zichongli5 Loading…
3 of 4 tasks
[Misc][Doc] Add missing comment for LLM frontend
#20285 opened Jun 30, 2025 by draftbk Loading…
1 of 4 tasks
Support DeepSeekV3-style block FP8 quantization with CT
#20279 opened Jun 30, 2025 by mgoin Loading…
[TPU] Temporary fix vmem oom for long model len by reducing page size tpu Related to Google TPUs v1
#20278 opened Jun 30, 2025 by Chenyaaang Loading…
[Docs] use uv in GPU installation docs documentation Improvements or additions to documentation
#20277 opened Jun 30, 2025 by davidxia Loading…
Dummy commit
#20273 opened Jun 30, 2025 by dhonnappa-amd Loading…
[Docs] Update transcriptions API to use openai client with stream=True documentation Improvements or additions to documentation frontend ready ONLY add when PR is ready to merge/full CI is needed
#20271 opened Jun 30, 2025 by NickLucche Loading…
[V1] [ROCm] Enable EP with AITER Fused MoE rocm Related to AMD ROCm
#20270 opened Jun 30, 2025 by tjtanaa Loading…
3 of 4 tasks
[Benchmark] Add benchmark tool for multi turn conversations performance Performance-related issues
#20267 opened Jun 30, 2025 by pliops-daniels Loading…
[misc]refactor Platform.set_device method rocm Related to AMD ROCm tpu Related to Google TPUs v1
#20262 opened Jun 30, 2025 by jikunshang Loading…
4 tasks
[WIP][Model][VLM] Support JinaVL Reranker documentation Improvements or additions to documentation frontend
#20260 opened Jun 30, 2025 by shineran96 Draft
ProTip! Filter pull requests by the default branch with base:main.