Skip to content

Pull requests: vllm-project/flash-attention

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add DCP parameters
#92 opened Sep 16, 2025 by MatthewBonanni Draft
Vllm_flash_attn_with_attention_weights
#88 opened Sep 11, 2025 by SiriusPaul Loading…
FA3 variable length attention sort/swizzle
#82 opened Aug 22, 2025 by jayhshah Loading…
WIP stream k scheduling
#67 opened Apr 29, 2025 by LucasWilkinson Draft
sync
#65 opened Apr 23, 2025 by AlpinDale Loading…
fix: add "typename" prior to dependent type name
#54 opened Feb 28, 2025 by zhiweij1 Loading…
AMD ROCm Build
#41 opened Jan 29, 2025 by ProExpertProg Draft
support KV-Compress paged KV cache
#27 opened Nov 27, 2024 by IsaacRe Loading…
Add CUDA 8.7 arch for Jetson Orin
#26 opened Nov 27, 2024 by conroy-cheers Loading…
Update torch to 2.5.1
#25 opened Nov 7, 2024 by ayakzob Loading…
Don't disable uneven k to support more headdims
#21 opened Sep 27, 2024 by njhill Loading…
Update .gitignore to ignore *env/ directories
#16 opened Aug 8, 2024 by wasertech Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.