[Proposal] Topology-Aware KV Cache Compression for Memory-Efficient Inference
April 1, 2026 ยท #38725
Python
Difficulty: Medium
Labels
performance
Parent Repository
vllm-project/vllm
Python repository
75,721 15,332
Labels
vllm-project/vllm
Python repository
Sign in required
Authenticate to use favourites & bookmarks