[Proposal] Topology-Aware KV Cache Compression for Memory-Efficient Inference

April 1, 2026 · #38725

Python Difficulty: Medium

Labels

performance

Parent Repository

vllm-project/vllm

Python repository

All Issues Back to vllm

Sign in required

Authenticate to use favourites & bookmarks

5