KV cache compression via E8 lattice VQ — 10-33x with PagedAttention integration
April 7, 2026 · #39241
Python
Difficulty: Easy
Parent Repository
vllm-project/vllm
Python repository
75,721 15,332
vllm-project/vllm
Python repository
Sign in required
Authenticate to use favourites & bookmarks