[RFC]: Incremental MoE Expert Offloading — GPU Cache + Async Pipeline
March 26, 2026 · #38256
Python
Difficulty: Medium
Parent Repository
vllm-project/vllm
Python repository
75,721 15,332
vllm-project/vllm
Python repository
Sign in required
Authenticate to use favourites & bookmarks