[Performance]: llmcompressor W8A8 Inference: decoding stage speed is lower than FP16
April 1, 2026 ยท #38697
Python
Difficulty: Medium
Labels
performance
Parent Repository
vllm-project/vllm
Python repository
75,721 15,332
Labels
vllm-project/vllm
Python repository
Sign in required
Authenticate to use favourites & bookmarks