[Bug]: MLA attention casts activations to int32 when using Marlin FP8 on GPUs without native FP8 support (sm < 89)
March 31, 2026 ยท #38658
Python
Difficulty: Easy
Labels
bug
Parent Repository
vllm-project/vllm
Python repository
75,721 15,332