[Bug]: MLA attention casts activations to int32 when using Marlin FP8 on GPUs without native FP8 support (sm < 89)

March 31, 2026 ยท #38658
View on GitHub
Python Difficulty: Easy

Labels

bug

Sign in required

Authenticate to use favourites & bookmarks

5