[Bug] FlashInfer + MTP speculative decoding crashes on SM121 (DGX Spark) with GQA=16 model

March 21, 2026 ยท #37754
View on GitHub
Python Difficulty: Easy

Sign in required

Authenticate to use favourites & bookmarks

5