GQA/MQA attention broken — only MHA (Q_heads == KV_heads) produces coherent output

April 12, 2026 · #61
View on GitHub
c Difficulty: Medium

Labels

bug

Sign in required

Authenticate to use favourites & bookmarks

5