[Bug] GRPO text-only path treats logits as hidden states for Qwen3.5 and Gemma 4, causing matmul shape mismatch

April 21, 2026 ยท #5121
View on GitHub
Python Difficulty: Medium

Labels

bug

Sign in required

Authenticate to use favourites & bookmarks

5