Correctness issue when Gemma4 E2B/E4B models (KV-sharing models) training has activation_checkpointing enabled

April 7, 2026 ยท #1705
View on GitHub
Python Difficulty: Medium

Labels

bug

Sign in required

Authenticate to use favourites & bookmarks

5