[Bug] Training on an ChatML-like dataset somehow uses much, much more VRAM than on an Alpaca-like dataset

March 21, 2026 ยท #4504
View on GitHub
Python Difficulty: Medium

Labels

bug

Sign in required

Authenticate to use favourites & bookmarks

5