Question regarding KL Divergence gradient flow in `AdaptiveLayerLoss` (edit: and possible issue with layers weighting schema)
May 8, 2026 ยท #3757
Python
Difficulty: Medium
Parent Repository
huggingface/sentence-transformers
Python repository
18,649 2,786