Why are old_logprob and new_logprob different and the coef_1!=1 in GRPO when num_iterations=1
March 21, 2026 ยท #4502
Python
Difficulty: Medium
Parent Repository
unslothai/unsloth
Python repository
60,135 5,153
unslothai/unsloth
Python repository
Sign in required
Authenticate to use favourites & bookmarks