Skip to content

feat(grpo_trainer.py): STARE — Surprisal-guided Token-Level Advantage Reweighting#6167

Open
smellslikeml wants to merge 2 commits into
huggingface:mainfrom
smellslikeml:stare-surprisal-guided-token-level-advantage-reweighting-for
Open

feat(grpo_trainer.py): STARE — Surprisal-guided Token-Level Advantage Reweighting#6167
smellslikeml wants to merge 2 commits into
huggingface:mainfrom
smellslikeml:stare-surprisal-guided-token-level-advantage-reweighting-for

Commits

Commits on Jun 24, 2026