Skip to content

feat(grpo_trainer.py): STARE — Surprisal-guided Token-Level Advantage Reweighting#6167

Open
smellslikeml wants to merge 2 commits into
huggingface:mainfrom
smellslikeml:stare-surprisal-guided-token-level-advantage-reweighting-for
Open

feat(grpo_trainer.py): STARE — Surprisal-guided Token-Level Advantage Reweighting#6167
smellslikeml wants to merge 2 commits into
huggingface:mainfrom
smellslikeml:stare-surprisal-guided-token-level-advantage-reweighting-for

fix(`grpo_trainer.py`): reject `loss_type='stare'` + `use_liger_kerne…

22337c6
Select commit
Loading
Failed to load commit list.