Skip to content

Move _get_kl_completion_ids into _get_kl_dataset#6181

Open
qgallouedec wants to merge 3 commits into
mainfrom
move-get-kl-dataset
Open

Move _get_kl_completion_ids into _get_kl_dataset#6181
qgallouedec wants to merge 3 commits into
mainfrom
move-get-kl-dataset

Conversation

@qgallouedec

@qgallouedec qgallouedec commented Jun 25, 2026

Copy link
Copy Markdown
Member

_get_kl_completion_ids was a module-level helper used in exactly one place. Move it into its only caller, KTOTrainer._get_kl_dataset, as a nested function, alongside the existing rename_kl_fn.

The direct unit test that imported the helper is removed; its behavior stays covered through the trainer's KL-dataset preparation.


Note

Low Risk
Pure structural refactor with identical KL dataset mapping logic; only test surface area shrinks for the former public helper.

Overview
Refactors where KL mismatching is defined: the module-level _get_kl_completion_ids helper is removed and the same batched completion cycling logic lives as a nested get_kl_completion_ids inside KTOTrainer._get_kl_dataset, next to rename_kl_fn. The vision collator comment now points at that nested name.

Tests drop the standalone unit test that imported _get_kl_completion_ids and asserted rotation on a synthetic Dataset; imports are trimmed accordingly. KL pairing behavior is unchanged and remains covered via trainer/collator paths (e.g. test_kl_cycling, end-to-end KTO training).

Reviewed by Cursor Bugbot for commit 35edb17. Bugbot is set up for automated code reviews on this repo. Configure here.

@bot-ci-comment

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant