Skip to content

Align data collators across DPO / SFT / Reward / KTO#6178

Open
qgallouedec wants to merge 3 commits into
mainfrom
align-collators
Open

Align data collators across DPO / SFT / Reward / KTO#6178
qgallouedec wants to merge 3 commits into
mainfrom
align-collators

Conversation

@qgallouedec

@qgallouedec qgallouedec commented Jun 25, 2026

Copy link
Copy Markdown
Member

Consistency pass over the data collators so the same thing is written the same way everywhere. No behavior change.

  • Docstrings: unified intro lines, arg wording (max_length, truncation_mode, return_tensors) and the # special case for Qwen2.5-VL comment. Added the missing Examples blocks to the KTO collators.
  • Naming: KTO batchoutput, exexample, torch.int64torch.long.
  • Structure: KTO vision now uses inline "token_type_ids" in processed_prompts checks (dropping the has_tti/has_mm_tti locals) so its flush-left / truncate / output blocks match DPO/SFT word-for-word; simplified mm_token_type_ids handling to match.
  • Fixed a misplaced comma in the repeated BOS comment.

Deeper output-key naming/semantics (e.g. KTO's completion_input_ids holds prompt+completion) is left for a follow-up PR.


Note

Low Risk
Collator-only refactor with stated no behavior change; the mm_token_type_ids merge simplification could affect edge-case VLMs if completion tensors previously carried non-zero mm types.

Overview
Consistency pass across DPO, SFT, and KTO data collators so docstrings, naming, and vision collator control flow match; the PR description states no intended behavior change.

Documentation: Unified max_length, truncation_mode, and return_tensors wording; expanded KTO text and vision collator docs with Examples blocks; reordered DPO vision output key list; clarified SFT max_length as truncate-before-pad for text collators.

KTO text collator: Renames batchoutput, exexample; uses torch.long instead of explicit int64 on tensors; adds inline Truncate / Pad comments aligned with DPO.

Vision collators (KTO, DPO, SFT): Replaces has_tti / has_mm_tti locals with inline "token_type_ids" in processed_prompts checks; aligns flush-left, truncate, and output blocks with DPO/SFT; sets completion-side mm_token_type_ids via torch.zeros_like(completion_ids) instead of merging processor completion mm_token_type_ids (KTO KL path similarly); fixes BOS comment typo (BOS, twiceBOS twice); DPO adds a Truncate if necessary comment before the vision truncation block.

Reviewed by Cursor Bugbot for commit 640f53b. Bugbot is set up for automated code reviews on this repo. Configure here.

@bot-ci-comment

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant