Promote KTO to stable API by albertvillanova · Pull Request #6175 · huggingface/trl

albertvillanova · 2026-06-25T05:56:15Z

Promote KTO to stable API.

This PR promotes the KTOTrainer and KTOConfig from the experimental API (trl.experimental.kto) to the stable API (trl). It updates all relevant documentation, scripts, and tests to use the new import paths, and deprecates the old experimental import with a warning. This change simplifies usage for end users and signals that the KTO API is now considered stable.

Changes

API Promotion and Deprecation:

The experimental KTOConfig and KTOTrainer classes now inherit from the stable API and emit a deprecation warning if imported from the experimental path, indicating users should switch to from trl import ....

Documentation Updates:

docs/source/kto_trainer.md, docs/source/paper_index.md, docs/source/reducing_memory_usage.md, docs/source/speeding_up_training.md: All references and code examples now use the stable import path for KTOTrainer and KTOConfig.
The warning about KTO being experimental has been removed.

Code and Script Updates:

examples/scripts/kto.py, trl/scripts/kto.py: Updated imports to use from trl import KTOConfig, KTOTrainer and related types.

Test Updates:

tests/test_kto_trainer.py (renamed from tests/experimental/test_kto_trainer.py): Updated to import from the stable API and moved to the main test directory, reflecting the stable status.

Note

Medium Risk
Large code move with import-path churn for downstream users still on trl.experimental.kto, though behavior is intended to be unchanged via delegation. KTO training touches reference models, KL batching, and PEFT paths, so regressions would affect alignment workflows.

Overview
KTO (KTOTrainer, KTOConfig) is promoted from trl.experimental.kto to the stable trl / trl.trainer surface. The full trainer and config implementations now live under trl/trainer/; the experimental modules are thin subclasses that delegate to the stable types and emit a FutureWarning (removal planned in v2.0.0).

Docs, examples (examples/scripts/kto.py, trl/scripts/kto.py), and memory/speed guides now show from trl import KTOConfig, KTOTrainer. The KTO trainer doc drops the experimental-only warning block. tests/test_kto_trainer.py imports the stable API and trainer helpers from trl.trainer.kto_trainer.

^{Reviewed by Cursor Bugbot for commit 021039c. Bugbot is set up for automated code reviews on this repo. Configure here.}

bot-ci-comment · 2026-06-25T05:59:04Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2026-06-25T16:22:56Z

I'm a bit worried that this would break the git history:

Both trl/trainer/kto_trainer.py and trl/experimental/kto/kto_trainer.py already exist on main. This PR swaps their contents rather than renaming a file, so git sees ~1500 lines deleted from one and added to the other as independent edits -> no rename is detected, and blame/log --follow on the promoted implementation will collapse to this one commit.

We could make it a real rename instead: git mv the experimental implementation onto trl/trainer/kto_trainer.py (removing the old shim first), then add the new experimental shim as a fresh file. That lets git detect the rename and preserves blame history. Same applies to kto_config.py.

This reverts commit a484978.

albertvillanova · 2026-06-26T04:37:11Z

@qgallouedec, thanks for pointing this out. I agree with the concern in principle: this promotion creates a large history boundary for kto_trainer.py/kto_config.py.

That said, since this PR will be squash-merged, I don’t think using git mv inside the branch would materially change the final repository history. Git does not store renames explicitly; rename detection is inferred from the final diff. Because both files already exist on main and both still exist after the PR, the squashed commit would still look like a large rewrite/content swap rather than a clean rename.

I tried the suggested approach anyway, but GitHub still shows additions/deletions rather than a rename.

albertvillanova added 4 commits June 25, 2026 07:31

Promote KTO to stable API

a484978

Update imports

8b2b98a

Move file with KTO tests

33a8de2

Update docs

b170c6e

albertvillanova added 5 commits June 26, 2026 06:20

Revert "Promote KTO to stable API"

9b017b5

This reverts commit a484978.

Remove KTO shims from stable

77ded33

Move KTO from experimental to stable

6b94bba

Add KTO shims to experimental

ec40029

Merge remote-tracking branch 'upstream/main' into align-kto-dpo-stable

e62f812

Fix imports

021039c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Promote KTO to stable API#6175

Promote KTO to stable API#6175
albertvillanova wants to merge 10 commits into
mainfrom
align-kto-dpo-stable

albertvillanova commented Jun 25, 2026 •

edited by cursor Bot

Loading

Uh oh!

bot-ci-comment Bot commented Jun 25, 2026

Uh oh!

qgallouedec commented Jun 25, 2026

Uh oh!

albertvillanova commented Jun 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

albertvillanova commented Jun 25, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

bot-ci-comment Bot commented Jun 25, 2026

Uh oh!

qgallouedec commented Jun 25, 2026

Uh oh!

albertvillanova commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

albertvillanova commented Jun 25, 2026 •

edited by cursor Bot

Loading

albertvillanova commented Jun 26, 2026 •

edited

Loading