Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KTOTrainer should work when actual batch size==1 #2554

Open
starmpcc opened this issue Jan 10, 2025 · 0 comments
Open

KTOTrainer should work when actual batch size==1 #2554

starmpcc opened this issue Jan 10, 2025 · 0 comments
Labels
✨ enhancement New feature or request 🏋 KTO Related to KTO

Comments

@starmpcc
Copy link

if args.per_device_train_batch_size <= 1:
raise ValueError(
"Actual (not effective) batch size must be > 1. KTO will not work properly because the KL term will be equivalent to the implied reward."
)

This check was introduced in #2153
However, the KL logits were calculated by unlinking prompt_input_ids and answer_input_ids, which means the KL term is not equivalent to the reward term.
Accordingly, KTOTrainer should work when the actual batch size is 1.

Thank you!

@August-murr August-murr added 🏋 KTO Related to KTO ✨ enhancement New feature or request labels Jan 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ enhancement New feature or request 🏋 KTO Related to KTO
Projects
None yet
Development

No branches or pull requests

2 participants