KTOTrainer should work when actual batch size==1 #2554

starmpcc · 2025-01-10T07:55:13Z

Lines 662 to 665 in edabe0a

    
           if args.per_device_train_batch_size <= 1: 
        
               raise ValueError( 
        
                   "Actual (not effective) batch size must be > 1. KTO will not work properly because the KL term will be equivalent to the implied reward." 
        
               )

This check was introduced in #2153
However, the KL logits were calculated by unlinking prompt_input_ids and answer_input_ids, which means the KL term is not equivalent to the reward term.
Accordingly, KTOTrainer should work when the actual batch size is 1.

Thank you!

The text was updated successfully, but these errors were encountered:

August-murr added 🏋 KTO Related to KTO ✨ enhancement New feature or request labels Jan 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KTOTrainer should work when actual batch size==1 #2554

KTOTrainer should work when actual batch size==1 #2554

starmpcc commented Jan 10, 2025

KTOTrainer should work when actual batch size==1 #2554

KTOTrainer should work when actual batch size==1 #2554

Comments

starmpcc commented Jan 10, 2025