generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add
_compute_score
method to PPOTrainer
#2560
opened Jan 11, 2025 by
oliveiraeliel
•
Draft
2 of 5 tasks
PPO/RLOO/OnlineDPO sequence generation: make deepsped 3 weight gathering optional
#2557
opened Jan 10, 2025 by
dawidm
Loading…
4 tasks done
Add generation caching in TextEnvironment and fix bugs in TextEnvironment
#2556
opened Jan 10, 2025 by
konrad-gerlach
Loading…
Reintroduce
truncation_mode
in DPOTrainer
#2551
opened Jan 8, 2025 by
anakin87
Loading…
4 of 5 tasks
custom reward function support for ppo trainer
#2540
opened Jan 3, 2025 by
August-murr
•
Draft
1 of 5 tasks
PPOTrainer: fix progress bar for num_mini_batches > 1
#2531
opened Dec 29, 2024 by
dawidm
Loading…
4 tasks done
Include stop token in policy model's generation_config
#2528
opened Dec 28, 2024 by
dawidm
Loading…
2 of 5 tasks
RLOO trainer: fix calculations of steps, episodes and epochs
#2516
opened Dec 23, 2024 by
dawidm
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.