generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat: implement DeepSeek unbiased KL estimator for GRPO
#4638
opened Dec 7, 2025 by
jlcanta
Loading…
2 tasks done
Disable gradient checkpointing during no-grad inference to avoid PyTorch warning
#4636
opened Dec 7, 2025 by
qgallouedec
Loading…
Fix KTOTrainer CUDA error for large-vocab models via tensor indexing
#4635
opened Dec 6, 2025 by
bhuvanprakash
Loading…
Support async reward functions and parallelize call to reward functions.
#4567
opened Nov 24, 2025 by
pramodith
Loading…
3 of 5 tasks
Add cross-tokenizer distillation support for GKD and MiniLLM trainers
#4561
opened Nov 22, 2025 by
sambhavnoobcoder
Loading…
Add PSPO trust region method as alternative to clipping in GRPOTrainer
#4548
opened Nov 19, 2025 by
MCDwyer
Loading…
2 of 5 tasks
Make
skip_special_tokens configurable
#4521
opened Nov 13, 2025 by
taha-yassine
Loading…
3 of 5 tasks
[GRPO] switch grpo liger loss to triton version
#4519
opened Nov 13, 2025 by
kashif
Loading…
1 of 8 tasks
adding [SimPER](https://arxiv.org/abs/2502.00883)
#4486
opened Nov 6, 2025 by
leeparkuky
Loading…
2 of 5 tasks
added 10 papers (+trainer cross-links) for #4407
#4441
opened Nov 3, 2025 by
SSusantAchary
Loading…
4 tasks done
docs: Unify model examples to use trl-lib namespace
#4431
opened Nov 2, 2025 by
behroozazarkhalili
Loading…
Use explicit tiny-Qwen2ForCausalLM-2.5 model_id param in CI tests
#4331
opened Oct 23, 2025 by
albertvillanova
Loading…
refactor: simplify parameter freezing in modeling_base.py
#4305
opened Oct 20, 2025 by
Ki-Seki
Loading…
2 of 5 tasks
feat: Add Multi-Token Prediction (MTP) support to SFTTrainer
#4290
opened Oct 15, 2025 by
KLGR123
Loading…
Remove FSDP1 support: use FSDP2 exclusively
#4260
opened Oct 11, 2025 by
behroozazarkhalili
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2025-12-04.