Hackernews
new
show
ask
jobs
DPO fine-tuning outperforms SFT
1 points
posted a year ago
by kcorbitt
(openpipe.ai)
No comments yet