Supervised Fine Tuning on Curated Data Is Reinforcement Learning

3 pointsposted 13 hours ago
by saijajin

1 Comments

user

13 hours ago

[deleted]