Hackernews
new
show
ask
jobs
Batched reward model inference and Best-of-N sampling
34 points
posted a year ago
by rawsh
(raw.sh)
No comments yet