Hackernews
new
show
ask
jobs
Training a small model to write better OCaml with RLVR and GRPO
1 points
posted 8 hours ago
by sriharis
(blog.nilenso.com)
No comments yet