hackernews client

Hackernews new show ask jobs

Predicting When RL Training Breaks Chain-of-Thought Monitorability

1 pointsposted 9 hours ago

by gmays

(lesswrong.com)

No comments yet