Hackernews
new
show
ask
jobs
Show HN: AST-guard A gradient-immune structural guard against RL reward hacking
3 points
posted 8 hours ago
by thinking-nick
(github.com)
1 Comments
thinking-nick
8 hours ago
[flagged]