Terminal-bench: a benchmark for AI agents in terminal environments

3 pointsposted 13 hours ago
by cpard

No comments yet