Terminal-bench: a benchmark for AI agents in terminal environments

3 pointsposted 5 months ago
by cpard

No comments yet