hackernews client

sijuntan

17 hours ago

We introduce *`DeepSWE-Preview`*, a reasoning-enabled coding agent trained from `Qwen3-32B` with only reinforcement learning (RL). It achieves an impressive 59.0*%* on SWE-Bench-Verified with test-time scaling, reaching SOTA for open-weight coding agents (*42.2%* Pass@1, *71.0%* Pass@16).

DeepSWE is trained using [*rLLM*](https://www.notion.so/rLLM-A-Framework-for-Post-Training-Lan...), our framework for post-training language agents. We’ve *open sourced* everything—our dataset, code, training, and eval logs, for everyone to progress on scaling and improving agents with RL.

DeepSWE: Training an Open-Sourced Coding Agent by Scaling RL

1 Comments

sijuntan