Ask HN: Any Local LLM can I run without GPU for Local Agentic workflow AI?

7 pointsposted 20 hours ago
by limondas

Item id: 48487147

4 Comments

denn-gubsky

20 hours ago

Try qwen3-coder or qwen3-coder-next models which fit into your configuration. This is team-of-experts model which may load only actual experts into GPU.

limondas

18 hours ago

Thanks for your reply. But it's to big for my PC. In PC around 1.5GB models got 20 token/s , which is too low for agentic workflow.

denn-gubsky

9 hours ago

try latest gemma4:12b. It fits into 16Gb with 256K context window