hackernews client

strickvl

6 hours ago

Hi HN, I am one of the people behind Kitaru.

Over the last year, we kept seeing teams in our ZenML community stretch pipeline DAGs to run agents; they executed things like dynamic branching, state passed through artifact-store workarounds, conditional steps, etc. It technically worked, but the abstraction was fighting the whole way.

The core issue was that pipelines assume you know the graph upfront, but agents don’t. They loop, branch on LLM outputs, pause for human/agent input, and fail in expensive ways when you have to restart from scratch. Kitaru is our attempt to build the missing layer for that: durable execution for Python agents. It’s not an agent framework, and it’s not just tracing/observability. It’s the layer underneath existing agent code that gives you crash recovery, pause/resume for human/agent or webhook input, and replay from any checkpoint.

We kept onboarding intentionally simple and small - add `@flow` and `@checkpoint` to normal Python functions and run your agent. No graph DSL, no big rewrite. It’s built on top of ZenML’s engine, so you get persisted artifacts, replayability, and the same code can run locally or on your own infrastructure.

We’re still early, and so I would love to get some feedback on the product and idea.

Happy to answer anything technical.

Show HN: Kitaru – Open-source infrastructure for async agents

1 Comments

strickvl