hackernews client

Travis_Cole

21 days ago

I've been using Claude Code heavily for months. It's great for velocity, but I kept hitting the same problems:

  - Agent hallucinates file paths that don't exist
  - Claims "tests pass" without running them
  - Same errors recurring across sessions
  - No way to catch failures that aren't crashes

  The tools exist to catch crashes. Nothing exists to catch semantic failures - when the agent confidently gives wrong answers.

  So I built Task Orchestrator - an MCP server that adds an "immune system" to Claude Code:

  1. Semantic failure detection - catches hallucinations, not just crashes
  2. ML-powered learning - remembers failure patterns, warns before similar prompts
  3. Human-in-the-loop - queues high-risk operations for approval
  4. Cost tracking - see exactly what you're spending
  5. Self-healing circuit breakers

  The math problem: at 95% per-step reliability, a 20-step workflow has only 36% success rate. That's not a bug - it's compound probability.

  Technical details:
  - 680+ tests
  - Provider-agnostic (works with any LLM)
  - MCP native for Claude Code
  - MIT licensed

  What features would you want to see that would improve your AI agent workflows?

Show HN: Task Orchestrator – Production Safety for Claude Code Agents

1 Comments

Travis_Cole