Llama-dash – One go-to control plane for local inference

2 pointsposted 5 hours ago
by ndom91

1 Comments

ndom91

5 hours ago

Been working on this side-project for the last few months and finally put together a nice marketing page and docs site.

It's a beautiful dashboard and proxy for monitoring models, requests, and API keys. It also supports routing rules, and ,proxy metrics, playground tabs, additional attribution headers, and much more. In addition to local models, it also supports proxying anthropic / openai requests from local coding agents like claude-code so you get much more insight into what they're doing too.

Currently relies on llama-swap / llama.cpp as the underlying inference engine, but that part is designed to be very generic and easy to swap out / build support for additional inference applications like ollama, vllm, etc. as long as they expose similar APIs.

It's entirely self-hostable without any cloud services. I'd love to hear any feedback you all may have!