hackernews client

ij23

2 hours ago

LiteLLM maintainer here. Some context on why we are doing this

Over the past year we've heard the same thing from our users and community, they want the fastest and litest AI gateway.

This change allows us to address two of the most common problems we hear from users latency spikes under load and memory leaks/OOM kills that take pods down

We believe a Rust hot path is faster and bounded in memory, so those whole classes of issues go away.

It will be a gradual, non-breaking change. The Python SDK and proxy stay exactly the same, under the hood they start calling the Rust binary through PyO3, one component at a time, each proven in production before the next. The sub-1ms figure is gateway overhead (what we add on top of the upstream call), and we're aiming for a sub-100MB binary. Happy to share benchmark methodology if folks want to poke at it.

The whole gateway will be running on Rust by December 1, 2026.

Full announcement: https://docs.litellm.ai/blog/litellm-rust-launch

lackoftactics

an hour ago

Two questions.

How do you handle the competition like https://github.com/ENTERPILOT/GOModel and Bifrost? They already moved to more performant languages like Go.

What is the moat of litellm currently and why such a radical move?

LiteLLM Migrates to Rust

3 Comments

ij23

lackoftactics

user