MKuykendall
3 days ago
Hey HN! I built this because I was tired of waiting 10 seconds for Ollama's 680MB binary to start just to run a 4GB model locally.
Quick demo - working VSCode + local AI in 30 seconds: curl -L https://github.com/Michael-A-Kuykendall/shimmy/releases/late... ./shimmy serve # Point VSCode/Cursor to localhost:11435
The technical achievement: Got it down to 5.1MB by stripping everything except pure inference. Written in Rust, uses llama.cpp's engine.
One feature I'm excited about: You can use LoRA adapters directly without converting them. Just point to your .gguf base model and .gguf LoRA - it handles the merge at runtime. Makes iterating on fine-tuned models much faster since there's no conversion step.
Your data never leaves your machine. No telemetry. No accounts. Just a tiny binary that makes GGUF models work with your AI coding tools.
Would love feedback on the auto-discovery feature - it finds your models automatically so you don't need any configuration.
What's your local LLM setup? Are you using LoRA adapters for anything specific?
carlos_rpn
3 days ago
You may have noticed already, but the link to the binary is throwing a 404.
MKuykendall
3 days ago
This should be fixed now!
sunscream89
2 days ago
How do I use it with ollama models?
MKuykendall
2 days ago
To use Shimmy (instead of Ollama):
1. Install Shimmy:
cargo install shimmy
2. Get GGUF models (same models you'd use with Ollama):
# Download to ./models/ directory
huggingface-cli download microsoft/Phi-3-mini-4k-instruct-gguf --local-dir
./models/
# Or use existing Ollama models from ~/.ollama/models/
3. Start serving:
./shimmy serve
4. Use with any OpenAI-compatible client at http://localhost:11435
sunscream89
a day ago
I am trying to use ~/.ollama/models/, even linked it to ~/models. I don’t have phi-3, it may be possible none of my models are supported. It acts as though it sees nothing.
How do I know for sure it is checking ~/.ollama/models/ (if linking isn’t the right approach.)
MKuykendall
a day ago
I didn't have that path set to autodiscover; pull the newest version this is fixed now!!