Show HN: RepoCrunch – CLI to analyze GitHub repos

2 pointsposted 10 hours ago
by chillkim

2 Comments

chillkim

10 hours ago

I kept getting surprised by what's actually inside popular repos. Like, Next.js? 13.5% Rust. Turbopack is a way bigger chunk than I expected. Deno is 59% Rust — the "TypeScript runtime" is mostly not TypeScript. Python's fastest package manager (uv) is 98% Rust, 1.6% Python. And LangChain is 99.3% Python with zero compiled extensions, which honestly explains a lot.

So I built a CLI that just... tells you this stuff. Point it at any repo, get back structured data — language breakdown, dependency counts, CI setup, health signals, security basics.

  pip install repocrunch
  repocrunch analyze vercel/next.js --pretty
No AI, no LLMs, no API keys beyond a GitHub token. Same repo, same output, every time. I mainly use it for dependency due diligence — is this library actually maintained, what's it written in, does it have basic security hygiene.

Also works as an MCP server if you want to give your AI coding assistant real repo data instead of hallucinated star counts.

Happy to talk about the implementation or the weird things I've found analyzing repos.

squid_protocol

4 hours ago

How does your system determine language prevalence? How does it deal with extension collisions?