Add Typeahead and Semantic Search to Your GitHub Searchbar

2 pointsposted 5 months ago
by iamjiamingliu

1 Comments

iamjiamingliu

5 months ago

How it works: 1. A Python ingestor continuously pulls repositories and READMEs from GitHub’s GraphQL API and streams them into Kafka. 2. An indexer consumes from Kafka, processes the content, and writes it into Qdrant, Elasticsearch, and PostgreSQL for vector, keyword, and structured search respectively. 3. At query time, the system analyzes the search request, retrieves candidate results from Qdrant and Elasticsearch, and ranks them using multiple signals — including reranker similarity, click-through rate, recency, and more.

Where it’s hosted: Linode’s 8GB ram virtual machine costing $48 a month + voyage AI

Lemme know if you'd like to request new features and report bugs. Thanks!