themadmedic
9 days ago
I OCR’d all my PDFs separately, then used FAISS with all-MiniLM-L12-v2 to generate embeddings and vectors. After that, I wrote a separate Python script using Tkinter that acts as my search “GUI.” It takes natural language input, creates an embedding, and queries the FAISS output. It can send the results and an extract to a local LLM (i host it using llama.cpp), which re-ranks them and provides a justification for the ranking.
I wouldn't call it plug and play the actual scripting took me about a weekend, and creating the embeddings took several hours on my Mac but the results have been great.