Refrag: Rethinking RAG Based Decoding

4 pointsposted 5 months ago
by datadrivenangel

1 Comments

datadrivenangel

5 months ago

Am I misunderstanding this or is basically just taking RAG results and doing a vector search on the results and only passing some to the context window?

Also, why do these AI papers never get speedup times in human time units?