Emerging Patterns in Building GenAI Products

Tuesday, February 4, 2025

We then handle user requests by using the embedding model to create an embedding for the query. We use that embedding with a ANN similarity search on the vector store to retrieve matching fragments. Next we use the RAG prompt template to combine the results with the original query, and send the complete input to the LLM.

A nice quick rundown of RAG implementation.

Read other posts

← [Once again, the only way forward is the Mac | Macworld](https://www.macworld.com/article/2525708/the-app-store-era-must-end-apples-already-got-the-solution.html) [On government efficiency - daverupert.com](https://daverupert.com/2025/02/on-government-efficiency/) →