What is Retrieval-Augmented Generation (RAG)?

Large language models are impressive, but they have a fundamental limitation in that they only know what they were trained on. Ask a model about something that happened after its training cutoff, or about a document sitting in your company’s internal knowledge base, and it either makes something up or tells you it doesn’t know.

Retrieval-augmented generation, almost always shortened to RAG, is the approach the industry has settled on to fix this.

The idea is pretty straightforward. Instead of relying purely on what the model has memorized, you give it the ability to pull in relevant information from an external source, then use that information to generate a response.

Read more

What is an Embedding Model?

Computers are good at numbers. They’re not naturally good at understanding that “dog” and “puppy” are related, that a photo of a beach and the phrase “summer vacation” share something in common, or that a five-star review and the sentence “this product is amazing” mean roughly the same thing.

Embedding models are how we bridge that gap.

Read more