AI Models & Technology

RAG (Retrieval-Augmented Generation)

📖

Definition

Retrieval-Augmented Generation (RAG) is an architectural pattern that enhances language model outputs by dynamically retrieving relevant information from an external knowledge base — such as a vector database, document store, or search index — and injecting it into the model's context at inference time. Rather than relying solely on knowledge encoded in model weights during training, RAG grounds model responses in up-to-date, curated, or proprietary information the model was not trained on.

RAG has become the dominant pattern for enterprise LLM deployment because it addresses two of the most critical limitations of standalone LLMs: knowledge cutoffs and hallucination. A commerce platform using RAG can build AI assistants that accurately answer questions about current product catalog details, live inventory status, customer order history, or internal policy documents — without retraining the model. RAG also improves auditability, since retrieved source documents can be surfaced alongside generated answers. Key engineering challenges include retrieval quality, context window management, and handling conflicting information across retrieved documents.

🔗

Retrieval-Augmented Generation (RAG)AI as an Appreciating AssetAI AssistantAI Flywheel

Last updated: May 12, 2026

Definition

Related Terms