Retrieval-Augmented Generation (RAG)
Definition
Retrieval-Augmented Generation (RAG) is an AI architecture that combines a large language model with a retrieval system, allowing the model to access and incorporate external knowledge at inference time rather than relying solely on information encoded in its parameters during training. In a RAG pipeline, a user query triggers a search over an external knowledge base—product catalogs, documentation, customer records, or enterprise knowledge repositories—and the retrieved content is injected into the model's prompt as context before the model generates its response.
RAG addresses two fundamental limitations of standalone LLMs in enterprise settings: knowledge staleness (models are frozen at training time) and hallucination risk (models confabulate facts they were not trained on). In commerce, RAG enables AI assistants to answer accurate questions about current inventory, pricing, policies, and product specifications without retraining. It also provides a mechanism for grounding model outputs in auditable source documents, which is important for compliance, customer trust, and debugging when responses are incorrect. Effective RAG implementation requires careful attention to retrieval quality, chunk sizing, context window management, and citation handling.
Related Terms
Source
Last updated: May 12, 2026