Context window
Definition
A context window is the maximum amount of text—measured in tokens, which are roughly word fragments—that a large language model can process in a single input-output exchange. Everything the model can "see" and reason about at one time must fit within this window: the system prompt, conversation history, retrieved documents, user input, and any structured data provided. Information outside the context window is not accessible to the model during that inference call, regardless of how relevant it might be.
Context window size has direct, practical implications for AI system design in commerce and enterprise applications. A narrow context window limits how much product catalog data, policy documentation, or conversation history can be included in a single query—forcing architects to make careful choices about what to retrieve and include. Larger context windows enable richer, more coherent interactions—an AI agent reviewing a lengthy contract, analyzing a full customer service thread, or reasoning across a large product specification document benefits substantially from expanded context capacity. However, larger contexts also increase inference latency and cost, creating engineering trade-offs that must be balanced against the requirements of each application.
Related Terms
Source
Last updated: May 12, 2026