Context Window

The maximum amount of text (measured in tokens) an LLM can process in a single request — covering both the prompt and the response.

Quick answer

What is Context Window?

The maximum amount of text (measured in tokens) an LLM can process in a single request — covering both the prompt and the response.

Last updated: May 2026

The context window is the LLM's working memory. Smaller models like GPT-3.5 had 4K-8K token windows; modern frontier models support 128K (GPT-4), 200K (Claude 3.5 Sonnet), and even 1M+ tokens (Gemini 1.5 Pro, Claude Opus 4). A 200K context window translates to roughly 150,000 English words — enough to fit an entire Salesforce account history.

Why this matters for Salesforce: with a small context window, you can only send a snippet of an account's history to the LLM, forcing manual selection and missing relationships. With 200K+, you can send the full Account + Cases + Opportunities + emails in a single request — the LLM sees the complete picture.

Trade-offs: larger context windows are slower and more expensive (token cost scales linearly), and accuracy can degrade on very long inputs ("lost in the middle" problem). Production architectures often combine RAG (retrieve relevant pieces) with selective long-context use for cost control.

Browse all terms

See it in your Salesforce org

See Context Window running in GPTfy

Book 30 minutes with a GPTfy engineer to see how Context Window actually works inside a Salesforce org like yours.

Book a demo

Context Window

What is Context Window?

Related terms

See Context Window running in GPTfy