GPTfy Glossary
Context Window
The maximum amount of text (measured in tokens) an LLM can process in a single request — covering both the prompt and the response.
The context window is the LLM's working memory. Smaller models like GPT-3.5 had 4K-8K token windows; modern frontier models support 128K (GPT-4), 200K (Claude 3.5 Sonnet), and even 1M+ tokens (Gemini 1.5 Pro, Claude Opus 4). A 200K context window translates to roughly 150,000 English words — enough to fit an entire Salesforce account history.
Why this matters for Salesforce: with a small context window, you can only send a snippet of an account's history to the LLM, forcing manual selection and missing relationships. With 200K+, you can send the full Account + Cases + Opportunities + emails in a single request — the LLM sees the complete picture.
Trade-offs: larger context windows are slower and more expensive (token cost scales linearly), and accuracy can degrade on very long inputs ("lost in the middle" problem). Production architectures often combine RAG (retrieve relevant pieces) with selective long-context use for cost control.
See Context Window in GPTfy
Book a 30-minute demo with a GPTfy engineer to see how this works in a Salesforce org like yours.
Book a demo