

If you have spent any significant time using Microsoft Copilot, you have likely encountered this frustrating scenario: you are deep into a productive chat, refining a document or analysing data. You ask a follow-up question, and Copilot responds with a generic answer, seemingly having forgotten everything you discussed five prompts earlier.
This "conversation drift" or "AI amnesia" is not a bug. It is a fundamental architectural constraint of all Large Language Models (LLMs) known as the context window. In this article, we’ll explain what context windows are, how they work behind the scenes, the challenges they pose (from token limits to privacy), and best practices to help you maximise Copilot’s effectiveness.
For more expert tips on how to boost your Copilot efficiency, make sure to check out our instructor-led training courses!
A context window refers to the amount of text (measured in tokens) that an AI model can consider at one time. Tokens are pieces of words; for example, “networking” might be two tokens (“network” + “ing”). As a rule of thumb, 1 token is roughly 3/4 of a word in English. So if a model has an 8,000-token context window, that’s about 6,000 words of text it can handle in its “working memory” before it starts forgetting old content.
In an enterprise setting, managing this limitation is the single most critical skill for productive AI adoption. It is the difference between a tool that accelerates work and one that creates constant friction. Mastering this goes beyond simple "prompt engineering". It is the vital discipline of context management.
This is the most important concept for M365 Copilot users to understand. Your organisation's 10 TB of data in SharePoint is not your context window.
Context Window: This is the finite "working memory" of the LLM, measured in tokens. It is what the model can "see" at the exact moment it generates a response.
Grounding: This is the process of retrieving information from your organisation's persistent knowledge.
Microsoft 365 Copilot is a Retrieval Augmented Generation (RAG) system. It does not put your entire file library into the model's brain. Instead, it performs a multi-step process:
This is a common puzzle related to context windows. This means Copilot can read a 300-page file, but it will only inject the most relevant passages (e.g., headings, intro, conclusion) into the finite token budget to generate a summary.
Download nowTo be effective, you must be aware of the three types of context Copilot is balancing:
Understanding the mechanics of this process is key to controlling it. Every prompt initiates a complex, invisible sequence of events that allocates your finite token budget.
Think of the model's total context window as a "token budget". This total budget is fixed and must be divided among multiple components. The available space for your data is what is left over.
A conceptual formula for this trade-off is:
Available_Tokens_for_User_Data =
Max_Model_Context_Window
- (System_Instructions + Safety_Guardrails + Tool_Definitions)
- (Chat_History)
- (User_Prompt)
- (Reserved_Response_Budget)
This model explains why things fail. If your Chat_History is very long and your User_Prompt is verbose, the Available_Tokens_for_User_Data (the space for retrieved documents) becomes tiny. This either forces the RAG system to be hyper-selective - leading to a weak, ungrounded answer - or it shrinks the Reserved_Response_Budget, resulting in a truncated, cut-off response.
When you interact with Copilot in M365, several processes happen in milliseconds:
Truncation and Retrieval Strategies
You need to be aware that when the context window is too large, Copilot employs several strategies:
Whilst powerful, this architecture presents clear limitations and risks that technical managers must mitigate.
As of October 2025, the following practical limits are key operational thresholds:
As of October 2025, the following practical limits are key operational thresholds:
On the flip side of the token limit issue, providing too little or ambiguous context can also be problematic. If Copilot doesn’t have specific information to ground an answer, it may produce a plausible-sounding but incorrect response – a phenomenon known as hallucination.
For example, asking “Explain the client’s requirements for Project Y” in Copilot Chat without providing or referencing any documents might lead it to guess or generalise, possibly giving a wrong summary. The AI will try to be helpful even if the context window is essentially empty or missing key facts.
Not all Copilot users have the same abilities, which indirectly affects context usage. Microsoft 365 Copilot add-on license users get the full Graph-integrated experience (the Work tab in Copilot Chat, ability to reference Teams messages, meetings, etc.), whereas users without that license are limited to public web context or their own provided text.
If you don’t have a Copilot license, you effectively have a smaller context scope (only web data, or whatever you paste in). Those with the license have a much richer context via their tenant data. Additionally, Microsoft has introduced priority access vs standard access modes. Priority access might mean faster responses or early feature releases, whereas standard could mean slower retrieval or lower rate limits.
Controlling Copilot's context is a skill. This playbook provides a concrete process for getting focused, accurate, and reliable results.
A 7-Step Playbook for Focused Copilot Prompts
Always remember to run a 'Lost-Information Check'. After Copilot gives you a summary of a long document, force it to re-examine the context by asking it to self-critique. This counters the "lost in the middle" context window problem. Try these prompts:
Context windows might be an “under the hood” technical concept, but for end-users and admins guiding Microsoft Copilot deployments, they make a tangible difference in daily use. Knowing the limits and behaviours of Copilot’s context window helps explain why it sometimes forgets things or produces off-base answers – and more importantly, how to prevent those issues.
For teams rolling out Copilot, the next steps should be to teach everyone these best practices. Get in touch today to find out how we can help you and your team level up!
Why does Copilot forget what I told it five prompts ago?
How can I force Copilot to only use one specific file?
How is the context window different from "memory"?
How big is the Copilot context window?

The article explains what a Copilot prompt is and why clarity and specificity dramatically improve results. It shows how adding audience, tone, length, and format turns vague requests into accurate outputs, contrasting weak vs. strong prompts. It lists common mistakes - being vague, bundling too many tasks, omitting context or target audience, and failing to critically review AI output. It emphasises prompting as a valuable workplace skill; beginners should start small, reuse and adapt prompts, and remember AI can err, so human judgment remains essential.

Microsoft 365 Copilot can super-charge Word, Excel, Outlook, PowerPoint, and Teams, but IT managers must align licensing, data governance, and clear business goals before launch. In this article, we discuss how engaging stakeholders early, piloting with a small cross-functional group, and phasing the rollout lets teams refine guidance and measure real productivity gains. Role-specific, hands-on training - prompt-engineering tips, quick-start resources, and “Copilot champions” - converts into confident daily use while resolving emerging user challenges.

Microsoft Copilot, OpenAI’s ChatGPT, and Google’s Gemini are leading AI assistants, each excelling in different environments. Copilot integrates deeply with Microsoft 365 to automate documents, data analysis, and email, while ChatGPT shines in open-ended conversation, creative writing, and flexible plugin-driven workflows. Gemini prioritises speed and factual accuracy within Google Workspace, offering powerful research and summarisation capabilities. Choosing the right tool depends on your ecosystem, need for customisation, and whether productivity, creativity, or precision is the top priority.

Registered England and Wales: 11477692 VAT Number: GB 3123317 52All trademarks are owned by their respective owners. Click here for details.