Skip to content

Commit 810a96f

Browse files
Merge pull request #58 from microsoft/gughini
Add new MCP Resources as input blog post
2 parents 69f259c + 773299c commit 810a96f

File tree

4 files changed

+315
-0
lines changed

4 files changed

+315
-0
lines changed
Lines changed: 315 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,315 @@
1+
---
2+
layout: post
3+
title: "Free Up Your Context Window: Pass MCP Resources, Not Raw Data"
4+
date: 2025-11-24 23:50:00 +0100
5+
categories: [copilot-studio, mcp, patterns]
6+
tags: [context-window, resources, mcp-server, tooling, scalability]
7+
description: How to avoid context window saturation in Copilot Studio by passing MCP resources between tools when the tool output is too huge to fit into an agent context window.
8+
author: giorgioughini
9+
image:
10+
path: /assets/posts/mcp-resources-as-tool-inputs/header-res.png
11+
alt: "Common image used by Giorgio Ughini in all his posts, with the Copilot Studio icon that is linked to the MCP logo which in turn is linked to MCP resources"
12+
no_bg: true
13+
---
14+
15+
> **Note**: This article assumes some familiarity with MCP resources. If you're new to the concept, check out the general tutorial: [MCP Tools & Resources](https://microsoft.github.io/mcscatblog/posts/mcp-tools-resources/).
16+
{: .prompt-info }
17+
18+
One pretty common error encountered when dealing with tools producing in output a huge amount of data is the Token Limit Error, for example:
19+
- A connector or a MCP tool returns a massive unfiltered payload
20+
- The orchestrator faithfully feeds it into the LLM
21+
- The context window explodes → the agent becomes very slow or fails with a Token Limit error
22+
23+
Sometimes, such tool token-heavy tool is outputting a lot of unwanted data, but other times you would actually need that data later in the plan, so you can't just throw it away.
24+
25+
In this article we'll look at a pattern that fixes this at the root: **use MCP resources and pass resource IDs between tools instead of passing the whole text or JSON into the context window.**
26+
27+
We'll use a simple MCP sample with two tools (random characters generato + character counter) to illustrate the pattern, and then project it onto a more realistic scenario.
28+
29+
> TL;DR: For token-heavy outputs, let your MCP server keep the data. Your agent should pass lightweight resource IDs between tools and only pull data into the context window when absolutely necessary.
30+
{: .prompt-tip }
31+
32+
---
33+
34+
## The problem: tools that flood the context window
35+
36+
Every time a Copilot Studio agent calls a tool (connector action, MCP tool, etc.), the orchestrator needs to decide what to send to the tool (inputs), and it gets some tool-defined outputs that are **fed in the LLM's context** for further reasoning.
37+
38+
If a tool returns small/medium data, no problem.
39+
40+
If a tool returns things like:
41+
- A huge JSON with thousands of records
42+
- A long text (logs, transcripts, documents, concatenated content)
43+
44+
...you will quickly hit the **context window limit**.
45+
46+
Result:
47+
48+
- Best case: the agent becomes slower and less smart in decision making due to a bloated context.
49+
- Worst case: you hit an “input too long / too many tokens” error and the run fails.
50+
51+
![Image 1: Copilot Studio run failed because the tool returned a string that is too large to fit in the context window](/assets/posts/mcp-resources-as-tool-inputs/ignored-too-many.png){: .shadow w="972" h="589" }
52+
_A run failing when the generate character tool returns 30,000 characters and the agent tries to keep them in context._
53+
54+
Sometimes, Copilot Studio doesn't need every byte of that payload for reasoning, but you still need it downstream, for reporting, notifications, or other tools.
55+
56+
So how do we keep the agent smart and fast without losing the data, if you can't edit the API itself?
57+
58+
---
59+
60+
## Option 1 (the usual way): filter at the connector boundary with C# custom code
61+
62+
Before we get to MCP resources (the real focus of this post), there's an important alternative worth mentioning.
63+
64+
If the tool that's producing too much data is a “normal” HTTP API, or a Power Platform connector, you can:
65+
66+
1. Create a **custom connector**.
67+
2. Add a bit of **C# code** that is executed after the API calls.
68+
3. In C#, **filter / reshape / trim** the response before returning it to Copilot Studio.
69+
70+
Example:
71+
- The upstream API returns a full user profile:
72+
- Name, surname, date of birth, address, phone, preferences, metadata, etc.
73+
- Your Copilot Studio agent only needs:
74+
- Name, surname, date of birth
75+
76+
You can write a few lines of C# in the custom connector that:
77+
78+
- After having called the original API
79+
- Extracts only the three required fields
80+
- Returns a tiny/cleaned JSON to Copilot Studio
81+
82+
This way:
83+
84+
- You don't touch the original microservice
85+
- You drastically reduce the token footprint
86+
- You avoid polluting or saturating the context window
87+
88+
> This pattern is great when you control the connector layer and can safely throw away fields the agent will never need.
89+
{: .prompt-info }
90+
91+
However, sometimes:
92+
93+
- You need to keep the full data around for later tools
94+
- You're working with MCP servers and want to stay in that ecosystem
95+
96+
That's where **MCP resources** shine.
97+
98+
---
99+
100+
## Option 2 (the new MCP way): use MCP resources instead of raw text
101+
102+
With MCP servers, you don't have to push every byte of data into the model's context if not needed.
103+
104+
You can:
105+
- Generate and store large outputs as **resources** on the MCP server
106+
- Pass around **resource identifiers** between tools
107+
- Only read and pull the data into the context window when absolutely required
108+
109+
Conceptually:
110+
- A “resource” is a named, addressable piece of content managed by the MCP server
111+
- e.g., a file, a JSON blob, a string, a document
112+
- Tools can:
113+
- Produce resources (create/store them)
114+
- Consume resources as inputs (by ID)
115+
- The Copilot Studio orchestrator only sees:
116+
- Small metadata + resource IDs (a handful of tokens), **not** the entire raw payload.
117+
- If the orchestrator thinks it's useful, it can send a _read request_ to the MCP server and put the resource in its context. But not always, only if necessary.
118+
119+
This pattern changes the game for **tool-to-tool orchestration**:
120+
121+
- Tool A: creates or updates a resource and returns an ID
122+
- Copilot Studio: keeps just the ID in the plan
123+
- Tool B: receives the resource ID as input, loads the content on the MCP side, processes it, and returns **only the final summary / aggregated result** to the LLM
124+
125+
No massive JSON in the context window and no token-heavy strings slowing everything down.
126+
127+
---
128+
129+
## The sample scenario: random characters generator and counter
130+
131+
Let's use the sample MCP server from this repo:
132+
[Github.com: Microsoft/CopilotStudioSamples](https://github.com/microsoft/CopilotStudioSamples/tree/main/MCPSamples/pass-resources-as-inputs)
133+
134+
It exposes two tools:
135+
136+
1. `generate_text`
137+
- Generates a random string with a configurable number of characters.
138+
2. `count_characters`
139+
- Counts the number of characters of a given input.
140+
141+
### The naive approach (no resources used)
142+
143+
If we ignore MCP resources and just pass the text between the two tools:
144+
145+
1. The agent calls `generate_text`.
146+
2. The tool returns a string of 100,000 characters.
147+
3. The orchestrator feeds those 100,000 characters into the LLM's context.
148+
4. The agent calls `count_characters`, passing the entire string as input.
149+
5. The second tool counts the characters and returns a number.
150+
151+
In theory it could work. But in practice in case of heavy token inputs the context window is not sufficient to include the huge string, the run might become heavy, slow, and potentially fail. That's exactly what you see in the failing screenshot above.
152+
153+
### The resource-based approach
154+
155+
Now let's do it the “MCP-native” way and use resources.
156+
157+
1. `generate_text` creates a **resource** with the random string (server-side).
158+
2. The tool returns a **resource ID**, not the raw string.
159+
- e.g., `resourceId: "random-string-12345"`
160+
3. Since the orchestrator knows there would be no advantage in reading the resource, it doesn't send the read request. The agent only needs to count the characters, and instead of passing the full string, it passes the **resource ID** to `count_characters`.
161+
4. `count_characters`:
162+
- Uses the resource ID to fetch the content from the MCP server
163+
- Counts the characters server-side
164+
- Returns just the **count** (a small integer) to Copilot Studio
165+
166+
From Copilot Studio's point of view:
167+
168+
- The LLM never sees the 100,000 characters
169+
- It only sees the ID and the final count
170+
- The context window remains clean and small
171+
172+
![Image 2: Copilot Studio run succeeding by passing only the MCP resource ID to the second tool, not the full 100,000-character string](/assets/posts/mcp-resources-as-tool-inputs/working-resource.png){: .shadow w="972" h="589" }
173+
_A run where the agent calls the same tools, but uses MCP resources. The context stays tiny and the plan completes successfully._
174+
175+
> The orchestrator can still decide to call `resource/read` on that ID if it genuinely needs to reason over the full content. The key is: it doesn't have to, and by default, it shouldn't for purely mechanical tasks like counting.
176+
{: .prompt-tip }
177+
178+
---
179+
180+
## How this looks in Copilot Studio
181+
182+
Assuming you've deployed the sample MCP server from the GitHub repo:
183+
184+
1. In Copilot Studio, go to your agent and open the **Tools / MCP** section.
185+
2. Add the MCP server from the sample.
186+
3. Make sure both tools are enabled:
187+
- `generate_text`
188+
- `count_characters`
189+
190+
Now when you test:
191+
192+
- Prompt:
193+
> Generate a random string of 100,000 characters and then tell me how many characters it has.
194+
195+
Behind the scenes:
196+
197+
- The large string lives entirely on the MCP server
198+
- Your Copilot Studio agent passes just one short ID between tools
199+
- The LLM only sees small, structured inputs and outputs
200+
201+
---
202+
203+
## A more realistic example: training completion for 200,000 employees
204+
205+
The character-count example is intentionally simple. Let's apply the same pattern to something closer to a real-world scenario.
206+
207+
Imagine a **nightly autonomous agent** that:
208+
209+
1. Calls a training system MCP tool to retrieve the **training completion status** for 200,000 employees.
210+
- The tool returns a huge JSON: list of employees + courses completed / not completed.
211+
2. The agents passes that data into a **report generation** tool.
212+
3. Calls another tool to **send reminders** to people who are not compliant.
213+
214+
Naive approach:
215+
216+
- The first tool returns a massive JSON
217+
- The agent tries to keep it in the context window to “think” about it
218+
- You immediately hit context limits, or at least degrade performance heavily
219+
220+
Resource-based approach:
221+
222+
1. The “get training status” tool writes the full JSON to a **resource**:
223+
- `resourceId: "training-status-2025-11-25"`
224+
2. The tool returns only that ID and maybe some small metadata (e.g., counts).
225+
3. The reporting tool receives the resource ID, loads the JSON from the MCP server, generates a **compact summary or aggregated report**, and returns just that summary to Copilot Studio.
226+
4. The reminder tool receives the another resource ID, maybe filtered by uncompliancy, loads the JSON server-side, identifies who needs reminders, and sends them—again returning only a small result object.
227+
228+
In the latter flow:
229+
230+
- Copilot Studio never sees the full 200,000-record JSON
231+
- The MCP server does all the heavy lifting
232+
- The LLM works with small, meaningful artifacts (summaries, counts, statuses) instead of raw bulk data
233+
234+
This is anoter scenario where MCP resources shine: you need to keep large data around and feed it through several tools, but the LLM doesn't need to “read” every record.
235+
236+
---
237+
238+
## When can you pass resources between tools?
239+
240+
This pattern works under a few conditions:
241+
242+
- The tools are part of the **same MCP server** and:
243+
- The first tool can create or reference a resource
244+
- The second (and subsequent) tools accept resource IDs as inputs and are configured to read it
245+
- Or, otherwise, if you are using that resource in **another agent**, it should:
246+
- Have access to the same MCP server
247+
- Know how to consume that resource ID
248+
249+
In other words:
250+
251+
> You're not “teleporting” raw data between completely independent systems.
252+
> You're passing handles (resource IDs) to a shared MCP backend that can resolve them.
253+
254+
Implementation details may vary by MCP server, but the idea is consistent:
255+
256+
- Define a convention for resource IDs
257+
- Make sure your tool schemas expose inputs like “resourceId” (string) or match the MCP resource type
258+
- In Copilot Studio, steer the orchestrator with instructions so it passes resource IDs instead of inlining content
259+
260+
---
261+
262+
## Design guidelines and best practices
263+
264+
A few practical tips when applying this pattern:
265+
266+
- Whenever you're tempted to pass a big JSON or string between tools, ask:
267+
“Is this too heavy? Can this be a resource instead?”
268+
269+
- Minimize what enters the context window
270+
- Only pull resources into the LLM when you truly need the model to reason over them (e.g., summarizing or analyzing content).
271+
- For mechanical operations (counting, filtering, sending, moving), keep everything on the MCP side.
272+
273+
- Combine with connector-level filtering
274+
- If you're consuming APIs via Power Platform connectors, you can still use custom connectors + C# to pre-trim data.
275+
- Then, for what remains large, use MCP resources to avoid flooding the LLM.
276+
277+
- Use clear agent instructions
278+
- Explicitly tell the agent to pass resource IDs between specific tools.
279+
- Add guardrails like:
280+
> Do not expand or print the full contents of large resources unless the user explicitly asks for it.
281+
282+
- Monitor and iterate
283+
- Look at traces:
284+
- Are you seeing huge tool outputs injected into the context?
285+
- Can those be converted to resource-based flows instead?
286+
287+
---
288+
289+
## Try it yourself with the sample MCP server
290+
291+
To see this pattern live:
292+
293+
1. Clone the sample:
294+
[Github.com: Microsoft/CopilotStudioSamples](https://github.com/microsoft/CopilotStudioSamples/tree/main/MCPSamples/pass-resources-as-inputs)
295+
2. Deploy the MCP server as described in the repo.
296+
3. Connect it to your Copilot Studio agent.
297+
4. Run your tests
298+
299+
You'll see immediately how much healthier your runs are when the heavy data stays on the MCP side.
300+
301+
---
302+
303+
## Key takeaways
304+
305+
- Big tool outputs can **silently kill** your Copilot Studio agents by saturating the context window.
306+
- A first mitigation could be to use **custom connectors with C#** to filter API responses before they ever reach the LLM.
307+
- For MCP servers, or for situations in which you don't want to filter since all the datas are used in subsequet tools, the better long-term pattern is to **use resources**:
308+
- Tools write large outputs as resources
309+
- Agents pass around **resource IDs**
310+
- Other tools read and process those resources server-side
311+
- This pattern scales from toy examples (like the 100,000 random characters used in this post) to serious workloads.
312+
313+
If you're building serious agents with heavy data flows, **resource-based orchestration might be an option** for reliability and performance.
314+
315+
For more background on MCP resources in Copilot Studio, see the [general tutorial here](https://microsoft.github.io/mcscatblog/posts/mcp-tools-resources/).
165 KB
Loading
235 KB
Loading
237 KB
Loading

0 commit comments

Comments
 (0)