
If you’re using Claude Desktop in Bring-Your-Own-Bedrock mode, the obvious question is: what exactly gets sent to Amazon Bedrock on every turn?
I went through a real CloudWatch model-invocation export from a Claude Desktop session captured on May 5, 2026, and the answer is refreshingly un-mysterious: Claude Desktop sends a standard Anthropic Messages API payload over Bedrock, including the full system prompt, tool definitions, and conversation history on every request.
Scope note: This post is grounded in one observed Claude Desktop BYO-Bedrock session captured in CloudWatch on May 5, 2026. It explains what this deployment mode sends over the wire; it does not claim that every Anthropic product path behaves identically.
InvokeModelWithResponseStream.system, messages, tools, max_tokens, temperature, and related fields.cache_control: { type: "ephemeral" } reduces recompute and billing, but it does not hide the full prompt from Bedrock or CloudWatch model-invocation logs.tool_use request and the tool_result content that the client sends back later.The biggest misconception is thinking Claude Desktop sends only the latest user message while Bedrock keeps the session state somewhere on the server side.
That is not what the observed request flow shows.
Claude Desktop treats each Bedrock call as a fresh, self-contained request. That means every turn includes:
In other words: Bedrock stores nothing between turns for this flow. The client rebuilds and re-sends the full context every time.
Here is the practical shape:
messages: [
{ role: user, content: [...] },
{ role: assistant, content: [...] },
{ role: user, content: [...] },
{ role: assistant, content: [...] },
...
]
That is why Bedrock model-invocation logging is so useful here. If you enable it, you do not have to guess what the model saw.
On every observed call, the system array had three main blocks.
x-anthropic-billing-header: cc_version=2.1.121.540; cc_entrypoint=claude-desktop-3p;
This identifies the client and the Claude Desktop BYO-Bedrock entrypoint. The important part is cc_entrypoint=claude-desktop-3p, which marks the traffic as Claude Desktop running against the customer’s own Bedrock account.
You are a Claude agent, built on Anthropic's Claude Agent SDK.
The observed prompt was about 25K characters long and marked with cache_control: ephemeral.
According to the captured payload, this block includes the usual agent instructions you’d expect from a coding assistant:
The key point is not just the size. It is the visibility: this text is plain request content, not a hidden opaque layer outside Bedrock’s logging surface.
This is where the behavior gets interesting.
Claude Desktop uses prompt caching aggressively, but prompt caching does not mean “only a tiny delta is sent to Bedrock.” It means Bedrock can re-use previously cached prefix work instead of re-processing and re-billing the stable portion of the request.
The observed token pattern looked like this:
| Turn | input_tokens | cache_read_input_tokens | cache_creation_input_tokens |
|---|---|---|---|
| 1 | 6 | 0 | 50,983 |
| 2 | 1 | 50,983 | 285 |
| 3 | 1 | 51,268 | 306 |
| 4 | 1 | 51,574 | 113 |
| 5 | 1 | 51,687 | 2,923 |
| 6 | 1 | 54,825 | 441 |
| 7 | 1 | 55,266 | 159 |
The practical interpretation:
cache_creation_input_tokens values reflect new history being folded into the cache.But the most important operational takeaway is this:
Prompt caching changes billing and recomputation, not wire visibility.
The full prompt still belongs to the request envelope. If model-invocation logging is enabled, Bedrock can log what it received, subject to its body-size truncation limits.
ephemeral_5m and ephemeral_1h style TTL behavior to survive longer idle periods.Another useful clarification: Bedrock does not directly watch local tool execution.
What Bedrock sees is:
tool_use block in the assistant outputtool_result block that the client includes in the next requestThat means the model can ask for Read, Edit, Bash, or MCP-style tools, but Bedrock itself only receives the structured content that the client chooses to send back in the next turn.
So the boundary looks like this:
tool_resultBased on the observed session, a few things are worth calling out explicitly.
tool_result.This is an important distinction. Saying “Claude Desktop can use local files and tools” is not the same as saying “Bedrock automatically receives your local machine state.” The latter only happens when the client intentionally forwards content.
In the observed BYO-Bedrock flow:
That last point is why this deployment model is attractive to teams that care about observability. The request does not disappear into a black box you cannot inspect.
If you want proof in your own environment instead of trusting anybody’s blog post, do this:
system arraymessages historycache_control markerscc_entrypoint=claude-desktop-3p headerOnce you do that, the architecture becomes much easier to reason about. You can stop debating from screenshots and inspect the actual wire payload.
If you remember one thing from this post, let it be this:
Claude Desktop on BYO-Bedrock is not hiding some magical proprietary request shape.
It is sending a standard, inspectable Anthropic-style messages payload to Bedrock, re-sending full context on every turn, and relying on prompt caching to make that affordable.
That is great news if you care about transparency, logging, and understanding exactly what your AI tooling is doing inside your own AWS account.