What Claude Desktop Sends to Amazon Bedrock

5 mins
Published on 06 May 2026

If you’re using Claude Desktop in Bring-Your-Own-Bedrock mode, the obvious question is: what exactly gets sent to Amazon Bedrock on every turn?

I went through a real CloudWatch model-invocation export from a Claude Desktop session captured on May 5, 2026, and the answer is refreshingly un-mysterious: Claude Desktop sends a standard Anthropic Messages API payload over Bedrock, including the full system prompt, tool definitions, and conversation history on every request.

Scope note: This post is grounded in one observed Claude Desktop BYO-Bedrock session captured in CloudWatch on May 5, 2026. It explains what this deployment mode sends over the wire; it does not claim that every Anthropic product path behaves identically.

TL;DR - What Leaves Your Machine

  • Claude Desktop talks to Bedrock through InvokeModelWithResponseStream.
  • The payload shape is the standard Anthropic Messages API: system, messages, tools, max_tokens, temperature, and related fields.
  • The API is stateless, so the full conversation context is re-sent on every turn.
  • Prompt caching with cache_control: { type: "ephemeral" } reduces recompute and billing, but it does not hide the full prompt from Bedrock or CloudWatch model-invocation logs.
  • Tool execution happens locally on the client; Bedrock only sees the tool_use request and the tool_result content that the client sends back later.
  • In BYO-Bedrock mode, the request lands in your AWS account and region.

The Mental Model Most People Get Wrong

The biggest misconception is thinking Claude Desktop sends only the latest user message while Bedrock keeps the session state somewhere on the server side.

That is not what the observed request flow shows.

Claude Desktop treats each Bedrock call as a fresh, self-contained request. That means every turn includes:

  • the system prompt
  • the available tool definitions
  • the full message history
  • the latest user input or latest tool result

In other words: Bedrock stores nothing between turns for this flow. The client rebuilds and re-sends the full context every time.

Here is the practical shape:

messages: [
  { role: user,      content: [...] },
  { role: assistant, content: [...] },
  { role: user,      content: [...] },
  { role: assistant, content: [...] },
  ...
]

That is why Bedrock model-invocation logging is so useful here. If you enable it, you do not have to guess what the model saw.

What the System Prompt Actually Contains

On every observed call, the system array had three main blocks.

1. A billing and entrypoint tag

x-anthropic-billing-header: cc_version=2.1.121.540; cc_entrypoint=claude-desktop-3p;

This identifies the client and the Claude Desktop BYO-Bedrock entrypoint. The important part is cc_entrypoint=claude-desktop-3p, which marks the traffic as Claude Desktop running against the customer’s own Bedrock account.

2. A short agent identity line

You are a Claude agent, built on Anthropic's Claude Agent SDK.

3. The large Claude Agent SDK system prompt

The observed prompt was about 25K characters long and marked with cache_control: ephemeral.

According to the captured payload, this block includes the usual agent instructions you’d expect from a coding assistant:

  • tool-use rules
  • output-formatting rules
  • permission-mode behavior
  • security and refusal guidance
  • URL guardrails
  • context-management guidance
  • model metadata

The key point is not just the size. It is the visibility: this text is plain request content, not a hidden opaque layer outside Bedrock’s logging surface.

Prompt Caching Saves Money, Not Visibility

This is where the behavior gets interesting.

Claude Desktop uses prompt caching aggressively, but prompt caching does not mean “only a tiny delta is sent to Bedrock.” It means Bedrock can re-use previously cached prefix work instead of re-processing and re-billing the stable portion of the request.

The observed token pattern looked like this:

Turninput_tokenscache_read_input_tokenscache_creation_input_tokens
16050,983
2150,983285
3151,268306
4151,574113
5151,6872,923
6154,825441
7155,266159

The practical interpretation:

  • Turn 1 writes the big reusable prefix into cache.
  • Later turns read that prefix back from cache.
  • The small later cache_creation_input_tokens values reflect new history being folded into the cache.

But the most important operational takeaway is this:

Prompt caching changes billing and recomputation, not wire visibility.

The full prompt still belongs to the request envelope. If model-invocation logging is enabled, Bedrock can log what it received, subject to its body-size truncation limits.

Two More Caching Details That Matter

  • Cache keys are prefix-sensitive. A single change near the top can invalidate the whole cache.
  • The observed flow used ephemeral_5m and ephemeral_1h style TTL behavior to survive longer idle periods.

Tool Calls Are Local Until You Send the Result Back

Another useful clarification: Bedrock does not directly watch local tool execution.

What Bedrock sees is:

  1. A tool_use block in the assistant output
  2. A later tool_result block that the client includes in the next request

That means the model can ask for Read, Edit, Bash, or MCP-style tools, but Bedrock itself only receives the structured content that the client chooses to send back in the next turn.

So the boundary looks like this:

  • local execution: runs on the client
  • Bedrock visibility: sees the tool request and the returned result payload
  • no direct Bedrock access to your filesystem or terminal unless the client forwards content into tool_result

What Is Not Sent

Based on the observed session, a few things are worth calling out explicitly.

  • No raw filesystem contents are sent unless the client first reads them and includes them in a later tool_result.
  • No AWS credentials or secret tokens were visible in the observed request structure.
  • The billing header is a plain identifier, not an auth secret.
  • Cache, logs, and billing stay scoped to the customer’s AWS account and region in this BYO-Bedrock path.

This is an important distinction. Saying “Claude Desktop can use local files and tools” is not the same as saying “Bedrock automatically receives your local machine state.” The latter only happens when the client intentionally forwards content.

What Actually Reaches the Customer’s AWS Account

In the observed BYO-Bedrock flow:

  • Inbound to Bedrock: the request body over TLS, including system prompt, tools, and conversation history
  • Outbound from Bedrock: the streaming completion back to the desktop client
  • CloudWatch visibility: the request/response envelope, assuming model-invocation logging is enabled

That last point is why this deployment model is attractive to teams that care about observability. The request does not disappear into a black box you cannot inspect.

How To Verify This Yourself

If you want proof in your own environment instead of trusting anybody’s blog post, do this:

  1. Enable Bedrock model-invocation logging to CloudWatch Logs or S3 in the relevant AWS account and region.
  2. Run Claude Desktop against Bedrock in BYO mode.
  3. Inspect the captured request JSON.
  4. Look specifically for:
    • the system array
    • the full messages history
    • cache_control markers
    • token-accounting fields
    • the cc_entrypoint=claude-desktop-3p header

Once you do that, the architecture becomes much easier to reason about. You can stop debating from screenshots and inspect the actual wire payload.

Final Thought

If you remember one thing from this post, let it be this:

Claude Desktop on BYO-Bedrock is not hiding some magical proprietary request shape.

It is sending a standard, inspectable Anthropic-style messages payload to Bedrock, re-sending full context on every turn, and relying on prompt caching to make that affordable.

That is great news if you care about transparency, logging, and understanding exactly what your AI tooling is doing inside your own AWS account.

Related Posts