handle data parts in AIChat by deathbyknowledge · Pull Request #900 · cloudflare/agents

deathbyknowledge · 2026-02-13T11:15:49Z

Add support for data-* stream parts in the shared message builder. Data parts are developer-defined typed JSON blobs that the AI SDK attaches to messages alongside text — they were previously silently dropped.

What changed

packages/ai-chat/src/message-builder.ts — Handle data-* chunk types in applyChunkToParts. This is the shared parser used by both the server (_streamSSEReply) and client (stream resume / cross-tab broadcast), so data parts are handled consistently everywhere. Transient parts (transient: true) are broadcast but not persisted. Non-transient parts support reconciliation by type+id — a second chunk with the same composite key updates .data in-place.

packages/ai-chat/src/react.tsx — Forward the onData callback for data-* chunks on the stream resume and cross-tab broadcast codepaths. The transport path (new messages from this tab) already goes through the AI SDK's internal pipeline which invokes onData. This is the correct way to consume transient data parts on the client since they are not in message.parts.

packages/ai-chat/src/tests/message-builder.test.ts — 9 new tests covering append, id preservation, type+id reconciliation, non-matching ids/types, id-less parts, transient skip, and coexistence with other part types.

examples/resumable-stream-chat/ — Updated to demonstrate three data part patterns: data-sources (reconciliation), data-thinking (transient via onData), and data-usage (persisted metadata). The client uses UIMessage<unknown, { sources; thinking; usage }> for fully typed data parts — no manual casts needed.

What about the items from this comment?

onData callback — Addressed. Extracted from options, stored in a ref, forwarded to useChat, and explicitly called for the resume/broadcast paths.
File part size validation — Removed per "don't skip persistence" feedback. PR ai-chat: architecture cleanup, bug fixes, storage guards, new features, docs rewrite #899 handles oversized rows with compression at the storage layer.
Typed data parts on useAgentChat — Already works. The ChatMessage generic flows through useAgentChat → useChat end-to-end. dataPartSchemas is available via the inherited ChatInit options. The example now demonstrates this — part.type === "data-sources" narrows part.data to the correct shape automatically.

Closes #575

changeset-bot · 2026-02-13T11:15:54Z

🦋 Changeset detected

Latest commit: 08fa18f

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
@cloudflare/ai-chat	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

threepointone · 2026-02-13T11:18:18Z

I think I handled this in my pr, can you check?

threepointone · 2026-02-13T11:19:01Z

Oh this is different

pkg-pr-new · 2026-02-13T11:19:34Z

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/agents@900

commit: 08fa18f

deathbyknowledge · 2026-02-13T11:28:38Z

Oh this is different

Yes, I think your PR is missing data parts but you have added a compression for messages too large and I added a simpler size check to skip persistence. Should I remove that?

threepointone · 2026-02-13T11:35:52Z

Yeah don't skip persistence

whoiskatrin · 2026-02-13T11:46:08Z

/bonk review this PR

ask-bonk · 2026-02-13T11:47:21Z

Now I have a thorough understanding of the PR. Here's my review:

PR Review: Handle data parts in AIChat

Summary

This PR adds server-side persistence for data-* stream parts in AIChatAgent._streamSSEReply, which were previously silently dropped. It also adds a size guard on file parts and updates the resumable-stream-chat example to demonstrate three data part patterns.

Core SDK Changes (`packages/ai-chat/src/index.ts`)

The good:

The default case with data.type.startsWith("data-") guard is the right place to handle this — it's a catch-all that won't interfere with existing named cases.
The reconciliation logic (find-by-type+id, update in-place) correctly mirrors AI SDK client behavior for progressive updates.
Transient parts are correctly excluded from persistence while still being broadcast to clients (broadcast happens at line 1838-1848, after the switch).
The size guard on file parts is a sensible addition.

Issues:

Skipping persistence on oversized parts contradicts maintainer feedback. @threepointone explicitly said "don't skip persistence" in the PR comments. The MAX_DATA_PART_SIZE_BYTES guard and the "skip persistence" behavior for both file parts and data parts needs to be removed or reworked per that feedback. This is a blocker.
Unsafe as cast on message.parts.push. At line 1813:
```
message.parts.push({
  type: dataChunk.type,
  ...(dataChunk.id != null && { id: dataChunk.id }),
  data: dataChunk.data
} as ChatMessage["parts"][number]);
```
ChatMessage is UIMessage from the AI SDK, and UIMessage["parts"] has a fixed union type. A data-* part won't match any member of that union, so the as cast silences a real type mismatch. This will work at runtime (it's just an array push), but it's hiding a type system gap. Consider whether ChatMessage should be extended to include a generic data part shape, or add a comment explaining why the cast is necessary and safe.
"id" in p check is fragile. At line 1800:
```
const existing = message.parts.find(
  (p) =>
    p.type === dataChunk.type &&
    "id" in p &&
    (p as { id?: string }).id === dataChunk.id
);
```
The "id" in p check will also match other part types that happen to have an id field (e.g., reasoning parts have an id). Since you're already filtering by dataChunk.type (which starts with data-), and non-data parts won't share those type strings, this is safe in practice — but a comment noting that would help future readers.
new TextEncoder().encode(...) on potentially large strings for size checking. This allocates a full UTF-8 byte array just to check .byteLength. For the 2MB threshold this is fine in practice, but new Blob([serialized]).size would be more memory-efficient if this ever matters. Not a blocker.

Example Changes (`examples/resumable-stream-chat/`)

The good:

Clean demonstration of all three patterns (reconciliation, transient, persisted metadata).
The client-side rendering with dedicated components (SourcesPart, ThinkingPart, UsagePart) is well-structured.

Issues:

Data parts are written after writer.merge(result.toUIMessageStream()). In server.ts, the writer.merge() call returns immediately (it pipes asynchronously), so the data-sources "searching" write at line ~52 fires right away — before any text has likely been emitted. This means the sources card will appear above the text in the UI, which matches the client layout. But the "found" write at line ~58 happens after a 1500ms setTimeout, which is a race with the LLM stream — the sources update might arrive mid-text-stream or after it finishes depending on model speed. This is probably fine for a demo, but a comment noting the intentional race would be helpful.
Hardcoded results in the example. The data-sources "found" results are always the same three strings regardless of the query. Fine for a demo, but worth a brief inline comment so users don't think real search is happening.
The ThinkingPart transient part is written with transient: true but the client also guards on isStreaming. This is belt-and-suspenders — after reload the transient part won't be in persisted messages, AND the streaming check would hide it anyway. Not a bug, just redundancy worth noting.

Changeset

The changeset bumps both @cloudflare/agents-resumable-stream-chat (example) and @cloudflare/ai-chat (SDK) as patch. The SDK bump is appropriate. The example bump is unnecessary if examples aren't independently published, but it doesn't hurt.

Verdict

The main blocker is item #1 — the maintainer has asked to not skip persistence for oversized parts. That feedback needs to be addressed before merge. The approach should either be removed entirely (let SQLite fail naturally, or handle it at the storage layer with compression as @threepointone's PR does) or reworked to match whatever the agreed strategy is.

Everything else is minor and the core reconciliation/transient logic is correct.

github run

deathbyknowledge · 2026-02-13T15:46:06Z

/bonk review again

ask-bonk · 2026-02-13T15:47:34Z

Good. Now I have a full picture. Let me write the review.

PR Review: Handle data parts in AIChat (re-review)

Previous blocker — resolved

The MAX_DATA_PART_SIZE_BYTES guard and skip-persistence behavior were removed in d8895f7, addressing @threepointone's feedback. Data parts are now always persisted (unless transient). This is correct.

Remaining issues

1. Stray whitespace change (nit)

The diff removes a blank line between CLEANUP_AGE_THRESHOLD_MS and const decoder at line 152. This is unrelated to the feature and will cause a noisy diff. Should be reverted.

2. Transient parts are still broadcast AND stored as stream chunks

Transient parts are correctly excluded from message.parts (the persisted message), but the code falls through to _storeStreamChunk at line 1806 — meaning the raw SSE chunk is stored in the replay buffer. If a client reconnects mid-stream, it will receive the transient chunk during replay, which is fine (the stream is still in-flight). But after the stream completes, the chunk stays in the replay buffer until cleanup. This is probably harmless, but worth a comment noting that transient parts are excluded from message persistence but are still included in the stream replay buffer (by design, since they're part of the SSE stream).

3. as ChatMessage["parts"][number] cast — unchanged from previous review

The cast at line 1777 is still there. UIMessage["parts"] is a closed union that doesn't include data-* shapes, so this as silences a real type error. It works at runtime, but if the AI SDK ever tightens the union or adds runtime validation on parts, this will break silently. At minimum, add a comment like:

// Cast needed because UIMessage["parts"] doesn't include data-* part shapes.
// Safe at runtime — parts is a plain array and data-* types are pass-through.

4. "id" in p type narrowing — unchanged from previous review

The find predicate at line 1764:

(p) =>
  p.type === dataChunk.type &&
  "id" in p &&
  (p as { id?: string }).id === dataChunk.id

The p.type === dataChunk.type guard (where dataChunk.type starts with data-) makes the "id" in p check safe in practice since no built-in part types share data-* prefixes. But the double cast ("id" in p then as { id?: string }) is awkward. A cleaner approach:

const existing = message.parts.find(
  (p) =>
    p.type === dataChunk.type &&
    "id" in p &&
    (p as Record<string, unknown>).id === dataChunk.id
);

Or just cast once at the top since you're already in the data-* branch. Minor nit, not a blocker.

Example changes — look good

The hardcoded results comment from the previous review was added (line ~75 in server.ts). Good.
The three patterns (reconciliation, transient, persisted) are clearly structured and well-commented.
The data-thinking part with transient: true on the server and isStreaming guard on the client is intentionally redundant (belt-and-suspenders), which is reasonable for a demo.

One observation: the writer.merge(result.toUIMessageStream()) call at line 62 starts piping the LLM stream, then immediately writes data-sources "searching" at line 65. Since merge pipes asynchronously, the sources write interleaves with text chunks — the ordering depends on model latency. The demo happens to look fine because the client renders sources above text regardless of arrival order. No action needed, just noting it.

Verdict

No blockers remaining. The main feedback from the previous review (skip persistence) has been addressed. The remaining items are minor — the as cast comment (#3) and the stray whitespace (#1) are worth fixing before merge but shouldn't block it. The core reconciliation and transient logic is correct.

github run

threepointone · 2026-02-14T14:38:28Z

cursor told me this:

What's still NOT covered even after this PR:
Looking at the AI SDK's full data parts feature set:
onData callback on the client side — The AI SDK's useChat has an onData callback that fires for all data parts as they stream (including transient ones). Our useAgentChat hook doesn't expose this. The example works around it by checking isStreaming to conditionally render the transient thinking part, but that's a UI hack — once streaming ends, the part vanishes from the render, not because the framework removed it, but because the render check hides it. A proper onData callback would give users a clean way to handle transient parts (show toasts, update external state, etc.).
File part size validation — The PR description mentions it, but I don't see it in the diff for index.ts. The diff only shows the data-* handling in the default case.
Typed data parts on useAgentChat — The AI SDK supports useChat with typed data part schemas, so message.parts gives you type-safe access to your custom data-weather, data-sources, etc. The example uses manual type assertions (as DataPart). Our useAgentChat would need to pass through the generic type parameter to fully support this

threepointone · 2026-02-16T10:12:26Z

I'll review this later this evening, let's get it in shape by then?

…#900)

threepointone · 2026-02-16T18:00:37Z

what parts of that feedback I shared were addressed? could you update the pr description? could you add tests (unit/e2e as relevant) thanks! I'll review tomorrow morning before landing.

deathbyknowledge · 2026-02-17T10:20:24Z

what parts of that feedback I shared were addressed? could you update the pr description? could you add tests (unit/e2e as relevant) thanks! I'll review tomorrow morning before landing.

Now onData does trigger client side
File part size validation. It was removed entirely because it was added here ai-chat: architecture cleanup, bug fixes, storage guards, new features, docs rewrite #899
Typed data parts already work, I've added the proper types in the example to showcase it

I've added tests too 🫡

Add a clarifying comment in message-builder explaining that chunk.data should always be provided because a persisted part with data: undefined will be dropped by JSON.stringify (resulting in no data field on reload). Add unit tests to cover: transient parts with the same id are skipped (not reconciled) and handling of undefined data (part persisted without a data field).

threepointone · 2026-02-17T11:06:30Z

love this, banger. will land once CI passes.

- Add Data Parts section to docs/chat-agents.md covering server-side writer.write(), three patterns (reconciliation, append, transient), typed client access, and onData callback - Update docs/AGENTS.md to reflect that the sync-docs workflow has been removed — docs must be manually ported to cloudflare-docs Co-authored-by: Cursor <cursoragent@cursor.com>

Adjust Markdown table spacing and column alignment in the "Three Patterns" section of docs/chat-agents.md to improve readability and rendering. No semantic changes to content or behavior.

* feat(ai-chat): handle data-* stream parts and forward onData callback (cloudflare#900) * typed data parts in example * add data parts in message builder test * Clarify chunk.data and add tests Add a clarifying comment in message-builder explaining that chunk.data should always be provided because a persisted part with data: undefined will be dropped by JSON.stringify (resulting in no data field on reload). Add unit tests to cover: transient parts with the same id are skipped (not reconciled) and handling of undefined data (part persisted without a data field). * docs: add data parts section to chat-agents, update AGENTS.md - Add Data Parts section to docs/chat-agents.md covering server-side writer.write(), three patterns (reconciliation, append, transient), typed client access, and onData callback - Update docs/AGENTS.md to reflect that the sync-docs workflow has been removed — docs must be manually ported to cloudflare-docs Co-authored-by: Cursor <cursoragent@cursor.com> * Reformat table in chat-agents.md Adjust Markdown table spacing and column alignment in the "Three Patterns" section of docs/chat-agents.md to improve readability and rendering. No semantic changes to content or behavior. --------- Co-authored-by: Sunil Pai <spai@cloudflare.com> Co-authored-by: Cursor <cursoragent@cursor.com>

deathbyknowledge marked this pull request as ready for review February 13, 2026 11:28

feat(ai-chat): handle data-* stream parts and forward onData callback (…

16def7b

…#900)

deathbyknowledge force-pushed the store-data-parts branch from d8895f7 to 16def7b Compare February 16, 2026 11:48

deathbyknowledge added 2 commits February 17, 2026 11:10

typed data parts in example

4a60d44

add data parts in message builder test

6f5777f

threepointone added 2 commits February 17, 2026 10:49

Merge remote-tracking branch 'origin/main' into store-data-parts

88c7f7b

threepointone approved these changes Feb 17, 2026

View reviewed changes

threepointone and others added 2 commits February 17, 2026 11:14

Reformat table in chat-agents.md

08fa18f

Adjust Markdown table spacing and column alignment in the "Three Patterns" section of docs/chat-agents.md to improve readability and rendering. No semantic changes to content or behavior.

threepointone merged commit 16b2dca into main Feb 17, 2026
4 checks passed

threepointone deleted the store-data-parts branch February 17, 2026 11:20

github-actions bot mentioned this pull request Feb 17, 2026

Version Packages #902

Merged

Conversation

deathbyknowledge commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed

What about the items from this comment?

Uh oh!

changeset-bot bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

threepointone commented Feb 13, 2026

Uh oh!

threepointone commented Feb 13, 2026

Uh oh!

pkg-pr-new bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deathbyknowledge commented Feb 13, 2026

Uh oh!

threepointone commented Feb 13, 2026

Uh oh!

whoiskatrin commented Feb 13, 2026

Uh oh!

ask-bonk bot commented Feb 13, 2026

PR Review: Handle data parts in AIChat

Summary

Core SDK Changes (packages/ai-chat/src/index.ts)

Example Changes (examples/resumable-stream-chat/)

Changeset

Verdict

Uh oh!

deathbyknowledge commented Feb 13, 2026

Uh oh!

ask-bonk bot commented Feb 13, 2026

PR Review: Handle data parts in AIChat (re-review)

Previous blocker — resolved

Remaining issues

Example changes — look good

Verdict

Uh oh!

threepointone commented Feb 14, 2026

Uh oh!

threepointone commented Feb 16, 2026

Uh oh!

threepointone commented Feb 16, 2026

Uh oh!

deathbyknowledge commented Feb 17, 2026

Uh oh!

threepointone commented Feb 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

deathbyknowledge commented Feb 13, 2026 •

edited

Loading

changeset-bot bot commented Feb 13, 2026 •

edited

Loading

pkg-pr-new bot commented Feb 13, 2026 •

edited

Loading

Core SDK Changes (`packages/ai-chat/src/index.ts`)

Example Changes (`examples/resumable-stream-chat/`)