Skip to content

Conversation

@lokitoth
Copy link
Member

@lokitoth lokitoth commented Jan 15, 2026

Motivation and Context

Subworkflows run into issues with Checkpointing and the Chat Protocol:

  • The concurrency rework made subtle changes in behaviour that introduced a hang when using subworkflows with ChatProtocol and streaming execution.
  • The ResetAsync() implementation in WorkflowHostExecutor was improperly resetting the joinContext - this was happening on restore checkpoint after the join context was attached when the executor is instantiated
  • Subworkflows cannot be used as the start node when hosted AsAgent() due to inability to treat Catch-All as a Chat Protocol
  • Subworkflow ownership issue when used in non-concurrent mode after finishing a run

Also fixes:

  • When ChatMessages are output by executors that are not agents, there is no corresponding AgentResponseUpdate/AgentResponse event

Breaking Changes

  • [BREAKING CHANGE] It is possible to provide the wrong runId when resuming from CheckpointInfo (even though the data already exists on CheckpointInfo), which could lead to odd error like "unable to find checkpoint with ID=..."

Description

Multiple fixes targeting related capabilities around Subworkflows and Checkpointing both AsAgent and directly:

Also fixes:

  • Subworkflow outputs do not get output through the parent workflow (fixes .NET : Subworkflow Output behavior #2163)
  • When ChatMessages are output by executors that are not agents, there is no corresponding AgentRunUpdate/AgentRunResponse event

Likely relevant to #2419, will validate / add test in separate PR.

Breaking Changes

  • [BREAKING CHANGE] Removes the runId argument from ResumeAsync() and ResumeStreamAsync()

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Copilot AI review requested due to automatic review settings January 15, 2026 20:02
@lokitoth lokitoth added .NET workflows Related to Workflows in agent-framework breaking change Introduces changes that are not backward compatible and may require updates to dependent code. labels Jan 15, 2026
@lokitoth lokitoth changed the title .NET: [BREAKING CHANGE] fix: Subworkflows do not work well with Chat Protocol and Checkpointing .NET: [BREAKING] fix: Subworkflows do not work well with Chat Protocol and Checkpointing Jan 15, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes multiple issues related to subworkflows when combined with checkpointing and the chat protocol. The changes address concurrency issues, ownership problems, and output propagation when subworkflows are used in different execution modes (AsAgent and direct streaming).

Changes:

  • Removed the runId parameter from Resume methods as it's redundant (already contained in CheckpointInfo)
  • Fixed subworkflow ownership release to properly restore previous ownership state instead of always releasing to null
  • Added YieldOutputAsync support to ISuperStepJoinContext for propagating subworkflow outputs
  • Modified ChatProtocol validation to allow catch-all handlers as valid chat protocol executors
  • Fixed WorkflowHostExecutor reset logic to avoid detaching join context prematurely

Reviewed changes

Copilot reviewed 23 out of 24 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
IWorkflowExecutionEnvironment.cs Removed redundant runId parameter from Resume methods
InProcessExecution.cs Updated static helper methods to remove runId parameter
InProcessExecutionEnvironment.cs Removed runId parameter from ResumeRunAsync, now uses CheckpointInfo.RunId
WorkflowThread.cs Added includeWorkflowOutputsInResponse support and new CreateUpdate overload for ChatMessage
WorkflowHostingExtensions.cs Added includeWorkflowOutputsInResponse parameter to AsAgent extension
WorkflowHostAgent.cs Plumbed through includeWorkflowOutputsInResponse and enabled catch-all for protocol validation
Workflow.cs Modified ReleaseOwnershipAsync to accept targetOwnerToken for proper ownership chaining
InProcessRunnerContext.cs Added ownership tracking fields and YieldOutputAsync implementation
WorkflowHostExecutor.cs Fixed checkpoint manager initialization, join context detachment, and output forwarding
OutputMessagesExecutor.cs Moved from nested class to top-level with explicit executor options
ChatProtocolExecutor.cs Set explicit executor options to disable auto-send/yield
ChatProtocol.cs Added allowCatchAll parameter for protocol validation
ProtocolDescriptor.cs Added AcceptsAll property and constructor parameter
Executor.cs Pass HasCatchAll to ProtocolDescriptor
ISuperStepJoinContext.cs Added YieldOutputAsync method
AIAgentHostExecutor.cs Removed blank line (formatting)
SubworkflowBinding.cs Formatting adjustment for constructor parameters
Sample files Updated to remove runId parameter and added new test case
TestRunContext.cs Added QueuedOutputs and YieldOutputAsync test support

@lokitoth lokitoth force-pushed the dev/dotnet_workflow/fix_subworkflow_checkpointing branch from 199a520 to da54134 Compare January 15, 2026 20:32
Subworkflows run into issues with Checkpointing and the Chat Protocol:

* The concurrency rework made subtle changes in behaviour that introduced a hang when using subworkflows with ChatProtocol and streaming execution.
* The ResetAsync() implementation in WorkflowHostExecutor was improperly resetting the joinContext - this was happening on restore checkpoint _after_ the join context was attached when
* Subworkflows cannot be used as the start node when hosted AsAgent due to inability to treat Catch-All as a Chat Protocol
* Subworkflow ownership issue when used in non-concurrent mode after finishing a run

Also fixes:
* When ChatMessages are output by executors that are not agents, there is no corresponding AgentResponseUpdate/AgentResponse event

Breaking Changes
* [BREAKING CHANGE] It is possible to provide the wrong RunId when resuming from CheckpointInfo (even though the data already exists on CheckpointInfo)
@lokitoth lokitoth force-pushed the dev/dotnet_workflow/fix_subworkflow_checkpointing branch from da54134 to f7b8b95 Compare January 16, 2026 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking change Introduces changes that are not backward compatible and may require updates to dependent code. .NET workflows Related to Workflows in agent-framework

Projects

None yet

3 participants