fix: avoid 500s when a2a cleanup is cancelled by lithammer · Pull Request #1268 · kagent-dev/kagent

lithammer · 2026-02-05T13:33:55Z

We hit a case where a client retried on 5xx responses, and the agent's Slack tool had already executed before the server returned 500. The request itself was effectively successful, but cleanup raised CancelledError, which bubbled into a 500 and triggered the retry. That led to duplicate Slack messages.

This change treats CancelledError during _cleanup_producer as non‑fatal so a completed request doesn't get turned into a 500 during cleanup. The result is that clients don't retry after work has already been done, avoiding duplicate side effects.

Copilot

Pull request overview

This PR prevents duplicate side effects when clients retry on 5xx responses by treating CancelledError during A2A cleanup as non-fatal. Previously, if cleanup was cancelled after a request completed successfully, the server would return a 500 error, triggering client retries and causing duplicate operations (e.g., multiple Slack messages).

Changes:

Introduced SafeRequestHandler that catches CancelledError during cleanup and performs best-effort resource cleanup instead of propagating the error
Replaced DefaultRequestHandler with SafeRequestHandler in the A2A application
Added unit tests verifying that cleanup handles both cancelled and successful task scenarios

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
python/packages/kagent-adk/src/kagent/adk/_safe_request_handler.py	New handler class that catches `CancelledError` during cleanup and performs manual resource cleanup
python/packages/kagent-adk/src/kagent/adk/_a2a.py	Updated to use `SafeRequestHandler` instead of `DefaultRequestHandler`
python/packages/kagent-adk/tests/unittests/test_safe_request_handler.py	Added tests for both cancelled and normal cleanup scenarios

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

python/packages/kagent-adk/src/kagent/adk/_safe_request_handler.py

Swallow CancelledError during cleanup so successful requests don't turn into 5xx responses. This prevents clients retrying after side effects already happened. Signed-off-by: Peter Lithammer <peter.lithammer@embark-studios.com>

lithammer · 2026-02-06T08:43:31Z

python/packages/kagent-adk/src/kagent/adk/_safe_request_handler.py

+            # Make a best-effort attempt to clean up resources.
+            await self._queue_manager.close(task_id)
+            async with self._running_agents_lock:
+                self._running_agents.pop(task_id, None)


A bit unsure about this part.

lithammer · 2026-02-06T09:01:08Z

Or would you prefer I try to solve this upstream in https://github.com/a2aproject/a2a-python?

lithammer · 2026-02-06T11:09:20Z

Or would you prefer I try to solve this upstream in https://github.com/a2aproject/a2a-python?

a2aproject/a2a-python#669

lithammer marked this pull request as ready for review February 5, 2026 13:57

lithammer requested review from EItanya, peterj and yuval-k as code owners February 5, 2026 13:57

Copilot AI review requested due to automatic review settings February 5, 2026 13:57

Copilot AI reviewed Feb 5, 2026

View reviewed changes

python/packages/kagent-adk/src/kagent/adk/_safe_request_handler.py Outdated Show resolved Hide resolved

fix: avoid 500s when a2a cleanup is cancelled

abeed0d

Swallow CancelledError during cleanup so successful requests don't turn into 5xx responses. This prevents clients retrying after side effects already happened. Signed-off-by: Peter Lithammer <peter.lithammer@embark-studios.com>

lithammer force-pushed the handle-cancellation-during-cleanup branch from 0d98064 to abeed0d Compare February 5, 2026 14:04

lithammer commented Feb 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: avoid 500s when a2a cleanup is cancelled#1268

fix: avoid 500s when a2a cleanup is cancelled#1268
lithammer wants to merge 1 commit intokagent-dev:mainfrom
lithammer:handle-cancellation-during-cleanup

lithammer commented Feb 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

lithammer Feb 6, 2026

Uh oh!

lithammer commented Feb 6, 2026

Uh oh!

lithammer commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lithammer commented Feb 5, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

lithammer Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

lithammer commented Feb 6, 2026

Uh oh!

lithammer commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant