Skip to content

Conversation

@clykins90
Copy link

@clykins90 clykins90 commented Jan 27, 2026

Body:

Summary

When using preemptive generation with Gemini 3, client-initiated request
cancellations (HTTP 499) can occur frequently as users interrupt the agent. These cancellations don't
indicate that the LLM service is unhealthy - they're normal behavior when speculative requests are no
longer needed.

Previously, any exception in the fallback adapter would mark the LLM as unavailable, triggering
unnecessary fallback cascades even though the primary LLM was functioning correctly which returns 499 status codes when requests are cancelled mid-stream.

Changes

  • Added check for APIStatusError with status code 499 before marking LLM as unavailable
  • Client cancellations are now handled gracefully without affecting LLM availability status

Test plan

  • Tested locally with preemptive generation enabled using Gemini 3
  • Verified 499 errors no longer trigger "LLM unavailable" warnings
  • Confirmed fallback adapter continues using primary LLM after 499 errors

Before/After

Before fix: 499 error → LLM marked unavailable → fallback cascade triggered
After fix: 499 error → LLM remains available → no unnecessary fallback

Summary by CodeRabbit

  • Bug Fixes
    • Improved fallback LLM error handling to distinguish between client-initiated cancellations and actual service failures. Client cancellations no longer incorrectly mark the LLM provider as unavailable.

✏️ Tip: You can customize this high-level summary in your review settings.

…llation

When using preemptive generation, client-initiated request cancellations
(HTTP 499) can occur frequently as users interrupt the agent. These
cancellations don't indicate that the LLM service is unhealthy - they're
normal behavior when speculative requests are no longer needed.

Previously, any exception in the fallback adapter would mark the LLM as
unavailable, triggering unnecessary fallback cascades even though the
primary LLM was functioning correctly.

This change checks if the exception is an APIStatusError with status
code 499 before marking the LLM as unavailable. Client cancellations
are now handled gracefully without affecting LLM availability status.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Court Lykins seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 27, 2026

📝 Walkthrough

Walkthrough

Modified exception handling in FallbackLLMStream._run to distinguish between client-initiated cancellations (APIStatusError with status_code 499) and other errors. Client cancellations no longer mark the LLM as unavailable; other exceptions trigger unavailability marking and emit a notification.

Changes

Cohort / File(s) Summary
Exception handling logic
livekit-agents/livekit/agents/llm/fallback_adapter.py
Added APIStatusError import and modified FallbackLLMStream._run to conditionally mark LLM unavailable based on exception type, skipping unavailability marking for client-initiated cancellations (status_code 499).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 A client says "stop," with code 499,
And once we hear it, we're feeling fine—
No blacklist needed, just let it be,
While other errors? Marked unavailable, you see!
Smart fallbacks hopping, errors in place, ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: handling HTTP 499 client cancellations in the fallback adapter without marking the LLM unavailable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@livekit-agents/livekit/agents/llm/fallback_adapter.py`:
- Around line 262-269: Update the typo in the exception comment inside the
except block that handles errors from the LLM call: change the reference from
"_try_synthesize" to the correct method name "_try_generate" (the block around
the except that checks is_client_cancellation, APIStatusError, and
llm_status.available). Ensure the comment accurately names _try_generate so it
matches the actual method used.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 02c11d3 and 5e09976.

📒 Files selected for processing (1)
  • livekit-agents/livekit/agents/llm/fallback_adapter.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-agents/livekit/agents/llm/fallback_adapter.py
🧠 Learnings (1)
📚 Learning: 2026-01-22T03:28:16.289Z
Learnt from: longcw
Repo: livekit/agents PR: 4563
File: livekit-agents/livekit/agents/beta/tools/end_call.py:65-65
Timestamp: 2026-01-22T03:28:16.289Z
Learning: In code paths that check capabilities or behavior of the LLM processing the current interaction, prefer using the activity's LLM obtained via ctx.session.current_agent._get_activity_or_raise().llm instead of ctx.session.llm. The session-level LLM may be a fallback and not reflect the actual agent handling the interaction. Use the activity LLM to determine capabilities and to make capability checks or feature toggles relevant to the current processing agent.

Applied to files:

  • livekit-agents/livekit/agents/llm/fallback_adapter.py
🧬 Code graph analysis (1)
livekit-agents/livekit/agents/llm/fallback_adapter.py (1)
livekit-agents/livekit/agents/_exceptions.py (3)
  • APIConnectionError (84-88)
  • APIError (14-42)
  • APIStatusError (45-81)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: type-check (3.9)
  • GitHub Check: type-check (3.13)
  • GitHub Check: unit-tests
🔇 Additional comments (2)
livekit-agents/livekit/agents/llm/fallback_adapter.py (2)

10-10: LGTM!

The import of APIStatusError is correctly added and necessary for the new 499 status code check.


265-274: The 499 handling logic is correct.

The implementation properly detects client-initiated cancellations and preserves LLM availability. One consideration: after a 499, the loop continues to try subsequent LLMs. If the request was intentionally cancelled (e.g., user interruption), attempting other LLMs may be unnecessary. If this is a concern, you could re-raise the exception for 499 to stop the fallback cascade entirely:

if is_client_cancellation:
    raise  # Don't attempt other LLMs for intentional cancellations

However, if the current behavior is intentional (letting higher-level code handle the cancellation while the adapter continues), the implementation is fine as-is.
[approve_code_changes, suggest_optional_refactor]

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +262 to +269
except Exception as e: # exceptions already logged inside _try_synthesize
# Don't mark LLM unavailable if this was a client-initiated cancellation (499)
# The LLM service is healthy - we just cancelled the request
is_client_cancellation = (
isinstance(e, APIStatusError) and e.status_code == 499
)

if llm_status.available and not is_client_cancellation:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor typo in comment: "_try_synthesize" should be "_try_generate".

The comment references a non-existent method.

Proposed fix
-                except Exception as e:  # exceptions already logged inside _try_synthesize
+                except Exception as e:  # exceptions already logged inside _try_generate
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e: # exceptions already logged inside _try_synthesize
# Don't mark LLM unavailable if this was a client-initiated cancellation (499)
# The LLM service is healthy - we just cancelled the request
is_client_cancellation = (
isinstance(e, APIStatusError) and e.status_code == 499
)
if llm_status.available and not is_client_cancellation:
except Exception as e: # exceptions already logged inside _try_generate
# Don't mark LLM unavailable if this was a client-initiated cancellation (499)
# The LLM service is healthy - we just cancelled the request
is_client_cancellation = (
isinstance(e, APIStatusError) and e.status_code == 499
)
if llm_status.available and not is_client_cancellation:
🤖 Prompt for AI Agents
In `@livekit-agents/livekit/agents/llm/fallback_adapter.py` around lines 262 -
269, Update the typo in the exception comment inside the except block that
handles errors from the LLM call: change the reference from "_try_synthesize" to
the correct method name "_try_generate" (the block around the except that checks
is_client_cancellation, APIStatusError, and llm_status.available). Ensure the
comment accurately names _try_generate so it matches the actual method used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants