-
Notifications
You must be signed in to change notification settings - Fork 2.7k
fix(fallback_adapter): don't mark LLM unavailable on 499 client cancellation gemini 3 #4632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(fallback_adapter): don't mark LLM unavailable on 499 client cancellation gemini 3 #4632
Conversation
…llation When using preemptive generation, client-initiated request cancellations (HTTP 499) can occur frequently as users interrupt the agent. These cancellations don't indicate that the LLM service is unhealthy - they're normal behavior when speculative requests are no longer needed. Previously, any exception in the fallback adapter would mark the LLM as unavailable, triggering unnecessary fallback cascades even though the primary LLM was functioning correctly. This change checks if the exception is an APIStatusError with status code 499 before marking the LLM as unavailable. Client cancellations are now handled gracefully without affecting LLM availability status. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Court Lykins seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
📝 WalkthroughWalkthroughModified exception handling in FallbackLLMStream._run to distinguish between client-initiated cancellations (APIStatusError with status_code 499) and other errors. Client cancellations no longer mark the LLM as unavailable; other exceptions trigger unavailability marking and emit a notification. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@livekit-agents/livekit/agents/llm/fallback_adapter.py`:
- Around line 262-269: Update the typo in the exception comment inside the
except block that handles errors from the LLM call: change the reference from
"_try_synthesize" to the correct method name "_try_generate" (the block around
the except that checks is_client_cancellation, APIStatusError, and
llm_status.available). Ensure the comment accurately names _try_generate so it
matches the actual method used.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
livekit-agents/livekit/agents/llm/fallback_adapter.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
livekit-agents/livekit/agents/llm/fallback_adapter.py
🧠 Learnings (1)
📚 Learning: 2026-01-22T03:28:16.289Z
Learnt from: longcw
Repo: livekit/agents PR: 4563
File: livekit-agents/livekit/agents/beta/tools/end_call.py:65-65
Timestamp: 2026-01-22T03:28:16.289Z
Learning: In code paths that check capabilities or behavior of the LLM processing the current interaction, prefer using the activity's LLM obtained via ctx.session.current_agent._get_activity_or_raise().llm instead of ctx.session.llm. The session-level LLM may be a fallback and not reflect the actual agent handling the interaction. Use the activity LLM to determine capabilities and to make capability checks or feature toggles relevant to the current processing agent.
Applied to files:
livekit-agents/livekit/agents/llm/fallback_adapter.py
🧬 Code graph analysis (1)
livekit-agents/livekit/agents/llm/fallback_adapter.py (1)
livekit-agents/livekit/agents/_exceptions.py (3)
APIConnectionError(84-88)APIError(14-42)APIStatusError(45-81)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: type-check (3.9)
- GitHub Check: type-check (3.13)
- GitHub Check: unit-tests
🔇 Additional comments (2)
livekit-agents/livekit/agents/llm/fallback_adapter.py (2)
10-10: LGTM!The import of
APIStatusErroris correctly added and necessary for the new 499 status code check.
265-274: The 499 handling logic is correct.The implementation properly detects client-initiated cancellations and preserves LLM availability. One consideration: after a 499, the loop continues to try subsequent LLMs. If the request was intentionally cancelled (e.g., user interruption), attempting other LLMs may be unnecessary. If this is a concern, you could re-raise the exception for 499 to stop the fallback cascade entirely:
if is_client_cancellation: raise # Don't attempt other LLMs for intentional cancellationsHowever, if the current behavior is intentional (letting higher-level code handle the cancellation while the adapter continues), the implementation is fine as-is.
[approve_code_changes, suggest_optional_refactor]
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| except Exception as e: # exceptions already logged inside _try_synthesize | ||
| # Don't mark LLM unavailable if this was a client-initiated cancellation (499) | ||
| # The LLM service is healthy - we just cancelled the request | ||
| is_client_cancellation = ( | ||
| isinstance(e, APIStatusError) and e.status_code == 499 | ||
| ) | ||
|
|
||
| if llm_status.available and not is_client_cancellation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor typo in comment: "_try_synthesize" should be "_try_generate".
The comment references a non-existent method.
Proposed fix
- except Exception as e: # exceptions already logged inside _try_synthesize
+ except Exception as e: # exceptions already logged inside _try_generate📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| except Exception as e: # exceptions already logged inside _try_synthesize | |
| # Don't mark LLM unavailable if this was a client-initiated cancellation (499) | |
| # The LLM service is healthy - we just cancelled the request | |
| is_client_cancellation = ( | |
| isinstance(e, APIStatusError) and e.status_code == 499 | |
| ) | |
| if llm_status.available and not is_client_cancellation: | |
| except Exception as e: # exceptions already logged inside _try_generate | |
| # Don't mark LLM unavailable if this was a client-initiated cancellation (499) | |
| # The LLM service is healthy - we just cancelled the request | |
| is_client_cancellation = ( | |
| isinstance(e, APIStatusError) and e.status_code == 499 | |
| ) | |
| if llm_status.available and not is_client_cancellation: |
🤖 Prompt for AI Agents
In `@livekit-agents/livekit/agents/llm/fallback_adapter.py` around lines 262 -
269, Update the typo in the exception comment inside the except block that
handles errors from the LLM call: change the reference from "_try_synthesize" to
the correct method name "_try_generate" (the block around the except that checks
is_client_cancellation, APIStatusError, and llm_status.available). Ensure the
comment accurately names _try_generate so it matches the actual method used.
Body:
Summary
When using preemptive generation with Gemini 3, client-initiated request
cancellations (HTTP 499) can occur frequently as users interrupt the agent. These cancellations don't
indicate that the LLM service is unhealthy - they're normal behavior when speculative requests are no
longer needed.
Previously, any exception in the fallback adapter would mark the LLM as unavailable, triggering
unnecessary fallback cascades even though the primary LLM was functioning correctly which returns 499 status codes when requests are cancelled mid-stream.
Changes
APIStatusErrorwith status code 499 before marking LLM as unavailableTest plan
Before/After
Before fix: 499 error → LLM marked unavailable → fallback cascade triggered
After fix: 499 error → LLM remains available → no unnecessary fallback
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.