Skip to content

Conversation

@VinayJogani14
Copy link
Contributor

@VinayJogani14 VinayJogani14 commented Jan 25, 2026

…ions

Summary by CodeRabbit

  • Bug Fixes
    • Improved ElevenLabs speech-to-text stability with automatic reconnection when streams close unexpectedly.
    • Added exponential backoff between reconnect attempts to reduce transient failures.
    • Enforced a configurable max reconnect limit to avoid indefinite retry loops.
    • Restored and maintained proper speaking/state handling across reconnects.
    • Marked certain disconnects as retryable so transient issues are retried intelligently.

✏️ Tip: You can customize this high-level summary in your review settings.

…ions

- Add retryable=True to APIStatusError when connection closes unexpectedly
   - Implement exponential backoff for reconnection attempts (max 5 attempts)
   - Reset _speaking state on reconnection to prevent state corruption
   - Add logging for reconnection attempts
   - Reset reconnection counter after successful reconnection

   Fixes livekit#4609
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 25, 2026

📝 Walkthrough

Walkthrough

Adds reconnection logic to ElevenLabs STT WebSocket handling: introduces reconnect attempts with exponential backoff, a maximum attempts limit, and resets relevant state on successful reconnects; raises APIStatusError(retryable=True) when the socket closes unexpectedly.

Changes

Cohort / File(s) Summary
ElevenLabs STT reconnection
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
Add reconnect loop with reconnect_attempt and max_reconnect_attempts, exponential backoff, reset of speaking/reconnect state on success, and raise APIStatusError(..., retryable=True) for unexpected WS closures. Applies same behavior in _run() and recv_task paths.

Sequence Diagram

sequenceDiagram
    participant Agent as Agent/SpeechStream
    participant WS as WebSocket
    participant EL as ElevenLabs Service

    Agent->>WS: establish connection
    activate WS
    WS->>EL: open stream
    activate EL
    EL-->>WS: stream messages
    WS-->>Agent: deliver messages

    WS-->>Agent: connection closes unexpectedly
    deactivate WS
    deactivate EL

    Agent->>Agent: raise APIStatusError(retryable=true)
    Agent->>Agent: increment reconnect_attempt
    Agent->>Agent: compare to max_reconnect_attempts

    alt attempts remaining
        Agent->>Agent: exponential backoff delay
        Agent->>WS: reconnect attempt
        activate WS
        WS->>EL: reopen stream
        activate EL
        EL-->>WS: stream resumes
        WS-->>Agent: messages resume
        Agent->>Agent: reset reconnect_attempt = 0
        Agent->>Agent: clear reconnect state
    else max exceeded
        Agent->>Agent: stop stream (hard stop)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 When sockets drop and silence grows,
I count attempts and plant some bows.
Backoff blooms, then I reconnect,
Reset the state, resume the text—
Hopping back to keep the flows.

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately describes the main change: adding reconnection support for ElevenLabs STT WebSocket disconnects. It is concise and directly related to the changeset.
Linked Issues check ✅ Passed The pull request implementation fully addresses the plugin-level fix for issue #4609: it raises APIStatusError with retryable=True on unexpected WebSocket closure, implements reconnection logic with exponential backoff, resets speaking state on reconnection, and logs attempts.
Out of Scope Changes check ✅ Passed All changes are directly scoped to resolving issue #4609 and are confined to the ElevenLabs STT module. The 25 lines added implement the required reconnection semantics with no unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Removed unnecessary whitespace in exponential backoff calculation.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py (1)

400-447: Reconnection loop still exits on first disconnect.

APIStatusError from recv_task (and failed reconnect connects) currently bubbles out of _run, so the backoff/attempt logic never executes. Catch retryable connection errors, continue the loop until the attempt limit, and reset _speaking for any reconnect (including _reconnect_event) so START_OF_SPEECH isn’t suppressed.

🔧 Proposed fix
         while True:
             try:
                 if reconnect_attempt > 0:
                     logger.info(
                         "Reconnecting to ElevenLabs STT (attempt %d/%d)",
                         reconnect_attempt + 1,
                         max_reconnect_attempts,
                     )
                     # Reset speaking state on reconnection
                     self._speaking = False
                     # Add exponential backoff
                     await asyncio.sleep(min(2 ** reconnect_attempt, 10))

                 reconnect_attempt += 1
                 if reconnect_attempt > max_reconnect_attempts:
                     logger.error("Max reconnection attempts reached for ElevenLabs STT")
                     break

-                ws = await self._connect_ws()
+                try:
+                    ws = await self._connect_ws()
+                except APIConnectionError as e:
+                    if reconnect_attempt >= max_reconnect_attempts:
+                        raise
+                    logger.warning(
+                        "ElevenLabs STT connect failed; retrying",
+                        exc_info=e,
+                    )
+                    continue
                 tasks = [
                     asyncio.create_task(send_task(ws)),
                     asyncio.create_task(recv_task(ws)),
                     asyncio.create_task(keepalive_task(ws)),
                 ]
                 tasks_group = asyncio.gather(*tasks)
                 wait_reconnect_task = asyncio.create_task(self._reconnect_event.wait())

                 try:
                     done, _ = await asyncio.wait(
                         (tasks_group, wait_reconnect_task),
                         return_when=asyncio.FIRST_COMPLETED,
                     )

                     for task in done:
                         if task != wait_reconnect_task:
                             task.result()

                     if wait_reconnect_task not in done:
                         break

                     self._reconnect_event.clear()
                     # Reset reconnection counter on successful reconnection
                     reconnect_attempt = 0
+                    self._speaking = False
+                except APIStatusError as e:
+                    if reconnect_attempt >= max_reconnect_attempts:
+                        raise
+                    logger.warning(
+                        "ElevenLabs STT disconnected; retrying",
+                        exc_info=e,
+                    )
+                    continue
                 finally:
                     await utils.aio.gracefully_cancel(*tasks, wait_reconnect_task)
                     tasks_group.cancel()
                     tasks_group.exception()  # Retrieve exception to prevent it from being logged
             finally:
                 if ws is not None:
                     await ws.close()
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7fe642d and c7f4314.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
🧬 Code graph analysis (1)
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py (1)
livekit-agents/livekit/agents/_exceptions.py (1)
  • APIStatusError (45-81)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: type-check (3.9)
  • GitHub Check: unit-tests
  • GitHub Check: type-check (3.13)
🔇 Additional comments (1)
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py (1)

385-388: Retryable close error handling looks good.

This correctly marks unexpected WebSocket closures as retryable for upstream recovery logic.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py (1)

432-454: Missing exception handler for reconnection on unexpected disconnection.

The reconnection logic with exponential backoff is added, but there's no except block to catch the APIStatusError raised by recv_task when the WebSocket closes unexpectedly. The exception will propagate through both finally blocks and exit the while True loop, causing the stream to end without any reconnection attempts.

To enable automatic reconnection on unexpected disconnection, you need to catch APIStatusError (or at least retryable errors) and continue the loop:

🐛 Suggested fix to catch exceptions and retry
                 try:
                     done, _ = await asyncio.wait(
                         (tasks_group, wait_reconnect_task),
                         return_when=asyncio.FIRST_COMPLETED,
                     )

                     for task in done:
                         if task != wait_reconnect_task:
                             task.result()

                     if wait_reconnect_task not in done:
                         break

                     self._reconnect_event.clear()
                     # Reset reconnection counter on successful reconnection
                     reconnect_attempt = 0
+                except APIStatusError as e:
+                    if not e.retryable:
+                        raise
+                    logger.warning("ElevenLabs STT connection error (retryable): %s", e)
+                    # Continue the loop to trigger reconnection
+                    continue
                 finally:
                     await utils.aio.gracefully_cancel(*tasks, wait_reconnect_task)
                     tasks_group.cancel()
                     tasks_group.exception()  # Retrieve exception to prevent it from being logged

Note: With this fix, you should also reset reconnect_attempt = 0 after a successful connection is established (e.g., after ws = await self._connect_ws() succeeds and tasks start running without immediate failure), not just after an intentional reconnect via _reconnect_event.

🧹 Nitpick comments (1)
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py (1)

407-421: Off-by-one confusion in reconnection logging.

The log message is misleading. When reconnect_attempt=1 (first time through the reconnect path), it logs "attempt 2/5". This is because the log uses reconnect_attempt + 1, but reconnect_attempt is already 1 at this point (incremented from the initial connection attempt).

Consider logging reconnect_attempt directly to show the actual reconnection attempt number:

♻️ Suggested fix
                 if reconnect_attempt > 0:
                     logger.info(
                         "Reconnecting to ElevenLabs STT (attempt %d/%d)",
-                        reconnect_attempt + 1,
+                        reconnect_attempt,
                         max_reconnect_attempts,
                     )
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c7f4314 and 6aea0ad.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
🧬 Code graph analysis (1)
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py (1)
livekit-agents/livekit/agents/_exceptions.py (1)
  • APIStatusError (45-81)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: unit-tests
🔇 Additional comments (2)
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py (2)

385-388: LGTM - Explicit retryable flag improves clarity.

The explicit retryable=True makes the intent clear, even though APIStatusError defaults to retryable=True when status_code=-1. This correctly addresses the root cause from issue #4609.


413-414: Good practice: resetting _speaking state on reconnection.

Resetting _speaking = False ensures that START_OF_SPEECH will be properly emitted after reconnection, preventing state corruption where the stream might think it's already in a speech segment from the previous connection.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ElevenLabs STT stream does not reconnect after mid-stream WebSocket disconnection

1 participant