AGT-2474: add commit user turn support for realtime models #4622

chenghao-mou · 2026-01-26T15:44:08Z

This adds commit_user_turn support for realtime models:

OpenAI realtime model will sends 3 messages recommended by the official doc: https://platform.openai.com/docs/guides/realtime-conversations#disable-vad
Other realtime models will ignore this call

This allows users to use turn_detection="manual" with a realtime model.

Summary by CodeRabbit

New Features
- Added explicit "commit user turn" for real-time agent sessions to finalize user turns.
- OpenAI provider: full commit triggers response creation.
- Google/AWS/Ultravox providers: method present but logs warnings or acts as a placeholder per provider support.
- Voice agent: when a realtime session is active, commits use the realtime path instead of audio-only processing.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-26T15:44:36Z

📝 Walkthrough

Walkthrough

Adds an abstract commit_user_turn() to RealtimeSession and implements it across realtime provider plugins; agent_activity now delegates to the realtime session when present, bypassing the audio-recognition commit path. Provider implementations either perform turn-finalization (OpenAI) or log unsupported warnings.

Changes

Cohort / File(s)	Summary
Abstract Interface & Agent Layer `livekit-agents/livekit/agents/llm/realtime.py`, `livekit-agents/livekit/agents/voice/agent_activity.py`	Adds abstract `commit_user_turn()` to `RealtimeSession`. `agent_activity.commit_user_turn()` now calls `_rt_session.commit_user_turn()` if present, otherwise falls back to AudioRecognition flow.
AWS Realtime Plugin `livekit-plugins/livekit-plugins-aws/.../realtime/realtime_model.py`	Adds `commit_user_turn()` that logs a warning stating Nova Sonic Realtime API does not support user-turn commit.
Google Realtime Plugin `livekit-plugins/livekit-plugins-google/.../realtime/realtime_api.py`	Adds `commit_user_turn()` and changes `commit_audio()` / `clear_audio()` to log warnings for Gemini Realtime API unsupported actions.
OpenAI Realtime Plugin (stable & beta) `livekit-plugins/livekit-plugins-openai/.../realtime/realtime_model.py`, `.../realtime_model_beta.py`	Implements `commit_user_turn()` to warn on auto-response/turn-detection combos, call `commit_audio()`, and emit a `ResponseCreateEvent` (empty params) to finalize the user turn.
Ultravox Realtime Plugin `livekit-plugins/livekit-plugins-ultravox/.../realtime/realtime_model.py`	Adds `commit_user_turn()` that logs unsupported warning; replaces `push_video()` no-op with a warning log.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant AgentActivity as AgentActivity
participant RTSession as RealtimeSession
participant AudioRec as AudioRecognition
Note over AgentActivity,RTSession,AudioRec: User turn commit decision flow
AgentActivity->>RTSession: commit_user_turn()
alt RT session exists
RTSession-->>AgentActivity: commit_user_turn handled
else No RT session
AgentActivity->>AudioRec: commit_user_turn(audio_detached, timeout)
AudioRec-->>AgentActivity: audio commit result
end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I thumped my paw and tapped the clock,
A turn is closed, no more to talk,
Plugins nod, some warn, some send,
A tiny hop to mark the end,
— rabbit jubilation 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title accurately describes the main change: adding commit_user_turn support across realtime models, which is the primary focus of all file modifications in this changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 267bea1 and dd8f80a.

📒 Files selected for processing (2)

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model.py
livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model_beta.py

🚧 Files skipped from review as they are similar to previous changes (1)

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model_beta.py

🧠 Learnings (1)

📚 Learning: 2026-01-19T23:21:47.799Z

Learnt from: vishal-seshagiri-infinitusai
Repo: livekit/agents PR: 4559
File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:122-123
Timestamp: 2026-01-19T23:21:47.799Z
Learning: Note from PR `#4559`: response_format was added as a passthrough to the OpenAI Responses API in livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py, but this is scoped only to the Google provider and not for OpenAI. Reviewers should ensure that this passthrough behavior is gated by the provider (Google) and that OpenAI paths do not inadvertently reuse the same passthrough. Consider adding explicit provider checks, and update tests to verify that only the Google provider uses this passthrough while the OpenAI provider ignores it.

Applied to files:

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model_beta.py

🧬 Code graph analysis (1)

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model_beta.py (4)

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model.py (3)

commit_user_turn (1295-1310)

commit_audio (1286-1289)

send_event (691-693)

livekit-plugins/livekit-plugins-ultravox/livekit/plugins/ultravox/realtime/realtime_model.py (2)

commit_user_turn (1141-1142)

commit_audio (1135-1136)

livekit-plugins/livekit-plugins-google/livekit/plugins/google/realtime/realtime_api.py (2)

commit_user_turn (1233-1234)

commit_audio (1227-1228)

livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/experimental/realtime/realtime_model.py (2)

commit_user_turn (2008-2009)

commit_audio (2002-2003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: livekit-plugins-deepgram
GitHub Check: unit-tests
GitHub Check: type-check (3.9)
GitHub Check: type-check (3.13)

🔇 Additional comments (1)

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model_beta.py (1)

1123-1138: LGTM! Implementation is correct and consistent with the non-beta version.

The commit_user_turn method correctly:

Warns when VAD auto-response is enabled (which could conflict with manual turn commits)

Commits any buffered audio via commit_audio()

Sends a ResponseCreateEvent to trigger the model response

The use of Response() is appropriate for the beta API (vs RealtimeResponseCreateParams() in the non-beta version).

One minor note: line 1129 exceeds the 100-character limit per coding guidelines (~110 chars), but this matches the pattern in the non-beta implementation and breaking the string would reduce readability. If the linter flags it, consider using a shorter message or a line continuation.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

bml1g12

This looks good to me, as it means the docs here https://docs.livekit.io/agents/logic/turns/#manual would apply also to realtime model

it might make sense to have clear_user_turn() also call self.clear_audio() for realtime model? I say this because I think then https://docs.livekit.io/agents/logic/turns/#manual would fully apply

# When user starts speaking
@ctx.room.local_participant.register_rpc_method("start_turn")
async def start_turn(data: rtc.RpcInvocationData):
    session.interrupt()  # Stop any current agent speech
    session.clear_user_turn()  # Clear any previous input
    session.input.set_audio_enabled(True)  # Start listening

As for cascaded models clear_user_turn() clears any previous model input, but for realtime model we also need to clear the audio I think

longcw · 2026-01-27T12:27:16Z

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model.py

+                response=RealtimeResponseCreateParams(),
+            )
+        )
+        self.clear_audio()


why a clear_audio is needed here?

I know, right? It seems redundant, but it is required for OpenAI according to their doc:

Send input_audio_buffer.clear before beginning a new user input.

I think it means you need to clear the buffer before next time you want to start a new user speech but not means it's required after response.create for this turn.

maybe it's similar to the session.clear_user_turn in the example @bml1g12 mentioned above

# When user starts speaking @ctx.room.local_participant.register_rpc_method("start_turn") async def start_turn(data: rtc.RpcInvocationData): session.interrupt() # Stop any current agent speech session.clear_user_turn() # Clear any previous input session.input.set_audio_enabled(True) # Start listening

~~Yeah, I can put that call in the clear_user_turn part.~~

Turns out we don't need this if we call session.clear_user_turn.

ah yes clear_user_turn already calls clear_audio() under the hood it seems, so indeed when you start the new turn you probably want to clear audio, now when you end the turn - and indeed that means probably not needed in this PR

chenghao-mou added 2 commits January 26, 2026 15:36

add commit user turn support for realtime models

f0ca5f1

minor fixes

267bea1

chenghao-mou requested a review from a team January 26, 2026 15:44

bml1g12 approved these changes Jan 27, 2026

View reviewed changes

longcw reviewed Jan 27, 2026

View reviewed changes

remove clear audio call

dd8f80a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AGT-2474: add commit user turn support for realtime models #4622

AGT-2474: add commit user turn support for realtime models #4622

chenghao-mou commented Jan 26, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 26, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

bml1g12 left a comment

Uh oh!

longcw Jan 27, 2026

Uh oh!

chenghao-mou Jan 27, 2026

Uh oh!

longcw Jan 27, 2026 •

edited

Loading

Uh oh!

chenghao-mou Jan 27, 2026 •

edited

Loading

Uh oh!

bml1g12 Jan 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

AGT-2474: add commit user turn support for realtime models #4622

Are you sure you want to change the base?

AGT-2474: add commit user turn support for realtime models #4622

Conversation

chenghao-mou commented Jan 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

bml1g12 left a comment

Choose a reason for hiding this comment

Uh oh!

longcw Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

chenghao-mou Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

longcw Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chenghao-mou Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bml1g12 Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chenghao-mou commented Jan 26, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 26, 2026 •

edited

Loading

longcw Jan 27, 2026 •

edited

Loading

chenghao-mou Jan 27, 2026 •

edited

Loading

bml1g12 Jan 27, 2026 •

edited

Loading