-
Notifications
You must be signed in to change notification settings - Fork 2.7k
AGT-2474: add commit user turn support for realtime models #4622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughAdds an abstract commit_user_turn() to RealtimeSession and implements it across realtime provider plugins; agent_activity now delegates to the realtime session when present, bypassing the audio-recognition commit path. Provider implementations either perform turn-finalization (OpenAI) or log unsupported warnings. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used📓 Path-based instructions (1)**/*.py📄 CodeRabbit inference engine (AGENTS.md)
Files:
🧠 Learnings (1)📚 Learning: 2026-01-19T23:21:47.799ZApplied to files:
🧬 Code graph analysis (1)livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model_beta.py (4)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
🔇 Additional comments (1)
✏️ Tip: You can disable this entire section by setting Comment |
bml1g12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, as it means the docs here https://docs.livekit.io/agents/logic/turns/#manual would apply also to realtime model
it might make sense to have clear_user_turn() also call self.clear_audio() for realtime model? I say this because I think then https://docs.livekit.io/agents/logic/turns/#manual would fully apply
# When user starts speaking
@ctx.room.local_participant.register_rpc_method("start_turn")
async def start_turn(data: rtc.RpcInvocationData):
session.interrupt() # Stop any current agent speech
session.clear_user_turn() # Clear any previous input
session.input.set_audio_enabled(True) # Start listening
As for cascaded models clear_user_turn() clears any previous model input, but for realtime model we also need to clear the audio I think
| response=RealtimeResponseCreateParams(), | ||
| ) | ||
| ) | ||
| self.clear_audio() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why a clear_audio is needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Send
input_audio_buffer.clearbefore beginning a new user input.
I think it means you need to clear the buffer before next time you want to start a new user speech but not means it's required after response.create for this turn.
maybe it's similar to the session.clear_user_turn in the example @bml1g12 mentioned above
# When user starts speaking
@ctx.room.local_participant.register_rpc_method("start_turn")
async def start_turn(data: rtc.RpcInvocationData):
session.interrupt() # Stop any current agent speech
session.clear_user_turn() # Clear any previous input
session.input.set_audio_enabled(True) # Start listening
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I can put that call in the clear_user_turn part.
Turns out we don't need this if we call session.clear_user_turn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes clear_user_turn already calls clear_audio() under the hood it seems, so indeed when you start the new turn you probably want to clear audio, now when you end the turn - and indeed that means probably not needed in this PR

This adds
commit_user_turnsupport for realtime models:This allows users to use
turn_detection="manual"with a realtime model.Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.