Skip to content

Conversation

@tinalenguyen
Copy link
Member

@tinalenguyen tinalenguyen commented Jan 26, 2026

syntax: llm=azure.responses.LLM()

https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/responses?view=foundry-classic&tabs=python-key#getting-started-with-the-responses-api

Summary by CodeRabbit

  • New Features
    • Added Azure OpenAI LLM support to LiveKit plugins, enabling use of Azure-hosted language models with configurable endpoints, deployments, API/auth options, timeouts, and model selection.
    • Made the Azure responses module publicly accessible so the new LLM integration is directly available from the plugin surface for easier import and use.

✏️ Tip: You can customize this high-level summary in your review settings.

@chenghao-mou chenghao-mou requested a review from a team January 26, 2026 09:36
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 26, 2026

📝 Walkthrough

Walkthrough

The changes add a new responses package to the Azure plugin and expose it via the plugin's public API. The package introduces an LLM class that wraps/configures an Azure-specific AsyncOpenAI client and forwards parameters to the existing OpenAI responses LLM.

Changes

Cohort / File(s) Summary
Azure plugin initialization
livekit-plugins-azure/livekit/plugins/azure/__init__.py
Import and export the new responses module by adding from . import responses and including "responses" in __all__ (+2/-1).
Responses package initializer
livekit-plugins-azure/livekit/plugins/azure/responses/__init__.py
New package file that imports LLM from .llm and sets __all__ = ["LLM"] (+3/-0).
LLM implementation
livekit-plugins-azure/livekit/plugins/azure/responses/llm.py
New LLM class extending openai.responses.LLM. Accepts Azure-specific config (endpoint, deployment, api_version, api_key, AD token/provider, base_url, organization, project) plus standard LLM params; builds an AsyncAzureOpenAI client (with timeout fallback) and delegates to the superclass (+70/-0).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 In Azure fields I hop and hum,
I wired a client, now responses come.
Endpoints, tokens, deployments too—
I stitch the bits so they speak true. ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'add azure openai responses' directly and accurately summarizes the main change: adding Azure OpenAI Responses support to the plugin via a new LLM wrapper class.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py`:
- Around line 45-59: The code calls openai.AsyncAzureOpenAI but currently
imports livekit.plugins.openai (which doesn't export AsyncAzureOpenAI); fix by
importing the real OpenAI SDK (e.g., add "import openai" or "from openai import
AsyncAzureOpenAI") at the top of the file so that azure_client =
openai.AsyncAzureOpenAI(...) resolves to the SDK class; ensure the new import
coexists with the existing "from livekit.plugins import openai" without name
conflict (prefer qualified import or direct AsyncAzureOpenAI import).
🧹 Nitpick comments (1)
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py (1)

14-34: Consider exposing store and metadata parameters for API consistency.

The base openai.responses.LLM class accepts store and metadata parameters that are not exposed in this Azure wrapper. If these parameters are useful for Azure OpenAI Responses API users, consider adding them for feature parity.

♻️ Suggested addition
         tool_choice: NotGivenOr[ToolChoice] = NOT_GIVEN,
         timeout: httpx.Timeout | None = None,
         reasoning: NotGivenOr[Reasoning] = NOT_GIVEN,
+        store: NotGivenOr[bool] = NOT_GIVEN,
+        metadata: NotGivenOr[dict[str, str]] = NOT_GIVEN,
     ) -> None:

And pass them to super().__init__():

         super().__init__(
             model=model,
             client=azure_client,
             user=user,
             temperature=temperature,
             parallel_tool_calls=parallel_tool_calls,
             tool_choice=tool_choice,
             reasoning=reasoning,
+            store=store,
+            metadata=metadata,
         )
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e084f6b and e4058ec.

📒 Files selected for processing (3)
  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/__init__.py
  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/__init__.py
  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/__init__.py
  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/__init__.py
  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py
🧠 Learnings (3)
📓 Common learnings
Learnt from: vishal-seshagiri-infinitusai
Repo: livekit/agents PR: 4559
File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:122-123
Timestamp: 2026-01-19T23:21:50.552Z
Learning: In PR `#4559`, response_format was intentionally added as a passthrough (accepted but not utilized) to the OpenAI Responses API in `livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`. The implementation is scoped only for Google provider, not for OpenAI.
Learnt from: davidzhao
Repo: livekit/agents PR: 4548
File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:77-81
Timestamp: 2026-01-18T01:09:01.847Z
Learning: In the OpenAI responses LLM (`livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`), reasoning effort defaults are intentionally set lower than OpenAI's API defaults for voice interactions: "none" for gpt-5.1/gpt-5.2 and "minimal" for other reasoning-capable models like gpt-5, to avoid enabling reasoning by default in voice contexts.
📚 Learning: 2026-01-19T23:21:50.552Z
Learnt from: vishal-seshagiri-infinitusai
Repo: livekit/agents PR: 4559
File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:122-123
Timestamp: 2026-01-19T23:21:50.552Z
Learning: In PR `#4559`, response_format was intentionally added as a passthrough (accepted but not utilized) to the OpenAI Responses API in `livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`. The implementation is scoped only for Google provider, not for OpenAI.

Applied to files:

  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/__init__.py
  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py
📚 Learning: 2026-01-18T01:09:01.847Z
Learnt from: davidzhao
Repo: livekit/agents PR: 4548
File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:77-81
Timestamp: 2026-01-18T01:09:01.847Z
Learning: In the OpenAI responses LLM (`livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`), reasoning effort defaults are intentionally set lower than OpenAI's API defaults for voice interactions: "none" for gpt-5.1/gpt-5.2 and "minimal" for other reasoning-capable models like gpt-5, to avoid enabling reasoning by default in voice contexts.

Applied to files:

  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py
🧬 Code graph analysis (2)
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/__init__.py (2)
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py (1)
  • LLM (14-69)
livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py (1)
  • LLM (52-166)
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py (1)
livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py (2)
  • LLM (52-166)
  • model (111-112)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: livekit-plugins-openai
  • GitHub Check: livekit-plugins-cartesia
  • GitHub Check: livekit-plugins-deepgram
  • GitHub Check: unit-tests
  • GitHub Check: type-check (3.9)
  • GitHub Check: type-check (3.13)
🔇 Additional comments (4)
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/__init__.py (1)

1-3: LGTM!

Clean and minimal module initialization following standard Python packaging conventions.

livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/__init__.py (1)

20-25: LGTM!

The responses module is properly imported and exported in __all__, following the existing pattern for exposing submodules.

Note: The module docstring (line 15) references the OpenAI plugin for Azure OpenAI, which may need updating now that this plugin also provides Azure OpenAI responses functionality. Consider updating the docstring to reflect the new capability.

livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py (2)

1-11: LGTM on imports.

The imports are well-organized and appropriate for the Azure OpenAI client configuration.


61-69: LGTM on parent initialization.

The super().__init__() call correctly passes the Azure client along with the relevant configuration options. The base class will handle reasoning defaults appropriately for voice interactions. Based on learnings, the OpenAI responses LLM intentionally sets lower reasoning effort defaults for voice contexts.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
`@livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py`:
- Around line 19-22: The constructor/function currently defaults model to
"gpt-4o" which is incompatible with Azure; change the model parameter to be
optional (e.g., model: str | ResponsesModel | None = None) and, wherever the
Azure-specific request is constructed (use the parameter names model,
azure_endpoint, azure_deployment, api_version to find the usage), derive the
model value from azure_deployment when model is None and azure_deployment is
provided; if neither model nor azure_deployment is available and
azure_endpoint/azure usage is expected, raise a clear error. Apply the same
change for the other signature occurrence (lines ~62-64) so both callsites
accept None and fall back to azure_deployment.
🧹 Nitpick comments (2)
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py (2)

29-35: Expose store and metadata for parity with OpenAI responses LLM.
The base openai.responses.LLM supports store and metadata, but this wrapper doesn’t surface them. Exposing these keeps the Azure API aligned with the OpenAI plugin.

♻️ Proposed change
         user: NotGivenOr[str] = NOT_GIVEN,
         temperature: NotGivenOr[float] = NOT_GIVEN,
         parallel_tool_calls: NotGivenOr[bool] = NOT_GIVEN,
         tool_choice: NotGivenOr[ToolChoice] = NOT_GIVEN,
+        store: NotGivenOr[bool] = NOT_GIVEN,
+        metadata: NotGivenOr[dict[str, str]] = NOT_GIVEN,
         timeout: httpx.Timeout | None = None,
         reasoning: NotGivenOr[Reasoning] = NOT_GIVEN,
     ) -> None:
@@
         super().__init__(
             model=model,
             client=azure_client,
             user=user,
             temperature=temperature,
             parallel_tool_calls=parallel_tool_calls,
             tool_choice=tool_choice,
+            store=store,
+            metadata=metadata,
             reasoning=reasoning,
         )

Also applies to: 62-70


36-44: Use Google-style docstring and avoid noqa: E501.
Current docstring format and line-length exemption conflict with the Python guidelines. Please convert it to Google style and wrap lines to ≤100. As per coding guidelines, ...

✍️ Example (partial)
-        """
-        This automatically infers the following arguments from their corresponding environment variables if they are not provided:
-        - `api_key` from `AZURE_OPENAI_API_KEY`
-        - `organization` from `OPENAI_ORG_ID`
-        - `project` from `OPENAI_PROJECT_ID`
-        - `azure_ad_token` from `AZURE_OPENAI_AD_TOKEN`
-        - `api_version` from `OPENAI_API_VERSION`
-        - `azure_endpoint` from `AZURE_OPENAI_ENDPOINT`
-        """  # noqa: E501
+        """Initializes an Azure OpenAI Responses LLM.
+
+        Environment variables:
+            AZURE_OPENAI_API_KEY: Fallback for api_key.
+            OPENAI_ORG_ID: Fallback for organization.
+            OPENAI_PROJECT_ID: Fallback for project.
+            AZURE_OPENAI_AD_TOKEN: Fallback for azure_ad_token.
+            OPENAI_API_VERSION: Fallback for api_version.
+            AZURE_OPENAI_ENDPOINT: Fallback for azure_endpoint.
+        """
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4058ec and 1387e8e.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py
🧠 Learnings (3)
📓 Common learnings
Learnt from: vishal-seshagiri-infinitusai
Repo: livekit/agents PR: 4559
File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:122-123
Timestamp: 2026-01-19T23:21:50.552Z
Learning: In PR `#4559`, response_format was intentionally added as a passthrough (accepted but not utilized) to the OpenAI Responses API in `livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`. The implementation is scoped only for Google provider, not for OpenAI.
Learnt from: davidzhao
Repo: livekit/agents PR: 4548
File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:77-81
Timestamp: 2026-01-18T01:09:01.847Z
Learning: In the OpenAI responses LLM (`livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`), reasoning effort defaults are intentionally set lower than OpenAI's API defaults for voice interactions: "none" for gpt-5.1/gpt-5.2 and "minimal" for other reasoning-capable models like gpt-5, to avoid enabling reasoning by default in voice contexts.
📚 Learning: 2026-01-19T23:21:50.552Z
Learnt from: vishal-seshagiri-infinitusai
Repo: livekit/agents PR: 4559
File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:122-123
Timestamp: 2026-01-19T23:21:50.552Z
Learning: In PR `#4559`, response_format was intentionally added as a passthrough (accepted but not utilized) to the OpenAI Responses API in `livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`. The implementation is scoped only for Google provider, not for OpenAI.

Applied to files:

  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py
📚 Learning: 2026-01-18T01:09:01.847Z
Learnt from: davidzhao
Repo: livekit/agents PR: 4548
File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:77-81
Timestamp: 2026-01-18T01:09:01.847Z
Learning: In the OpenAI responses LLM (`livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`), reasoning effort defaults are intentionally set lower than OpenAI's API defaults for voice interactions: "none" for gpt-5.1/gpt-5.2 and "minimal" for other reasoning-capable models like gpt-5, to avoid enabling reasoning by default in voice contexts.

Applied to files:

  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py
🧬 Code graph analysis (1)
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py (1)
livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py (2)
  • LLM (52-166)
  • model (111-112)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: livekit-plugins-deepgram
  • GitHub Check: livekit-plugins-openai
  • GitHub Check: unit-tests
  • GitHub Check: livekit-plugins-cartesia
  • GitHub Check: type-check (3.9)
  • GitHub Check: type-check (3.13)

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +19 to +22
model: str | ResponsesModel = "gpt-4o",
azure_endpoint: str | None = None,
azure_deployment: str | None = None,
api_version: str | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "llm.py" | grep -E "azure.*responses" | head -5

Repository: livekit/agents

Length of output: 136


🏁 Script executed:

cat -n livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py

Repository: livekit/agents

Length of output: 3059


🌐 Web query:

Azure OpenAI Python SDK AsyncAzureOpenAI model parameter deployment name

💡 Result:

In the OpenAI Python SDK when using AsyncAzureOpenAI against Azure OpenAI, the model= argument should be your Azure deployment name (the name you assigned in the Azure portal), not the underlying model ID (e.g., not gpt-4o, text-embedding-3-large, etc.). Azure requires the deployment name even though the parameter is still called model. [1]

Example:

from openai import AsyncAzureOpenAI
import os

client = AsyncAzureOpenAI(
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    api_version="2024-10-21",
)

resp = await client.chat.completions.create(
    model="my-gpt4o-deployment",  # <- deployment name
    messages=[{"role": "user", "content": "Hello!"}],
)

If you have multiple deployments (e.g., one for chat and one for embeddings), you typically pass the appropriate deployment name via model= per call; the SDK also has an azure_deployment concept discussed in the library, but it doesn’t replace the need to provide the right deployment identifier for each operation in common multi-deployment setups. [2]

Sources: Microsoft Learn (Azure OpenAI + Python SDK switching endpoints) [1]; openai-python issue discussing AsyncAzureOpenAI deployment handling [2].


🌐 Web query:

OpenAI Responses API model parameter requirements Azure deployment

💡 Result:

For the OpenAI Responses API, the request body expects a model string that is the model ID (for example gpt-4o, o3). [2]

For Azure OpenAI’s Responses API, you still send a model string in the body, but its value must be your Azure “model deployment name” (the name you assigned when you deployed the model), and you call the Azure endpoint /openai/v1/responses on your resource. [1]

Also note a practical Azure difference: in Azure Responses API, response.model should be assumed to identify the deployment name (and Azure does not guarantee exposing the underlying base model/version in the response). [3]

Sources: [1] [2] [3]


🏁 Script executed:

find . -type f -path "*/livekit/plugins/openai/responses/llm.py" | head -1 | xargs wc -l

Repository: livekit/agents

Length of output: 142


🏁 Script executed:

find . -type f -path "*/livekit/plugins/openai/responses/llm.py" | head -1 | xargs head -100

Repository: livekit/agents

Length of output: 3378


Default model is incompatible with Azure OpenAI.
The Azure OpenAI SDK requires the model parameter to be the Azure deployment name (e.g., "my-gpt4o-deployment"), not a model ID like "gpt-4o". Defaulting to "gpt-4o" will cause 404 errors for users who only set azure_deployment. Make model optional and derive it from azure_deployment when not explicitly provided.

Proposed fix
-        model: str | ResponsesModel = "gpt-4o",
+        model: str | ResponsesModel | None = None,
@@
+        if model is None:
+            if azure_deployment is None:
+                raise ValueError("Either model or azure_deployment must be provided")
+            model = azure_deployment
+
         super().__init__(
             model=model,

Also applies to: 62-64

🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/responses/llm.py`
around lines 19 - 22, The constructor/function currently defaults model to
"gpt-4o" which is incompatible with Azure; change the model parameter to be
optional (e.g., model: str | ResponsesModel | None = None) and, wherever the
Azure-specific request is constructed (use the parameter names model,
azure_endpoint, azure_deployment, api_version to find the usage), derive the
model value from azure_deployment when model is None and azure_deployment is
provided; if neither model nor azure_deployment is available and
azure_endpoint/azure usage is expected, raise a clear error. Apply the same
change for the other signature occurrence (lines ~62-64) so both callsites
accept None and fall back to azure_deployment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants