Skip to content

Unable to use LLM-as-a-Judge with Azure model. #4325

@omarbessi04

Description

@omarbessi04

🔴 Required Information

Please ensure all items in this section are completed to allow for efficient
triaging. Requests without complete information may be rejected / deprioritized.
If an item is not applicable to you - please mark it as N/A

Describe the Bug:
When using LLM as judge, google/adk/models/registry's resolve function fails when "judge_model" is "azure/gpt-4o".

This is because r"azure/.*" is missing in the adk/models/lite_llm LiteLLm supported_models() return value, this bug does not happen if this line is included.

Steps to Reproduce:
Run eval with any LLM-as-a-judge criteria on any evalset on an agent with the judge_model set to azure/gpt-4o.

Expected Behavior:
A clear and concise description of what you expected to happen.

The program will be unable to find a suitable model due to 'azure/gpt-4o' not matching any regex found in the _llm_registry_dict.

Observed Behavior:
What actually happened? Include error messages or crash stack traces here.

"2026-01-30 16:50:50,906 - ERROR - local_eval_service.py:357 - Metric evaluation failed for metric final_response_match_v2 for eval case id '[eval case id]' with following error `Model azure/gpt-4o not found."
In the table, all criteria that use an LLM-as-a-judge will have the Status "NOT_EVALUATED".

Environment Details:

  • ADK Library Version (pip show google-adk):

Name: google-adk
Version: 1.23.0
Summary: Agent Development Kit
Home-page: https://google.github.io/adk-docs/
Author:
Author-email: Google LLC googleapis-packages@google.com
License:
Location: C:\Projects\IsIT.Agents.Search.venv\Lib\site-packages
Requires: aiosqlite, anyio, authlib, click, fastapi, google-api-python-client, google-auth, google-cloud-aiplatform, google-cloud-bigquery, google-cloud-bigquery-storage, google-cloud-bigtable, google-cloud-discoveryengine, google-cloud-pubsub, google-cloud-secret-manager, google-cloud-spanner, google-cloud-speech, google-cloud-storage, google-genai, graphviz, jsonschema, mcp, opentelemetry-api, opentelemetry-exporter-gcp-logging, opentelemetry-exporter-gcp-monitoring, opentelemetry-exporter-gcp-trace, opentelemetry-exporter-otlp-proto-http, opentelemetry-resourcedetector-gcp, opentelemetry-sdk, pyarrow, pydantic, python-dateutil, python-dotenv, PyYAML, requests, sqlalchemy, sqlalchemy-spanner, starlette, tenacity, typing-extensions, tzlocal, uvicorn, watchdog, websockets
Required-by: Search, toolbox-adk

  • Desktop OS:** [e.g., macOS, Linux, Windows]
    Windows
  • Python Version (python -V):
    Python 3.12.10

Model Information:

  • Are you using LiteLLM: Yes/No
    Yes
  • Which model is being used: (e.g., gemini-2.5-pro)
    azure/gpt-4o

🟡 Optional Information

Providing this information greatly speeds up the resolution process.

Regression:
Did this work in a previous version of ADK? If so, which one?
I don't think so.

How often has this issue occurred?:

  • Always (100%)

Metadata

Metadata

Assignees

No one assigned

    Labels

    models[Component] Issues related to model support

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions