Skip to content

Conversation

@alexkuzmik
Copy link

@alexkuzmik alexkuzmik commented Jan 28, 2026

Problem:

Hi! Opik engineer is here :)

Our integration tests caught a serious regression in google-adk after your latest release (1.23.0).

If you're using agents with LiteLLM models, your application might degrade significantly starting from 1.23.0.
Reason: when converting content parts to the litellm format, there is a piece of logic that checks if a content part from the user contains payload. If it doesn't, a user message "Handle the requests as specified in the System Instruction." is added to the end of the message list. The function google.adk.models.lite_llm._part_has_payload returns False if the content is just a function response!

So, this fallback user message is added to the end of the chat payload after every tool call and it can affect the agent drastically.
The issue is not consistent; sometimes models ignore this instruction, sometimes they go into a long and expensive loop where they focus on this "Handle the requests as specified in the System Instruction." Here are the traces screenshots for the simplest weather-time agent, whichis supposed to have "llm -> tool -> llm" structure but instead explodes into 30 seconds run that just burns tokens.

"Lite" regression. A few unnecessary tool calls.
image

Hard case, pay attention to the token consumption and time - 24k and 25 seconds (I saw the cases with 60-70k tokens so this is one is actually quite far from the worst scenario).
image

Solution:

Updated _part_has_payload to respect the function response and treat it as a payload too.

Testing Plan

Tested manually with openai models, also looked into Litellm spans directly to make sure this extra instruction no longer passed to litellm.completion/acompletion.

Unit Tests:

  • I have added or updated unit tests for my change.
  • All unit tests pass locally.

Too much warnings in the end, but all the 4027 unit tests passed :)

Manual End-to-End (E2E) Tests:
I used this script, but I suppose that any other might reproduce it as well with higher or lower chance, depending on the model/system instruction/tool/input message. Openai gpt-4o-mini fails most of the time, gpt-5-nano seems smarter and more resilient to this "last message attack". Not sure about any other providers but I don't think it matters.
(script contains opik logic, feel free to remove it if you're using another observability tool)

import os
import datetime
import uuid
from zoneinfo import ZoneInfo
from typing import Optional, AsyncIterator, Union

import opik

from google.adk.agents import Agent
from google.adk import agents as adk_agents
from google.adk import runners as adk_runners
from google.adk import sessions as adk_sessions
from google.adk.events import Event
from google.genai import types

from opik.integrations.adk import OpikTracer, track_adk_agent_recursive


MODEL_NAME = "openai/gpt-4o-mini"

def get_weather(city: str) -> dict:
    if city.lower() == "new york":
        return {
            "status": "success",
            "report": (
                "The weather in New York is sunny with a temperature of 25 degrees"
                " Celsius (41 degrees Fahrenheit)."
            ),
        }
    else:
        return {
            "status": "error",
            "error_message": f"Weather information for '{city}' is not available.",
        }


def get_current_time(city: str) -> dict:
    if city.lower() == "new york":
        tz_identifier = "America/New_York"
    else:
        return {
            "status": "error",
            "error_message": (f"Sorry, I don't have timezone information for {city}."),
        }

    tz = ZoneInfo(tz_identifier)
    now = datetime.datetime.now(tz)
    report = f'The current time in {city} is {now.strftime("%Y-%m-%d %H:%M:%S %Z%z")}'
    return {"status": "success", "report": report}


async def async_build_runner(
    root_agent: Union[adk_agents.Agent, adk_agents.SequentialAgent],
    session_id: str,
) -> adk_runners.Runner:
    session_service = adk_sessions.InMemorySessionService()
    await session_service.create_session(
        app_name="ADK_app", user_id="ADK_test_user", session_id=session_id
    )
    runner = adk_runners.Runner(
        agent=root_agent, app_name="ADK_app", session_service=session_service
    )
    return runner


async def async_extract_final_response_text(
    events_generator: AsyncIterator[Event],
) -> Optional[str]:
    collected_events = []
    async for event in events_generator:
        collected_events.append(event)

    if len(collected_events) == 0:
        raise Exception("Agent failed to execute.")

    last_event = collected_events[-1]
    assert (
        last_event.is_final_response()
        and last_event.content
        and last_event.content.parts
    )
    return last_event.content.parts[0].text


async def main():
    root_agent = Agent(
        name="openai_weather_time_agent",
        model=MODEL_NAME,
        description=("Agent to answer questions about the time and weather in a city."),
        instruction=("I can answer your questions about the time and weather in a city."),
        tools=[get_weather, get_current_time],
    )

    opik_tracer = OpikTracer()
    track_adk_agent_recursive(root_agent, opik_tracer)

    session_id = "ADK_" + str(uuid.uuid4())
    runner = await async_build_runner(root_agent, session_id)
   
    events = runner.run_async(
        user_id="ADK_test_user",
        session_id=session_id,
        new_message=types.Content(
            role="user",
            parts=[types.Part(text="What's the weather in New York?")],
        ),
    )
    final_response = await async_extract_final_response_text(events)
    print(f"Response: {final_response}")


if __name__ == "__main__":
    import asyncio

    asyncio.run(main())
    opik.flush_tracker()
    print("Done")

Checklist

  • I have read the CONTRIBUTING.md document.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.
  • I have manually tested my changes end-to-end.
  • Any dependent changes have been merged and published in downstream modules.

alexkuzmik and others added 3 commits January 28, 2026 16:55
…-tool-response-content-for-litellm-models

Make _part_has_payload respect function response
@google-cla
Copy link

google-cla bot commented Jan 28, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @alexkuzmik, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical bug affecting agents utilizing LiteLLM models in google-adk. The issue stemmed from an incorrect payload detection mechanism for function responses, which inadvertently injected a misleading instruction into the agent's message history. This led to agents consuming excessive tokens and entering inefficient processing loops. The fix ensures that function responses are properly recognized as valid content, thereby preventing the erroneous instruction from being passed to the models and restoring expected agent behavior.

Highlights

  • Bug Fix: Agent Degradation with LiteLLM: Addresses a critical regression introduced in google-adk version 1.23.0, where agents using LiteLLM models could degrade significantly due to incorrect message handling.
  • Incorrect Payload Detection: The _part_has_payload function incorrectly returned False for content parts that were function responses, leading to an unintended fallback user message being added to the chat payload.
  • Impact on Agents: This erroneous message ('Handle the requests as specified in the System Instruction.') caused agents to enter long, expensive loops, burning tokens and increasing run times, as illustrated by provided trace screenshots.
  • Solution: Updated Payload Logic: The _part_has_payload function has been updated to correctly identify function_response as a valid payload, preventing the addition of the problematic fallback message.
  • New Unit Test: A new unit test case has been added to test_litellm.py to specifically verify that _part_has_payload correctly handles function_response parts.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot adk-bot added the models [Component] Issues related to model support label Jan 28, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a critical bug that could cause agent runs to enter expensive, token-burning loops. The fix correctly identifies function_response as a valid payload within a content part, preventing an erroneous fallback message from being added after tool calls. The change in src/google/adk/models/lite_llm.py is correct and effectively resolves the issue.

While a unit test was added, it was placed in a test suite that doesn't validate the actual bug fix. I've provided a detailed comment with a suggestion for a new test function that will properly ensure this regression doesn't occur in the future. Once the test is corrected, this PR should be good to merge.

@ryanaiagent ryanaiagent self-assigned this Jan 28, 2026
@ryanaiagent
Copy link
Collaborator

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a critical bug that could cause agent runs to enter expensive, infinite loops when using LiteLLM models. The pull request description clearly explains the problem with illustrative examples, which is very helpful for understanding the impact of the bug. The root cause was that a function_response was not being correctly identified as a payload, which triggered the addition of a fallback message that confused the agent.

The fix is simple and effective: _part_has_payload is updated to correctly recognize function_response as a valid payload. The accompanying unit test is well-written and thoroughly validates that the fallback message is no longer added in this scenario.

Overall, this is an excellent contribution that resolves a significant issue. The code changes are clean, targeted, and well-tested.

@ryanaiagent ryanaiagent added the needs review [Status] The PR/issue is awaiting review from the maintainer label Jan 28, 2026
@ryanaiagent
Copy link
Collaborator

Hi @alexkuzmik , Thank you for your contribution! We appreciate you taking the time to submit this pull request. Your PR has been received by the team and is currently under review. We will provide feedback as soon as we have an update to share.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

models [Component] Issues related to model support needs review [Status] The PR/issue is awaiting review from the maintainer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants