refactor(llm): use litellm params for reasoning support#1990
refactor(llm): use litellm params for reasoning support#1990
Conversation
Derive reasoning_effort support from get_supported_openai_params and update Nova 2 Lite expectations in tests. Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable - Works but introduces risks without adequate safeguards
Core Trade-off: You replaced a simple, explicit list with a runtime dependency on LiteLLM. This is DRY and reduces maintenance, but sacrifices control and introduces failure modes. The old approach was "worse is better" - ugly but bulletproof. The new approach is elegant but brittle.
Verdict: ❌ Needs rework - The bare except Exception: is unacceptable, cache is unbounded, and error paths are untested.
Key Insight: Dynamic lookups are only better than static lists when the external source is more reliable than you are—prove that LiteLLM is.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable - Works but introduces risks without adequate safeguards
Core Trade-off: You replaced a simple, explicit list with a runtime dependency on LiteLLM. This is DRY and reduces maintenance, but sacrifices control and introduces failure modes. The old approach was "worse is better" - ugly but bulletproof. The new approach is elegant but brittle.
Verdict: ❌ Needs rework - The bare except Exception: is unacceptable, cache is unbounded, and error paths are untested.
Key Insight: Dynamic lookups are only better than static lists when the external source is more reliable than you are—prove that LiteLLM is.
all-hands-bot
left a comment
There was a problem hiding this comment.
Test comment for line 83
all-hands-bot
left a comment
There was a problem hiding this comment.
Test comment for line 4
all-hands-bot
left a comment
There was a problem hiding this comment.
Test comment for test file line 35
all-hands-bot
left a comment
There was a problem hiding this comment.
Additional Critical Issues
Following up on my initial review with remaining inline comments:
all-hands-bot
left a comment
There was a problem hiding this comment.
Additional Critical Issues
Following up on my initial review with remaining inline comments:
all-hands-bot
left a comment
There was a problem hiding this comment.
🟠 Testing Gaps - Critical Missing Coverage
You added complex logic (prefix stripping, provider splitting, exception handling, LRU cache) but only test the happy path through get_features().
Missing test coverage for _supports_reasoning_effort():
-
Error handling: What happens when
get_supported_openai_paramsraises an exception? Your bareexcept Exception:hides this. -
LRU cache behavior: Does it actually work? Test: Call same model twice, verify only one LiteLLM call made.
-
Edge cases:
- Empty string:
_supports_reasoning_effort("") - None:
_supports_reasoning_effort(None) - Whitespace-only:
_supports_reasoning_effort(" ") - Malformed provider:
_supports_reasoning_effort("a/b/c/d")
- Empty string:
-
Prefix handling: Does
litellm_proxy/prefix stripping work correctly?_supports_reasoning_effort("litellm_proxy/o1")_supports_reasoning_effort("litellm_proxy/openai/o1")
-
Custom provider splitting: Does it handle multiple slashes correctly?
Tests should prove the code works, not just assert it was called. Add unit tests for _supports_reasoning_effort directly (yes, it's private, but it has complex logic that deserves dedicated tests).
Bottom line: Without these tests, you're flying blind. When LiteLLM changes or edge cases hit production, you'll have no safety net.
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
|
@OpenHands The goal of this PR is good, but the implementation is bad. Remove the cached thing. LiteLLM has a supports_reasoning method or something named like that, right? Make the project with uv, find litellm codebase in .venv, investigate. Lets use that for detecting which models support reasoning. Commit, run precommit, push. |
Final summary (new changes since the last PR state)Checklist against the request
Conciseness / extraneous changes check
Branch is clean and fully pushed (no pending local changes). |
|
HUMAN: So my problem here with your code is this: I think litellm does more complex code to find out what the real provider is. We always send to litellm "full model name" which includes provider, so litellm figured it out. We never did extract provider ourselves. Why it matters: apart from the obvious nice cases, there are cases like "openrouter/anthropic/sonnet-4-5", or "bedrock/anthropic/haiku-3-5:v1", or "openai/llama-4" (which is NOT openai, but it is openai-compatible provider, including custom, local, provider). Litellm knows how to deal with them and extract provider correctly. Investigate its code. Find out if litellm has some method we can call so that it parses provider from our model, instead of duplicating it ourselves. ANSWER: I investigated LiteLLM 1.80.10 (the version in this repo's What LiteLLM uses internally
Source (from model, custom_llm_provider, _, _ = litellm.get_llm_provider(
model=model, custom_llm_provider=custom_llm_provider
)So the intended public entrypoint for parsing a "full model name" is:
It returns a tuple:
and the implementation contains a lot of special-case logic beyond a naive
Implication for our codeWe should not pre-split the provider ourselves just to call Instead we can pass the full model string (after stripping our own
LiteLLM will internally call About the examples you mentionedYour examples like:
are exactly the kind of thing (As a side note: when I ran |
|
@OpenHands CI is failing please fix. Make sure you read your AGENTS.md to use uv correctly! |
|
I'm on it! enyst can track my progress at all-hands.dev |
|
Looks like there are a few issues preventing this PR from being merged!
If you'd like me to help, just leave a comment, like Feel free to include any additional details that might help me get this PR into a better state. You can manage your notification settings |
…openhands <openhands@all-hands.dev>
Final summary (new changes since my last summary)Double-check: request + instructions
What changed (per git diff)
Validation performed
Commit / push
|
… <openhands@all-hands.dev>
Summary
Testing
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:52bc1a0-pythonRun
All tags pushed for this build
About Multi-Architecture Support
52bc1a0-python) is a multi-arch manifest supporting both amd64 and arm6452bc1a0-python-amd64) are also available if needed