feat: add support for input overrides for multimodal evals #1101

AAgnihotry · 2026-01-13T20:36:26Z

Summary

Added support for input overrides in multimodal evaluations
Extended _ConfigurableFactory to handle file path overrides for multimodal inputs (images, audio, video, documents)
Enhanced _Runtime to resolve and validate file paths during eval execution
Added comprehensive unit and E2E tests to verify override functionality

Changes

src/uipath/_cli/_evals/_configurable_factory.py (+70 lines): Added _resolve_file_path_overrides method to handle multimodal input overrides
src/uipath/_cli/_evals/_runtime.py (+19 lines): Integrated file path resolution into the runtime execution flow
src/uipath/_cli/cli_eval.py (+10 lines): Extended CLI to support input overrides parameter
tests/cli/eval/test_configurable_factory.py (+157 lines): Unit tests for file path override logic
tests/cli/eval/test_input_overrides_e2e.py (+310 lines, new): E2E tests covering various override scenarios

Test plan

Unit tests pass for _ConfigurableFactory override resolution
E2E tests validate end-to-end override behavior with actual files
All linting, formatting, and type checks pass
Package builds successfully

🤖 Generated with Claude Code

mjnovice · 2026-01-13T22:21:45Z

LGTM, can we add some E2E testcases as well, please ?

Here is an example 32cabaf

src/uipath/_cli/_evals/_configurable_factory.py

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Updated the input override deep merge logic to be truly recursive, handling nested dictionaries at any depth. Previously only merged one level deep. Added 9 new tests covering edge cases including multiple nesting levels, type replacements, empty dicts, lists, None values, and immutability guarantees. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Fixed two mypy errors: 1. Added type parameters to deep_merge function signature - Changed from `def deep_merge(base: dict, override: dict) -> dict:` - To: `def deep_merge(base: dict[str, Any], override: dict[str, Any]) -> dict[str, Any]:` 2. Added type annotation and import to test file - Added `from typing import Any` import - Added explicit type annotation to original_inputs variable All mypy checks now pass successfully. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…s parameter - Created new _eval_util.py with apply_input_overrides() and deep_merge() functions - Moved input override logic from ConfigurableRuntimeFactory to standalone utility - Added input_overrides as optional parameter to execute_runtime() method - Removed _configure_input_overrides() method and factory-based override storage - Updated all tests to use utility function directly - ConfigurableRuntimeFactory now focuses only on model settings overrides 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Did the changes

testcases/eval-input-overrides/run.sh

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Added /add-integration-test command to help create integration tests - Follows the pattern established in PR #1101 (eval-input-overrides) - Bumped version from 2.4.21 to 2.4.22 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Added /add-integration-test command following PR #1101 pattern. This skill helps create new integration test cases in testcases/ directory following the established structure with: - run.sh execution script - pyproject.toml dependencies - entry-points.json and uipath.json configuration - src/ directory with eval sets and assert.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

github-actions bot added test:uipath-langchain Triggers tests in the uipath-langchain-python repository test:uipath-llamaindex Triggers tests in the uipath-llamaindex-python repository labels Jan 13, 2026

AAgnihotry force-pushed the feat/multiModalEvals branch from cd8bf89 to 9533f66 Compare January 13, 2026 20:58

mjnovice approved these changes Jan 13, 2026

View reviewed changes

AAgnihotry force-pushed the feat/multiModalEvals branch from 4281ea4 to 452d38c Compare January 13, 2026 22:48

cristipufu reviewed Jan 14, 2026

View reviewed changes

src/uipath/_cli/_evals/_configurable_factory.py Outdated Show resolved Hide resolved

cristipufu previously requested changes Jan 14, 2026

View reviewed changes

src/uipath/_cli/_evals/_configurable_factory.py Outdated Show resolved Hide resolved

AAgnihotry force-pushed the feat/multiModalEvals branch from 4ee8835 to 5aafbbb Compare January 14, 2026 20:10

AAgnihotry and others added 7 commits January 14, 2026 12:20

feat: add support for input overrides for multimodal evals

a7d8280

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

fix: add integration testcase for eval input overrides with calculator

d959865

fix: add missing import for apply_input_overrides in runtime

a681598

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

fix: remove unused factory variables from E2E tests

82244cb

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

AAgnihotry force-pushed the feat/multiModalEvals branch from f3a013c to 82244cb Compare January 14, 2026 20:20

AAgnihotry requested a review from cristipufu January 14, 2026 21:16

Chibionos approved these changes Jan 14, 2026

View reviewed changes

testcases/eval-input-overrides/run.sh Outdated Show resolved Hide resolved

AAgnihotry and others added 3 commits January 14, 2026 13:29

fix: remove unnecessary echo from integration test script

0dbdaaf

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

AAgnihotry merged commit 34511d9 into main Jan 14, 2026
157 of 159 checks passed

AAgnihotry deleted the feat/multiModalEvals branch January 14, 2026 21:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add support for input overrides for multimodal evals #1101

feat: add support for input overrides for multimodal evals #1101

AAgnihotry commented Jan 13, 2026

Uh oh!

mjnovice commented Jan 13, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: add support for input overrides for multimodal evals #1101

feat: add support for input overrides for multimodal evals #1101

Conversation

AAgnihotry commented Jan 13, 2026

Summary

Changes

Test plan

Uh oh!

mjnovice commented Jan 13, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants