Skip to content

Conversation

@weiyuanyue
Copy link
Contributor

@weiyuanyue weiyuanyue commented Dec 12, 2025

This PR addresses a production-breaking bug that blocked model downloads from FoundryLocal in AIDG after upgrading to Foundry Local v0.8.x+.
We restore functionality and improve robustness by migrating from custom HTTP calls to the official Microsoft.AI.Foundry.Local.WinML SDK.
Bottom line: Users can again download and prepare models reliably, and the system is significantly more resilient to future upstream API changes.

Background

In July, Foundry Local changed the string format of a critical Name field in its HTTP API.
The change was handled internally by Foundry Local but not communicated externally, causing silent incompatibility with AIDG’s direct HTTP API calls.
After upgrading to v0.8.x, AIDG’s download workflow broke: calls failed and downstream features relying on those models were effectively blocked.

Impact: it disrupts user flows (FoundryModel download → prepare → chat/inference). Leaving this unfixed would stall developer productivity, inflate support tickets, and erode trust in AIDG’s stability.

Decision

To eliminate this class of failures and avoid future format drift, we migrate to the official SDK:

  • SDK shields us from low-level API format changes.
  • SDK provides supported, versioned, and stable integration points.
  • We gain stronger concurrency, simpler code, and clearer error handling.

Changes

Architecture Improvements

  • Replaced custom HTTP client implementation with official SDK's FoundryLocalManager API
  • Eliminated CLI dependency: Removed command-line invocation of foundry executable in favor of native SDK integration
  • Improved concurrency control: Implemented thread-safe model preparation with semaphore-based locking to prevent race conditions

Code Simplification

  • Reduced codebase by ~182 net lines (421 deletions, 239 insertions)
  • Removed 3 obsolete utility classes:
    • FoundryServiceManager - API-based service orchestration
    • FoundryJsonContext - custom JSON serialization contexts
    • Utils - process execution wrappers
  • Simplified FoundryCatalogModel: Reduced from 100+ lines to 21 lines by leveraging SDK types

Enhanced Functionality

  • Added EnsureModelReadyAsync(): Explicit model preparation API to prevent deadlocks when accessing IChatClient
  • Multi-variant support: Models are now grouped by alias, enabling automatic selection of optimal variant per device
  • Background model loading: Cached models are prepared asynchronously during initialization for faster first-use
  • Improved error handling: Better exception messages and graceful degradation

Technical Details

  • Model preparation caching: Prepared models are stored in memory dictionary to avoid redundant initialization
  • Double-check locking pattern: Ensures thread-safe single preparation per model alias
  • Dynamic service URL resolution: Web service starts on-demand with dynamic port allocation
  • Null-safe runtime handling: Added checks for nullable Runtime properties to prevent UI crashes

Dependencies

  • Added: Microsoft.AI.Foundry.Local.WinML v0.8.2.1
  • Updated: Microsoft.ML.OnnxRuntimeGenAI packages from v0.10.1 → v0.11.4

Testing

  • Verify model download and caching functionality
  • Validate IChatClient creation after EnsureModelReadyAsync()
  • Confirm UI correctly displays EP variants
  • Test on x64 architectures
  • Test on ARM64 architectures
  • Test on FoundryLocal 0.6.x
  • Test on FoundryLoca 0.8.x+

IDisposableAnalyzers Build Errors

This PR introduces the Microsoft.AI.Foundry.Local.WinML(0.8.2.1) package as a new dependency. This package includes IDisposableAnalyzers (4.0.8) as a transitive dependency, which is a code analyzer that enforces proper IDisposable pattern usage.

Since the main branch does not have this package, it doesn't have the IDisposableAnalyzers, and therefore doesn't encounter these errors. When building this branch, the analyzer runs automatically and reports 237+ violations across the codebase related to improper disposal of IDisposable objects.

Temporary Solution

To unblock the build while keeping the scope of this PR focused on Foundry Local integration, these analyzer warnings have been temporarily suppressed:

  • Directory.Build.props: Added NoWarn configuration with TODO comment indicating these issues will be addressed in a follow-up PR
  • AIDevGallery.csproj & AIDevGallery.UnitTests.csproj: Added all IDISP error codes to the NoWarn list

Next Steps

A separate PR should be created to properly address these IDisposable violations by:

  • Implementing proper disposal patterns for objects holding unmanaged resources
  • Adding IDisposable implementation to classes with disposable members
  • Using using statements where appropriate
  • Ensuring proper resource cleanup throughout the codebase

@weiyuanyue weiyuanyue marked this pull request as ready for review December 12, 2025 12:42
@weiyuanyue weiyuanyue requested a review from a team as a code owner December 12, 2025 14:24
@weiyuanyue weiyuanyue changed the title Migrate FoundryLocal integration to official Microsoft.AI.Foundry.Local SDK Migrate FoundryLocal integration to official Microsoft.AI.Foundry.Local.WinML SDK Dec 13, 2025
@weiyuanyue weiyuanyue changed the title Migrate FoundryLocal integration to official Microsoft.AI.Foundry.Local.WinML SDK [Fix][Refactor]Migrate FoundryLocal integration to official Microsoft.AI.Foundry.Local.WinML SDK Dec 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants