feat: migrate from Granite 3 to Granite 4 hybrid models #357

planetf1 · 2026-01-26T12:05:48Z

Migrate from Granite 3.x to Granite 4.0 Models

Type of PR

Bug Fix
New Feature
Documentation
Other

Description

Link to Issue: Closes feat: Update tests & examples to use granite4 #344

Summary

This PR migrates Mellea from Granite 3.x to Granite 4.0 hybrid models across all backends, tests, and documentation. Note: HuggingFace tests remain on Granite 3.3 due to adapter availability constraints (see below).

Changes

Model Definitions (`mellea/backends/model_ids.py`)

Added Granite 4 hybrid model identifiers:
- IBM_GRANITE_4_HYBRID_MICRO (granite-4.0-h-micro)
- IBM_GRANITE_4_HYBRID_TINY (granite-4.0-h-tiny)
- IBM_GRANITE_4_HYBRID_SMALL (granite-4-h-small)
Restored IBM_GRANITE_4_MICRO_3B with per-backend model selection (Ollama: MICRO, Watsonx: SMALL)
Marked Granite 3 models as deprecated (kept for backward compatibility)
Added vision model: IBM_GRANITE_3_3_VISION_2B

Backend Updates

WatsonxAIBackend: Default model → IBM_GRANITE_4_HYBRID_SMALL
All other backends: Use Granite 4 hybrid models in tests

Test Updates (19 files)

✅ Migrated to Granite 4:

test/backends/test_watsonx.py
test/backends/test_ollama.py
test/backends/test_litellm_*.py (3 files)
test/backends/test_vllm*.py (2 files)
test/stdlib/components/*.py (8 files)
test/stdlib/requirements/*.py (3 files)

⚠️ Remains on Granite 3.3:

test/backends/test_huggingface.py - See "HuggingFace Test Exception" below

⚠️ Remains on Granite 3.2:

test/backends/test_vision_ollama.py - See "Vision Model Exception" below

Documentation Updates

docs/tutorial.md: Updated all examples to Granite 4
docs/alora.md: Updated training examples, added note about non-hybrid models for adapter training
docs/examples/*.py: Updated all example scripts

Test Infrastructure

Removed 48GB memory markers (Granite 4 micro models require ~16GB)
Fixed CI memory constraints by using MICRO models for Ollama tests
Restored per-backend model selection for IBM_GRANITE_4_MICRO_3B (matches upstream pattern)

HuggingFace Test Exception

HuggingFace tests remain on Granite 3.3 due to missing aLoRA adapters for Granite 4.

The HF tests require the requirement_check intrinsic adapter, which is only available for Granite 3.x models in ibm-granite/rag-intrinsics-lib. While ibm-granite/granite-lib-rag-r1.0 has Granite 4 support for RAG intrinsics (answerability, context_relevance, etc.), the core intrinsics needed for tests are not yet available.

Follow-up Issue: #359 tracks migration once Granite 4 adapters are released.

Vision Model Exception

Vision tests remain on granite3.2-vision due to Ollama compatibility issues.

The ibm/granite3.3-vision:2b model causes Ollama server crashes with segmentation fault (null pointer dereference in llama runner). Reverted to granite3.2-vision which works reliably.

Follow-up Issue: #360 documents the crash with full stack traces and debugging information.

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code as added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Local Testing

# Fast tests (skip LLM quality checks)
uv run pytest -m "not qualitative"

# Full test suite
uv run pytest

Test Results: 204 passed, 6 skipped, 69 deselected, 1 xpassed

CI Testing

All tests pass in CI with CICD=1 (skips qualitative markers).

Related Issues

Closes feat: Update tests & examples to use granite4 #344 (Granite 4 migration)
Follow-up Migrate HuggingFace Tests to Granite 4 When Adapters Available #359 (HuggingFace Granite 4 adapter migration)
Follow-up Investigate granite-vision-3.3-2b Ollama Compatibility Issue #360 (Vision model Ollama crash investigation)

github-actions · 2026-01-26T12:06:01Z

The PR description has been updated. Please fill out the template for your PR to be reviewed.

mergify · 2026-01-26T12:06:25Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

planetf1 · 2026-01-26T12:47:23Z

Issue with HuggingFace tests

I moved to using granite v4. First hybrid, then regular. However the intrinsics repo doesn't have granite4 options yet.
Checking status of intrinsics/aLora (including with hybrid - extra parameters for mamba).

planetf1 · 2026-01-26T15:57:37Z

Looking at CI failures...

- Add Granite 4 hybrid model definitions (micro 3B, tiny 7B, small 8B) - Add Granite 3.3 vision model (no G4 vision available yet) - Update Watsonx backend default to Granite 4 hybrid small - Update all tests to use Granite 4 models - Add HF_TEST_MODEL env var for flexible model selection in HF tests - Update documentation and examples to use Granite 4 - Maintain backward compatibility with deprecated constants Closes generative-computing#344

- Keep HuggingFace tests on Granite 3.3 due to missing requirement_check adapter - Add GRANITE4_ADAPTER_STATUS.md documenting adapter availability analysis - Update alora.md to recommend non-hybrid models for adapter training - Clean up model_ids.py comments The ibm-granite/rag-intrinsics-lib repository only has adapters for Granite 3.3 models. While granite-lib-rag-r1.0 has Granite 4 support for RAG intrinsics (answerability, context_relevance, etc.), the core requirement_check intrinsic used in HF tests is not available for Granite 4. All other backends (Ollama, Watsonx, LiteLLM, vLLM) successfully migrated to Granite 4 hybrid models.

Searched all 180+ ibm-granite repositories on HuggingFace: - Found 22 standalone adapter repositories, ALL Granite 3.x - Zero Granite 4 standalone adapters exist - Only granite-lib-rag-r1.0 has Granite 4 support (RAG intrinsics only) - requirement_check remains Granite 3.x only in both consolidated and standalone repos

The test uses IBM_GRANITE_4_HYBRID_MICRO (3B model) which only requires ~16GB RAM, not the 48GB+ indicated by requires_heavy_ram marker. Updated comment to reflect actual model being used.

- Remove HF_TEST_MODEL environment variable (unnecessary complexity) - Use hardcoded granite-3.3-8b-instruct (matches upstream/main) - Restore requires_heavy_ram marker (8B model needs 48GB+ RAM) - Simplify docstring to match upstream approach HuggingFace tests remain on Granite 3.3 because aLoRA adapters for requirement_check intrinsic are only available for Granite 3.x models.

The granite-vision-3.3-2b model causes Ollama server crashes with 'model runner has unexpectedly stopped' error. Reverting to granite3.2-vision which works reliably. Issue to be created to track granite-vision-3.3-2b Ollama compatibility.

The deprecated constant IBM_GRANITE_4_MICRO_3B is used as the default in start_session(). In upstream it had watsonx_name set, so we need to maintain that for backward compatibility. Map it to IBM_GRANITE_4_HYBRID_SMALL which has watsonx support, rather than MICRO which doesn't. Fixes test_start_session_watsonx which relies on this default.

CI runners have 17.7 GB memory but granite4:small-h requires 18.3 GB. Switch Ollama-based tests to explicitly use IBM_GRANITE_4_HYBRID_MICRO (granite4:micro-h, ~2GB) instead of IBM_GRANITE_4_MICRO_3B which now points to SMALL for watsonx compatibility. This restores the upstream pattern where tests use small models while the deprecated constant maintains backward compatibility for production code that needs watsonx support. Fixes CI OOM failures in test_litellm_ollama, test_openai_ollama, and test_vision_openai.

Restore IBM_GRANITE_4_MICRO_3B as a proper ModelIdentifier with per-backend model selection, matching upstream's pattern: - Ollama/HF: granite4:micro-h (~2GB, fits in CI) - Watsonx: granite-4-h-small (required for watsonx support) This fixes CI memory failures in test_session.py which uses start_session() default parameter. The previous approach of aliasing to IBM_GRANITE_4_HYBRID_SMALL broke CI because all backends got the large SMALL model (18.3GB). Upstream had this per-backend selection intentionally to balance CI constraints with watsonx compatibility. This restores that pattern using hybrid models throughout. Fixes test_session_copy_with_context_ops CI failure.

planetf1 force-pushed the feat/issue-344 branch from 8c9543e to d19052b Compare January 26, 2026 12:11

planetf1 marked this pull request as ready for review January 26, 2026 15:43

planetf1 added 8 commits January 26, 2026 16:41

fix: remove incorrect heavy_ram marker from test_spans.py

c56fb4a

The test uses IBM_GRANITE_4_HYBRID_MICRO (3B model) which only requires ~16GB RAM, not the 48GB+ indicated by requires_heavy_ram marker. Updated comment to reflect actual model being used.

planetf1 force-pushed the feat/issue-344 branch from 8ed5e3a to c16e261 Compare January 26, 2026 16:42

planetf1 added 3 commits January 26, 2026 16:55

Merge branch 'main' into feat/issue-344

0a93180

Merge branch 'main' into feat/issue-344

1e70336

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: migrate from Granite 3 to Granite 4 hybrid models #357

feat: migrate from Granite 3 to Granite 4 hybrid models #357

planetf1 commented Jan 26, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 26, 2026

Uh oh!

mergify bot commented Jan 26, 2026

Uh oh!

planetf1 commented Jan 26, 2026

Uh oh!

planetf1 commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: migrate from Granite 3 to Granite 4 hybrid models #357

Are you sure you want to change the base?

feat: migrate from Granite 3 to Granite 4 hybrid models #357

Conversation

planetf1 commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Migrate from Granite 3.x to Granite 4.0 Models

Type of PR

Description

Summary

Changes

Model Definitions (mellea/backends/model_ids.py)

Backend Updates

Test Updates (19 files)

Documentation Updates

Test Infrastructure

HuggingFace Test Exception

Vision Model Exception

Testing

Local Testing

CI Testing

Related Issues

Uh oh!

github-actions bot commented Jan 26, 2026

Uh oh!

mergify bot commented Jan 26, 2026

Merge Protections

🟢 Enforce conventional commit

Uh oh!

planetf1 commented Jan 26, 2026

Uh oh!

planetf1 commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

planetf1 commented Jan 26, 2026 •

edited

Loading

Model Definitions (`mellea/backends/model_ids.py`)