-
Notifications
You must be signed in to change notification settings - Fork 73
feat: migrate from Granite 3 to Granite 4 hybrid models #357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
planetf1
wants to merge
11
commits into
generative-computing:main
Choose a base branch
from
planetf1:feat/issue-344
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+204
−94
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
|
The PR description has been updated. Please fill out the template for your PR to be reviewed. |
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Enforce conventional commitWonderful, this rule succeeded.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
|
8c9543e to
d19052b
Compare
Contributor
Author
|
Issue with HuggingFace tests
|
Contributor
Author
|
Looking at CI failures... |
- Add Granite 4 hybrid model definitions (micro 3B, tiny 7B, small 8B) - Add Granite 3.3 vision model (no G4 vision available yet) - Update Watsonx backend default to Granite 4 hybrid small - Update all tests to use Granite 4 models - Add HF_TEST_MODEL env var for flexible model selection in HF tests - Update documentation and examples to use Granite 4 - Maintain backward compatibility with deprecated constants Closes generative-computing#344
- Keep HuggingFace tests on Granite 3.3 due to missing requirement_check adapter - Add GRANITE4_ADAPTER_STATUS.md documenting adapter availability analysis - Update alora.md to recommend non-hybrid models for adapter training - Clean up model_ids.py comments The ibm-granite/rag-intrinsics-lib repository only has adapters for Granite 3.3 models. While granite-lib-rag-r1.0 has Granite 4 support for RAG intrinsics (answerability, context_relevance, etc.), the core requirement_check intrinsic used in HF tests is not available for Granite 4. All other backends (Ollama, Watsonx, LiteLLM, vLLM) successfully migrated to Granite 4 hybrid models.
Searched all 180+ ibm-granite repositories on HuggingFace: - Found 22 standalone adapter repositories, ALL Granite 3.x - Zero Granite 4 standalone adapters exist - Only granite-lib-rag-r1.0 has Granite 4 support (RAG intrinsics only) - requirement_check remains Granite 3.x only in both consolidated and standalone repos
The test uses IBM_GRANITE_4_HYBRID_MICRO (3B model) which only requires ~16GB RAM, not the 48GB+ indicated by requires_heavy_ram marker. Updated comment to reflect actual model being used.
- Remove HF_TEST_MODEL environment variable (unnecessary complexity) - Use hardcoded granite-3.3-8b-instruct (matches upstream/main) - Restore requires_heavy_ram marker (8B model needs 48GB+ RAM) - Simplify docstring to match upstream approach HuggingFace tests remain on Granite 3.3 because aLoRA adapters for requirement_check intrinsic are only available for Granite 3.x models.
The granite-vision-3.3-2b model causes Ollama server crashes with 'model runner has unexpectedly stopped' error. Reverting to granite3.2-vision which works reliably. Issue to be created to track granite-vision-3.3-2b Ollama compatibility.
The deprecated constant IBM_GRANITE_4_MICRO_3B is used as the default in start_session(). In upstream it had watsonx_name set, so we need to maintain that for backward compatibility. Map it to IBM_GRANITE_4_HYBRID_SMALL which has watsonx support, rather than MICRO which doesn't. Fixes test_start_session_watsonx which relies on this default.
CI runners have 17.7 GB memory but granite4:small-h requires 18.3 GB. Switch Ollama-based tests to explicitly use IBM_GRANITE_4_HYBRID_MICRO (granite4:micro-h, ~2GB) instead of IBM_GRANITE_4_MICRO_3B which now points to SMALL for watsonx compatibility. This restores the upstream pattern where tests use small models while the deprecated constant maintains backward compatibility for production code that needs watsonx support. Fixes CI OOM failures in test_litellm_ollama, test_openai_ollama, and test_vision_openai.
8ed5e3a to
c16e261
Compare
Restore IBM_GRANITE_4_MICRO_3B as a proper ModelIdentifier with per-backend model selection, matching upstream's pattern: - Ollama/HF: granite4:micro-h (~2GB, fits in CI) - Watsonx: granite-4-h-small (required for watsonx support) This fixes CI memory failures in test_session.py which uses start_session() default parameter. The previous approach of aliasing to IBM_GRANITE_4_HYBRID_SMALL broke CI because all backends got the large SMALL model (18.3GB). Upstream had this per-backend selection intentionally to balance CI constraints with watsonx compatibility. This restores that pattern using hybrid models throughout. Fixes test_session_copy_with_context_ops CI failure.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Migrate from Granite 3.x to Granite 4.0 Models
Type of PR
Description
Summary
This PR migrates Mellea from Granite 3.x to Granite 4.0 hybrid models across all backends, tests, and documentation. Note: HuggingFace tests remain on Granite 3.3 due to adapter availability constraints (see below).
Changes
Model Definitions (
mellea/backends/model_ids.py)IBM_GRANITE_4_HYBRID_MICRO(granite-4.0-h-micro)IBM_GRANITE_4_HYBRID_TINY(granite-4.0-h-tiny)IBM_GRANITE_4_HYBRID_SMALL(granite-4-h-small)IBM_GRANITE_4_MICRO_3Bwith per-backend model selection (Ollama: MICRO, Watsonx: SMALL)IBM_GRANITE_3_3_VISION_2BBackend Updates
IBM_GRANITE_4_HYBRID_SMALLTest Updates (19 files)
✅ Migrated to Granite 4:
test/backends/test_watsonx.pytest/backends/test_ollama.pytest/backends/test_litellm_*.py(3 files)test/backends/test_vllm*.py(2 files)test/stdlib/components/*.py(8 files)test/stdlib/requirements/*.py(3 files)test/backends/test_huggingface.py- See "HuggingFace Test Exception" belowtest/backends/test_vision_ollama.py- See "Vision Model Exception" belowDocumentation Updates
docs/tutorial.md: Updated all examples to Granite 4docs/alora.md: Updated training examples, added note about non-hybrid models for adapter trainingdocs/examples/*.py: Updated all example scriptsTest Infrastructure
IBM_GRANITE_4_MICRO_3B(matches upstream pattern)HuggingFace Test Exception
HuggingFace tests remain on Granite 3.3 due to missing aLoRA adapters for Granite 4.
The HF tests require the
requirement_checkintrinsic adapter, which is only available for Granite 3.x models inibm-granite/rag-intrinsics-lib. Whileibm-granite/granite-lib-rag-r1.0has Granite 4 support for RAG intrinsics (answerability, context_relevance, etc.), the core intrinsics needed for tests are not yet available.Follow-up Issue: #359 tracks migration once Granite 4 adapters are released.
Vision Model Exception
Vision tests remain on
granite3.2-visiondue to Ollama compatibility issues.The
ibm/granite3.3-vision:2bmodel causes Ollama server crashes with segmentation fault (null pointer dereference in llama runner). Reverted togranite3.2-visionwhich works reliably.Follow-up Issue: #360 documents the crash with full stack traces and debugging information.
Testing
Local Testing
Test Results: 204 passed, 6 skipped, 69 deselected, 1 xpassed
CI Testing
All tests pass in CI with
CICD=1(skips qualitative markers).Related Issues