feat: add Python SDK for model spec types and validation#175
feat: add Python SDK for model spec types and validation#175pradhyum6144 wants to merge 1 commit intomodelpack:mainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new Python SDK for the CNCF ModelPack specification, providing a robust and consistent way to define, serialize, and validate AI model configurations in Python. By mirroring the existing Go SDK's type definitions and utilizing the same JSON schema for validation, this SDK ensures interoperability and a unified experience for developers working with ModelPack across different language ecosystems. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a well-structured Python SDK for ModelPack specifications, mirroring the existing Go implementation. The use of dataclasses for model types and a shared JSON schema for validation are excellent choices for ensuring consistency across languages. However, I've identified a couple of critical issues. Firstly, there's a compatibility problem with Python 3.10 regarding the parsing of RFC3339 timestamps with a 'Z' suffix, which will cause errors. Secondly, the method for loading the JSON schema file is not robust for a distributable package and will fail upon installation. My review includes specific suggestions to address these critical issues to ensure the SDK is both correct and packageable.
| if "mtime" in data: | ||
| mod_time = datetime.fromisoformat(data["mtime"]) |
There was a problem hiding this comment.
datetime.fromisoformat() does not support the 'Z' suffix for UTC timezones in Python versions before 3.11. Since your setup.py supports Python 3.10, this will cause a ValueError for valid RFC3339 timestamps ending in 'Z'. You can fix this by replacing 'Z' with '+00:00' before parsing.
| if "mtime" in data: | |
| mod_time = datetime.fromisoformat(data["mtime"]) | |
| if "mtime" in data and data["mtime"]: | |
| mod_time = datetime.fromisoformat(data["mtime"].replace("Z", "+00:00")) |
specs-python/modelpack/v1/config.py
Outdated
| if "knowledgeCutoff" in data: | ||
| kwargs["knowledge_cutoff"] = datetime.fromisoformat(data["knowledgeCutoff"]) |
There was a problem hiding this comment.
datetime.fromisoformat() does not support the 'Z' suffix for UTC timezones in Python versions before 3.11. Since your setup.py supports Python 3.10, this will cause a ValueError for valid RFC3339 timestamps ending in 'Z'. You can fix this by replacing 'Z' with '+00:00' before parsing.
| if "knowledgeCutoff" in data: | |
| kwargs["knowledge_cutoff"] = datetime.fromisoformat(data["knowledgeCutoff"]) | |
| if "knowledgeCutoff" in data and data["knowledgeCutoff"]: | |
| kwargs["knowledge_cutoff"] = datetime.fromisoformat(data["knowledgeCutoff"].replace("Z", "+00:00")) |
specs-python/modelpack/v1/config.py
Outdated
| if "createdAt" in data: | ||
| created_at = datetime.fromisoformat(data["createdAt"]) |
There was a problem hiding this comment.
datetime.fromisoformat() does not support the 'Z' suffix for UTC timezones in Python versions before 3.11. Since your setup.py supports Python 3.10, this will cause a ValueError for valid RFC3339 timestamps ending in 'Z'. You can fix this by replacing 'Z' with '+00:00' before parsing.
| if "createdAt" in data: | |
| created_at = datetime.fromisoformat(data["createdAt"]) | |
| if "createdAt" in data and data["createdAt"]: | |
| created_at = datetime.fromisoformat(data["createdAt"].replace("Z", "+00:00")) |
| def _load_schema() -> dict: | ||
| """Load and return the config JSON schema.""" | ||
| with open(_CONFIG_SCHEMA_PATH) as f: | ||
| return json.load(f) |
There was a problem hiding this comment.
The current method of locating config-schema.json via a hardcoded relative path is not robust for packaging and will fail when the package is installed via pip. The schema file should be included as package data and accessed using importlib.resources.
To fix this, you should:
- Move
schema/config-schema.jsonintospecs-python/modelpack/v1/. - Update
setup.pyto include this file (I have left a separate comment there). - Update this function to load the schema as a resource. You will also need to add
from importlib import resourcesat the top of the file and remove the global_SCHEMA_DIRand_CONFIG_SCHEMA_PATHvariables.
| def _load_schema() -> dict: | |
| """Load and return the config JSON schema.""" | |
| with open(_CONFIG_SCHEMA_PATH) as f: | |
| return json.load(f) | |
| def _load_schema() -> dict: | |
| """Load and return the config JSON schema.""" | |
| import importlib.resources | |
| schema_file = importlib.resources.files("modelpack.v1").joinpath("config-schema.json") | |
| with schema_file.open(encoding="utf-8") as f: | |
| return json.load(f) |
| setup( | ||
| name="modelpack", | ||
| version="0.1.0", | ||
| description="Python SDK for the CNCF ModelPack specification", | ||
| packages=find_packages(), | ||
| python_requires=">=3.10", | ||
| install_requires=[ | ||
| "jsonschema[format]>=4.20.0", | ||
| ], | ||
| extras_require={ | ||
| "dev": [ | ||
| "pytest>=7.0", | ||
| "ruff>=0.4.0", | ||
| ], | ||
| }, | ||
| ) |
There was a problem hiding this comment.
To fix the packaging issue I mentioned in validator.py, you need to tell setuptools to include config-schema.json in the package. After moving the file to specs-python/modelpack/v1/, you can use the package_data argument in setup().
| setup( | |
| name="modelpack", | |
| version="0.1.0", | |
| description="Python SDK for the CNCF ModelPack specification", | |
| packages=find_packages(), | |
| python_requires=">=3.10", | |
| install_requires=[ | |
| "jsonschema[format]>=4.20.0", | |
| ], | |
| extras_require={ | |
| "dev": [ | |
| "pytest>=7.0", | |
| "ruff>=0.4.0", | |
| ], | |
| }, | |
| ) | |
| setup( | |
| name="modelpack", | |
| version="0.1.0", | |
| description="Python SDK for the CNCF ModelPack specification", | |
| packages=find_packages(), | |
| package_data={"modelpack.v1": ["config-schema.json"]}, | |
| python_requires=">=3.10", | |
| install_requires=[ | |
| "jsonschema[format]>=4.20.0", | |
| ], | |
| extras_require={ | |
| "dev": [ | |
| "pytest>=7.0", | |
| "ruff>=0.4.0", | |
| ], | |
| }, | |
| ) |
Signed-off-by: pradhyum6144 <pradhyum314@gmail.com>
1d701f6 to
1c931e5
Compare
Adds a Python SDK under specs-python/ that mirrors the Go types in specs-go/v1/. It provides dataclasses for Model, ModelDescriptor, ModelConfig, ModelCapabilities, ModelFS, and Modality with JSON serialization that matches the Go JSON tags exactly. The validator uses the same config-schema.json as the Go side so both languages enforce identical validation rules. Includes all media type and annotation constants, and 64 tests covering serialization round-trips, all Go validation test cases, and edge cases.