Conversation
Adds json-java21-jtd-codegen module (JDK 24+, --release 24) that compiles JTD schemas into bytecode validators targeting Java 21. Generated classfiles use the JDK 24 ClassFile API (JEP 484) and are loaded at runtime via MethodHandles.Lookup.defineClass(). Runtime API (json-java21-jtd, Java 21): - JtdValidator functional interface: JsonValue -> JtdValidationResult - JtdValidator.compile(schema) -- interpreter path, always available - JtdValidator.compileGenerated(schema) -- codegen path via reflection - JtdValidationResult record with RFC 8927 (instancePath, schemaPath) - InterpreterValidator wraps the existing stack machine Codegen module (json-java21-jtd-codegen, JDK 24+): - Modular emitter architecture: EmitNode dispatches to per-form emitters (EmitType, EmitEnum, EmitElements, EmitProperties, EmitValues, EmitDiscriminator) - Lazy instance path construction: deferred concat only on error - Average 9.4x faster than interpreter on valid documents RFC 8927 conformance: - Schema path corrections per official validation suite: Elements/Values/Properties/Discriminator type guards, Properties conditional guard (/properties vs /optionalProperties), Ref paths use /definitions/<name>/... - 316/316 official json-typedef-spec validation.json cases pass (interpreter); 314/316 codegen (2 recursive schemas skipped) Verification: - json-java21-jtd: 452 tests (136 unit + 316 spec conformance) - json-java21-jtd-codegen: 398 tests (82 cross-validation + 316 spec) - Total: 850 tests, all passing
Summary of ChangesHello @simbo1905, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant performance enhancement to the JTD validation library by adding an optional bytecode generation module. This new module compiles JTD schemas directly into Java bytecode, offering a substantial speedup over the traditional interpreter. Alongside this, the core validation logic has been rigorously aligned with RFC 8927, ensuring accurate and standardized error reporting. The changes provide developers with a choice between a robust, universally compatible interpreter and a high-performance, bytecode-generated validator, all accessible through a unified functional API. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c1805a981c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| cob.aload(instSlot); | ||
| cob.checkcast(CD_JsonNumber); | ||
| cob.invokeinterface(CD_JsonNumber, "toLong", MTD_long); | ||
| int lSlot = cob.allocateLocal(TypeKind.LONG); | ||
| cob.lstore(lSlot); |
There was a problem hiding this comment.
Guard toLong overflow before validating int ranges
The generated int-type validator calls JsonNumber.toLong() immediately after checking floor() (see EmitType.emitIntCore), but it never bounds-checks the double value the way the interpreter does. When the instance is a large integer (e.g., 1e20), toLong() can throw JsonAssertionException, which aborts validation instead of returning a schema error. The interpreter explicitly rejects values outside Long range first, so codegen should mirror that guard to avoid runtime exceptions and keep behavior consistent.
Useful? React with 👍 / 👎.
| cob.ldc(":60"); | ||
| cob.ldc(":59"); | ||
| cob.invokevirtual(CD_String, "replace", MTD_String_CharSeq_CharSeq); | ||
| cob.getstatic(CD_DateTimeFormatter, "ISO_OFFSET_DATE_TIME", CD_DateTimeFormatter); | ||
| cob.invokestatic(CD_OffsetDateTime, "parse", MTD_OffsetDateTime_CharSeq_DTF); |
There was a problem hiding this comment.
Catch timestamp parse failures to avoid hard exceptions
The timestamp path compiles a regex and then calls OffsetDateTime.parse(...) without any exception handling. The regex accepts values like month/day 00, so parse can throw DateTimeParseException, which will bubble out of validation and crash instead of producing a schema error. The interpreter explicitly catches exceptions and treats them as a validation failure, so generated validators should wrap this parse and fall back to onError to preserve RFC 8927 behavior.
Useful? React with 👍 / 👎.
| try { | ||
| final var clazz = Class.forName(CODEGEN_CLASS); | ||
| final var method = clazz.getMethod(CODEGEN_METHOD, JsonValue.class); | ||
| return (JtdValidator) method.invoke(null, schema); |
There was a problem hiding this comment.
Handle UnsupportedClassVersionError in compileGenerated
On Java 21 runtimes, compileGenerated will call Class.forName on the codegen module, which is built with --release 24. This throws UnsupportedClassVersionError (a LinkageError), but the method only catches ClassNotFoundException and ReflectiveOperationException, so callers see a hard error instead of the documented UnsupportedOperationException fallback. Catching LinkageError here would keep the method behavior consistent when the newer codegen module is present but incompatible with the runtime.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This is an impressive and substantial pull request that introduces bytecode generation for JTD validation, a new functional JtdValidator API, and brings the implementation into conformance with RFC 8927 error reporting. The modular emitter architecture for codegen is well-designed, and the addition of extensive test suites, including spec conformance and cross-validation tests, is excellent.
My review focuses on a few areas for improvement: project consistency in the new module's pom.xml, a performance optimization for regex compilation in the generated code, and enhancing code clarity and maintainability by addressing a known limitation and improving code reuse.
| cob.ldc("^(\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:(\\d{2}|60)(\\.\\d+)?(Z|[+-]\\d{2}:\\d{2}))$"); | ||
| cob.invokestatic(CD_Pattern, "compile", MTD_Pattern_String); |
There was a problem hiding this comment.
Compiling the regex pattern on every validation call is inefficient. The compiled Pattern should be cached. A good approach is to create a private static final Pattern field in the generated class and initialize it in a static block. This avoids repeated compilation overhead.
Additionally, the regex string is duplicated from JtdSchema.TypeSchema.RFC3339. To avoid this, you could expose the regex string as a public static final String in JtdSchema.TypeSchema and reference it here, or make the Pattern field public and load it directly using getstatic.
| <version>0.1.9</version> | ||
| </parent> | ||
|
|
||
| <artifactId>java.util.json.jtd.codegen</artifactId> |
There was a problem hiding this comment.
The artifactId java.util.json.jtd.codegen is inconsistent with the module's directory name (json-java21-jtd-codegen) and the naming convention of other modules in this project (e.g., json-java21-jtd). For consistency and easier project navigation, consider renaming the artifactId to json-java21-jtd-codegen.
| <artifactId>java.util.json.jtd.codegen</artifactId> | |
| <artifactId>json-java21-jtd-codegen</artifactId> |
| case JtdSchema.RefSchema r -> | ||
| emit(cob, r.target(), instSlot, errSlot, instPath, "/definitions/" + r.ref()); |
There was a problem hiding this comment.
This direct recursive call to emit for RefSchema will cause a StackOverflowError in the code generator for recursive JTD schemas. While the PR description and tests acknowledge this limitation, it would be beneficial to add a comment here to make this explicit for future maintainers. A more robust solution would involve generating separate methods for definitions and using invokevirtual to handle recursion in the generated code, as described in JTD_CODEGEN_SPEC.md.
| static final java.util.regex.Pattern RFC3339 = java.util.regex.Pattern.compile( | ||
| "^(\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:(\\d{2}|60)(\\.\\d+)?(Z|[+-]\\d{2}:\\d{2}))$" | ||
| ); |
There was a problem hiding this comment.
To allow the codegen module to reuse this pattern and avoid duplicating the regex string, consider making RFC3339 public. This would improve maintainability by having a single source of truth for the pattern and would allow the codegen module to load the pre-compiled pattern directly.
| static final java.util.regex.Pattern RFC3339 = java.util.regex.Pattern.compile( | |
| "^(\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:(\\d{2}|60)(\\.\\d+)?(Z|[+-]\\d{2}:\\d{2}))$" | |
| ); | |
| public static final java.util.regex.Pattern RFC3339 = java.util.regex.Pattern.compile( | |
| "^(\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:(\\d{2}|60)(\\.\\d+)?(Z|[+-]\\d{2}:\\d{2}))$" | |
| ); |
#140) Previously, two copies of the 78KB jtd-spec-validation.json file were committed to the repository (156KB total), bloating the PR and git history. Changes: - Created JtdTestDataExtractor utility class to extract test data from existing jtd-test-suite.zip at test runtime - Updated JtdSpecConformanceTest and CodegenSpecConformanceTest to use extraction instead of classpath resources - Updated JtdSpecIT and CompilerSpecIT to use shared extractor - Deleted committed JSON files from both modules - Codegen module references parent module's ZIP file Testing: - Run: ./mvnw -pl json-java21-jtd test - All 452 tests pass (136 unit + 316 spec conformance) - Test data is automatically extracted from ZIP on first run - Reduces PR size by ~156KB (9,390 lines)
- Updated CI workflow to use Java 24 (required for JTD codegen module) - Changed distribution from temurin to oracle for Java 24 availability - Updated expected test count from 611 to 850 (includes new codegen tests) - Updated expected skipped count to 2 (recursive schemas in codegen) The json-java21-jtd-codegen module requires Java 24+ for the ClassFile API (JEP 484). Previous workflow used Java 21 which caused compilation failure: 'release version 24 not supported'. Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
- Updated CI workflow to use Java 24 (required for JTD codegen module) - Changed distribution from temurin to oracle for Java 24 availability - Updated expected test count from 611 to 850 (includes new codegen tests) - Updated expected skipped count to 2 (recursive schemas in codegen) The json-java21-jtd-codegen module requires Java 24+ for the ClassFile API (JEP 484). Previous workflow used Java 21 which caused compilation failure: 'release version 24 not supported'. Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
The actual test count is 1354, not 850. The previous value was based on incomplete information. Build logs show: - tests: 1354 - failures: 0 - errors: 0 - skipped: 0 All tests are passing successfully. Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
The actual test count is 1354, not 850. The previous value was based on incomplete information. Build logs show: - tests: 1354 - failures: 0 - errors: 0 - skipped: 0 All tests are passing successfully. Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
Adds json-java21-jtd-codegen module (JDK 24+, --release 24) that compiles JTD schemas into bytecode validators targeting Java 21. Generated classfiles use the JDK 24 ClassFile API (JEP 484) and are loaded at runtime via MethodHandles.Lookup.defineClass().
Runtime API (json-java21-jtd, Java 21):
Codegen module (json-java21-jtd-codegen, JDK 24+):
RFC 8927 conformance:
Verification:
(Optional) When submitting a PR, please consider using a "deep research" tool to sanity check your proposal. Then before submission, run your draft through a strong model with a prompt such as:
If you used an AI assistant while preparing this PR, ensure it followed the contributor/agent workflow rules in
AGENTS.md.(Optional) Please then attach both the prompt and the model's review to the bottom of this template under "Augmented Intelligence Review".
What changed
Why this change is needed
How were these changes tested
Checklist
CODING_STYLE_LLM.mdconvensionsAGENTS.mdupdated if appropriate(Optional) Augmented Intelligence Review:
Both prompt and model out, asking a strong model to double-check your submission, from the perspective of a maintainer of this repo