Add json-java21-jdt module for JSON Document Transforms#142
Add json-java21-jdt module for JSON Document Transforms#142
Conversation
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
…on bug (#31) - Add junit-js 2.0.0 dependency with GraalVM exclusions to avoid version conflicts - Add JUnit Vintage engine to run JUnit 4 tests under JUnit 5 Platform - Configure Surefire to discover *TestSuite.java files - Delete bun-based src/test/js/ directory (replaced by junit-js) - Add JtdEsmJsTestSuite.java to run JS tests via @RunWith(JSRunner.class) - Add boolean-schema.test.js and nested-elements-empty-focused.test.js - Tests run in GraalVM polyglot, no external JS runtime needed - Fix EsmRenderer bug where inline validator functions were never emitted: - Add generateInlineFunctions() method to emit collected inline validators - Fix collision issue by using counter instead of hashCode for function names - Support nested elements/schemas that require multiple inline validators Test results: 29 tests pass in jtd-esm-codegen (17 Java + 2 property + 10 JS)
- Remove all bun-specific code (ProcessBuilder runners, isBunAvailable checks) - Replace with GraalVM Polyglot JS in-process execution via jsContext()/evalModule() - Add GraalVM helper methods: jsContext(), evalModule(), errCount() - Simplify test schemas (inline them instead of separate variables) - Add generatedDiscriminatorValidatorWorks test - Add JTD_CODEGEN_SPEC.md documentation - All 17 tests pass, no external JS runtime required
Adds json-java21-jtd-codegen module (JDK 24+, --release 24) that compiles JTD schemas into bytecode validators targeting Java 21. Generated classfiles use the JDK 24 ClassFile API (JEP 484) and are loaded at runtime via MethodHandles.Lookup.defineClass(). Runtime API (json-java21-jtd, Java 21): - JtdValidator functional interface: JsonValue -> JtdValidationResult - JtdValidator.compile(schema) -- interpreter path, always available - JtdValidator.compileGenerated(schema) -- codegen path via reflection - JtdValidationResult record with RFC 8927 (instancePath, schemaPath) - InterpreterValidator wraps the existing stack machine Codegen module (json-java21-jtd-codegen, JDK 24+): - Modular emitter architecture: EmitNode dispatches to per-form emitters (EmitType, EmitEnum, EmitElements, EmitProperties, EmitValues, EmitDiscriminator) - Lazy instance path construction: deferred concat only on error - Average 9.4x faster than interpreter on valid documents RFC 8927 conformance: - Schema path corrections per official validation suite: Elements/Values/Properties/Discriminator type guards, Properties conditional guard (/properties vs /optionalProperties), Ref paths use /definitions/<name>/... - 316/316 official json-typedef-spec validation.json cases pass (interpreter); 314/316 codegen (2 recursive schemas skipped) Verification: - json-java21-jtd: 452 tests (136 unit + 316 spec conformance) - json-java21-jtd-codegen: 398 tests (82 cross-validation + 316 spec) - Total: 850 tests, all passing
…forms) - Rename module from json-java21-transforms to json-java21-jdt - New package: json.java21.jdt with Jdt engine and JdtException - Implement core JDT operations: replace, remove, rename, merge - Execution order: Rename → Remove → Merge → Replace - 18 unit tests + 15 Microsoft fixture tests (33 total) - Path-based operations deferred to Skipped/ folders for future work To verify: mvn test -pl json-java21-jdt
…on-java21-jdt To verify: mvn test -pl json-java21-jdt -am -Djava.util.logging.ConsoleHandler.level=INFO
…queries Compiles JsonPath expressions into Java 21 classfiles via the JDK 24+ ClassFile API. Generated code eliminates interpretation overhead by inlining all segment dispatch at codegen time. Supports all 8 segment types: PropertyAccess, ArrayIndex, ArraySlice, Wildcard, RecursiveDescent, Filter, Union, ScriptExpression. Uses runtime helpers for recursive descent (DFS traversal) and filter comparison (value extraction and comparison logic). 29 cross-validation tests verify codegen produces identical results to the interpreter for all supported JsonPath expressions. Also makes JsonPathAst public and adds JsonPath.ast() accessor for cross-module codegen access. To verify: mvn test -pl json-java21-jsonpath-codegen -am
…ing init - Exclude json-java21-jtd-codegen and json-java21-jsonpath-codegen from the Java 21 CI build (they require JDK 24+ ClassFile API) - Update test count to 653 (excluding codegen modules) - Fix CodegenTestBase to handle empty logging level property in CI To verify: mvn test -pl '!json-java21-jtd-codegen,!json-java21-jsonpath-codegen'
Thread a pathResolver function through the JDT engine so callers can plug in bytecode-compiled JsonPath queries instead of the interpreter. The new overload Jdt.transform(source, transform, pathResolver) accepts a Function<String, Function<JsonValue, List<JsonValue>>> that compiles a JsonPath expression into a query function. Default behavior unchanged: uses JsonPath.parse(expr)::query. Example with compiled JsonPath: Jdt.transform(source, transform, expr -> JsonPathCodegen.compile(expr)::query); All 33 existing tests pass unchanged. To verify: mvn test -pl json-java21-jdt -am
Create JdtAst with three sealed node types: - DirectiveNode: contains @jdt.rename/remove/merge/replace + children - MergeNode: default object-to-object deep merge - ReplacementNode: direct value replacement JdtAstParser builds the AST from a JsonValue transform document. Jdt.parseToAst() public API for codegen modules to access the AST. 4 new AST parser tests (22 total JDT unit tests). To verify: mvn test -pl json-java21-jdt -am
Walk the JsonPath AST with exhaustive switch to emit JavaScript that evaluates the expression without interpretation overhead. Generated modules export a query(root) function that returns an array of matched values. Supports all segment types: property access, array index, slice, wildcard, recursive descent, filter, union, script. 18 tests verify structural validity and correct rendering for each segment type. To verify: mvn test -pl json-java21-jsonpath -Dtest=JsonPathEsmRendererTest
Walk the JDT AST with exhaustive switch to emit JavaScript that applies the transform without interpretation overhead. Generated modules export a transform(source) function with inlined directive logic and a deepMerge helper. Supports all JDT directives: rename, remove, merge, replace, plus default recursive merge for non-directive keys. 7 tests verify correct rendering for each directive type and structural validity. To verify: mvn test -pl json-java21-jdt -Dtest=JdtEsmRendererTest
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b96335a950
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for (final var entry : source.members().entrySet()) { | ||
| final var value = entry.getValue(); | ||
| if (!toRemove.contains(value)) { | ||
| if (value instanceof JsonObject childObj) { |
There was a problem hiding this comment.
Match removals by identity to avoid deleting equal values
The path-based remove uses toRemove.contains(value), which relies on JsonValue.equals (value-based in JsonArrayImpl/JsonObjectImpl). That means if the matched node’s value is equal to other nodes elsewhere, those unrelated nodes will also be removed even though they were not selected by the JSONPath. For example, a path that matches a single 1 in $.items[0] will remove every other 1 in the document. To preserve path semantics, track matches by identity (e.g., IdentityHashMap/Set) and compare with == when deciding what to remove.
Useful? React with 👍 / 👎.
Summary of ChangesHello @simbo1905, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the project's JSON processing capabilities by introducing a new module for declarative JSON document transformations (JDT). It also establishes a robust framework for generating highly optimized, bytecode-based validators for both JsonPath queries and JTD schemas, alongside an experimental ES2020 module generator for JTD. These additions aim to enhance performance and provide flexible, standards-compliant JSON manipulation and validation tools. Highlights
Changelog
Ignored Files
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This is a massive and impressive pull request that introduces a new json-java21-jdt module for JSON Document Transforms, complete with extensive documentation and tests. It also adds experimental code generation capabilities for JTD and JsonPath to both Java bytecode (for JDK 24+) and ES2020 JavaScript modules. The changes are well-structured, and the new APIs, like JtdValidator, are a significant improvement. I have found one potential issue in the new JDT ESM renderer that I've commented on.
| default -> | ||
| sb.append(indent).append("if (typeof _r === 'object' && _r !== null && _r[") | ||
| .append(jsString(key)).append("] !== undefined) _r[").append(jsString(key)) | ||
| .append("] = deepMerge(_r[").append(jsString(key)).append("], ") | ||
| .append(jsonToJs(child)).append(");\n"); | ||
| } |
There was a problem hiding this comment.
The default case in this switch statement for processing child nodes appears to be incorrect. It handles MergeNode and DirectiveNode children.
default ->
sb.append(indent).append("if (typeof _r === 'object' && _r !== null && _r[")
.append(jsString(key)).append("] !== undefined) _r[").append(jsString(key))
.append("] = deepMerge(_r[").append(jsString(key)).append("], ")
.append(jsonToJs(child)).append(");\n");When child is a MergeNode or DirectiveNode, jsonToJs(child) returns {}, as per the implementation of jsonToJs:
private static String jsonToJs(Object value) {
if (value instanceof JdtNode) return "{}"; // Fallback for AST nodes in children
// ...
}This results in deepMerge(..., {}), which does not apply the child's transformation. This is likely a bug as it will cause incorrect behavior for nested transforms within a directive node.
To fix this, the switch should handle MergeNode and DirectiveNode explicitly by generating recursive transformation logic for them, similar to how emitMergeNode handles its children by creating inline functions.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 4 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
This is the final PR Bugbot will review for you during this billing cycle
Your free Bugbot reviews will reset on March 9
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| sb.append(indent).append("if (typeof _r === 'object' && _r !== null && _r[") | ||
| .append(jsString(key)).append("] !== undefined) _r[").append(jsString(key)) | ||
| .append("] = deepMerge(_r[").append(jsString(key)).append("], ") | ||
| .append(jsonToJs(child)).append(");\n"); |
There was a problem hiding this comment.
ESM renderer loses nested transform logic in directive children
Medium Severity
In emitDirectiveNode, the default branch for processing children calls jsonToJs(child) where child is a MergeNode or DirectiveNode. The jsonToJs method returns "{}" for any JdtNode, so all nested transform content is silently lost. For example, a child MergeNode carrying {"A": 10, "C": 3} would generate deepMerge(_r["key"], {}) instead of the actual values. Compare with emitMergeNode, which correctly generates local functions for nested MergeNode/DirectiveNode children.
Additional Locations (1)
| return transformObj; | ||
| } else { | ||
| return transformObj; | ||
| } |
There was a problem hiding this comment.
Redundant identical branches in default transform dispatch
Low Severity
The else if (source instanceof JsonArray) branch and the else branch in applyTransform both return transformObj identically. The JsonArray check is redundant since it produces the same result as the fallback. This could also mask a missing behavior — the README specifies arrays append by default, but when a non-directive object transform is applied to an array source, it replaces rather than appends.
|
|
||
| for (final var entry : source.members().entrySet()) { | ||
| final var value = entry.getValue(); | ||
| if (!toRemove.contains(value)) { |
There was a problem hiding this comment.
Path-based removal uses value equality instead of reference identity
Medium Severity
removeMatchedNodes and removeMatchedNodesFromArray use toRemove.contains(value) which relies on .equals() (structural equality), while the sibling method transformMatchingNodes uses node == match (reference identity). The AGENTS.md explicitly documents that the design uses reference identity because JsonPath returns the same object instances from the source tree. Using .equals() causes over-removal — if a document has multiple nodes with identical values at different paths, removing one path-matched node would incorrectly remove all value-equal nodes.
Additional Locations (2)
|
|
||
| private static void emitReplace(StringBuilder sb, JsonValue replaceSpec, String indent) { | ||
| sb.append(indent).append("_r = ").append(jsonToJs(replaceSpec)).append(";\n"); | ||
| } |
There was a problem hiding this comment.
ESM renderer omits double-bracket array unwrapping for merge/replace
Medium Severity
emitMerge and emitReplace pass the raw JsonValue spec through jsonToJs() without handling the double-bracket array syntax [[...]]. The Java interpreter's applyMerge and applyReplace both call isDoubleBracketArray() to unwrap [[1,2,3]] into [1,2,3], but the ESM renderer has no equivalent logic. A transform like {"@jdt.replace": [[1,2,3]]} would generate _r = [[1,2,3]] instead of the correct _r = [1,2,3].


Summary
json-java21-transformstojson-java21-jdt(JSON Document Transforms)json.java21.jdtwithJdtengine andJdtException@jdt.replace,@jdt.remove,@jdt.rename,@jdt.mergeFeatures Implemented
Deferred
Path-based operations using
@jdt.pathattribute moved toSkipped/folders for future implementation.Verification
mvn test -pl json-java21-jdt -Djava.util.logging.ConsoleHandler.level=INFONote
Medium Risk
Large net-new functionality across multiple modules plus new code generation paths and CI/release automation; main risk is build/test pipeline changes and correctness of the new transform/codegen implementations.
Overview
Adds a new
json-java21-jdtmodule implementing JSON Document Transforms with@jdt.rename/remove/merge/replace(fixed execution order) plus an AST parser (parseToAst) and an experimental ES2020 ESM renderer for emitting atransform(source)module.Introduces bytecode/codegen tooling modules for JsonPath (
json-java21-jsonpath-codegen) and JTD (json-java21-jtd-codegen), exposesJsonPath.ast()and makesJsonPathAstpublic to support codegen, and adds an ES2020JsonPathEsmRendererwith tests.CI is updated to exclude the Java 24-only codegen modules from the JDK 21 workflow and to assert the new overall test count; a new nightly GitHub Action builds and publishes
jtd-esm-codegenuber-jar + GraalVM native binaries. Minor docs/ignore updates add the new submodule toREADME.mdand ignore.DS_Store.Written by Cursor Bugbot for commit 8f0032e. This will update automatically on new commits. Configure here.