Add json-java21-jdt module for JSON Document Transforms#138
Add json-java21-jdt module for JSON Document Transforms#138
Conversation
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
Co-authored-by: Simon Massey <simbo1905@users.noreply.github.com>
…on bug (#31) - Add junit-js 2.0.0 dependency with GraalVM exclusions to avoid version conflicts - Add JUnit Vintage engine to run JUnit 4 tests under JUnit 5 Platform - Configure Surefire to discover *TestSuite.java files - Delete bun-based src/test/js/ directory (replaced by junit-js) - Add JtdEsmJsTestSuite.java to run JS tests via @RunWith(JSRunner.class) - Add boolean-schema.test.js and nested-elements-empty-focused.test.js - Tests run in GraalVM polyglot, no external JS runtime needed - Fix EsmRenderer bug where inline validator functions were never emitted: - Add generateInlineFunctions() method to emit collected inline validators - Fix collision issue by using counter instead of hashCode for function names - Support nested elements/schemas that require multiple inline validators Test results: 29 tests pass in jtd-esm-codegen (17 Java + 2 property + 10 JS)
- Remove all bun-specific code (ProcessBuilder runners, isBunAvailable checks) - Replace with GraalVM Polyglot JS in-process execution via jsContext()/evalModule() - Add GraalVM helper methods: jsContext(), evalModule(), errCount() - Simplify test schemas (inline them instead of separate variables) - Add generatedDiscriminatorValidatorWorks test - Add JTD_CODEGEN_SPEC.md documentation - All 17 tests pass, no external JS runtime required
Adds json-java21-jtd-codegen module (JDK 24+, --release 24) that compiles JTD schemas into bytecode validators targeting Java 21. Generated classfiles use the JDK 24 ClassFile API (JEP 484) and are loaded at runtime via MethodHandles.Lookup.defineClass(). Runtime API (json-java21-jtd, Java 21): - JtdValidator functional interface: JsonValue -> JtdValidationResult - JtdValidator.compile(schema) -- interpreter path, always available - JtdValidator.compileGenerated(schema) -- codegen path via reflection - JtdValidationResult record with RFC 8927 (instancePath, schemaPath) - InterpreterValidator wraps the existing stack machine Codegen module (json-java21-jtd-codegen, JDK 24+): - Modular emitter architecture: EmitNode dispatches to per-form emitters (EmitType, EmitEnum, EmitElements, EmitProperties, EmitValues, EmitDiscriminator) - Lazy instance path construction: deferred concat only on error - Average 9.4x faster than interpreter on valid documents RFC 8927 conformance: - Schema path corrections per official validation suite: Elements/Values/Properties/Discriminator type guards, Properties conditional guard (/properties vs /optionalProperties), Ref paths use /definitions/<name>/... - 316/316 official json-typedef-spec validation.json cases pass (interpreter); 314/316 codegen (2 recursive schemas skipped) Verification: - json-java21-jtd: 452 tests (136 unit + 316 spec conformance) - json-java21-jtd-codegen: 398 tests (82 cross-validation + 316 spec) - Total: 850 tests, all passing
…forms) - Rename module from json-java21-transforms to json-java21-jdt - New package: json.java21.jdt with Jdt engine and JdtException - Implement core JDT operations: replace, remove, rename, merge - Execution order: Rename → Remove → Merge → Replace - 18 unit tests + 15 Microsoft fixture tests (33 total) - Path-based operations deferred to Skipped/ folders for future work To verify: mvn test -pl json-java21-jdt
Summary of ChangesHello @simbo1905, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the project's JSON processing capabilities by introducing a new module for JSON Document Transforms (JDT) in Java, enabling declarative manipulation of JSON structures. Concurrently, it adds an experimental code generator that transforms JSON Type Definition (JTD) schemas into highly optimized, standalone JavaScript validators, enhancing client-side data validation efficiency. Highlights
Changelog
Ignored Files
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a80cc68323
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for (final var entry : source.members().entrySet()) { | ||
| final var value = entry.getValue(); | ||
| if (!toRemove.contains(value)) { | ||
| if (value instanceof JsonObject childObj) { |
There was a problem hiding this comment.
Avoid value-based removal in @jdt.path deletes
The path-based remove logic checks toRemove.contains(value), which relies on JsonValue.equals (value equality). That means removing a path match that happens to be, say, the number 1 will also remove any other 1 elsewhere in the object/array even if it wasn’t matched by the JsonPath. This shows up whenever there are duplicate values in different locations and can silently drop unrelated data. Consider tracking matches by identity (e.g., IdentityHashMap/==) instead of List.contains.
Useful? React with 👍 / 👎.
| // This is complex - for now, handle simple cases | ||
| LOG.fine(() -> "Rename with path '" + path + "' to '" + newName + "' - " + matches.size() + " matches"); | ||
|
|
||
| // TODO: Implement path-based rename properly | ||
| return source; |
There was a problem hiding this comment.
Do not silently no-op @jdt.rename with @jdt.path
applyRenameWithPath logs matches and then returns the original source unchanged, even though the class-level docs advertise @jdt.path support. A transform that uses @jdt.path for rename will currently appear to succeed but have no effect, which is incorrect behavior. If path-based rename isn’t implemented yet, it should fail fast with a JdtException rather than silently ignoring the directive.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This pull request is a significant contribution, adding two new modules: json-java21-jdt for JSON Document Transforms and an experimental jtd-esm-codegen for generating JavaScript validators from JTD schemas. The code for both modules is well-structured and includes a good set of tests, including property-based tests for the codegen module.
My review focuses on a few areas for improvement:
- Performance: I've suggested a couple of optimizations in the
Jdt.javaimplementation to improve performance during node traversal by usingSets for faster lookups. - Documentation: The
README.mddocumentation for thejtd-esm-codegenmodule appears to understate its capabilities, which could be updated to avoid user confusion. - PR Scope: The pull request title and description only mention the JDT module, while a large part of the change is the new JTD codegen module. It would be clearer if the PR description reflected the full scope of changes.
Overall, this is a great addition to the project.
| private static JsonObject applyRemoveWithPath(JsonObject source, JsonObject removeObj) { | ||
| final var pathValue = removeObj.members().get(JDT_PATH); | ||
|
|
||
| if (!(pathValue instanceof JsonString pathStr)) { | ||
| throw new JdtException("@jdt.path must be a string"); | ||
| } | ||
|
|
||
| final var path = pathStr.string(); | ||
| final var jsonPath = JsonPath.parse(path); | ||
| final var matches = jsonPath.query(source); | ||
|
|
||
| LOG.fine(() -> "Remove with path '" + path + "' - " + matches.size() + " matches"); | ||
|
|
||
| // Remove matched nodes from source | ||
| // This requires traversing and rebuilding the object | ||
| return removeMatchedNodes(source, matches); | ||
| } | ||
|
|
||
| private static JsonObject removeMatchedNodes(JsonObject source, List<JsonValue> toRemove) { | ||
| final var result = new LinkedHashMap<String, JsonValue>(); | ||
|
|
||
| for (final var entry : source.members().entrySet()) { | ||
| final var value = entry.getValue(); | ||
| if (!toRemove.contains(value)) { | ||
| if (value instanceof JsonObject childObj) { | ||
| result.put(entry.getKey(), removeMatchedNodes(childObj, toRemove)); | ||
| } else if (value instanceof JsonArray childArr) { | ||
| result.put(entry.getKey(), removeMatchedNodesFromArray(childArr, toRemove)); | ||
| } else { | ||
| result.put(entry.getKey(), value); | ||
| } | ||
| } else { | ||
| LOG.finer(() -> "Removed matched node at key '" + entry.getKey() + "'"); | ||
| } | ||
| } | ||
|
|
||
| return JsonObject.of(result); | ||
| } | ||
|
|
||
| private static JsonArray removeMatchedNodesFromArray(JsonArray source, List<JsonValue> toRemove) { | ||
| final var result = new ArrayList<JsonValue>(); | ||
|
|
||
| for (final var element : source.elements()) { | ||
| if (!toRemove.contains(element)) { | ||
| if (element instanceof JsonObject childObj) { | ||
| result.add(removeMatchedNodes(childObj, toRemove)); | ||
| } else if (element instanceof JsonArray childArr) { | ||
| result.add(removeMatchedNodesFromArray(childArr, toRemove)); | ||
| } else { | ||
| result.add(element); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| return JsonArray.of(result); | ||
| } |
There was a problem hiding this comment.
The use of List.contains() inside removeMatchedNodes and removeMatchedNodesFromArray can lead to poor performance (O(N*M)) on large documents. Using a Set with identity hashing for the collection of nodes to remove will improve lookup performance to O(1) on average, making the overall removal process much more efficient (O(N)).
private static JsonObject applyRemoveWithPath(JsonObject source, JsonObject removeObj) {
final var pathValue = removeObj.members().get(JDT_PATH);
if (!(pathValue instanceof JsonString pathStr)) {
throw new JdtException("@jdt.path must be a string");
}
final var path = pathStr.string();
final var jsonPath = JsonPath.parse(path);
final var matches = jsonPath.query(source);
LOG.fine(() -> "Remove with path '" + path + "' - " + matches.size() + " matches");
// For efficient O(1) lookups, convert the list of matches to a set that uses identity hashing.
final Set<JsonValue> toRemove = Collections.newSetFromMap(new IdentityHashMap<>(matches.size()));
toRemove.addAll(matches);
// Remove matched nodes from source
// This requires traversing and rebuilding the object
return removeMatchedNodes(source, toRemove);
}
private static JsonObject removeMatchedNodes(JsonObject source, Set<JsonValue> toRemove) {
final var result = new LinkedHashMap<String, JsonValue>();
for (final var entry : source.members().entrySet()) {
final var value = entry.getValue();
if (!toRemove.contains(value)) {
if (value instanceof JsonObject childObj) {
result.put(entry.getKey(), removeMatchedNodes(childObj, toRemove));
} else if (value instanceof JsonArray childArr) {
result.put(entry.getKey(), removeMatchedNodesFromArray(childArr, toRemove));
} else {
result.put(entry.getKey(), value);
}
} else {
LOG.finer(() -> "Removed matched node at key '" + entry.getKey() + "'");
}
}
return JsonObject.of(result);
}
private static JsonArray removeMatchedNodesFromArray(JsonArray source, Set<JsonValue> toRemove) {
final var result = new ArrayList<JsonValue>();
for (final var element : source.elements()) {
if (!toRemove.contains(element)) {
if (element instanceof JsonObject childObj) {
result.add(removeMatchedNodes(childObj, toRemove));
} else if (element instanceof JsonArray childArr) {
result.add(removeMatchedNodesFromArray(childArr, toRemove));
} else {
result.add(element);
}
}
}
return JsonArray.of(result);
}| private static JsonValue applyToMatches(JsonValue source, List<JsonValue> matches, | ||
| java.util.function.Function<JsonValue, JsonValue> transformer) { | ||
| if (matches.isEmpty()) { | ||
| return source; | ||
| } | ||
|
|
||
| return transformMatchingNodes(source, matches, transformer); | ||
| } | ||
|
|
||
| private static JsonValue transformMatchingNodes(JsonValue node, List<JsonValue> matches, | ||
| java.util.function.Function<JsonValue, JsonValue> transformer) { | ||
|
|
||
| // Check if this node is a match | ||
| for (final var match : matches) { | ||
| if (node == match) { | ||
| return transformer.apply(node); | ||
| } | ||
| } | ||
|
|
||
| // Recurse into children | ||
| if (node instanceof JsonObject obj) { | ||
| final var result = new LinkedHashMap<String, JsonValue>(); | ||
| for (final var entry : obj.members().entrySet()) { | ||
| result.put(entry.getKey(), transformMatchingNodes(entry.getValue(), matches, transformer)); | ||
| } | ||
| return JsonObject.of(result); | ||
| } else if (node instanceof JsonArray arr) { | ||
| final var result = new ArrayList<JsonValue>(); | ||
| for (final var element : arr.elements()) { | ||
| result.add(transformMatchingNodes(element, matches, transformer)); | ||
| } | ||
| return JsonArray.of(result); | ||
| } | ||
|
|
||
| return node; | ||
| } |
There was a problem hiding this comment.
The current implementation of transformMatchingNodes iterates over the matches list for every node in the JSON tree, which can be inefficient (O(N*M)). For better performance, you should use a Set with identity hashing for the matches collection to achieve O(1) average time complexity for lookups.
private static JsonValue applyToMatches(JsonValue source, List<JsonValue> matches,
java.util.function.Function<JsonValue, JsonValue> transformer) {
if (matches.isEmpty()) {
return source;
}
final Set<JsonValue> matchesSet = Collections.newSetFromMap(new IdentityHashMap<>(matches.size()));
matchesSet.addAll(matches);
return transformMatchingNodes(source, matchesSet, transformer);
}
private static JsonValue transformMatchingNodes(JsonValue node, Set<JsonValue> matches,
java.util.function.Function<JsonValue, JsonValue> transformer) {
// Check if this node is a match
if (matches.contains(node)) {
return transformer.apply(node);
}
// Recurse into children
if (node instanceof JsonObject obj) {
final var result = new LinkedHashMap<String, JsonValue>();
for (final var entry : obj.members().entrySet()) {
result.put(entry.getKey(), transformMatchingNodes(entry.getValue(), matches, transformer));
}
return JsonObject.of(result);
} else if (node instanceof JsonArray arr) {
final var result = new ArrayList<JsonValue>();
for (final var element : arr.elements()) {
result.add(transformMatchingNodes(element, matches, transformer));
}
return JsonArray.of(result);
}
return node;
}| This tool deliberately supports only: | ||
| - `properties` (required properties) | ||
| - `optionalProperties` | ||
| - `type` primitives (`string`, `boolean`, `timestamp`, `int8`, `int16`, `int32`, `uint8`, `uint16`, `uint32`, `float32`, `float64`) | ||
| - `enum` | ||
| - `metadata.id` (used for the output filename prefix) | ||
|
|
||
| It rejects other JTD features (`elements`, `values`, `discriminator`/`mapping`, `ref`/`definitions`) and also rejects **nested `properties`** (object schemas inside properties). |
There was a problem hiding this comment.
The implementation in the jtd-esm-codegen module appears to be more capable than what is described here. The JtdParser and EsmRenderer classes include logic for handling elements, values, discriminator, and ref, which are listed here as being rejected. It would be good to update the documentation to reflect the actual capabilities of the tool, or if these features are not fully supported, clarify their status.
…on-java21-jdt To verify: mvn test -pl json-java21-jdt -am -Djava.util.logging.ConsoleHandler.level=INFO
…queries Compiles JsonPath expressions into Java 21 classfiles via the JDK 24+ ClassFile API. Generated code eliminates interpretation overhead by inlining all segment dispatch at codegen time. Supports all 8 segment types: PropertyAccess, ArrayIndex, ArraySlice, Wildcard, RecursiveDescent, Filter, Union, ScriptExpression. Uses runtime helpers for recursive descent (DFS traversal) and filter comparison (value extraction and comparison logic). 29 cross-validation tests verify codegen produces identical results to the interpreter for all supported JsonPath expressions. Also makes JsonPathAst public and adds JsonPath.ast() accessor for cross-module codegen access. To verify: mvn test -pl json-java21-jsonpath-codegen -am
…ing init - Exclude json-java21-jtd-codegen and json-java21-jsonpath-codegen from the Java 21 CI build (they require JDK 24+ ClassFile API) - Update test count to 653 (excluding codegen modules) - Fix CodegenTestBase to handle empty logging level property in CI To verify: mvn test -pl '!json-java21-jtd-codegen,!json-java21-jsonpath-codegen'
Thread a pathResolver function through the JDT engine so callers can plug in bytecode-compiled JsonPath queries instead of the interpreter. The new overload Jdt.transform(source, transform, pathResolver) accepts a Function<String, Function<JsonValue, List<JsonValue>>> that compiles a JsonPath expression into a query function. Default behavior unchanged: uses JsonPath.parse(expr)::query. Example with compiled JsonPath: Jdt.transform(source, transform, expr -> JsonPathCodegen.compile(expr)::query); All 33 existing tests pass unchanged. To verify: mvn test -pl json-java21-jdt -am
Create JdtAst with three sealed node types: - DirectiveNode: contains @jdt.rename/remove/merge/replace + children - MergeNode: default object-to-object deep merge - ReplacementNode: direct value replacement JdtAstParser builds the AST from a JsonValue transform document. Jdt.parseToAst() public API for codegen modules to access the AST. 4 new AST parser tests (22 total JDT unit tests). To verify: mvn test -pl json-java21-jdt -am
Walk the JsonPath AST with exhaustive switch to emit JavaScript that evaluates the expression without interpretation overhead. Generated modules export a query(root) function that returns an array of matched values. Supports all segment types: property access, array index, slice, wildcard, recursive descent, filter, union, script. 18 tests verify structural validity and correct rendering for each segment type. To verify: mvn test -pl json-java21-jsonpath -Dtest=JsonPathEsmRendererTest
Walk the JDT AST with exhaustive switch to emit JavaScript that applies the transform without interpretation overhead. Generated modules export a transform(source) function with inlined directive logic and a deepMerge helper. Supports all JDT directives: rename, remove, merge, replace, plus default recursive merge for non-directive keys. 7 tests verify correct rendering for each directive type and structural validity. To verify: mvn test -pl json-java21-jdt -Dtest=JdtEsmRendererTest
|
closing due to a git-worktree tooling matter. |
Summary
json-java21-transformstojson-java21-jdt(JSON Document Transforms)json.java21.jdtwithJdtengine andJdtException@jdt.replace,@jdt.remove,@jdt.rename,@jdt.mergeFeatures Implemented
Deferred
Path-based operations using
@jdt.pathattribute moved toSkipped/folders for future implementation.Verification
mvn test -pl json-java21-jdt -Djava.util.logging.ConsoleHandler.level=INFO