-
-
Notifications
You must be signed in to change notification settings - Fork 174
Add support for Upserts #753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
…compilation error handling to helix build (HelixDB#751) <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR makes two key improvements: **1. Removes static globals from config generation** - Replaces process-wide `OnceLock` statics (`INTROSPECTION_DATA`, `SECONDARY_INDICES`) with instance-scoped data flow through `GeneratedSource`. This architectural change improves thread safety and eliminates potential issues with concurrent compilation of multiple instances. **2. Adds Rust compilation error handling for Docker builds** - When a Docker build fails due to Rust compilation errors (potentially from a bug in the Helix code generator), the CLI now detects this, displays a helpful error message, and offers to open a pre-filled GitHub issue with diagnostic information. - Introspection data and secondary indices now flow through `GeneratedSource` instead of global statics - New `DockerBuildError` enum distinguishes Rust compilation failures from other Docker errors - `Config::fmt_with_schema()` method replaces `Display` impl for code generation with explicit parameters - Minor optimization: spinner frames changed from `Vec` to array literal <details><summary><h3>Important Files Changed</h3></summary> File Analysis | Filename | Score | Overview | |----------|-------|----------| | helix-db/src/helixc/analyzer/mod.rs | 5/5 | Removed static `INTROSPECTION_DATA` and `SECONDARY_INDICES` OnceLocks, moving data into `GeneratedSource` struct for thread-safe, instance-scoped operation. | | helix-db/src/helixc/generator/mod.rs | 5/5 | Added `introspection_data` and `secondary_indices` fields to `Source` struct; Display impl now calls `fmt_with_schema` for proper code generation. | | helix-db/src/helix_engine/traversal_core/config.rs | 4/5 | Refactored config formatting - moved Display logic to `fmt_with_schema` method that takes introspection data as parameters instead of reading from statics. Contains outdated comment referencing removed `INTROSPECTION_DATA`. | | helix-cli/src/commands/build.rs | 5/5 | Added Rust compilation error handling during Docker builds with GitHub issue creation workflow for user bug reporting. | | helix-cli/src/docker.rs | 5/5 | Added `DockerBuildError` enum to distinguish Rust compilation failures from other Docker errors, enabling targeted error handling. | | helix-cli/src/utils.rs | 5/5 | Changed spinner frames from `vec![]` to array literal `[]` - minor performance improvement avoiding heap allocation. | </details> </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant User participant CLI as helix build participant Analyzer participant Generator participant Docker participant Browser User->>CLI: helix build instance_name CLI->>Analyzer: analyze(source) Analyzer->>Analyzer: Build introspection_data Analyzer->>Analyzer: Collect secondary_indices Analyzer-->>CLI: GeneratedSource with embedded data CLI->>Generator: fmt_with_schema(introspection_data, secondary_indices) Generator-->>CLI: Generated Rust code CLI->>Docker: build_image() alt Rust Compilation Error Docker-->>CLI: DockerBuildError::RustCompilation CLI->>CLI: handle_docker_rust_compilation_failure() CLI->>User: Display error, offer to create issue opt User accepts CLI->>Browser: Open GitHub issue with diagnostics end else Success Docker-->>CLI: Ok(()) CLI-->>User: Build successful end ``` </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->
Needs some cleaning and better error handling tho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (4)
-
helix-db/src/helix_engine/traversal_core/ops/util/upsert.rs, line 309 (link)style: TODO comment indicates error handling needs improvement - operations continue even after errors occur, wasting resources. Consider early returns or using
?operator to propagate errors immediately. -
helix-db/src/helix_engine/traversal_core/ops/util/upsert.rs, line 62-73 (link)style: The upsert logic checks
self.inner.next()to determine if an entity exists, but there's no clear mechanism or documentation explaining how users should pass an existing entity to the upsert operation. The tests only verify parser compilation, not actual update behavior. Consider adding documentation/examples showing the update path and runtime tests that verify both insert and update cases. -
helix-db/tests/upsert_integration_test.rs, line 232-254 (link)style: Test only verifies parsing, not runtime behavior. Add runtime tests that: (1) create an entity then upsert it to verify update, (2) upsert a non-existent entity to verify insert, (3) test property merging, (4) verify secondary indices and BM25 updates.
-
helix-db/src/helix_engine/traversal_core/ops/util/upsert.rs, line 108-147 (link)style: When updating secondary indices for changed properties, if
old.get(k)returnsNone, youcontinuewithout creating the new index. New properties are handled separately in thedifflogic (lines 149-172). Consider consolidating this logic for clarity.
14 files reviewed, 4 comments
|
@el-yawd should add some helixql tests in hql-tests and unit tests |
84d1096 to
d42b17c
Compare
Description
Related Issues
Closes #
Checklist when merging to main
rustfmthelix-cli/Cargo.tomlandhelixdb/Cargo.tomlAdditional Notes
Greptile Overview
Greptile Summary
This PR adds comprehensive support for upsert operations (
UpsertN,UpsertE,UpsertV) to HelixDB, enabling update-or-insert semantics for nodes, edges, and vectors. The implementation follows the existing patterns forAddoperations and includes changes across the grammar, parser, analyzer, code generator, and runtime engine.Key Changes
UpsertN,UpsertE, andUpsertVsyntax matchingAddN,AddE, andAddVpatternsObservations
self.inner.next()to determine if an entity exists, but there's no clear documentation or examples showing how users should pass existing entities to trigger the update path vs the insert pathCargo.tomlfiles show version was bumped to 1.1.4Recommendations
existing <- N<Person>(id)followed byupdated <- existing::UpsertN<Person>({...})?)Important Files Changed
File Analysis
UpsertN,UpsertE, andUpsertVoperations, mirroring the structure ofAddN,AddE, andAddV. Updated step and creation_stmt rules to include upsert operations.UpsertNode,UpsertEdge, andUpsertVectorexpressions, following the same pattern as Add operations with property validation and type checking.UpsertN,UpsertE, andUpsertV, following the same structure as Add operations. Includes comprehensive unit tests.Sequence Diagram
sequenceDiagram participant User participant Parser participant Analyzer participant Generator participant Runtime participant Storage User->>Parser: UpsertN<Person>({name: "Alice"}) Parser->>Parser: parse_upsert_node() Parser->>Analyzer: UpsertNode AST Analyzer->>Analyzer: Validate node type exists Analyzer->>Analyzer: Validate properties & types Analyzer->>Generator: Generate UpsertN code Generator->>Generator: Create upsert_n() call Generator->>Runtime: Execute traversal with upsert_n() Runtime->>Runtime: Check if entity in iterator alt Entity exists (iterator has item) Runtime->>Runtime: Merge properties (old + new) Runtime->>Storage: Update secondary indices Runtime->>Storage: Update BM25 index Runtime->>Storage: Update node in nodes_db Runtime-->>User: Return updated node else Entity does not exist (iterator empty) Runtime->>Runtime: Create new node with UUID Runtime->>Storage: Insert secondary indices Runtime->>Storage: Insert BM25 index Runtime->>Storage: Insert node in nodes_db Runtime-->>User: Return new node end