Optimize id()/start_id()/end_id() with direct column access#2295
Optimize id()/start_id()/end_id() with direct column access#2295jrgemignani wants to merge 1 commit intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes the execution of id(), start_id(), and end_id() functions by avoiding full vertex/edge object reconstruction. Instead of calling these functions on rebuilt _agtype_build_vertex/_agtype_build_edge expressions, the implementation directly accesses the underlying graphid columns and wraps them with graphid_to_agtype().
Key Changes:
- Added hidden column export mechanism that stores id, start_id, and end_id as target entries when entities pass between clauses
- Implemented cross-clause and current-clause optimization patterns in
try_optimize_id_funcs() - Added validation logic to prevent stale Var references after WITH clauses
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/include/parser/cypher_transform_entity.h | Added id_var, start_id_var, end_id_var, and props_var fields to transform_entity struct to store exposed column Vars |
| src/backend/parser/cypher_transform_entity.c | Initialized the new Var fields to NULL in make_transform_entity |
| src/backend/parser/cypher_expr.c | Implemented optimization logic with try_optimize_id_funcs, extract_id_var_from_entity_expr, and find_entity_in_current_cpstate; includes unrelated whitespace changes |
| src/backend/parser/cypher_clause.c | Added export_entity_hidden_columns to export id columns as hidden target entries and update_entity_vars_from_rte to update entity Vars from RTE columns |
| regress/sql/cypher_match.sql | Added 13 test cases covering WHERE clause optimizations for both current-clause and cross-clause scenarios |
| regress/sql/cypher_with.sql | Added 12 test cases verifying id functions work correctly through WITH clauses, including chained WITH and aliasing |
| regress/expected/cypher_match.out | Updated expected output showing optimized query plans using graphid_to_agtype instead of age_id |
| regress/expected/cypher_with.out | Updated expected output with correct results for WITH clause id function tests |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
eac9cf1 to
3ffb136
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Optimize id()/start_id()/end_id() with direct graphid column access. NOTE: This PR is created with AI tools and a human. Add optimization to avoid rebuilding full vertex/edge objects when only the id, start_id, or end_id is needed. Instead of calling age_id() on a reconstructed _agtype_build_vertex/_agtype_build_edge, directly access the underlying graphid column. This enables PostgreSQL to use B-tree indexes on graphid columns for significantly improved query performance. Implementation: - Export hidden columns as raw graphid type (not agtype-wrapped) when entities pass between clauses via export_entity_hidden_columns() - Store Var references (id_var, start_id_var, end_id_var) in transform_entity - try_optimize_id_funcs() returns raw graphid Var for optimizable patterns - Add cross-type comparison operators (graphid vs int8) with btree support - Update graphid_ops operator class to include cross-type operators, enabling Index Cond scans instead of post-scan Filter evaluation - Add graphid support to agtype_volatile_wrapper() for SET/MERGE contexts Optimized patterns: MATCH (p) WHERE id(p) > 0 RETURN p MATCH ()-[e]->() WHERE id(e) > 0 RETURN e MATCH ()-[e]->() WHERE start_id(e) > 0 RETURN e MATCH ()-[e]->() WHERE end_id(e) > 0 RETURN e All regression tests passed. Added regression tests for the new graphid operators. Files changed: - src/include/parser/cypher_transform_entity.h: Add id_var, start_id_var, end_id_var fields to transform_entity struct - src/backend/parser/cypher_clause.c: Export raw graphid columns - src/backend/parser/cypher_expr.c: Add try_optimize_id_funcs(), extract_id_var_from_entity_expr(); return raw graphid Var - src/backend/utils/adt/graphid.c: Add 12 cross-type comparison functions (graphid_eq_int8, etc.) and 2 btree comparison functions (graphid_btree_cmp_int8, int8_btree_cmp_graphid) - src/backend/utils/adt/agtype.c: Add GRAPHIDOID to agtype_volatile_wrapper() - sql/age_main.sql: Add cross-type operators, functions, and update graphid_ops operator class with cross-type btree support - age--1.6.0--y.y.y.sql: Add upgrade definitions for new functions, operators, and operator class - regress/sql/graphid.sql: Add cross-type operator and index tests - regress/sql/cypher_match.sql: Add WHERE optimization tests - regress/expected/*.out: Update expected output
3ffb136 to
ae06570
Compare
NOTE: This PR was created with AI tools and a human
Optimize id()/start_id()/end_id() with direct graphid column access.
Add optimization to avoid rebuilding full vertex/edge objects when only
the id, start_id, or end_id is needed. Instead of calling age_id() on a
reconstructed _agtype_build_vertex/_agtype_build_edge, directly access
the underlying graphid column. This enables PostgreSQL to use B-tree
indexes on graphid columns for significantly improved query performance.
Implementation:
entities pass between clauses via export_entity_hidden_columns()
transform_entity
patterns
support
enabling Index Cond scans instead of post-scan Filter evaluation
contexts
Optimized patterns:
MATCH (p) WHERE id(p) > 0 RETURN p
MATCH ()-[e]->() WHERE id(e) > 0 RETURN e
MATCH ()-[e]->() WHERE start_id(e) > 0 RETURN e
MATCH ()-[e]->() WHERE end_id(e) > 0 RETURN e
All regression tests passed.
Added regression tests for the new graphid operators.
Files changed:
end_id_var fields to transform_entity struct
extract_id_var_from_entity_expr(); return raw graphid Var
(graphid_eq_int8, etc.) and 2 btree comparison functions
(graphid_btree_cmp_int8, int8_btree_cmp_graphid)
agtype_volatile_wrapper()
graphid_ops operator class with cross-type btree support
operators, and operator class