Skip to content

Conversation

@ArnabChatterjee20k
Copy link
Contributor

@ArnabChatterjee20k ArnabChatterjee20k commented Dec 31, 2025

Query method added

Query::regex

Adapters supporting it

Mongodb, mariadb/mysql, postgres

Mongdb, postgres are case sensitive by default
mysql/mariadb are case insensitive by default

Index added and optimisaton

Postgres cases

SELECT * FROM authors WHERE name ~ 'param.*';    /* Regex Query */

~ means case sensitive regex match but we can manipulate from the pattern getting passed inside the regex to make it case insenstive. WHERE username ~ '(?i)param';

Optimisation -> using pg_trgm extension -> trigram index

by default pg_trgm is included in the postgres:16 onwards so no need to change the dockerfile

CREATE EXTENSION pg_trgm;
CREATE INDEX idx_users_username_trgm
ON users USING gin (username gin_trgm_ops, lastname gin_trgm_ops);

-- for each attribute gin_trgm_ops must be applied --

Trigram index -> An index (GIN or GiST) that stores trigrams instead of full strings

word -> author

trigrams ->

"** au**"

"aut"

"uth"

"tho"

"hor"

"**or **"

Difference in working of postgres regex engine

Mongodb and mysql/mariadb are PCRE based
so regex works a bit diffferent
Example -> \bWork\b works as word boundaray in general regex but \b is using a backspace escape in postgres posix based engine. And many more

Summary by CodeRabbit

Release Notes

  • New Features

    • Added regex query support for pattern-based record searching across database adapters
    • Added trigram index support for optimized text indexing (availability depends on database type)
    • Added capability detection to check database support for regex and trigram features
  • Tests

    • Added comprehensive validation tests for regex queries and trigram indexes

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 31, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (2)
  • main
  • 0.69.x

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

The changes introduce regex query support and trigram index capabilities across the database adapter layer. New abstract methods in the base Adapter class expose regex (PCRE/POSIX) and trigram support flags, each adapter implements these with appropriate return values, and Query/Database classes add regex queries and trigram indexes with corresponding validation logic.

Changes

Cohort / File(s) Summary
Adapter Interface & Base SQL
src/Database/Adapter.php, src/Database/Adapter/SQL.php
Introduces abstract capability methods (getSupportForTrigramIndex, getSupportForPRCERegex, getSupportForPOSIXRegex) and adds getSupportForRegex() helper. SQL adapter adds getRegexOperator() method returning 'REGEXP'.
Adapter Implementations
src/Database/Adapter/MariaDB.php, src/Database/Adapter/SQLite.php
Implement new capability methods, returning false for all regex and trigram support.
Adapter Implementations (Postgres & Mongo)
src/Database/Adapter/Postgres.php, src/Database/Adapter/Mongo.php
Postgres: enables POSIX regex and trigram support (true); adds pg_trgm extension and trigram index handling with USING gin_trgm_ops; adds getRegexOperator() returning '~'. Mongo: enables PCRE regex (true); adds TYPE_REGEX operator mapping; disables POSIX and trigram.
Adapter Pool
src/Database/Adapter/Pool.php
Adds three delegation methods mirroring adapter capability checks.
Query & Database
src/Database/Query.php, src/Database/Database.php
Query adds TYPE_REGEX constant and static regex() helper. Database adds INDEX_TRIGRAM constant and extends index validation to include trigram support flag.
Validators
src/Database/Validator/Index.php, src/Database/Validator/Queries.php, src/Database/Validator/Query/Filter.php
Index validator adds checkTrigramIndexes() with validation logic for trigram-only attributes and restrictions. Queries validator maps TYPE_REGEX to filter method type. Filter validator adds TYPE_REGEX support as single-value filter.
Integration Tests
tests/e2e/Adapter/Scopes/DocumentTests.php, tests/e2e/Adapter/Scopes/IndexTests.php
DocumentTests adds testFindRegex() with extensive regex query scenarios across adapters. IndexTests adds testTrigramIndex() and testTrigramIndexValidation() with conditional execution based on adapter support.
Unit Tests
tests/unit/Validator/IndexTest.php
Adds testTrigramIndexValidation() covering valid/invalid trigram index scenarios with various attribute types and configurations.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Validate create index array #628: Modifies adapter capability flags and updates Index validator constructor signature with new boolean parameters.
  • Primary attributes #623: Adds new abstract methods to Adapter base class, similar interface expansion pattern.
  • Feat mongo tmp #647: Modifies adapter implementations and validator APIs to add capability flags and query/index handling.

Suggested reviewers

  • fogelito
  • abnegate

Poem

🐰 A regex rabbit hops through postgres fields,
With trigram indexes, triumphant yields!
PCRE, POSIX, each adapter now knows,
What patterns it masters, what support it shows,
Validation and queries in harmony blend,
Database capabilities verified, my friend! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title "Query regex" is vague and overly generic; it does not convey specific information about what was changed or why. Consider a more descriptive title such as "Add regex query support across database adapters" or "Implement Query::regex method with adapter-specific implementations" to better reflect the scope and intent of the changes.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ArnabChatterjee20k
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 31, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@ArnabChatterjee20k ArnabChatterjee20k requested review from abnegate and fogelito and removed request for abnegate December 31, 2025 09:02
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/Database/Query.php (1)

254-301: Add TYPE_REGEX to isMethod() to allow parsing regex queries

TYPE_REGEX is missing from isMethod(), so parse() / parseQuery() will currently reject queries whose method is "regex" even though the type is otherwise supported. That makes the JSON/array parsing path inconsistent with the fluent Query::regex() helper.

Consider adding it to the match:

Proposed fix
     public static function isMethod(string $value): bool
     {
         return match ($value) {
             self::TYPE_EQUAL,
@@
             self::TYPE_SELECT,
             self::TYPE_VECTOR_DOT,
             self::TYPE_VECTOR_COSINE,
-            self::TYPE_VECTOR_EUCLIDEAN => true,
+            self::TYPE_VECTOR_EUCLIDEAN,
+            self::TYPE_REGEX => true,
             default => false,
         };
     }
🧹 Nitpick comments (6)
src/Database/Adapter/Mongo.php (1)

2744-2762: Regex/trigram capability flags for Mongo are consistent (minor naming nit)

Declaring:

  • getSupportForPRCERegex(): true
  • getSupportForPOSIXRegex(): false
  • getSupportForTrigramIndex(): false

matches Mongo’s actual capabilities and aligns with the new abstract methods in Adapter and their use in Pool and other adapters. The only nit is the “PRCE” spelling, which likely wants to be “PCRE”; since this is a newly introduced API, now would be the easiest time to correct the name across Adapter, Pool, and all adapters if you want to fix it.

Also applies to: 3246-3249

tests/unit/Validator/IndexTest.php (1)

481-598: Trigram index validation test is thorough; consider using named args for clarity

The new testTrigramIndexValidation() nicely exercises all key paths in checkTrigramIndexes():

  • happy paths (single and multi string attributes),
  • non‑string and mixed‑type failures,
  • orders/lengths disallowed,
  • and the “feature disabled” case.

Given the growing boolean flag list on Index::__construct, the first validator uses a named supportForTrigramIndexes: true argument, which is clear. For \$validatorNoSupport, you currently rely on a long positional boolean chain; mirroring the named argument style there (e.g., supportForTrigramIndexes: false) would reduce the risk of future parameter‑order mistakes.

src/Database/Adapter.php (1)

1446-1478: Consider fixing the getSupportForPRCERegex name before the API solidifies

The new capability surface looks good:

  • getSupportForTrigramIndex() as an abstract flag.
  • Separate PCRE vs POSIX regex flags plus a getSupportForRegex() helper.

However, the PCRE method is consistently named getSupportForPRCERegex() (letters swapped) while the docblock correctly refers to “PCRE (Perl Compatible Regular Expressions)”. Because this is an abstract method on the core Adapter type, the typo becomes part of the public API and must be implemented everywhere.

Before this spreads further, I’d strongly recommend renaming it to getSupportForPCRERegex() and updating all implementations and call sites in this PR; doing so later will be a breaking change.

To scope the rename, you can search for the current spelling:

#!/bin/bash
rg -n "getSupportForPRCERegex" -C2
tests/e2e/Adapter/Scopes/DocumentTests.php (1)

6552-7034: testFindRegex is thorough; consider a few small cleanups

The test logic and coverage across engines/adapters look solid and consistent with the new Query::regex semantics. A few nits you might want to tidy up:

  • Several $pattern locals are assigned but never used (e.g., around Line 6713, 6727, 6741, 6753, 6767). They can be dropped to reduce noise.
  • The $verifyRegexQuery helper always uses PHP’s PCRE engine to validate results. That’s fine for the current pattern set (simple anchors, .*, $, colon, etc.), but if you later add more POSIX‑specific constructs, it may be worth documenting that assumption near the helper to avoid future mismatches.

No functional issues; these are just clarity/maintenance tweaks.

src/Database/Database.php (1)

3671-3679: Trigram index creation gating and error messaging are consistent with existing patterns

  • The new case self::INDEX_TRIGRAM branch correctly guards index creation with getSupportForTrigramIndex() and throws a clear exception when unsupported.
  • The “Unknown index type” error text now includes Database::INDEX_TRIGRAM, keeping user-facing guidance accurate.
  • Adding the trigram support flag into the IndexValidator call in createIndex() keeps validation behavior aligned across create/alter flows.

If you find yourself adding more index types later, consider building the allowed-type list for the exception message from the constants instead of hardcoding the concatenated string, but that’s optional.

Also applies to: 3721-3735

tests/e2e/Adapter/Scopes/IndexTests.php (1)

176-178: Consider documenting or using named parameter for the hardcoded false value.

The hardcoded false on line 177 is not self-documenting. While PHP 8.0+ supports named arguments, at minimum a trailing comment would clarify what this boolean controls, improving maintainability.

             $database->getAdapter()->getSupportForMultipleFulltextIndexes(),
             $database->getAdapter()->getSupportForIdenticalIndexes(),
-            false,
+            false, // $supportForIndexAlias (or appropriate param name)
             $database->getAdapter()->getSupportForTrigramIndex()
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c8c1b2f and e18512d.

📒 Files selected for processing (15)
  • src/Database/Adapter.php
  • src/Database/Adapter/MariaDB.php
  • src/Database/Adapter/Mongo.php
  • src/Database/Adapter/Pool.php
  • src/Database/Adapter/Postgres.php
  • src/Database/Adapter/SQL.php
  • src/Database/Adapter/SQLite.php
  • src/Database/Database.php
  • src/Database/Query.php
  • src/Database/Validator/Index.php
  • src/Database/Validator/Queries.php
  • src/Database/Validator/Query/Filter.php
  • tests/e2e/Adapter/Scopes/DocumentTests.php
  • tests/e2e/Adapter/Scopes/IndexTests.php
  • tests/unit/Validator/IndexTest.php
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: ArnabChatterjee20k
Repo: utopia-php/database PR: 613
File: src/Database/Adapter/Postgres.php:1254-1319
Timestamp: 2025-07-01T11:31:37.438Z
Learning: In PostgreSQL adapter methods like getUpsertStatement, complexity for database-specific SQL generation is acceptable when the main business logic is properly separated in the parent SQL adapter class, following the adapter pattern where each database adapter handles its own SQL syntax requirements.
📚 Learning: 2025-10-16T08:48:36.715Z
Learnt from: fogelito
Repo: utopia-php/database PR: 739
File: src/Database/Adapter/Postgres.php:154-158
Timestamp: 2025-10-16T08:48:36.715Z
Learning: For the utopia-php/database repository, no migration scripts are needed for the collation change from utf8_ci to utf8_ci_ai in the Postgres adapter because there is no existing production data.

Applied to files:

  • src/Database/Adapter/Postgres.php
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
Repo: utopia-php/database PR: 721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.

Applied to files:

  • tests/e2e/Adapter/Scopes/IndexTests.php
  • tests/unit/Validator/IndexTest.php
  • tests/e2e/Adapter/Scopes/DocumentTests.php
  • src/Database/Database.php
📚 Learning: 2025-10-03T01:50:11.943Z
Learnt from: abnegate
Repo: utopia-php/database PR: 721
File: tests/e2e/Adapter/Scopes/AttributeTests.php:1329-1334
Timestamp: 2025-10-03T01:50:11.943Z
Learning: MongoDB has a 1024kb (1,048,576 bytes) limit for index entries. The MongoDB adapter's getMaxIndexLength() method should return this limit rather than 0.

Applied to files:

  • tests/e2e/Adapter/Scopes/IndexTests.php
📚 Learning: 2025-10-29T12:27:57.071Z
Learnt from: ArnabChatterjee20k
Repo: utopia-php/database PR: 747
File: src/Database/Adapter/Mongo.php:1449-1453
Timestamp: 2025-10-29T12:27:57.071Z
Learning: In src/Database/Adapter/Mongo.php, when getSupportForAttributes() returns false (schemaless mode), the updateDocument method intentionally uses a raw document without $set operator for replacement-style updates, as confirmed by the repository maintainer ArnabChatterjee20k.

Applied to files:

  • src/Database/Adapter/Mongo.php
📚 Learning: 2025-10-16T09:37:33.531Z
Learnt from: fogelito
Repo: utopia-php/database PR: 733
File: src/Database/Adapter/MariaDB.php:1801-1806
Timestamp: 2025-10-16T09:37:33.531Z
Learning: In the MariaDB adapter (src/Database/Adapter/MariaDB.php), only duplicate `_uid` violations should throw `DuplicateException`. All other unique constraint violations, including `PRIMARY` key collisions on the internal `_id` field, should throw `UniqueException`. This is the intended design to distinguish between user-facing document duplicates and internal/user-defined unique constraint violations.

Applied to files:

  • src/Database/Database.php
🧬 Code graph analysis (10)
src/Database/Adapter/SQLite.php (5)
src/Database/Adapter/Postgres.php (2)
  • getSupportForPRCERegex (2122-2125)
  • getSupportForPOSIXRegex (2127-2130)
src/Database/Adapter.php (2)
  • getSupportForPRCERegex (1459-1459)
  • getSupportForPOSIXRegex (1467-1467)
src/Database/Adapter/MariaDB.php (2)
  • getSupportForPRCERegex (2239-2242)
  • getSupportForPOSIXRegex (2244-2247)
src/Database/Adapter/Mongo.php (2)
  • getSupportForPRCERegex (2749-2752)
  • getSupportForPOSIXRegex (2759-2762)
src/Database/Adapter/Pool.php (2)
  • getSupportForPRCERegex (368-371)
  • getSupportForPOSIXRegex (373-376)
src/Database/Adapter/SQL.php (2)
src/Database/Query.php (1)
  • Query (8-1195)
src/Database/Adapter/Postgres.php (1)
  • getRegexOperator (2148-2151)
src/Database/Adapter.php (5)
src/Database/Adapter/Postgres.php (3)
  • getSupportForTrigramIndex (2132-2135)
  • getSupportForPRCERegex (2122-2125)
  • getSupportForPOSIXRegex (2127-2130)
src/Database/Adapter/MariaDB.php (3)
  • getSupportForTrigramIndex (2234-2237)
  • getSupportForPRCERegex (2239-2242)
  • getSupportForPOSIXRegex (2244-2247)
src/Database/Adapter/Mongo.php (3)
  • getSupportForTrigramIndex (3246-3249)
  • getSupportForPRCERegex (2749-2752)
  • getSupportForPOSIXRegex (2759-2762)
src/Database/Adapter/Pool.php (3)
  • getSupportForTrigramIndex (378-381)
  • getSupportForPRCERegex (368-371)
  • getSupportForPOSIXRegex (373-376)
src/Database/Adapter/SQLite.php (2)
  • getSupportForPRCERegex (1886-1889)
  • getSupportForPOSIXRegex (1897-1900)
src/Database/Adapter/Pool.php (6)
src/Database/Adapter/Postgres.php (3)
  • getSupportForPRCERegex (2122-2125)
  • getSupportForPOSIXRegex (2127-2130)
  • getSupportForTrigramIndex (2132-2135)
src/Database/Adapter.php (3)
  • getSupportForPRCERegex (1459-1459)
  • getSupportForPOSIXRegex (1467-1467)
  • getSupportForTrigramIndex (1451-1451)
src/Database/Adapter/MariaDB.php (3)
  • getSupportForPRCERegex (2239-2242)
  • getSupportForPOSIXRegex (2244-2247)
  • getSupportForTrigramIndex (2234-2237)
src/Database/Adapter/Mongo.php (3)
  • getSupportForPRCERegex (2749-2752)
  • getSupportForPOSIXRegex (2759-2762)
  • getSupportForTrigramIndex (3246-3249)
src/Database/Adapter/SQLite.php (2)
  • getSupportForPRCERegex (1886-1889)
  • getSupportForPOSIXRegex (1897-1900)
src/Database/Mirror.php (1)
  • delegate (88-103)
src/Database/Adapter/Postgres.php (3)
src/Database/Adapter.php (3)
  • getSupportForPRCERegex (1459-1459)
  • getSupportForPOSIXRegex (1467-1467)
  • getSupportForTrigramIndex (1451-1451)
src/Database/Database.php (1)
  • Database (37-8790)
src/Database/Adapter/SQLite.php (2)
  • getSupportForPRCERegex (1886-1889)
  • getSupportForPOSIXRegex (1897-1900)
tests/unit/Validator/IndexTest.php (2)
src/Database/Validator/Index.php (2)
  • Index (10-622)
  • getDescription (76-79)
src/Database/Validator/Query/Base.php (1)
  • getDescription (25-28)
src/Database/Adapter/Mongo.php (6)
src/Database/Query.php (1)
  • Query (8-1195)
src/Database/Adapter/Postgres.php (3)
  • getSupportForPRCERegex (2122-2125)
  • getSupportForPOSIXRegex (2127-2130)
  • getSupportForTrigramIndex (2132-2135)
src/Database/Adapter.php (3)
  • getSupportForPRCERegex (1459-1459)
  • getSupportForPOSIXRegex (1467-1467)
  • getSupportForTrigramIndex (1451-1451)
src/Database/Adapter/MariaDB.php (3)
  • getSupportForPRCERegex (2239-2242)
  • getSupportForPOSIXRegex (2244-2247)
  • getSupportForTrigramIndex (2234-2237)
src/Database/Adapter/Pool.php (3)
  • getSupportForPRCERegex (368-371)
  • getSupportForPOSIXRegex (373-376)
  • getSupportForTrigramIndex (378-381)
src/Database/Adapter/SQLite.php (2)
  • getSupportForPRCERegex (1886-1889)
  • getSupportForPOSIXRegex (1897-1900)
tests/e2e/Adapter/Scopes/DocumentTests.php (9)
src/Database/Query.php (3)
  • Query (8-1195)
  • regex (1191-1194)
  • or (798-801)
tests/e2e/Adapter/Base.php (1)
  • getDatabase (46-46)
tests/e2e/Adapter/MongoDBTest.php (1)
  • getDatabase (32-66)
src/Database/Adapter/Postgres.php (4)
  • getSupportForPRCERegex (2122-2125)
  • getSupportForPOSIXRegex (2127-2130)
  • create (139-168)
  • delete (178-186)
src/Database/Adapter/MariaDB.php (4)
  • getSupportForPRCERegex (2239-2242)
  • getSupportForPOSIXRegex (2244-2247)
  • create (31-46)
  • delete (56-67)
src/Database/Adapter/Mongo.php (4)
  • getSupportForPRCERegex (2749-2752)
  • getSupportForPOSIXRegex (2759-2762)
  • create (328-331)
  • delete (388-393)
src/Database/Adapter/Pool.php (4)
  • getSupportForPRCERegex (368-371)
  • getSupportForPOSIXRegex (373-376)
  • create (133-136)
  • delete (148-151)
src/Database/Adapter/SQLite.php (4)
  • getSupportForPRCERegex (1886-1889)
  • getSupportForPOSIXRegex (1897-1900)
  • create (116-119)
  • delete (129-132)
src/Database/Validator/Queries/Document.php (1)
  • Document (10-44)
src/Database/Database.php (5)
src/Database/Adapter/Postgres.php (2)
  • getSupportForTrigramIndex (2132-2135)
  • getSupportForObject (2222-2225)
src/Database/Adapter.php (2)
  • getSupportForTrigramIndex (1451-1451)
  • getSupportForObject (1080-1080)
src/Database/Adapter/MariaDB.php (2)
  • getSupportForTrigramIndex (2234-2237)
  • getSupportForObject (2138-2141)
src/Database/Adapter/Mongo.php (2)
  • getSupportForTrigramIndex (3246-3249)
  • getSupportForObject (2814-2817)
src/Database/Adapter/Pool.php (2)
  • getSupportForTrigramIndex (378-381)
  • getSupportForObject (603-606)
src/Database/Validator/Queries.php (3)
src/Database/Query.php (1)
  • Query (8-1195)
tests/e2e/Adapter/Base.php (1)
  • Base (24-76)
src/Database/Validator/Query/Base.php (1)
  • Base (7-58)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (Mirror)
🔇 Additional comments (19)
src/Database/Validator/Query/Filter.php (1)

326-344: Treating TYPE_REGEX as a single‑value filter is consistent

Adding Query::TYPE_REGEX to the group that requires exactly one value and then delegating to isValidAttributeAndValues() aligns it with other single‑value string filters (search, startsWith, etc.) and correctly reuses existing type validation.

src/Database/Query.php (1)

29-29: TYPE_REGEX constant and regex() helper are wired consistently

Exposing TYPE_REGEX, adding it to TYPES, and providing Query::regex($attribute, $pattern) (single pattern value) is consistent with the other helper factories and with the filter validator’s expectation of exactly one value.

Also applies to: 69-114, 1184-1194

src/Database/Adapter/SQL.php (1)

1771-1799: Regex operator integration into SQL adapters looks correct

Routing Query::TYPE_REGEX through getSQLOperator() and introducing a dedicated getRegexOperator() (defaulting to 'REGEXP' here and overridden in adapters like Postgres) cleanly extends the operator surface without affecting existing conditions or bindings.

Also applies to: 2289-2295

src/Database/Adapter/Mongo.php (1)

2452-2477: TYPE_REGEX mapping to $regex is correct; be aware it bypasses createSafeRegex()

Mapping Query::TYPE_REGEX to Mongo’s $regex operator plugs the new query type cleanly into buildFilter(), which will now produce:

$filter[$attribute]['$regex'] = $pattern;

Unlike the higher‑level text helpers (contains, search, notSearch, not*With), this path does not go through createSafeRegex(), so patterns are taken as raw PCRE. That’s appropriate for a low‑level regex primitive, but it means any safety/validation for user‑supplied patterns must be enforced at a higher layer.

From the adapter’s perspective the wiring looks sound.

src/Database/Adapter/SQLite.php (1)

1880-1900: Explicitly declaring SQLite has no built‑in regex support looks correct

Both capability methods returning false match SQLite’s lack of native REGEXP without user‑defined functions; this is a sensible, conservative default.

src/Database/Validator/Queries.php (1)

80-127: Routing TYPE_REGEX and TYPE_VECTOR_EUCLIDEAN through filter validators is consistent

Adding Query::TYPE_VECTOR_EUCLIDEAN and Query::TYPE_REGEX to the METHOD_TYPE_FILTER group cleanly integrates them into existing validation without special‑casing.

src/Database/Adapter/Pool.php (1)

368-381: Delegation of regex/trigram capability flags is correctly wired

The three new methods mirror the existing delegation pattern and keep the pool adapter’s capability surface in sync with concrete adapters.

src/Database/Adapter/Postgres.php (3)

155-157: LGTM!

The pg_trgm extension creation follows the established pattern for enabling PostgreSQL extensions alongside postgis and vector.


903-931: LGTM - Trigram index SQL generation correctly uses GIN with gin_trgm_ops.

The implementation correctly:

  1. Maps INDEX_TRIGRAM to 'INDEX' for GIN index creation
  2. Applies gin_trgm_ops operator class to each attribute for trigram-based similarity searches

Note: Trigram indexes intentionally don't include _tenant prefix (lines 913-916) since GIN indexes work differently than B-tree indexes for multi-tenant filtering.


2145-2151: LGTM!

The ~ operator is the correct PostgreSQL POSIX regex match operator for case-sensitive pattern matching, aligning with the PR's documentation that PostgreSQL uses POSIX-style regex.

src/Database/Validator/Index.php (3)

47-48: LGTM!

The new $supportForTrigramIndexes parameter follows the established pattern for index type support flags with a sensible default of false.


142-144: LGTM!

The trigram index validation check is correctly integrated into the validation chain.


470-506: LGTM!

The checkTrigramIndexes method correctly validates:

  1. Adapter support for trigram indexes
  2. String-only attribute type constraint
  3. No orders or lengths (as these aren't applicable to GIN trigram indexes)

The method intentionally allows multiple attributes (unlike vector/spatial/object indexes), which is appropriate since PostgreSQL GIN indexes can index multiple columns.

src/Database/Database.php (3)

79-88: New INDEX_TRIGRAM constant is well-integrated

Constant name/value aligns with existing index type conventions and keeps index kinds centrally defined; no issues here.


1631-1646: Propagating trigram support into collection-level index validation looks correct

Passing $this->adapter->getSupportForTrigramIndex() into IndexValidator during createCollection() ensures trigram indexes are validated against adapter capabilities at collection creation time, consistent with other feature flags (arrays, spatial, vectors, object, etc.). Argument ordering matches the other call sites, so this looks sound.


2778-2792: Update-attribute path now trigram-aware via IndexValidator

Extending the IndexValidator call in updateAttribute() with both getSupportForObject() and getSupportForTrigramIndex() keeps index validation consistent when attribute definitions change, preventing incompatible trigram indexes from slipping through on adapters that don’t support them. The added flags are in the same relative position as in other call sites, which helps avoid ctor-signature drift.

tests/e2e/Adapter/Scopes/IndexTests.php (3)

269-271: Same hardcoded false pattern as above.

Apply the same documentation improvement here for consistency.


652-699: Well-structured trigram index test with proper cleanup.

The test correctly:

  • Guards with early return when trigram support is unavailable
  • Tests CRUD operations for trigram indexes
  • Uses try/finally to ensure cleanup regardless of test outcome
  • Verifies index metadata (type, attributes) after creation

701-772: Thorough validation test covering key constraint scenarios.

The test comprehensively validates:

  • Type constraints (string-only attributes)
  • Multi-attribute trigram indexes
  • Mixed-type rejection
  • Orders/lengths constraints

The use of assertStringContainsString for error message verification is appropriate as it decouples the test from exact wording changes.

@fogelito
Copy link
Contributor

LGTM let's wait for @abnegate review.
Will have to update joins PR later on

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants