chore: Make semantic deps default, auto-backfill embeddings, and default search to semantic#586
chore: Make semantic deps default, auto-backfill embeddings, and default search to semantic#586phernandez wants to merge 1 commit intomainfrom
Conversation
- Move semantic deps into default install and update CI/install/docs messaging\n- Add Alembic-triggered one-time embedding backfill during migrations\n- Default search_notes mode to hybrid when semantic is enabled, with explicit text remaining FTS\n- Update CLI/ChatGPT/UI adapters and tests for new defaults Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 3m 35s —— View job PR Review: Make semantic deps default, auto-backfill embeddingsTasks:
Overall this is a well-structured PR. The auto-backfill approach is clever and the literate comments in Issues FoundHigh: Resource leak in
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 23a35dc374
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for entity_id in entity_ids: | ||
| await search_repository.sync_entity_vectors(entity_id) |
There was a problem hiding this comment.
Prevent migration backfill from aborting on one bad entity
The new migration-time backfill calls sync_entity_vectors in a tight loop without per-entity error handling, so any single embedding failure (e.g., one malformed row, provider/auth issue, or transient embedding error) now raises out of run_migrations and can block process startup right after upgrade. This is a regression in robustness compared to the existing SearchService.reindex_vectors() path, which catches per-entity failures and continues; the migration hook should avoid turning one bad entity into a full migration failure.
Useful? React with 👍 / 👎.
Summary
fastembed,sqlite-vec,openai) into the default package installjustfile, CI workflow) and semantic docs to match default dependency inclusionbm reindex --embeddingsbehavior) when the new revision is first appliedsemantic_search_enabledbased on semantic dependency availability (fastembed+sqlite_vec)search_noteschoose default mode dynamically (hybridwhen semantic is enabled, otherwisetext/FTS), while preserving explicitsearch_type="text"as FTSValidation
uv run ruff check src/basic_memory/db.py src/basic_memory/alembic/versions/i2c3d4e5f6g7_auto_backfill_semantic_embeddings.py tests/services/test_initialization.pyuv run pytest tests/services/test_initialization.py -quv run pytest tests/services/test_initialization.py tests/repository/test_fastembed_provider.py tests/repository/test_openai_provider.py tests/mcp/test_tool_search.py -quv run ruff check src/basic_memory/config.py src/basic_memory/mcp/tools/search.py src/basic_memory/mcp/tools/chatgpt_tools.py src/basic_memory/mcp/tools/ui_sdk.py src/basic_memory/cli/commands/tool.py tests/test_config.py tests/mcp/test_tool_search.py tests/mcp/tools/test_chatgpt_tools.pyuv run pyright src/basic_memory/config.py src/basic_memory/mcp/tools/search.py src/basic_memory/mcp/tools/chatgpt_tools.py src/basic_memory/mcp/tools/ui_sdk.py src/basic_memory/cli/commands/tool.py tests/mcp/test_tool_search.py tests/mcp/tools/test_chatgpt_tools.pyuv run pytest tests/test_config.py tests/mcp/test_tool_search.py tests/mcp/tools/test_chatgpt_tools.py -q