Skip to content

BUG: New entities silently skip embedding generation after sqlite-vec load failure #578

@bm-clawd

Description

@bm-clawd

Problem

If sqlite-vec fails to load during a bm watch session (e.g. SemanticDependenciesMissingError), subsequent new entities get indexed in FTS but not in the vector store. There is no recovery mechanism — the watcher continues running but silently skips embeddings for all new entities.

Reproduction

  1. Start bm watch --project claw
  2. At some point, sqlite-vec fails to load (observed in logs: SemanticDependenciesMissingError: sqlite-vec package is missing)
  3. New files are created and synced — they appear in FTS search
  4. But search_vector_chunks has zero rows for these entities
  5. Vector and hybrid search can't find them

Evidence

-- 5 of 66 entities had zero vector chunks
SELECT id, title FROM entity 
WHERE project_id = 2 
AND id NOT IN (SELECT DISTINCT entity_id FROM search_vector_chunks);
-- Returns entities created after the sqlite-vec error

Workaround

bm reindex --embeddings -p claw regenerates all embeddings. But users won't know to run this because the failure is silent.

Suggested Fix

  1. Log a warning on every sync when embeddings fail (not just the first time)
  2. Retry sqlite-vec loading periodically
  3. Track embedding failures and expose them in bm status or bm doctor
  4. Consider: if semantic search is enabled but vec can't load, should the watcher error out loudly rather than silently degrade?

Environment

  • BM v0.18.3, sqlite-vec 0.1.6 (installed but intermittently fails to load)
  • macOS, Intel Mac

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions