Skip to content

BUG: Hybrid RRF fusion dilutes vector scores with weak FTS scores #577

@bm-clawd

Description

@bm-clawd

Problem

Hybrid search (RRF fusion of FTS + vector) produces worse results than vector-only search. The RRF formula averages strong vector scores with weak FTS scores, resulting in useless combined relevance.

Evidence

Query: arscontexta skill graphs tweet viral

  • Vector-only: 65.4% relevance for correct match ✅
  • FTS-only: 1.6% ❌
  • Hybrid (RRF): 3.2% — worse than vector alone ❌

The correct note (containing an entire section about arscontexta's skill graphs tweet) scores 65% on semantic similarity but gets crushed to 3% by the RRF fusion with FTS.

Expected Behavior

Hybrid should produce scores >= max(vector, FTS), not lower than both meaningful components. When vector returns a strong match and FTS doesn't, the hybrid score should still reflect the vector confidence.

Impact

High. The OpenClaw plugin uses default search (not --vector), so users get the diluted hybrid scores. This makes memory_search unreliable for semantic recall.

Possible Fixes

  1. Weighted RRF: give vector results higher weight than FTS
  2. Max-score fusion instead of averaging
  3. Fall through: if vector score > threshold, use it directly
  4. Let callers specify search mode preference

Environment

  • BM v0.18.3, SQLite backend, fastembed bge-small-en-v1.5
  • 66 entities in claw project, all with embeddings

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions