-
Notifications
You must be signed in to change notification settings - Fork 166
Description
Problem
Hybrid search (RRF fusion of FTS + vector) produces worse results than vector-only search. The RRF formula averages strong vector scores with weak FTS scores, resulting in useless combined relevance.
Evidence
Query: arscontexta skill graphs tweet viral
- Vector-only: 65.4% relevance for correct match ✅
- FTS-only: 1.6% ❌
- Hybrid (RRF): 3.2% — worse than vector alone ❌
The correct note (containing an entire section about arscontexta's skill graphs tweet) scores 65% on semantic similarity but gets crushed to 3% by the RRF fusion with FTS.
Expected Behavior
Hybrid should produce scores >= max(vector, FTS), not lower than both meaningful components. When vector returns a strong match and FTS doesn't, the hybrid score should still reflect the vector confidence.
Impact
High. The OpenClaw plugin uses default search (not --vector), so users get the diluted hybrid scores. This makes memory_search unreliable for semantic recall.
Possible Fixes
- Weighted RRF: give vector results higher weight than FTS
- Max-score fusion instead of averaging
- Fall through: if vector score > threshold, use it directly
- Let callers specify search mode preference
Environment
- BM v0.18.3, SQLite backend, fastembed bge-small-en-v1.5
- 66 entities in claw project, all with embeddings