Skip to content

Commit 8747a3a

Browse files
authored
Docs/fix distance threshold documentation issue 407 (#451)
Fix distance threshold documentation inconsistencies (Part of issue [#407](#407)) Summary Corrects documentation to accurately reflect that Redis COSINE distance thresholds use the range [0-2], not [0-1]. The code validation was already correct, but docstrings and examples were misleading. Changes - SemanticCache: Fixed `__init__` and `set_threshold` docstrings to specify [0-2] range - Message History: Corrected notebook stating max threshold is 2.0, not 1.0 - Examples: Added inline comments clarifying Redis COSINE distance scale in README and notebooks
1 parent caa9621 commit 8747a3a

File tree

5 files changed

+14
-11
lines changed

5 files changed

+14
-11
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -276,7 +276,7 @@ llmcache = SemanticCache(
276276
name="llmcache",
277277
ttl=360,
278278
redis_url="redis://localhost:6379",
279-
distance_threshold=0.1
279+
distance_threshold=0.1 # Redis COSINE distance [0-2], lower is stricter
280280
)
281281
282282
# store user queries and LLM responses in the semantic cache

docs/user_guide/03_llmcache.ipynb

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@
103103
"llmcache = SemanticCache(\n",
104104
" name=\"llmcache\", # underlying search index name\n",
105105
" redis_url=\"redis://localhost:6379\", # redis connection url string\n",
106-
" distance_threshold=0.1, # semantic cache distance threshold\n",
106+
" distance_threshold=0.1, # semantic cache distance threshold (Redis COSINE [0-2], lower is stricter)\n",
107107
" vectorizer=HFTextVectorizer(\"redis/langcache-embed-v1\"), # embedding model\n",
108108
")"
109109
]
@@ -312,7 +312,9 @@
312312
"## Customize the Distance Threshold\n",
313313
"\n",
314314
"For most use cases, the right semantic similarity threshold is not a fixed quantity. Depending on the choice of embedding model,\n",
315-
"the properties of the input query, and even business use case -- the threshold might need to change. \n",
315+
"the properties of the input query, and even business use case -- the threshold might need to change.\n",
316+
"\n",
317+
"The distance threshold uses Redis COSINE distance units [0-2], where 0 means identical and 2 means completely different.\n",
316318
"\n",
317319
"Fortunately, you can seamlessly adjust the threshold at any point like below:"
318320
]
@@ -323,7 +325,7 @@
323325
"metadata": {},
324326
"outputs": [],
325327
"source": [
326-
"# Widen the semantic distance threshold\n",
328+
"# Widen the semantic distance threshold (allow less similar matches)\n",
327329
"llmcache.set_threshold(0.5)"
328330
]
329331
},
@@ -930,4 +932,4 @@
930932
},
931933
"nbformat": 4,
932934
"nbformat_minor": 2
933-
}
935+
}

docs/user_guide/07_message_history.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -276,7 +276,7 @@
276276
"source": [
277277
"You can adjust the degree of semantic similarity needed to be included in your context.\n",
278278
"\n",
279-
"Setting a distance threshold close to 0.0 will require an exact semantic match, while a distance threshold of 1.0 will include everything."
279+
"Setting a distance threshold close to 0.0 will require an exact semantic match, while a distance threshold of 2.0 will include everything (Redis COSINE distance range is [0-2])."
280280
]
281281
},
282282
{

docs/user_guide/08_semantic_router.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
"route. The incoming query from a user needs to be semantically similar to one or\n",
2828
"more of the references in order to \"match\" on the route.\n",
2929
"\n",
30-
"Additionally, each route has a `distance_threshold` which determines the maximum distance between the query and the reference for the query to be routed to the route. This value is unique to each route."
30+
"Additionally, each route has a `distance_threshold` which determines the maximum distance between the query and the reference for the query to be routed to the route. This value is unique to each route and uses Redis COSINE distance units (0-2], where lower values require stricter matching."
3131
]
3232
},
3333
{

redisvl/extensions/cache/llm/semantic.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -61,8 +61,9 @@ def __init__(
6161
Args:
6262
name (str, optional): The name of the semantic cache search index.
6363
Defaults to "llmcache".
64-
distance_threshold (float, optional): Semantic threshold for the
65-
cache. Defaults to 0.1.
64+
distance_threshold (float, optional): Semantic distance threshold for the
65+
cache in Redis COSINE units [0-2], where lower values indicate stricter
66+
matching. Defaults to 0.1.
6667
ttl (Optional[int], optional): The time-to-live for records cached
6768
in Redis. Defaults to None.
6869
vectorizer (Optional[BaseVectorizer], optional): The vectorizer for the cache.
@@ -80,7 +81,7 @@ def __init__(
8081
Raises:
8182
TypeError: If an invalid vectorizer is provided.
8283
TypeError: If the TTL value is not an int.
83-
ValueError: If the threshold is not between 0 and 1.
84+
ValueError: If the threshold is not between 0 and 2 (Redis COSINE distance).
8485
ValueError: If existing schema does not match new schema and overwrite is False.
8586
"""
8687
# Call parent class with all shared parameters
@@ -243,7 +244,7 @@ def set_threshold(self, distance_threshold: float) -> None:
243244
the cache.
244245
245246
Raises:
246-
ValueError: If the threshold is not between 0 and 1.
247+
ValueError: If the threshold is not between 0 and 2 (Redis COSINE distance).
247248
"""
248249
if not 0 <= float(distance_threshold) <= 2:
249250
raise ValueError(

0 commit comments

Comments
 (0)