Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
| import { createClient } from '@supabase/supabase-js'; | ||
| import { OpenAIEmbeddings } from '@langchain/openai'; | ||
| import { SupabaseVectorStore } from '@langchain/community/vectorstores/supabase'; | ||
|
|
There was a problem hiding this comment.
Overall this is a solid start — it sets up Supabase with LangChain’s SupabaseVectorStore cleanly, and I like that you’ve wrapped the logic into getVectorStore and getEmbeddingsCollection. That said, there are a few things worth improving or clarifying:
Embedding Model Choice
Right now you’re using text-embedding-3-small. That’s good for cost savings and fast inference, but it may not be ideal for semantic search or production workloads.
For better quality, consider using text-embedding-3-large. It provides higher dimensionality and usually performs much better for RAG-style retrieval. You can make the model configurable via env variables, so you can switch easily without code changes.
new OpenAIEmbeddings({
modelName: process.env.EMBEDDING_MODEL || 'text-embedding-3-large'
})
Use of Public (Anon) Key
Right now you’re connecting with NEXT_PUBLIC_SUPABASE_ANON_KEY. That’s fine for client-side usage, but it’s not safe if you’re embedding from server-side or doing insert/update operations.
If this code is running in an API route or server function, you should use the service role key (with Row Level Security properly configured). Otherwise, anyone inspecting your frontend bundle will see the anon key.
Suggestion: separate client setup between server (service role key) and client (anon key).
LangChain Support / Capabilities
LangChain’s SupabaseVectorStore integration supports not only storing vectors, but also similarity search and hybrid filtering (if you pass metadata filters).
You’ve set filter: {} as default, but you might want to allow dynamic filtering (e.g., by userId, category, or document type).
It’s also worth noting that LangChain supports other vector stores (Pinecone, Weaviate, Chroma, etc.). If you anticipate switching later, maybe abstract this in your own wrapper so you’re not tightly coupled to Supabase.
Environment Variables
You’re throwing an error if env vars aren’t set, which is great. But consider aligning naming with what LangChain expects or what Supabase docs recommend (SUPABASE_URL, SUPABASE_ANON_KEY). Adding both server- and client-side keys might reduce confusion.
Database Table
You’ve hardcoded tableName: 'documents'. Is this table already configured with a vector column and the right Postgres extension (pgvector)? Might be worth documenting this in a README so other devs don’t get stuck.
Code Style / DX
Both getVectorStore and getEmbeddingsCollection return promises but one is not actually async (client.from('documents') is sync until you query). You might not need async for the second function.
You could also export a singleton instance of SupabaseVectorStore instead of creating a new one every time, depending on usage patterns.
No description provided.