The RAG Vector Query Tool
The low-level building block: pass an embedding vector and get the top-k nearest matches from a vector store, with optional metadata filtering — the retrieval primitive custom agents and pipelines build on, on infrastructure you control.
Custom RAG needs a retrieval primitive you control
High-level search tools are great until you need to build something bespoke — your own embedding model, your own chunking, your own metadata schema. For that you need direct, predictable access to nearest-neighbor search.
Black-box search is limiting
Pre-packaged search hides the knobs custom pipelines need.
Custom embeddings need raw access
If you bring your own vectors, you need to query them directly.
Metadata matters
Real pipelines filter by repo, tenant, or document type, not just similarity.
Determinism is required
Builders need predictable, inspectable retrieval to debug their stack.
Direct nearest-neighbor search
Primitive
Vector in, neighbors out
The core of any RAG stack.
Pass a query embedding and the path to a vector store; the tool returns the top-k nearest matches by cosine similarity. It’s the deterministic retrieval primitive everything else can be built on.
- Cosine-similarity ranking
- Caller-supplied embeddings
- Explicit top_k
- Points at any store by path
Top-k nearest
Filtering
Metadata-aware retrieval
Similarity plus exact-match filters.
Optional filters narrow results to exact metadata matches — a repo, a tenant, a document type — so retrieval respects the structure of your data, not just its semantics.
Exact-match narrowing
Governance
Your store, your control
On-prem by construction.
The tool queries a vector store you own at a path you specify, so the whole retrieval layer runs inside your perimeter with no external dependency.
Self-hosted store
Parameters
The rag_vector_query tool accepts these inputs when an agent calls it. Required inputs are flagged.
Where the primitive pays back
Custom RAG pipelines
Build bespoke retrieval with your own embeddings and chunking.
Bring-your-own-vectors
Query a store you populated with a model of your choice.
Scoped retrieval
Filter by repo, tenant, or type alongside similarity.
Evaluation harnesses
Use deterministic retrieval to benchmark a RAG stack.
Agent tooling
Give a custom agent direct access to nearest-neighbor search.
Debugging retrieval
Inspect exactly what a query returns, step by step.
Assigned to agents, orchestrated as networks
On VDF AI, an industry’s use cases map to agents, and you assign tools like this one to those agents. Compose multiple agents into a governed, on-premise network.
What changes after you assign it
Questions about the RAG Vector Query tool
What is the RAG vector query tool?
It is the low-level retrieval primitive: you supply a query embedding, a path to a vector store, and a top_k, and it returns the nearest matches by cosine similarity. Higher-level search tools are built on this kind of operation.
When should I use this instead of federated search?
Use federated or per-source search for ready-made retrieval. Use this when you’re building custom RAG — your own embeddings, chunking, or metadata schema — and need direct, deterministic access to nearest-neighbor search.
Can I filter results?
Yes. The optional filters object narrows results to exact metadata matches, such as a specific repo or tenant, in addition to similarity ranking.
Where does the vector store live?
You point the tool at a store you own via db_path, so the entire retrieval layer runs on your own infrastructure with no external dependency.
Who typically uses it?
Builders and custom agents that need fine-grained control over retrieval, often as the foundation beneath the higher-level semantic search tools.
Assign RAG Vector Query to these agents
These VDF AI agents can be assigned this tool. Open an agent to see the full toolkit it can run.
Tools that work well alongside this one
Where this tool delivers value
Build custom RAG on a primitive you control
See the RAG vector query tool power a bespoke retrieval pipeline — on infrastructure you control.