Best for: security-sensitive teams, minimal-footprint installs, and anyone who wants fewer moving parts. Goal: run NeuralMind’s full vector retrieval — embedding and search — with zero ChromaDB.
Requires v0.21.0+.
NeuralMind’s default vector store is ChromaDB. It works well, but it pulls in a
large transitive dependency tree (a web server, a kubernetes client,
OpenTelemetry, gRPC, …) and has carried the recurring CVE-2026-45829
advisory. The opt-in turbovec backend replaces it entirely:
| Concern | Default (chroma) | turbovec (ChromaDB-free) |
|---|---|---|
| ANN search | ChromaDB HNSW | Google TurboQuant compressed index |
| Embeddings | ChromaDB’s MiniLM | bundled OnnxMiniLMEmbedder (byte-identical vectors) |
| Metadata store | ChromaDB | local SQLite (stdlib) |
| Vector size on disk/in RAM | 1× | ~8–16× smaller |
| Advisory surface | CVE-2026-45829 | gone on this path |
| Retrieval quality | baseline | at/above parity (fact recall 0.744 → 0.800) |
Vectors are byte-identical to the default backend (same all-MiniLM-L6-v2
model; verified cosine 1.0), so you give up nothing in answer quality — you just
shed dependencies and shrink the index.
pip install "neuralmind[turbovec]"
# pulls: turbovec, onnxruntime, tokenizers, numpy — no ChromaDB needed
Drop a neuralmind-backend.yaml at your repo root:
backend: turbovec
That’s it — neuralmind build and every query now use the ChromaDB-free path.
Other accepted values: graph/chroma (default), in_memory (offline tests).
neuralmind build .
neuralmind query "where is JWT verification handled?"
The index lands in <project>/.neuralmind/ (a TurboVec index.tvim + a SQLite
store.sqlite), 8–16× smaller than the equivalent ChromaDB collection.
The embedder needs the all-MiniLM-L6-v2 ONNX model (~90 MB) once. Pre-stage it
and point an env var at it — no download at runtime:
export NEURALMIND_ONNX_MODEL_DIR=/opt/models/all-MiniLM-L6-v2/onnx
It also transparently reuses an existing ~/.cache/chroma/... model if you have
one. Full offline walkthrough: air-gapped.
This isn’t a claim — it’s gated. The backend parity job
(python -m evals.parity.run) and the faithfulness eval run on every PR; the
embedder’s byte-for-byte equivalence to ChromaDB is checked directly. See
Benchmarks for the numbers and how to reproduce them.
ChromaDB stays the default for now; turbovec is the opt-in path while it
bakes. Flipping the default (with a one-time reindex migration) and dropping
ChromaDB from the core dependencies are the staged next steps — see issue #204.