Open-source semantic code intelligence for AI agents. Built for cost efficiency, security, and compliance.
Every figure below is produced by code in the repo and gated in CI, so it can't silently regress. Where a number is a real-repo extrapolation or an estimate, it says so. Reproduce them all with python -m tests.benchmark.run and the evals/ harnesses — details on the Benchmarks page.
import neuralmind no longer requires ChromaDB, and the default backend is now auto — it prefers the ChromaDB-free turbovec path when its deps are installed, else falls back to chroma. Plain installs are unchanged; neuralmind doctor shows the resolved backend.What we don't claim: the CI numbers come from a tiny fixture (they prove the mechanism and catch regressions, not a real-repo ceiling); TurboQuant is an approximate index whose compression win matters at scale; 40–70× is a range, not a fixed guarantee. Point it at your own repo to see your ratio.
Reach for NeuralMind if you drive an AI coding agent (Claude Code, Cursor, Cline, Continue, or your own MCP/OpenRouter stack) over a non-trivial codebase, you want one shared memory across those agents, you're watching LLM spend on repo-level questions, or you work in a regulated/air-gapped setting that rules out SaaS code search.
You probably don't need it if your repo is under ~5K tokens (just paste it in), you don't use an agent, or you only want inline completions (use Copilot/Cursor directly).
Set expectations honestly: the 40–70× token reduction is a real-repo range — CI measures a conservative 6.2× on a tiny fixture, so prove it on your own code. ChromaDB is still the default (the slim, ChromaDB-free backend is opt-in while it bakes). It's a fast-moving, single-maintainer beta with a lot of surface area — pin a version for CI. The compressed backend is approximate, with parity gated on the reference fixture rather than measured at large-repo scale. If those trade-offs are fine, the upside is real and the receipts are in the repo.
A staged plan — the first increment (v0.13.0) has shipped. After the v0.10→v0.12 ergonomics + diagnostics work, the current arc is about proving and future-proofing the core. The spine is simple: measure, then change, then measure again.
neuralmind eval now reports a faithfulness delta (expected-fact recall vs a matched-budget naive baseline), grounding, and contradiction checks over the gold dataset + polyglot TypeScript + Go fixtures. The onboarding-lift eval is the remaining increment; the LLM-as-judge stays strictly opt-in.Committable, opt-in team/shared memory is approved in principle: a reviewed team baseline that travels in your git repo (no SaaS, no exfiltration), overlaid by each developer's private personal layer. Its day-one onboarding lift will be measured by the v0.13 harness before the design is locked — we sign off on data, not assertion.
NeuralMind now has a single learning signal. The old learned_patterns cooccurrence reranker is removed, and neuralmind learn becomes an exit-0 deprecation no-op. The Hebbian synapse layer — which already learns continuously from your queries, edits, and tool calls, and lets unused edges decay — is now the only thing that adapts retrieval to how you actually use the codebase. One system instead of two means automatic learning instead of a manual step, and decay instead of staleness. This is a removal, not a regression: warm-path recall is synapse-driven exactly as it was in v0.24.
A 2×2 on the benchmark fixture measured top-k hit rate with the reranker on vs. off, crossed with synapses on vs. off. The reranker moved the number by 0.0 points in both rows (71.7% → 71.7% cold, 83.3% → 83.3% warm), while the synapse layer alone moved it by +11.6 points. The reranker was also runtime-inert on the warm path — the synapse boost re-sort discarded its ordering anyway — required the manual neuralmind learn step to populate, and went stale between runs. The learning that matters is the synapse layer's, and that is the signal NeuralMind keeps.
(+X.XX boost) labels from the reranker. Synapse-recall labels are unchanged and still appear..neuralmind/learned_patterns.json is no longer read or written. An existing one is simply ignored and can be deleted.neuralmind learn prints a deprecation notice and exits 0, so any script or CI step that calls it keeps working. NeuralMind(enable_reranking=...) is accepted and ignored for backward compatibility.Two learning mechanisms that nominally do the same job is two places for behavior to drift, two things to document, and two things to keep honest. The A/B settled which one earns its keep. Automatic beats manual: the reranker only improved with a neuralmind learn step you had to remember to run, while the synapse layer learns as you work. And decay beats staleness: the reranker's JSON captured a snapshot that aged between runs, whereas the synapse layer continuously reinforces what's used and decays what isn't, so recall tracks current usage instead of a stale batch. To see what's been learned, use neuralmind stats or neuralmind memory inspect.
The learned synapse layer is now namespace-aware (PRD 4). Branch experiments, your personal long-term memory, an imported team baseline, and throwaway session scratch live in separate namespaces inside the same store — branch:<name>, personal, shared, ephemeral — so a weekend spike on a feature branch can't pollute what the agent learned about main. Reads stay smart by default: a transparent merged view prioritizes recent branch-local context while retaining long-term priors, with the weighting published as three constants. Existing learned memory migrates in place, losslessly — everything you've taught NeuralMind becomes the personal namespace, and on the default branch behavior is byte-identical to v0.23.
NEURALMIND_NAMESPACE env var → memory_namespace: pinned in neuralmind-backend.yaml → branch:<name> on a non-default git branch (best-effort stdlib git rev-parse, 3s timeout) → personal. Detached HEAD, missing git, or a non-repo all degrade safely — a weird git state never fails a memory write.namespace joins the primary keys of all three tables, so the store rebuilds each table inside a single IMMEDIATE transaction that rolls back wholesale on any failure. A mandatory stdlib-only test opens a real pre-namespace database and proves every edge, transition, and activation survives with identical weights under personal.neighbors, next_likely, edges, and transitions merge the active namespace at 1.0× (W_BRANCH), personal at 0.8× (W_PERSONAL), and shared at 0.5× (W_SHARED) — explicit module-level constants with the formula documented beside them. Explicit namespaces=[...] (or --namespace) reads one namespace at raw weights.neuralmind memory {inspect,reset,export,import}. Inspect contribution by namespace (also folded into stats); clear one namespace without touching the index or any other namespace; export a namespace as a portable, versioned JSON bundle reusing the IR's IRSynapse shape; import validates format + version and merges idempotently (weights merge by MAX). This is the PRD 8 team-memory on-ramp.shared is sticky (a team baseline shouldn't evaporate because one developer changed focus); personal / branch:* keep the existing rates with long-term potentiation intact; ephemeral fades fast with no LTP floor and is cleared outright at session boundaries (SessionStart hook, daemon shutdown).namespace_contribution map showing how much boost energy arrived through each namespace's edges — computed only on traced queries, so the untraced hot path pays nothing.A learned usage memory is only trustworthy if the wrong lessons can be contained and removed. Namespaces give branch experiments a place to live and die, give a team a clean channel to ship baseline knowledge to a new teammate (shared at 0.5× — informative, never louder than your own experience), and give throwaway exploration a memory that forgets itself. And because the migration is the riskiest part of touching a memory store, it's a single transaction with rollback-on-error and a no-data-loss test — the pattern future schema bumps follow.
Full v0.24.0 release notes → · Branch-isolated memory walkthrough →
Four future-proofing foundations land together. PRD 1: NeuralMind now builds a canonical, versioned intermediate representation (IR) of your code graph and validates it on every build — a new neuralmind validate command checks that contract and reports schema problems, decoupling retrieval/memory/UI/MCP from any one graph producer's field names. It's a Phase-1, hidden-adapter rollout: the embedder still reads graph.json exactly as before, so nothing about retrieval changes. PRD 2: a new neuralmind benchmark --quality mode proves retrieval finds the right code — precision@k / recall@k / MRR / answerability over golden suites — and fails CI on a regression. PRD 3: neuralmind query --trace shows why a result came back — per-layer candidates, cluster scoring with vector-vs-synapse attribution, synapse boosts, and final hits. PRD 5: an experimental neuralmind daemon holds project state warm so repeat queries skip cold backend init, with the CLI auto-preferring it and falling back to direct mode.
neuralmind/ir.py). Every build adapts the loaded graph.json into an IndexIR — canonical IRNode / IREdge / IRCluster / IRSynapse entities stamped with an ir_version, the source backend, the producer's schema version, language inference, and a per-build coverage signal (coarse for tree-sitter/graphify; a future SCIP/LSP backend reports precise). Written to .neuralmind/index_ir.json.neuralmind validate [path] [--write] [--json]. Validates the IR with no vector backend (no ChromaDB/turbovec needed). It reports errors (dangling/missing edge endpoints, duplicate ids, unsupported IR version) and warnings (orphaned nodes, unknown node kinds, unknown edge relations). --write (re)materializes the IR — the in-place migration path for a project that predates it, no rebuild required; --json emits a machine-readable summary for CI.from_graph_json / to_graph_json round-trip any graph.json back to a dict equal on every field the stack consumes; non-standard producer fields are preserved verbatim so upgrades don't silently drop information.build and stats. build prints an IR: v1 (valid) line; stats (and build_stats) carry an ir block with the contract version, counts, coverage, and last validation result. Public API: from neuralmind import IndexIR, from_graph_json, validate_ir, validate_project.neuralmind/quality.py) — PRD 2. Pure, stdlib-only precision@k / recall@k / MRR / answerability, with a regression-threshold gate and baseline-delta comparison.neuralmind benchmark --quality. Scores those metrics over golden query suites spanning three repos (Python / TypeScript / Go — 30 labelled queries), reports per-suite MRR / answerability / recall@k / precision@k, and exits non-zero on a regression. --suite scopes to one language; --baseline reports deltas vs a saved run; --json for dashboards. A contributor/CI self-test like neuralmind eval.neuralmind query --trace) — PRD 3. A per-layer trace of the retrieval path: vector candidate pool, cluster scoring with vector-vs-synapse attribution, individual synapse boosts, final ranked hits (flagging synapse-recalled ones), and the token budget. Pure/stdlib neuralmind/trace.py; bounded so it can ride along in a CLI/MCP payload, and redact-able to basenames for bug reports. Zero-overhead off by default; the daemon's /query honors it too.neuralmind daemon) — PRD 5, experimental. A per-user localhost service with a project registry (each project's index initialized once, reused warm), per-project locks + background build jobs (no index/synapse corruption), and a transport-agnostic dispatch() API the CLI client speaks. neuralmind query/stats auto-prefer it and fall back to direct mode (NEURALMIND_NO_DAEMON=1 to force direct); a stale discovery file is cleaned up so a crashed daemon never wedges the CLI.Retrieval no longer has to assume one producer's exact JSON shape — the IR is the single contract, and the graphify/tree-sitter reader is just the first adapter behind it. validate turns "why is retrieval weird?" debugging into a one-line schema check, surfacing dangling edges and orphaned nodes before they cost a query. And the quality harness closes the loop the token-reduction benchmark left open: it proves the context NeuralMind selects is relevant, not just cheap, so a ranking or synapse-recall change that quietly drops the right files fails CI instead of shipping. The embedder still consumes graph.json today; future phases dual-write, then make the IR the read path.
import neuralmind no longer requires ChromaDB, and the default backend is now auto. v0.21.0 made a complete ChromaDB-free retrieval path (the turbovec backend). v0.22.0 starts switching everyone onto it — safely: the package imports without ChromaDB, and auto prefers turbovec when its deps are installed, else falls back to chroma. Nothing breaks for a plain install, and the old ChromaDB index is never deleted.
import neuralmind is ChromaDB-free. GraphEmbedder is now exposed lazily via PEP 562 module-level __getattr__ (so from neuralmind import GraphEmbedder still works), and the chromadb telemetry patch moved next to the only import chromadb — so import neuralmind succeeds with ChromaDB entirely absent.auto. An unset config (or backend: auto) resolves to turbovec when its stack (turbovec + onnxruntime + tokenizers) is importable, else chroma. An explicit backend: in neuralmind-backend.yaml always wins, so pinning chroma (backend: graph) is a one-liner.auto lands on turbovec for a project that still has a legacy ChromaDB index and no turbovec index yet, the build path reindexes from graph.json and prints a one-line notice. The old ChromaDB index is left untouched as a fallback — nothing is deleted.neuralmind doctor shows the resolved backend. A new Backend check reports the configured value, what it resolves to, and whether the turbovec stack is installed.A plain pip install neuralmind (no [turbovec] extra) resolves auto → chroma, so existing installs and CI are unchanged; users who added the extra get the ChromaDB-free path by default, letting it bake before v0.23 makes it universal. The flip is deliberately soft — turbovec is still an optional extra and chromadb is still a core dependency. This release removes only the import-time requirement and makes turbovec the preferred default; folding the turbovec deps into core and dropping chromadb are the staged next steps (v0.23).
NeuralMind can now embed and search with zero ChromaDB. v0.20.0 moved vector search to TurboVec (Google Research's TurboQuant compressed index); v0.21.0 moves the last ChromaDB-only job — embedding — into a bundled module, so the opt-in turbovec backend is fully self-contained.
OnnxMiniLMEmbedder. A ChromaDB-free all-MiniLM-L6-v2 embedder on just onnxruntime + tokenizers + numpy, producing vectors byte-identical to ChromaDB's DefaultEmbeddingFunction (verified: cosine 1.0, max diff 0.0).turbovec backend is ChromaDB-free end to end. Vectors → TurboVec IdMapIndex (8–16× smaller); text + metadata → local SQLite; embeddings → the bundled embedder. Enable with backend: turbovec in neuralmind-backend.yaml.turbovec never constructs the Chroma stack.ChromaDB drags in a large dependency tree and the recurring CVE-2026-45829 advisory surface. This release is the foundation for retiring it entirely — making the package import ChromaDB-free, flipping the default to turbovec with a migration, and dropping the dependency are the staged next steps (v0.22–v0.23).
The static code graph is commoditizing; the learned synapse layer is the moat. v0.20.0 turns that moat into a number: neuralmind eval --onboarding measures whether an agent that inherits a committed team memory retrieves better on its first queries than a cold agent with none.
neuralmind eval --onboarding. A cold-vs-onboarded A/B over the gold queries, sharing one built index and differing only in synapse recall (off vs on over a committed team baseline). Scored by the same offline judge as the faithfulness eval; the headline is the top-k module hit-rate lift — the slice associative recall re-ranks within — with fact-recall and grounding reported as honest secondaries (at a fixed budget fact-recall is budget-traded and full-context grounding saturates).evals/onboarding/seed_history.json is a regenerable record of co-edit sessions replayed through the real reinforcement path — no binary synapses.db committed.AST-derived code graphs are table stakes now. NeuralMind's durable edge is usage memory — the synapse layer that learns what your team edits together. This release makes that edge measurable and regression-gated, formalising the self-benchmark's Phase-3 synapse A/B into a committed-baseline eval.
Distribution is half the moat. The static code graph is commoditizing; what NeuralMind has that a static graph can't copy is usage memory (the learned synapse layer) plus being the memory an agent actually reaches for. v0.19.0 makes the latter frictionless: one command registers NeuralMind's MCP server with every agent you have installed.
neuralmind install-mcp. Auto-detects installed MCP clients — Claude Code, Cursor, Cline, Claude Desktop — and merges a NeuralMind entry into each client's config. --all does every detected client; --client targets one; --print emits the snippet to paste by hand.{"mcpServers": {…}} shape (just in different files); the command preserves your other servers, re-running is a no-op, and a changed entry is updated in place. Pure standard library (neuralmind/mcp_install.py), 11 tests.{"command": "neuralmind-mcp"} — the MCP tools take a project_path per call, so there's nothing to bake in.AST-derived code graphs are becoming table stakes. NeuralMind's durable edge is usage memory + distribution. The self-benchmark already measures the learned uplift directly (Phase 3 — synapse-recall A/B: top-k hit rate 71.7% → 83.3%, +11.7 points with recall on, at a neutral token budget). v0.19.0 invests in the other half — making NeuralMind trivial to plug into every agent, so it sees more usage and learns more.
Stay fresh without rebuilding. Picking up a code change used to mean re-running neuralmind build — a whole-repo re-parse. v0.18.0 re-parses just the file you edited and re-embeds only its nodes, leaving everything else byte-for-byte untouched.
graphgen.update_files(). Splices a re-parse of the changed files back into an existing graph.json. Unchanged files keep their nodes, edges, and community numbers byte-for-byte — the latter is the trick, since renumbering communities would change every node's content hash and force a full re-embed. The edited file's outgoing edges are re-resolved; edges into a removed/renamed symbol are pruned (no dangling edges).NeuralMind.update_files(paths). Writes the graph, deletes embeddings for removed symbols, and re-embeds — which, thanks to content hashing, only touches the edited file's nodes. On the reference fixture, editing one file re-embeds 2 nodes and skips 135.neuralmind watch --reindex. Wires it to the watcher: every debounced batch of edits is re-indexed automatically, so a query right after a save already reflects the new code. Opt-in, since re-embedding needs the retrieval stack in the watch process.The built-in backend was already file-by-file and the embedder already content-hashed; v0.18.0 closes the loop so the parse is incremental too and it happens automatically as you work. The graph contract is unchanged — retrieval, synapses, the graph view, and MCP tools just see a fresher graph.
Compiler-accurate edges, where you have them. The built-in backend resolves calls/inherits edges heuristically — by name — which is wrong when two classes both define, say, handle(). If your repo has been indexed by a SCIP tool (scip-python, scip-typescript, scip-go), v0.17.0 can fold that compiler-accurate resolution in.
neuralmind/precision.py). With NEURALMIND_PRECISION=1 and a *.scip index present, neuralmind build replaces the heuristic calls/inherits edges for the files the index covers with SCIP-resolved ones. Off by default — a no-op otherwise.contains, imports_from, rationale, communities — is untouched, behind the same graph.json seam.run() → B.handle (wrong), SCIP links run() → A.handle (right) and drops the wrong edge, and the pass is a strict no-op when disabled.Tree-sitter gives breadth (66+ languages, no build); SCIP/LSP give compiler-grade precision per language. v0.17.0 is the proven hybrid: tree-sitter by default, exact edges where a SCIP index exists — both behind the same seam, so the retrieval stack is untouched. No runtime change unless you opt in.
The built-in backend is no longer Python-only. v0.16.0 takes the bundled tree-sitter backend multi-language: neuralmind build . now indexes Python, TypeScript, and Go out of the box, with no graphify install — and a mixed-language repo is indexed in one pass.
neuralmind/graphgen.py dispatches each file by suffix (_SUFFIX_LANG → _EXTRACTORS) to a language extractor. TypeScript (.ts/.tsx) and Go (.go) join Python, each mapping its grammar's node types onto the same code/rationale/document nodes and contains/imports_from/inherits/calls edges./** … */ JSDoc, and Go // … doc comments all become the same rationale layer; struct fields and module constants become queryable code symbols.tree-sitter-typescript and tree-sitter-go are now core dependencies, so the languages work with no extra install; a missing optional grammar is skipped per-file rather than failing the build.The seam — not the parser — is the durable asset. Adding a language is now registering a pair of functions, with the parity gate guaranteeing each new grammar holds up before it ships. The retrieval stack downstream of graph.json is untouched; it simply sees more of the world. No runtime change for existing Python or graphify installs.
The biggest barrier to trying NeuralMind on your own repo is gone. Until now you had to install a second, external tool (graphify) and run it before neuralmind build would do anything. v0.15.0 ships a built-in tree-sitter graph backend, so pip install neuralmind && neuralmind build . just works.
neuralmind/graphgen.py). A pure-Python parser that produces a graphify-compatible graph.json — symbol-level code nodes (incl. module constants & class fields), contains/imports_from/inherits/calls edges, a full-body rationale layer, document nodes from markdown, and balanced per-file communities. tree-sitter + tree-sitter-python are now bundled core deps.--force never clobbers a real graphify build, and an empty/non-code project still gets the honest "no graph" guidance instead of a silent 0-node "success."evals/parity/run.py) builds the reference fixture with both backends, runs the faithfulness eval and derives the self-benchmark reduction on each, and fails the build if the built-in backend drifts outside tolerance of graphify (within 25% reduction, within 10 points faithfulness). An early cut failed this gate — it retrieved worse than naive truncation — and the backend was improved until it cleared the bar on the merits. On the reference fixture the built-in now beats graphify on fact recall (0.717 vs 0.555), grounding (1.000 vs 0.917), and token reduction (6.66× vs 6.08×). It's the safety net every future backend change must clear.graph.json — embedder, context selector, communities, synapses, graph view, MCP tools — is unchanged. Only the graph producer changed, behind a generated_by / schema_version stamp. That's what lets multi-language extractors and an optional LSP/SCIP precision pass slot in next.Removing the external-tool dependency cuts the #1 onboarding drop-off and a single-maintainer bus-factor risk, and unblocks a cleaner fresh-install on more platforms. The static code graph is commoditizing across the industry; making the producer pluggable — and gating every swap on measured retrieval quality — is how NeuralMind keeps the part that's actually unique (the learned synapse layer) honest. No runtime change for existing graphify-based installs.
The measurement foundation became a command. neuralmind eval turns "does NeuralMind's memory make an agent's answers better, not just shorter?" into a reproducible number — and, like everything here, the default judge is 100% local.
neuralmind eval. Runs a with-NeuralMind vs naive A/B on the reference fixture and reports a faithfulness delta, grounding rate, contradiction rate, and a per-query breakdown. --json for machines; --selfcheck validates the gold set + offline scorer with no heavy deps. A contributor/CI quality gate — run it from a source checkout (the evals/ gold set isn't bundled in the pip wheel).NEURALMIND_EVAL_LLM_JUDGE and is never the default or the gate.NeuralMind has always led with a token-reduction number. v0.14.0 adds the harder, more honest one: quality. It's a self-test against the bundled reference fixture today (like neuralmind benchmark), and the fitness function the rest of the roadmap — graph-backend decoupling, team memory — is validated against. No runtime change to your install.
A foundation release: the scaffolding to prove the memory helps, not just claim it. v0.13.0 doesn't change what your install does at runtime — it builds the machinery to measure whether NeuralMind makes an agent's answers better (faithfulness), and to measure retrieval quality beyond Python. Plus a documentation process so the docs stop drifting.
NEURALMIND_EVAL_LLM_JUDGE mode is clearly labelled as leaving the machine and is never the default or the gate.DOCUMENTATION-PROCESS.md, a PR-template checklist, and a CONTRIBUTING link, so every user-facing change ships its docs in the same PR and the landing-page version stops going stale.NeuralMind's headline claims — cheaper, and the memory makes answers better — deserve to be measured, not asserted. v0.13.0 is the first brick: an honest, offline, reproducible way to score answer faithfulness, and fixtures that extend quality coverage past Python. Nothing here is a runtime feature; all of it is the fitness function the rest of the eval-first roadmap builds on. No migration, no new dependencies, no behavior change for existing installs.
One command tells you exactly what's wired up and what isn't. NeuralMind has accumulated moving parts — a code graph, a semantic index, a synapse memory db, Claude Code hooks, an MCP server, a query-memory consent flag. When one isn't in place, the old symptom was a stack trace or a silent no-op. v0.12.0 makes the setup legible.
neuralmind doctor. Inspects the install and reports each piece with a status (ok / warn / fail) and the exact command to fix it. Read-only — it never builds or mutates state. Exits non-zero when a check fails (graph/index missing) so you can gate a CI step or an agent's provisioning on it.--json output. A stable, machine-readable snapshot of every check, for scripting and agent consumption. An agent provisioning its own environment can run neuralmind doctor --json as a pre-flight check before issuing queries.AttributeError. It now raises a clear, actionable message naming the two commands to run (graphify update, then neuralmind build) and points you at neuralmind doctor.Most NeuralMind support friction is setup friction: a piece isn't installed, and the failure doesn't say which one. doctor turns "it doesn't work" into a checklist with fixes — for the human dropped into a fresh clone, and for the agent that has to verify its own environment before relying on the tool. No new dependencies, no behavior change for working installs.
The brain-like layer now learns what comes next, not just what goes together. The Hebbian co-activation signal that's powered NeuralMind since v0.4.0 has always been symmetric: nodes that fire together wire together, no ordering. v0.11.0 adds a parallel directional signal — after touching file A, you typically touch file B with probability p.
synapse_transitions(from_node, to_node, weight, count) table tracks ordered observations. Same Hebbian + decay machinery as the undirected synapses, but with directionality preserved. The file watcher records transitions automatically every time it flushes a batch — sorted by edit timestamp so re-touched files appear at their latest position, not their first.next_likely(node) API. Returns the probability distribution over what typically follows a given file or node. Probabilities normalize to 1.0 across all outgoing transitions. Callable from Python (mind.synapses.next_likely(path)), the CLI (neuralmind next . path/to/file.py), or via the new neuralmind_next_likely MCP tool from Cursor / Cline / Continue / any MCP client.TRANSITION_DECAY_RATE = 0.01 vs DECAY_RATE = 0.02 for the undirected edges) because sequential signals are rarer per session — you might edit ten files together but only get nine ordered pairs between them, so they need to accumulate for longer before being trusted.The existing undirected synapse graph answers association: "given this file, what other files belong to the same thought?" Directional transitions split that signal into before (prefetch this file's skeleton when the implementation file is loaded) and after (remind the agent to update this when the implementation changes). Same source data — every activate_files call writes to both the undirected and directional tables — two views.
No migration. The new table is created on first connection. Existing co-activation edges in synapses.db are untouched. Disable by ignoring the new API — the watcher still records transitions in the background, but if nothing reads them, they just sit there decaying like any other edge.
Three friction-removing changes to the PostToolUse Bash compressor — the surface AI coding agents see on every shell call. Built from real session feedback where the previous footer forced a second query round-trip just to recover dropped middle content.
[Full output: 4298 bytes...] message with a categorized breakdown: dropped 23 lines (12 info, 8 debug, 3 other); repeated: 5× 'Gamma API error 503'. Agents can judge whether the dropped middle was log noise or a buried error without re-running the command.neuralmind last — new CLI that prints the raw pre-compression output. The PostToolUse hook stashes every Bash result to <project>/.neuralmind/last_output.json (single-slot, 2 MB cap, atomic temp-file + rename writes). Turns NEURALMIND_BYPASS=1 from a re-run-from-scratch escape hatch into a free lookup — meaningful on npm test (~28s) and non-deterministic network calls.NEURALMIND_BASH_SMALL (default 500 chars) pass through verbatim with just exit-code framing, even on non-zero exit.The PostToolUse compressor is the most-visible surface NeuralMind exposes — the agent sees it on every Bash call. v0.10.0 converts the common "I need to see the dropped middle" friction from a re-run loop into a single cache lookup, and turns the footer text itself into actionable messaging: concrete benefit visible, concrete affordance, clear escape hatch.
No migration. Same graph.json, same synapses.db, same hooks. The new cache file is additive; disable via NEURALMIND_OUTPUT_CACHE=0.
Phase 3 of the release arc. Turn the v0.6.0 → v0.7.0 → v0.8.0 foundation into something a CTO, security team, or regulated-industry operator can actually adopt.
ghcr.io/dfrostar/neuralmind:vX.Y.Z and :latest, multi-platform (linux/amd64 + linux/arm64), non-root runtime, transitive deps pre-wheeled. :latest only moves for stable tags — pre-release tags like v1.0.0-rc1 are excluded from the floating tag.neuralmind-vX.Y.Z.sbom.json — ingestible by Grype, Trivy, Dependency-Track, and most enterprise SCA scanners.No production code changes — pure CI + docs. No migration. Same graph.json, same synapses.db, same hooks.
neuralmind watch and neuralmind serve are first-class production processes now. The synapse store accumulates 24/7 whether you're at the keyboard or not, and the graph view is always listening on 127.0.0.1:8765.
scripts/systemd/neuralmind-{watch,serve}.service (user-scope units, hardened), scripts/launchd/com.neuralmind.{watch,serve}.plist (macOS user agents with RunAtLoad + KeepAlive), Windows Task Scheduler section in the Scheduling Guide./healthz endpoint on neuralmind serve — unauthenticated, returns {"status":"ok","version":"..."}. Designed for Docker HEALTHCHECK and systemd ExecStartPost probes so a fresh container can be checked without threading a session token.HEALTHCHECK) with install + verify + uninstall + troubleshooting.The canvas still requires the per-session auth token by default; pass --no-auth in the templates or read the tokenized URL from the service logs.
Distribution release, not a features release. The brain is the same brain. What changed is how many ways you can install it.
pip install neuralmind graphifyy — the default. Works in any venv.pipx install neuralmind — global CLI, isolated env. neuralmind on PATH everywhere without activation.uv pip install neuralmind graphifyy — ~10× faster than pip, same wheel.docker build -t neuralmind:dev . — multi-stage Dockerfile in the repo root, non-root runtime, transitive deps pre-wheeled in the builder. GHCR auto-publish lands in a later release; build locally for now.All five paths deliver the same package: the neuralmind CLI, the neuralmind-mcp server (for Claude Code, Cursor, Cline, Continue, and any MCP client), and the live graph view from v0.6.0. Smoke test is identical: neuralmind --help works everywhere; python -c "import neuralmind" works for pip / uv / source paths.
graph-view, hebbian-learning, force-directed-graph, …) so PyPI search ranking finally matches what we ship.logrotate/copytruncate is fixed./api/queries test coverage (#116) — the replay-last-query route is now in the automated regression suite.No migration. Same graph.json, same synapses.db, same hooks. Upgrade is whatever your install path's --upgrade equivalent is.
The pitch flipped. v0.5.4 made the brain inspectable. v0.6.0 makes it legible. neuralmind serve now streams a live activity feed: synapse + file events pulse across the canvas in real time as the agent and the codebase interact. The graph stops being a static map and becomes a window into the hippocampus learning your codebase.
/api/events SSE stream subscribed to the in-process event bus. Affected nodes pulse on the canvas; sidebar log shows the most recent ~80 events.neuralmind watch daemon, or a hook-driven Claude Code session in another process, feeds the same live feed via <project>/.neuralmind/events.jsonl. The in-process bus stays the primary path; the JSONL is a deliberately boring side channel. Opt out with NEURALMIND_EVENT_LOG=0.Cmd/Ctrl-K and / jump-to-search from anywhere; Esc clears and blurs.The cross-process JSONL bridge means that if you run Claude Code, Cursor, OpenClaw, and Hermes-Agent against the same project — they reinforce the same synapse store, and the v0.6.0 graph view shows the union of their activity. Pre-v0.6.0, the synapse store was shared but the experience wasn't — three tools talking to a black box. Now the brain is visibly one brain. See the multi-agent walkthrough.
No migration: same graph.json, same synapses.db, same hooks. Upgrade is pip install --upgrade neuralmind. Then neuralmind serve and save a file — that's the demo and the verification in one motion.
NeuralMind exists to solve a fundamental problem: AI agents waste tokens loading raw source code when they only need small, semantic context.
Our mission is to make semantic code intelligence accessible, affordable, and trustworthy—without data exfiltration, vendor lock-in, or compliance headaches.
Phase 1 — Smart Retrieval: Instead of loading entire files, NeuralMind uses a 4-layer semantic index to surface only the ~800 tokens of code your question actually needs.
Phase 2 — Output Compression: PostToolUse hooks compress Read, Bash, and Grep output 88–91% smaller before agents see it.
Phase 3 — Brain-like Memory (v0.4.0): A second brain runs alongside the LLM — a persistent weighted graph that learns associations between code nodes from how the agent and the codebase actually interact. Stronger connections form between code that gets used together, weaker ones decay, and the agent's prompts trigger spreading-activation recall over the learned graph.
Phase 4 — Graph View (v0.5.4, made live in v0.6.0): neuralmind serve renders the whole system as an Obsidian-style force-directed graph in the browser. Structural edges, the learned synapse overlay, backlinks, a semantic quick-switcher, and one-click open-in-editor. v0.6.0 adds a live activity feed — synapse and file events pulse on the canvas in real time, with a cross-process JSONL bridge that lets Claude Code, Cursor, OpenClaw, and Hermes-Agent all feed the same canvas. The brain stops being a black box — you can finally watch what it's learning, live.
Result: 40–70× per-query token reduction (50K+ tokens of raw source compressed into ~800 tokens of structured context) and 40–70% bill drops on real codebases. Retrieval that gets sharper the longer the system runs on a codebase.
NeuralMind doesn't load code randomly. It uses a 4-layer index that progressively surfaces context:
The agent gets exactly what it needs, in order, without bloat.
NeuralMind learns from how you actually use the codebase. Over time, the brain-like synapse layer (described below) improves retrieval quality based on which code you query, edit, and run tools over together — automatically and continuously, with no external training and no manual step.
The newest layer is an associative memory inspired by how brains actually learn. NeuralMind tracks weighted "synapses" between code nodes, and applies three classic neuroscience principles:
The synapse store is local SQLite, project-scoped, and inspectable. It exports a markdown summary that Claude Code's auto-memory system loads natively, so the learned associations show up in every session — no MCP tool call required.
neuralmind serve (v0.5.4, made live in v0.6.0)The agent-facing brain is now also a human-facing tool. A stdlib HTTP server renders a force-directed graph of the entire codebase in the browser — structural edges (calls and imports) drawn together with the learned synapse overlay, where edge thickness encodes weight. Each node has Obsidian-style backlinks and outgoing-links panels, a "synaptic neighbours" list with weights and activation counts, and a one-click "open in editor" button (smart support for VS Code, Cursor, vim, sublime, JetBrains). A semantic quick-switcher lets you type a phrase and jump straight to the matching node. Zero CDN dependencies; per-session access token bound to 127.0.0.1.
v0.6.0 adds a live activity feed. The server now exposes a long-lived /api/events SSE stream subscribed to an in-process event bus. Every SynapseStore.reinforce() call and every coalesced file-edit batch publishes an event; affected nodes pulse on the canvas with short animated radial rings, and a sidebar log shows the most recent ~80 events. A cross-process JSONL bridge (<project>/.neuralmind/events.jsonl) lets a separate neuralmind watch daemon, a Claude Code session, an OpenClaw call, or any other process feed the same canvas. One canvas, every agent. Opt out with NEURALMIND_EVENT_LOG=0.
Why it matters: pre-v0.6.0 the graph view was inspectable but static — you had to refresh to see new state. With the live feed, you can sit there in real time and watch the hippocampus learn your codebase. "The agent has a learning memory" stops being a claim and becomes a visual.
Every query is logged with full provenance: which code was retrieved, why, which embeddings were used, code state (git commit). Export for NIST AI RMF, SOC 2, GDPR, HIPAA.
NeuralMind is MIT licensed and fully open source. No hidden business model, no vendor lock-in, no surprise rate limits.
The latest release leans into the distribution moat: neuralmind install-mcp --all auto-detects your installed agents (Claude Code, Cursor, Cline, Claude Desktop) and registers NeuralMind's MCP server with each — a non-destructive, idempotent merge. The learned synapse layer's uplift is already measured by the self-benchmark's Phase-3 A/B (top-k hit rate 71.7% → 83.3% with recall on). See the v0.19.0 release notes or the summary above.
This is an independent, open-source project. No relationship to NeuralMind.ai (a different company). We chose the name because it reflects our philosophy: a "neural" index that learns your codebase.
Your code stays local. Zero cloud calls, zero telemetry, zero data exfiltration.
Open source, MIT licensed. Every decision is auditable, every result is explainable.
Built for regulated industries. NIST AI RMF, SOC 2, GDPR, HIPAA friendly.
Works with your tools. Claude Code, Cursor, ChatGPT, local LLMs—not locked in.
Smart context reduces per-query tokens 40–70×. Lower costs, better answers.
Built in public. Issues, discussions, and contributions welcome.
Ready to reduce your per-query token costs by 40–70×?