Deep dive into NeuralMindβs 4-layer progressive disclosure system and technical architecture.
NeuralMind is designed to solve a fundamental problem in AI-assisted coding: context window limitations. When working with AI coding assistants, loading an entire codebase can consume 50,000+ tokens, leaving little room for meaningful conversation.
NeuralMind achieves 40-70x token reduction through intelligent, query-aware context selection using a 4-layer progressive disclosure architecture.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TRADITIONAL APPROACH β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Full Codebase βββββββββββββββββββββββββββββββββββΊ AI Context β
β (50,000+ tokens) (50,000 tokens) β
β β
β Problems: β
β β’ Exceeds context windows β
β β’ Most content irrelevant to query β
β β’ Expensive (token costs) β
β β’ Slow processing β
β β’ Dilutes important information β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β NEURALMIND APPROACH β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Full Codebase βββββΊ Knowledge βββββΊ Progressive βββββΊ AI Contextβ
β (50,000+ tokens) Graph Disclosure (1,000 tokens)β
β β
β Benefits: β
β β’ 40-70x token reduction β
β β’ Query-relevant content only β
β β’ Cost effective β
β β’ Fast response β
β β’ Focused, relevant context β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Load information incrementally, starting with the most essential and adding detail as needed.
Essential (Always) βββΊ Identity + Summary (~600 tokens)
Query-Relevant (Dynamic) βββΊ Modules + Search (~400-900 tokens)
βββββββββββββββββββββ
Total: ~1000-1500 tokens
Use embeddings and semantic search rather than keyword matching to find relevant code.
Leverage code structure and relationships (communities/clusters) to load logically related code together.
Only re-process changed nodes to minimize rebuild time.
Strict token limits per layer ensure consistent, predictable context sizes.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β LAYER ARCHITECTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β L0: IDENTITY LAYER (~100 tokens) β β
β β β β
β β β’ Project name β β
β β β’ Brief description β β
β β β’ Key facts (language, framework, purpose) β β
β β β β
β β Source: mempalace.yaml > CLAUDE.md > README.md β β
β β Loading: ALWAYS β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β L1: SUMMARY LAYER (~500 tokens) β β
β β β β
β β β’ High-level architecture overview β β
β β β’ Main components and their roles β β
β β β’ Code cluster summaries β β
β β β’ Key patterns and conventions β β
β β β β
β β Source: graph.json communities + descriptions β β
β β Loading: ALWAYS β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β L2: ON-DEMAND LAYER (~200-500 tokens) β β
β β β β
β β β’ Specific modules relevant to query β β
β β β’ Community/cluster details β β
β β β’ Function signatures and docstrings β β
β β β’ Class hierarchies β β
β β β β
β β Source: Semantic search β community expansion β β
β β Loading: PER QUERY β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β L3: SEARCH LAYER (~200-500 tokens) β β
β β β β
β β β’ Semantic search results β β
β β β’ Relevant code snippets β β
β β β’ Direct matches to query terms β β
β β β’ Related entities β β
β β β β
β β Source: ChromaDB vector search β β
β β Loading: PER QUERY β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Purpose: Establish basic project context that AI needs for any interaction.
Token Budget: ~100 tokens
Content:
Source Priority:
mempalace.yaml - Structured project metadataCLAUDE.md - AI-specific project descriptionREADME.md - Standard project documentationExample Output:
# Project: MyApp
MyApp is a full-stack task management application built with React and Node.js.
It provides real-time collaboration features and integrates with popular
productivity tools.
Purpose: Provide architectural overview and main component understanding.
Token Budget: ~500 tokens
Content:
Source: Generated from graph.json community analysis
Example Output:
## Architecture Overview
### Core Components
- **Frontend** (React 18): Single-page application with TypeScript
- **Backend** (Node.js/Express): REST API with WebSocket support
- **Database** (PostgreSQL): Relational data with Prisma ORM
### Main Modules
1. **User Management** (users/) - Authentication, profiles, permissions
2. **Task Engine** (tasks/) - CRUD, scheduling, notifications
3. **Collaboration** (collab/) - Real-time sync, comments, sharing
4. **API Layer** (api/) - Routes, middleware, validation
Purpose: Load specific modules and code clusters relevant to the current query.
Token Budget: ~200-500 tokens (variable based on relevance)
Content:
Selection Process:
Example Output (for query βHow does authentication work?β):
## Authentication Module (users/auth/)
### Key Components
**authenticate_user(credentials)** β User | None
Validates credentials against database, returns user on success.
**generate_jwt(user)** β str
Creates JWT token with user claims and 24h expiry.
**AuthMiddleware**
Express middleware that validates JWT and attaches user to request.
### Dependencies
- bcrypt for password hashing
- jsonwebtoken for JWT operations
- Redis for token blacklisting
Purpose: Provide direct semantic search results for specific query terms.
Token Budget: ~200-500 tokens (variable based on results)
Content:
Search Strategy:
Example Output (for query βHow does authentication work?β):
## Search Results
**1. authenticate_user** (function) - Score: 0.92
`users/auth/handlers.py:45`
Main authentication handler that validates credentials.
**2. verify_jwt** (function) - Score: 0.88
`users/auth/jwt.py:23`
Verifies and decodes JWT tokens.
**3. hash_password** (function) - Score: 0.81
`users/auth/crypto.py:12`
Securely hashes passwords using bcrypt.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA FLOW β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
BUILD PHASE
βββββββββββββββ βββββββββββββββ
β β graphify update β β
β Codebase β ββββββββββββββββββββββββββββββββββΊ β graph.json β
β β (Parse, Analyze) β β
β .py .js β β Nodes β
β .ts .java β β Edges β
βββββββββββββββ β Communitiesβ
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββ
β β
neuralmind build β ChromaDB β
ββββββββββββββββββββββββββββββββββββ β
(Embed, Index) β Vectors β
β Metadata β
βββββββββββββββ
QUERY PHASE
βββββββββββββββ βββββββββββββββ
β β neuralmind query β β
β User β ββββββββββββββββββββββββββββββββββΊ β NeuralMind β
β Query β "How does auth work?" β β
βββββββββββββββ ββββββββ¬βββββββ
β
βββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β β β β β β
β L0 + L1 β β L2 β β L3 β
β Identity β β Community β β Vector β
β Summary β β Expansion β β Search β
β β β β β β
ββββββββ¬βββββββ ββββββββ¬βββββββ ββββββββ¬βββββββ
β β β
βββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββ
β β
β Context β
β Selector β
β β
β Merge β
β Dedupe β
β Format β
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββ
β β
β Optimized β
β Context β
β β
β ~1000 tok β
βββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β COMPONENT DIAGRAM β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β NeuralMind β
β (core.py) β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Public API β β
β β β β
β β β’ build() - Generate/update embeddings β β
β β β’ wakeup() - Get minimal context β β
β β β’ query() - Get query-optimized context β β
β β β’ search() - Direct semantic search β β
β β β’ benchmark() - Performance testing β β
β β β’ get_stats() - Index statistics β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββ΄ββββββββββββββββ β
β β β β
β βΌ βΌ β
β βββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββ β
β β GraphEmbedder β β ContextSelector β β
β β (embedder.py) β β (context_selector.py) β β
β β β β β β
β β β’ load_graph() β β β’ get_l0_identity() β β
β β β’ embed_nodes() β β β’ get_l1_summary() β β
β β β’ search() β β β’ get_l2_context() β β
β β β’ get_node() β β β’ get_l3_search() β β
β β β β β’ get_context() β β
β ββββββββββββββββ¬βββββββββββββββ ββββββββββββββββ¬βββββββββββββββ β
β β β β
β βΌ β β
β βββββββββββββββββββββββββββββββ β β
β β ChromaDB βββββββββββββββββββββ β
β β (Vector Database) β β
β β β β
β β β’ Collections β β
β β β’ Embeddings β β
β β β’ Metadata β β
β β β’ Similarity Search β β
β βββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TOKEN BUDGET ALLOCATION β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β WAKE-UP CONTEXT (Starting a conversation) β
β βββ L0: Identity 100 tokens ββββ β
β βββ L1: Summary 500 tokens ββββββββββββββββββββ β
β βββββββββββββ β
β 600 tokens total β
β β
β QUERY CONTEXT (Specific questions) β
β βββ L0: Identity 100 tokens ββββ β
β βββ L1: Summary 500 tokens ββββββββββββββββββββ β
β βββ L2: On-Demand 400 tokens ββββββββββββββββ β
β βββ L3: Search 500 tokens ββββββββββββββββββββ β
β βββββββββββββ β
β 1500 tokens max β
β β
β COMPARISON β
β βββ Full Codebase 50,000 tokens ββββββββββββββββββββββββ β
β βββ NeuralMind Query 1,000 tokens ββ β
β βββ Reduction 50x β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
L2 and L3 budgets are dynamic based on:
# Budget allocation logic (simplified)
def allocate_budget(query_complexity: float) -> TokenBudget:
base_l2 = 200
base_l3 = 200
# Scale based on complexity (0.0 - 1.0)
l2_budget = base_l2 + int(300 * query_complexity)
l3_budget = base_l3 + int(300 * query_complexity)
return TokenBudget(
l0=100,
l1=500,
l2=min(l2_budget, 500),
l3=min(l3_budget, 500),
total=100 + 500 + l2_budget + l3_budget
)
NeuralMind uses ChromaDBβs default embedding function, which typically uses:
all-MiniLM-L6-v2 (or similar)βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β EMBEDDING TARGETS β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β For each node in graph.json: β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Embedding Text = Concatenation of: β β
β β β β
β β 1. Node name "authenticate_user" β β
β β 2. Node type "function" β β
β β 3. Description "Validates user credentials..." β β
β β 4. File path "users/auth/handlers.py" β β
β β 5. Docstring "Args: credentials (dict)..." β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Metadata stored: β
β β’ node_id β
β β’ node_type β
β β’ community_id β
β β’ file_path β
β β’ content_hash (for incremental updates) β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
# Incremental embedding logic
def should_embed(node: dict, existing_hash: str) -> bool:
current_hash = hash_node_content(node)
return current_hash != existing_hash
# Only embed changed nodes
for node in graph['nodes']:
if should_embed(node, get_stored_hash(node['id'])):
embed_and_store(node)
else:
skip_count += 1
NeuralMind leverages community structure from the knowledge graph to understand logical code groupings.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β COMMUNITY STRUCTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Community: A cluster of closely related code entities β
β β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
β β Community 1 β β Community 2 β β Community 3 β β
β β (Authentication)β β (Task Engine) β β (API Layer) β β
β β β β β β β β
β β β’ login() β β β’ create_task() β β β’ /api/tasks β β
β β β’ logout() β β β’ update_task() β β β’ /api/users β β
β β β’ verify_jwt() β β β’ delete_task() β β β’ middleware β β
β β β’ User model β β β’ Task model β β β’ validators β β
β ββββββββββ¬βββββββββ ββββββββββ¬βββββββββ ββββββββββ¬βββββββββ β
β β β β β
β ββββββββββββββββββββββΌβββββββββββββββββββββ β
β β β
β Cross-community relationships β
β (imports, calls, dependencies) β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
When a query matches entities in a community, NeuralMind can expand to load related context:
def expand_context(matched_nodes: List[Node], budget: int) -> List[Node]:
# Get communities of matched nodes
communities = set(n.community for n in matched_nodes)
expanded = list(matched_nodes)
remaining_budget = budget - sum(estimate_tokens(n) for n in matched_nodes)
# Add other nodes from same communities
for community_id in communities:
community_nodes = get_community_nodes(community_id)
for node in community_nodes:
if node not in expanded:
node_tokens = estimate_tokens(node)
if node_tokens <= remaining_budget:
expanded.append(node)
remaining_budget -= node_tokens
return expanded
| Optimization | Description | Impact |
|---|---|---|
| Incremental Updates | Only embed changed nodes | 10-100x faster rebuilds |
| Content Hashing | SHA-256 hash of node content | Accurate change detection |
| Batch Embedding | Process nodes in batches | Reduced API overhead |
| Parallel Processing | Multi-threaded for large graphs | 2-4x faster initial build |
| Optimization | Description | Impact |
|---|---|---|
| Vector Indexing | ChromaDB HNSW index | Sub-linear search time |
| Layer Caching | Cache L0/L1 per session | Instant wake-up |
| Result Caching | Cache recent query results | Instant repeat queries |
| Early Termination | Stop search at confidence threshold | Faster for clear queries |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MEMORY USAGE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Component Typical Size β
β βββββββββββββββββββββββββββββββββββββ β
β ChromaDB Index 10-50 MB (depends on codebase size) β
β Loaded Graph 5-20 MB β
β Embedding Model ~100 MB (shared across instances) β
β Query Cache 1-5 MB β
β β
β Total per project: ~20-80 MB β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ