Claude Engram
Persistent memory for AI coding assistants. Auto-tracks mistakes, decisions, and context. Retrieves the right memory at the right time using hybrid search (keyword + vector + reranking). Works with any MCP-compatible tool.
Benchmarks
Retrieval benchmarks
Retrieval-only (recall@k). These measure whether the right memory is found in the top results — not end-to-end QA with answer generation and judge scoring, which is what the published LongMemEval leaderboard measures. MemPalace comparison uses the same retrieval-only methodology (their raw mode, no LLM reranking, top_k=10).
| Benchmark | Claude Engram | MemPalace (raw) |
|---|---|---|
| LongMemEval Recall@5 (500 questions) | 0.966 | 0.966 |
| LongMemEval Recall@10 | 0.982 | 0.982 |
| LongMemEval NDCG@10 | 0.889 | 0.889 |
| ConvoMem (250 items, 5 categories) | 0.960 | 0.929 |
| LoCoMo R@10 (1,986 questions, top_k=10) | 0.649 | 0.603 |
| Speed | 43ms/query | ~600ms/query |
| Dependencies | AllMiniLM (optional) | ChromaDB |
Reproduce: python tests/bench_longmemeval.py, bench_locomo.py, bench_convomem.py
Integration benchmarks
These test what the product actually does — not just search retrieval.
| Benchmark | What it tests | Result |
|---|---|---|
| Decision Capture (220 prompts) | Auto-detect decisions from user prompts | 97.8% precision, 36.7% recall |
| Injection Relevance (50 memories, 15 cases) | Right memories surface before edits | 14/15 passed, 100% cross-domain isolation |
| Compaction Survival (6 scenarios) | Rules/mistakes survive context compression | 6/6 passed |
| Error Auto-Capture (53 payloads) | Extract errors, reject noise, deduplicate | 100% recall, 97% precision |
| Multi-Project Scoping (11 cases) | Sub-project isolation + workspace inheritance | 11/11 passed |
| Edit Loop Detection (12 scenarios) | Detect spirals vs iterative improvement | 12/12 passed |
Reproduce: python tests/bench_integration.py (runs from tests/ directory)
Comparison with MemPalace
Different approaches. MemPalace is a conversation archive with a spatial palace structure, knowledge graph, AAAK compression, and specialist agents. Claude Engram is live-capture: hooks into the coding lifecycle to auto-track mistakes, decisions, and context as you work. Comparable retrieval, different strengths.
Compatibility
| Platform | What Works | Auto-Capture |
|---|---|---|
| Claude Code (CLI, desktop, VS Code, JetBrains) | Everything | Yes — 10 hook events |
| Cursor | MCP tools (memory, search, scope, etc.) | No hooks |
| Windsurf | MCP tools | No hooks |
| Continue.dev | MCP tools | No hooks |
| Zed | MCP tools | No hooks |
| Any MCP client | MCP tools | No hooks |
| Python code | MemoryStore SDK directly |
N/A |
With Claude Code, hooks auto-capture mistakes, decisions, edits, test results, and session state. With other tools, you use the MCP tools manually — the memory system, hybrid search, archiving, and scoring all work the same.
Features
- Hybrid search — keyword + AllMiniLM vector + reranking. No ChromaDB dependency.
- Auto-tracks mistakes from any failed tool. Warns before editing the same file.
- Auto-captures decisions from prompts ("let's use X") via semantic + regex scoring.
- Detects edit loops when the same file is edited 3+ times.
- Survives compaction — auto-checkpoint before, re-inject rules/mistakes after.
- Tiered storage — hot (fast) + archive (cold, searchable, restorable). Rules and mistakes never archive.
- Scored injection — top 3 memories by file match, tags, recency, importance before every edit.
- Multi-project — memories scoped per sub-project. Workspace rules cascade down.
Install
git clone https://github.com/20alexl/claude-engram.git
cd claude-engram
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -e . # Core
pip install -e ".[semantic]" # + AllMiniLM for vector search and decision capture
python install.py # Configure hooks + MCP server
Per-Project Setup
python install.py --setup /path/to/your/project
Or copy .mcp.json and CLAUDE.md to your project root.
Mid-Project Adoption
Already deep in a project? Install normally, then tell your AI to dump what it knows:
Save everything you know about this project:
- memory(add_rule) for each project convention
- memory(remember) for key facts about the architecture
- work(log_decision) for decisions we've made and why
Ollama (Optional)
Only needed for scout_search, scout_analyze, and LLM-based convention checking. Everything else works without it.
ollama pull gemma3:4b # or gemma3:12b for better semantic search
export CLAUDE_ENGRAM_MODEL="gemma3:4b" # Linux/Mac
Configuration
| Variable | Default | Description |
|---|---|---|
CLAUDE_ENGRAM_MODEL |
gemma3:12b |
Ollama model |
CLAUDE_ENGRAM_OLLAMA_URL |
http://localhost:11434 |
Ollama endpoint |
CLAUDE_ENGRAM_ARCHIVE_DAYS |
14 |
Days until inactive memories archive |
CLAUDE_ENGRAM_SCORER_TIMEOUT |
1800 |
AllMiniLM server idle timeout (seconds) |
Documentation
Library Book — design, internals, full usage guide, API reference, gotchas.
License
MIT

