Local memory, shared learnings, and context routing for Hermes, Claude Code, Codex, Cursor, Gemini CLI, Aider, Cline, and any MCP client.
Dhee is the information layer through which your agents collaborate. When one agent creates a reusable learning, Dhee captures it as a candidate; once promoted, every connected agent can use it.
#1 on LongMemEval retrieval — R@1 94.8% · R@5 99.4% · R@10 99.8% on the full 500-question set. Reproduce it →
What is Dhee · Shared Agent Learning · DheeFS · Quick Start · Repo-Shared Context · Benchmarks · How It Works · vs Alternatives · Integrations
Dhee is the local information layer through which your agents collaborate. It runs on your machine, uses SQLite, plugs into Hermes, Claude Code, Codex, and any MCP client, and does four jobs the model can't do for itself:
-
🧠 Remembers. Doc chunks, decisions, what worked, what failed, user preferences. Ebbinghaus decay pushes stale knowledge out of the hot path; frequently-used memory gets promoted. Per-turn context stays bounded and relevant instead of becoming another giant prompt file.
-
🔁 Routes. A 10 MB
git logbecomes a compact digest with a pointer. Raw output only re-enters context when the model explicitly expands it. On heavy tool-output calls, this is where the 90%+ token reduction comes from. -
🌱 Shares learnings. Hermes memory, session traces, and agent-created skills flow into Dhee as auditable learning candidates. Only promoted learnings appear as "Learned Playbooks" for Claude Code, Codex, Hermes, and any Dhee-enabled agent. No separate middleman agent.
-
⚙️ Self-tunes. Dhee watches which digests the model expands and which retrieval depths are useful, then tunes router policy per tool, per intent, per file type. The goal is not a bigger prompt; it is a smaller, better one.
- Every Claude Code / Cursor / Codex / Gemini CLI / Aider / Cline user who has ever hit a context limit or a $200 token bill.
- Hermes users who already have a self-evolving agent and want those learnings to make Claude Code and Codex smarter too.
- Any team with a 2,000-line
CLAUDE.md, a Skills library, anAGENTS.md, or a prompt library that's "too big for context." Stop pruning. Dhee handles delivery. - Anyone who wants their team to share context through git — the same way they share code.
Hermes can evolve its own skills and memories. Claude Code has native hooks. Codex has MCP config, AGENTS.md, and a persisted session stream. Dhee is the information layer underneath them: it turns separate agent histories into shared, gated context.
Hermes MemoryProvider
├─ MEMORY.md / USER.md writes
├─ agent-created skills
├─ session summaries and outcomes
└─ self-evolution traces
│
▼
Dhee Learning Exchange
│
├─ candidate -> review / evidence / score
├─ promoted -> injected as Learned Playbooks
└─ rejected -> auditable, never injected
│
▼
Claude Code · Codex · Hermes · any MCP client
What this means in practice:
- Your existing Hermes progress is not stranded inside Hermes.
dhee installdetects Hermes when present, installs Dhee as a HermesMemoryProviderat~/.hermes/plugins/memory/dhee, and imports local Hermes memory files, session summaries, and agent-created skills into Dhee. - Claude Code and Codex do not need to launch Hermes to benefit. They receive promoted Hermes/Dhee learnings through normal Dhee context and MCP tools.
- New Claude Code and Codex outcomes can become Dhee learning candidates too. After promotion, Hermes can read them back through the same provider.
- Candidate learnings are never auto-injected. Trusted Hermes
MEMORY.md/USER.mdimports may be promoted during install; HermesSOUL.md, session traces, and agent-created skills stay candidates until explicitly approved or promoted by policy.
This is the product contract: with Dhee, a learning proven in one agent can become a promoted playbook for every connected agent.
- Hermes native: Dhee integrates as a Hermes
MemoryProvider, the first-class Hermes memory-plugin surface. Hermes allows one active external memory provider, so V1 replaces Honcho/Mem0/etc. whilememory.provider: dheeis active. - Claude Code native: Dhee uses Claude Code hooks, MCP, and router enforcement. This is the strongest integration surface.
- Codex native: Codex does not expose Claude-style pre-tool hooks here. Dhee uses the closest native Codex surfaces:
~/.codex/config.toml, global~/.codex/AGENTS.md, MCP server instructions, and Codex session-stream auto-sync. - Promotion gate: Imported Hermes skills and session traces are candidates by default. Rejected or archived learnings remain auditable but are excluded from retrieval.
Agents already understand files and shell verbs. DheeFS exposes Dhee's memory, router, handoff, artifacts, shared tasks, and learning exchange as one virtual context space:
dhee shell "ls /learnings"
dhee shell "cat /handoff/latest.md"
dhee shell "grep parser /learnings/promoted"
dhee shell "cat /router/ptr/R-abc123"The first version is a virtual shell, not FUSE. It intentionally supports a small approved command set: ls, cat, grep, why, promote, reject, broadcast, provision, and snapshot. The same surface is available through MCP as dhee_shell(command) and through Python:
from dhee import ContextWorkspace
result = ContextWorkspace(repo=".").execute("provision 'fix parser bug'")
print(result.stdout)External systems such as Slack, Gmail, and Notion are future context sources under /sources, not generic remote action backends. They can sync and search evidence into Dhee artifacts, learnings, and handoffs without making the core install depend on SaaS SDKs.
/learnings candidates, promoted, rejected, archived
/handoff latest repo/session continuity
/router/ptr raw pointer lookup when explicitly requested
/artifacts host-parsed files and chunks
/repo .dhee/context decisions and conventions
/agents Hermes, Claude Code, Codex views
/shared inbox, broadcasts, shared task results
/sources optional future Slack/Gmail/Notion context mounts
One command. No venv. No config. No pasting into settings.json.
curl -fsSL https://raw.githubusercontent.com/Sankhya-AI/Dhee/main/install.sh | shThe installer creates ~/.dhee/, installs the dhee package, and auto-wires Claude Code, Codex, and Hermes when detected. Open your agent in any project — cognition is on.
Other install paths
# Via pip
pip install dhee
dhee install # configure supported agent harnesses
# From source
git clone https://github.com/Sankhya-AI/Dhee.git
cd Dhee && ./scripts/bootstrap_dev_env.sh
source .venv-dhee/bin/activate
dhee installAfter install, Dhee auto-ingests project docs (CLAUDE.md, AGENTS.md, SKILL.md, etc.) on the first session. Run dhee ingest any time to re-chunk.
dhee install # configure local agent harnesses
dhee hermes status # see whether Hermes is detected and Dhee-backed
dhee hermes sync --dry-run # preview Hermes memories/skills before import
dhee learn search --include-candidates # inspect candidates and promotions
dhee link /path/to/repo # share context with teammates through this repo
dhee context refresh # refresh repo context after pull/checkout
dhee handoff # compact continuity for current repo/session
dhee key set openai # store a provider key locally (encrypted)
dhee router report # token-savings stats + replay projection
dhee router tune # re-tune retrieval policy from usageMost "team memory" tools need a server. Dhee uses the one your team already trusts: git.
dhee link /path/to/repoDhee creates a tracked folder inside your repo:
<repo>/.dhee/
config.json
context/manifest.json
context/entries.jsonl
Commit it. Teammates who pull the repo and have Dhee installed get the same shared context — decisions, conventions, what-not-to-do — surfaced into their agent automatically.
Shared context is append-only and git-friendly. If two developers edit overlapping context concurrently, Dhee keeps both versions and reports a conflict instead of silently dropping one developer's work. The installed pre-push hook blocks unresolved conflicts from leaving the laptop:
dhee context check --repo /path/to/repoNo hosted service. No org account. Your repo is the team brain.
#1 on LongMemEval recall. R@1 94.8%, R@5 99.4%, R@10 99.8% — full 500 questions, no held-out split, no cherry-picking.
| System | R@1 | R@3 | R@5 | R@10 |
|---|---|---|---|---|
| Dhee | 94.8% | 99.0% | 99.4% | 99.8% |
| MemPalace (raw) | — | — | 96.6% | — |
| MemPalace (hybrid v4, held-out 450q) | — | — | 98.4% | — |
| agentmemory | — | — | 95.2% | 98.6% |
Stack: NVIDIA llama-nemotron-embed-vl-1b-v2 embedder + llama-3.2-nv-rerankqa-1b-v2 reranker, top-k 10.
Proof is in-tree, not screenshots. Exact command, metrics, and per-question output live under benchmarks/longmemeval/. Recompute R@k yourself — any mismatch is a bug you can open.
┌──────────────────────────────┐
│ Your fat context │
│ CLAUDE.md · AGENTS.md · │
│ SKILL.md · prompts · docs · │
│ sessions · tool output │
└──────────────┬─────────────────┘
│ ingest once
▼
┌────────────────────────────────────────────────────┐
│ Dhee · local SQLite brain │
│ │
│ doc chunks · short-term · long-term · insights · │
│ beliefs · policies · intentions · episodes · edits │
└─────────────────────┬───────────────────────────────┘
│
┌──────────────┴───────────────┐
▼ ▼
Session start Each user prompt
(full assembly) (matching slice only)
│ │
└──────────────┬───────────────┘
▼
┌────────────────────────────┐
│ Token-budgeted XML │
│ <dhee v="1"> │
│ <doc src="CLAUDE.md"…/> │
│ <i>What worked last…</i> │
│ </dhee> │
└────────────────────────────┘
│
Model sees only what it
needs, when it needs it.
On the tool-use side, the router digests raw output at source — never letting raw Read, Bash, or subagent results into context unless the model asks.
Every interface — hooks, MCP, Python, CLI — exposes the same four operations.
from dhee import Dhee
d = Dhee()
d.remember("User prefers FastAPI over Flask")
d.recall("what framework does this project use?")
d.context("fixing the auth bug")
d.checkpoint("Fixed auth bug", what_worked="git blame first", outcome_score=1.0)| Operation | LLM calls | Cost |
|---|---|---|
remember / recall / context |
0 | ~$0.0002 |
checkpoint |
1 per ~10 memories | ~$0.001 |
| Typical 20-turn Opus session | ~1 | ~$0.004 |
Dhee overhead: $0.004/session. Token savings on the same 20-turn session: **$0.50+**. >100× ROI.
Four MCP tools replace Read / Bash / Agent on heavy calls:
dhee_read(file_path, offset?, limit?)— symbols, head, tail, kind, token estimate + pointer.dhee_bash(command)— output digested by class (git log, pytest, grep, listing, generic).dhee_agent(text)— file refs, headings, bullets, error signals from any subagent return.dhee_expand_result(ptr)— only called when the digest genuinely isn't enough.
A 10 MB git log --oneline -50000 becomes a ~200-token digest. This is where the serious savings live.
Most memory layers are static: you write rules, they retrieve. Dhee watches what happens and tunes itself.
- Intent classification. Every
Read/Bash/Agentcall is bucketed (source, test, config, doc, data, build). Each bucket gets its own retrieval depth. - Expansion ledger. Every
dhee_expand_result(ptr)is logged with(tool, intent, depth). - Policy tuning.
dhee router tunereads the ledger and atomically rewrites~/.dhee/router_policy.json— deeper for what gets expanded, shallower for what doesn't.
Frontend-heavy teams get deeper JS/TS digests. Data teams get richer CSV/JSONL summaries. You don't pick — Dhee picks, based on what you actually expand.
| Dhee | CLAUDE.md | Mem0 | Letta | MemPalace | agentmemory | |
|---|---|---|---|---|---|---|
| Tokens / turn | ~300 | 2,000+ | varies | ~1K+ | varies | ~1,900 |
| LongMemEval R@5 | 99.4% | — | — | — | 96.6% | 95.2% |
| Self-tuning retrieval | Yes | No | No | No | No | No |
| Hermes → Claude/Codex learning exchange | Yes | No | No | No | No | No |
| Auto-digest tool output | Yes | No | No | No | No | No |
| Git-shared team context | Yes | Manual | No | No | No | No |
| Works across MCP agents | Yes | No | Partial | No | Yes | Yes |
| External DB required | No (SQLite) | No | Qdrant/pgvector | Postgres+vector | No | No |
| License | MIT | — | Apache-2 | Apache-2 | MIT | MIT |
Dhee combines token reduction, reproducible recall benchmarks, self-tuning retrieval policy, git-shared team context, and promoted cross-agent learning in one local-first collaboration layer.
dhee install # detects Hermes and enables Dhee when present
dhee hermes status
dhee hermes sync --dry-runDhee installs as the Hermes memory provider, mirrors Hermes memory writes, imports local Hermes memory files, and checkpoints Hermes sessions into Dhee learning candidates. Curated MEMORY.md / USER.md imports can be promoted on install; SOUL.md, session traces, and agent-created skills stay gated. Promoted playbooks flow back into Hermes through the provider and out to Claude Code/Codex through Dhee context.
pip install dhee && dhee installSix lifecycle hooks fire at the right moments. Claude Code gets Dhee handoff, shared tasks, inbox broadcasts, learned playbooks, and router enforcement for heavy Read/Bash/Grep calls.
pip install dhee && dhee install --harness codex
dhee harness status --harness codexDhee writes ~/.codex/config.toml, manages a global ~/.codex/AGENTS.md block, advertises context-first MCP instructions, and tails Codex session logs on Dhee calls. Codex does not currently expose Claude-style pre-tool hooks, so this is the strongest truthful native integration available.
{
"mcpServers": {
"dhee": { "command": "dhee-mcp" }
}
}dhee remember "User prefers Python"
dhee recall "programming language"
dhee ingest CLAUDE.md AGENTS.md
dhee checkpoint "Fixed auth" --what-worked "checked logs"pip install dhee[openai,mcp] # cheapest embeddings
pip install dhee[nvidia,mcp] # current SOTA stack
pip install dhee[gemini,mcp]
pip install dhee[ollama,mcp] # local, no API costs| Public Dhee (this repo, MIT) | Dhee Enterprise (private) | |
|---|---|---|
| Local memory + router | ✅ | ✅ |
| Self-tuning retrieval | ✅ | ✅ |
| Hermes → Claude Code/Codex learning exchange | ✅ | ✅ |
| Git-shared repo context | ✅ | ✅ |
| Claude Code / Codex / MCP | ✅ | ✅ |
| Org / team management | — | ✅ |
| Repo Brain code-intelligence | — | ✅ |
| Owner dashboard, billing, licensing | — | ✅ |
| Sentry-derived security telemetry | — | ✅ |
Public Dhee is the local collaboration layer — lightweight, trustworthy, and complete on its own. The commercial layer is closed-source and lives in Sankhya-AI/dhee-enterprise.
What problem does Dhee solve?
Large agent projects accumulate a fat CLAUDE.md, AGENTS.md, skills library, and tool output that get re-injected every turn. Dhee chunks, indexes, and decays that knowledge, and digests fat tool output at the source — so only the relevant ~300 tokens reach the model.
How is Dhee different from Mem0, Letta, MemPalace, agentmemory?
Dhee is built around four pieces most tools treat separately: reproducible LongMemEval results, a self-tuning retrieval/router policy, source-side digests for heavy Read/Bash/subagent output, and git-shared team context instead of a server.
Does Dhee work with Claude Code, Cursor, Codex, Gemini CLI, Aider? Yes. Native Claude Code hooks, closest-native Codex config/AGENTS/session-stream sync, a Hermes MemoryProvider, an MCP server for every other host, plus a Python SDK and CLI. One install, every agent.
Does Hermes make Claude Code and Codex smarter? Yes, through Dhee's learning exchange after promotion. Dhee can install as Hermes' memory provider, import Hermes memory/session/skill artifacts, and expose promoted learnings to Claude Code, Codex, and any MCP client as Learned Playbooks. Claude/Codex do not have to run Hermes to benefit.
Does Claude Code or Codex evolve Hermes back? Yes, after promotion. Claude Code hooks, Codex session-stream sync, MCP memory tools, and learning submissions create Dhee learning candidates. Promoted personal/repo/workspace playbooks are retrieved by Hermes through the Dhee provider.
How does the team-context sharing actually work?
dhee link /path/to/repo writes a .dhee/ directory inside your repo. Commit it. Teammates pull, install Dhee, and their agent surfaces the same shared decisions and conventions. Append-only with conflict detection — no overwrites, no server, no account.
Is Dhee production-ready? What storage? SQLite by default. No Postgres, no Qdrant, no pgvector, no infra. The regression suite and reproducible benchmarks live in-tree. MIT, works offline with Ollama or online with OpenAI / NVIDIA NIM / Gemini.
Where are the benchmarks and can I reproduce them?
benchmarks/longmemeval/ — full command, per-question JSONL, metrics.json. Clone, run, recompute R@k. Any mismatch is an issue you can open.
git clone https://github.com/Sankhya-AI/Dhee.git
cd Dhee && ./scripts/bootstrap_dev_env.sh
source .venv-dhee/bin/activate
pytestFor the same full-suite path CI expects, including the local Rust acceleration extension and async test plugin:
./scripts/verify_full_suite.sh
Your fat skills stay fat. Your token bill stays thin. Promoted learnings travel with every agent.
GitHub ·
PyPI ·
Issues ·
Sankhya AI
MIT License — built by Sankhya AI Labs.
Topics: ai-agents · agent-memory · llm-memory · developer-brain · claude-code · claude-code-hooks · claudemd · agentsmd · mcp · mcp-server · model-context-protocol · context-router · context-engineering · context-compression · token-optimization · llm-tools · vector-memory · sqlite · longmemeval · retrieval-augmented-generation · rag · mem0-alternative · letta-alternative · mempalace-alternative · cursor · codex · gemini-cli · aider · cline · goose

