Honcho is memory infrastructure for building stateful agents that understand changing people, agents, groups, projects, and ideas over time.
Store messages and events, let Honcho reason in the background, then query peer representations, session context, search results, or natural-language insights from any model or framework. Use it managed at api.honcho.dev or self-host the FastAPI server yourself.
Using Honcho as your memory system will earn your agents higher retention, more trust, and help you build data moats to out-compete incumbents.
Honcho has defined the Pareto Frontier of Agent Memory. Watch the video, check out our evals page, and read the blog post for more detail.
The Honcho project is split between several repositories, with this one hosting the core service logic — implemented as a FastAPI server. Client SDKs for Python and TypeScript live in the sdks/ directory.
| I want to... | Path | Get started |
|---|---|---|
| Give my coding agent persistent memory | Claude Code, OpenCode, OpenClaw, Hermes, or any MCP client | Integrations |
| Add memory to my product | Python or TypeScript SDK | Quickstart |
| Self-host Honcho | Docker / local development | Self-hosting |
| Capability | What it means |
|---|---|
| Reasoning-first memory | Extracts conclusions from conversations and events, not just matching chunks. |
| Peer-centric model | Tracks users, agents, groups, projects, and ideas as entities that change over time. |
| Multi-peer perspective | Models what one peer knows about another when configured. |
| Managed or self-hosted | Use api.honcho.dev or run the FastAPI server yourself. |
| Agent-tool integrations | MCP, Claude Code, OpenCode, OpenClaw, Hermes, Cursor-compatible clients. |
Concretely: workspaces hold peers, peers participate in sessions, messages live on sessions, and Honcho builds a per-peer representation that you query through the Chat Endpoint or directly.
Get an API key at app.honcho.dev — when you sign up you'll be prompted to join an organization, which gets its own dedicated Honcho instance and $100 free credits. Or self-host and run against http://localhost:8000.
pip install honcho-ai
# or: uv add honcho-ai
# or: poetry add honcho-ai
import os
from honcho import Honcho
# Managed service uses api.honcho.dev by default. For self-hosted, pass
# base_url="http://localhost:8000" or set HONCHO_URL.
honcho = Honcho(
workspace_id="my-app-testing",
api_key=os.environ["HONCHO_API_KEY"],
)
# 1. Store: peers and messages on a session
alice = honcho.peer("alice")
tutor = honcho.peer("tutor")
session = honcho.session("session-1")
session.add_messages([
alice.message("Hey there — can you help me with my math homework?"),
tutor.message("Absolutely. Send me your first problem!"),
])
# 2. Reason: happens asynchronously in the background.
# 3. Query: ask Honcho what it knows, or pull prompt-ready context.
answer = alice.chat("What learning styles does the user respond to best?")
context = session.context(summary=True, tokens=10_000)
# 4. Inject: hand the context to your model of choice.
from openai import OpenAI
client = OpenAI()
completion = client.chat.completions.create(
model=os.environ.get("OPENAI_MODEL", "gpt-4o-mini"),
messages=context.to_openai(assistant=tutor),
)
npm install @honcho-ai/sdk
# or: bun add @honcho-ai/sdk
import { Honcho } from "@honcho-ai/sdk";
import OpenAI from "openai";
const honcho = new Honcho({
workspaceId: "my-app-testing",
apiKey: process.env.HONCHO_API_KEY,
});
const alice = await honcho.peer("alice");
const tutor = await honcho.peer("tutor");
const session = await honcho.session("session-1");
await session.addMessages([
alice.message("Hey there — can you help me with my math homework?"),
tutor.message("Absolutely. Send me your first problem!"),
]);
const answer = await alice.chat(
"What learning styles does the user respond to best?",
);
const context = await session.context({ summary: true, tokens: 10_000 });
const openai = new OpenAI();
const completion = await openai.chat.completions.create({
model: process.env.OPENAI_MODEL ?? "gpt-4o-mini",
messages: context.toOpenAI({ assistant: tutor }),
});
Note: background reasoning is asynchronous. Newly-added messages may take a moment to be reflected in chat/representation responses; for low-latency reads, use the
representationendpoint.
| Need | API |
|---|---|
| Save interaction history | session.add_messages(...) |
| Ask what Honcho knows about a peer | peer.chat(...) |
| Get prompt-ready context | session.context(...).to_openai(...) / .to_anthropic(...) |
| Hybrid search (BM25 + vector) | peer.search(...), session.search(...), honcho.search(...) |
| Low-latency static representations | peer.representation(...), session.representation(...) |
| Import documents | session.upload_file(...) |
| Inspect background processing | honcho.queue_status(...) |
See the full SDK Reference and API Reference.
Two ways, depending on how deep you want to go:
Plugin (richer integration — recommended for Claude Code users):
/plugin marketplace add plastic-labs/claude-honcho
/plugin install honcho@honcho
Raw MCP (works in any MCP client — Cursor, Cline, Windsurf, etc.):
claude mcp add honcho \
--transport http \
--url "https://mcp.honcho.dev" \
--header "Authorization: Bearer hch-your-key-here" \
--header "X-Honcho-User-Name: YourName"
Details: Claude Code guide · MCP guide.
opencode plugin "@honcho-ai/opencode-honcho" --global
Details: OpenCode guide.
openclaw plugins install @honcho-ai/openclaw-honcho
openclaw honcho setup
openclaw gateway --force
openclaw honcho setup prompts for your API key, writes the config, and optionally migrates legacy MEMORY.md / USER.md / IDENTITY.md files into Honcho (non-destructive — originals are never deleted). Details: OpenClaw guide.
hermes memory setup # select "honcho", point at api.honcho.dev or your local server
Details: Hermes guide.
For wiring the Honcho SDK into an existing application, install the integration skill — it explores your codebase, asks about integration preferences, generates the SDK setup, and verifies it works:
npx skills add plastic-labs/honcho
Then invoke /honcho-integration in Claude Code (or /honcho-dev:integrate via the plugin marketplace). Details: agentic development guide.
The same claude mcp add form (or its client-specific equivalent) works in any MCP-compatible client. See MCP guide.
Honcho organises everything around peers — humans and AI agents alike are first-class entities. The peer model enables:
Peers exchange messages within sessions; Honcho reasons over those messages to build a representation of each peer that you can query.
What you query out of Honcho:
Internal storage (Collections & Documents)
Internally, Honcho stores peer-related observations in collections of vector-embedded documents. Collections are keyed by (observer, observed) peer pairs — the same mechanism powers self-representation (observer == observed) and cross-peer modelling (peer X's understanding of peer Y). These primitives are not exposed directly; the Conclusions API is the public surface.
Honcho's evals span LongMemEval, LoCoMo, and other long-conversation benchmarks. See the evals page, the research blog post, and the Pareto-frontier announcement video for methodology and reproducible results.
Honcho is open source under AGPL-3.0. You can run the full server locally with Docker, then point the SDKs at http://localhost:8000.
git clone https://github.com/plastic-labs/honcho.git
cd honcho
cp docker-compose.yml.example docker-compose.yml
cp .env.template .env # fill in LLM_GEMINI_API_KEY / LLM_ANTHROPIC_API_KEY / LLM_OPENAI_API_KEY
docker compose up
Then point the SDKs at it:
honcho = Honcho(workspace_id="my-app-testing", base_url="http://localhost:8000")
# or: export HONCHO_URL=http://localhost:8000
Local development without Docker
Below is a guide on setting up a local environment for running the Honcho Server without Docker.
Honcho is developed using python and uv.
The minimum python version is 3.10
The minimum uv version is 0.5.0
Once the dependencies are installed on the system run the following steps to get the local project setup.
```bash git clone https://github.com/plastic-
$ claude mcp add honcho \
-- python -m otcore.mcp_server <graph>