
Highlights · Overview · Core Technology · Features · Quick Start
TencentDB Agent Memory = symbolic short-term memory + layered long-term memory.
- Symbolic short-term memory offloads heavy tool logs and condenses them into compact Mermaid symbols, cutting token usage and improving task success.
- Layered long-term memory distills fragmented conversations into structured personas and scenes, instead of flat vector piles.
When integrated with OpenClaw, it cuts token usage by up to 61.38%, improves pass rate by 51.52% (relative), and raises PersonaMem accuracy from 48% to 76%.
| Memory Capability | Benchmark | OpenClaw Success | With Plugin | Relative Δ | OpenClaw Tokens | With Plugin Tokens | Relative Δ |
|---|---|---|---|---|---|---|---|
| Short-term | WideSearch | 33% | 50% | +51.52% | 221.31M | 85.64M | −61.38% |
| Short-term | SWE-bench | 58.4% | 64.2% | +9.93% | 3474.1M | 2375.4M | −33.09% |
| Short-term | AA-LCR | 44.0% | 47.5% | +7.95% | 112.0M | 77.3M | −30.98% |
| Long-term | PersonaMem | 48% | 76% | +59% | — | — | — |
These results are measured over continuous long-horizon sessions, not isolated turns. For example, SWE-bench runs 50 consecutive tasks per session to simulate the context-accumulation pressure of real-world long-horizon agents.
Memory is not about hoarding everything in the AI — it is about sparing humans from having to repeat themselves.
In practice, we constantly re-explain the same SOPs, project background, tool conventions, and output formats to the Agent. Such information should not require repetition, nor should it be indiscriminately dumped into the context.
TencentDB Agent Memory helps the Agent learn your workflows, retain task context, and reuse past experience. We reject both brute-force history accumulation and irreversible lossy summarization. Instead, we design memory as a layered system: symbolic memory for in-task information overload, and memory layering for cross-session experience.
Let the Agent remember what should be remembered, so people can focus on judgment, creation, and work that truly matters.
Our architecture rests on two pillars: memory layering and symbolic memory. Together they ensure Agents do not merely "remember more", but "reason better".
Traditional memory systems shred data into fragments and dump them into a flat vector store. Recall degenerates into a blind search across disconnected fragments, with no macro-level guidance.
Whether it is long-term knowledge, short-term tasks, or future skill capabilities, memory should never be flat — both its formation and its recall must be hierarchical. TencentDB Agent Memory adopts layering as its unified architectural paradigm:
refs/*.md); the middle layer extracts step-level summaries (jsonl); the top layer condenses state into a lightweight Mermaid canvas. The Agent only needs to attend to the top-layer structure in context, and drills down to the lower layers via node_id when an error occurs.
Heterogeneous storage and progressive disclosure. A dual-layer storage strategy underpins this architecture. The bottom layer (facts, logs, traces) is persisted in databases for robust full-text retrieval; the top layer (personas, scenes, canvases) is stored as human-readable Markdown files for high information density and white-box inspection. Lower layers preserve evidence; upper layers preserve structure.
Full traceability and lossless recovery. Compression often sacrifices traceability. TencentDB Agent Memory avoids irreversible compression by maintaining a deterministic path from high-level abstractions back to ground-truth evidence. Whether it is an offloaded error log or a distilled user preference, the system guarantees a complete drill-down path: "top-layer symbol (Persona / canvas) → mid-layer index (Scenario / jsonl) → bottom-layer raw text (L0 Conversation / refs)".

In long tasks, the largest token consumers are verbose intermediate logs (search results, code, error traces). To address this, we combine context offloading with symbolic memory:
node_id tracing. The Agent reasons over the symbol graph; to verify a detail, it greps for the node_id and instantly retrieves the full raw text — cutting token cost while preserving full traceability.graph LR
Log["Verbose Logs
(hundreds of thousands of tokens)"] -->|"1. Offload full text"| FS[("External FS
(refs/*.md)")]
Log -->|"2. Extract relations"| MMD["Mermaid Canvas
(with node_id)"]
MMD -->|"3. Light injection"| Agent(("Agent Context
(a few hundred tokens)"))
Agent -. "4. Recall via node_id" .-> FS
style Log fill:#f1f5f9,stroke:#94a3b8,stroke-dasharray: 5 5,color:#475569
style FS fill:#f8fafc,stroke:#cbd5e1,stroke-width:2px,color:#334155
style MMD fill:#eff6ff,stroke:#3b82f6,stroke-width:2px,color:#1e3a8a
style Agent fill:#fffbeb,stroke:#f59e0b,stroke-width:2px,color:#92400e
This section covers standalone local mode only: the Memory Gateway runs locally (or inside the container), uses local SQLite + BM25 by default, and does not depend on any cloud Memory service.
By default,
TDAI_GATEWAY_API_KEYis not set, so the Gateway does not enforce a fixed Bearer secret. The v2 SDK still sends non-emptyapiKeyandserviceIdheaders as part of the protocol; for standalone mode, useapiKey = "local"andserviceId = "default".
The images are published on Docker Hub with multi-arch tags, so Docker automatically selects the correct architecture (amd64 / arm64). No local image build is required:
docker pull agentmemory/hermes-memory:1.0.0-beta
docker pull agentmemory/openclaw-memory:1.0.0-beta
Defaults to a local SQLite + sqlite-vec backend.
// ~/.openclaw/openclaw.json
{
"memory-tencentdb": {
"enabled": true
}
}
Once enabled, TencentDB Agent Memory automatically handles conversation capture, memory extraction, scene aggregation, persona generation, and recall before the next turn.
{
"memory-tencentdb": {
"config": {
"offload": {
"enabled": true
}
}
}
}
Add the slots field so OpenClaw routes context-offload requests to this plugin:
{
"plugins": {
"slots": {
"contextEngine": "memory-tencentdb"
}
}
}
For the best results, run the patch script below. It hooks after-tool-call messages so they can be offloaded and recovered correctly:
docker run -d --name hermes-memory \
--restart unless-stopped \
-v hermes-memory-data:/opt/data \
-e MODEL_API_KEY="<LLM_API_KEY>" \
-e MODEL_BASE_URL="https://api.lkeap.cloud.tencent.com/v1" \
-e MODEL_NAME="deepseek-v3.2" \
-e MODEL_PROVIDER="custom" \
agentmemory/hermes-memory:1.0.0-beta
# Enter Hermes chat
docker exec -it hermes-memory hermes
# Check the in-container Memory Gateway
docker exec hermes-memory curl -s http://127.0.0.1:8420/health
If you reuse an existing
hermes-memory-datavolume, the container keeps using/opt/data/config.yamland/opt/data/.envfrom that volume. When changingMODEL_API_KEY,MODEL_BASE_URL,MODEL_NAME, orMODEL_PROVIDER, add-e HERMES_CONFIG_OVERWRITE=1to regenerate the config. Removing the volume also works, but it deletes local memory data.
docker run --rm -it --name openclaw-memory \
-p 18789:18789 \
-v openclaw-memory-data:/home/node/.openclaw \
-v openclaw-memory-store:/opt/data \
-e MODEL_API_KEY="<LLM_API_KEY>" \
-e MODEL_BASE_URL="<LLM_BASE_URL>" \
-e MODEL_NAME="<MODEL_ID>" \
-e OPENCLAW_CONFIG_OVERWRITE=1 \
agentmemory/openclaw-memory:1.0.0-beta \
openclaw-memory-tui
Both OpenClaw and Hermes standalone containers run their own in-container Memory Gateway at 127.0.0.1:8420. Normal usage does not require publishing port 8420 to the host. Only add a non-conflicting mapping such as -p 8421:8420 when you want to debug the in-container Gateway from the host.
If you only want to run the Memory Gateway and call it from SDKs or your own Agent:
git clone <this-repo>
cd tdai-memory-openclaw-plugin
npm install
TDAI_GATEWAY_CONFIG="$PWD/tdai-gateway.standalone.yaml" \
TDAI_GATEWAY_HOST=127.0.0.1 \
TDAI_GATEWAY_PORT=8420 \
TDAI_DATA_DIR="$HOME/.memory-tencentdb/memory-tdai" \
TDAI_LLM_API_KEY="<LLM_API_KEY>" \
TDAI_LLM_BASE_URL="https://api.openai.com/v1" \
TDAI_LLM_MODEL="gpt-4o" \
node --import tsx/esm src/gateway/server.ts
Verify the service:
curl http://127.0.0.1:8420/health
By default, TDAI_GATEWAY_API_KEY is not configured, so standalone mode does not enforce a fixed shared secret. If you expose the Gateway to a network, set TDAI_GATEWAY_API_KEY explicitly and use the same value in clients.
This option is intended for users who want to build their own Agent Memory adapter: run only the Memory Gateway, then use the v2 SDK inside your Agent framework to write conversations, recall memories, and expose memory tools. This repository provides two adapter references:
Full SDK API and local package build references: TypeScript SDK docs / Python SDK docs.
Install:
npm install @tencentdb-agent-memory/memory-sdk-ts
Example:
import { MemoryClient } from "@tencentdb-agent-memory/memory-sdk-ts";
const client = new MemoryClient({
endpoint: "http://127.0.0.1:8420",
apiKey: "local", // standalone default; if TDAI_GATEWAY_API_KEY is set, use that value
serviceId: "default", // standalone default space
});
await client.addConversation({
session_id: "quickstart-session",
messages: [
{ role: "user", content: "I prefer TypeScript for tool scripts." },
{ role: "assistant", content: "Got it, I will remember that preference." },
],
});
const conversations = await client.queryConversation({
session_id: "quickstart-session",
limit: 10,
});
console.log(conversations.messages);
// L1 memories are extracted asynchronously. After enough turns or after the
// pipeline has run, you can search structured memories:
const memories = await client.searchAtomic({ query: "TypeScript preference", limit: 5 });
console.log(memories.items);
Install:
pip install tencentdb-agent-memory-sdk-python
Example:
```python from tencentdb_agent_memory import MemoryClient
client = MemoryClient( endpoint="http://127.0.0.1:8420", api_key="local", # standalone default; if TDAI_GATEWAY_API_KEY is set, use that value service_id="default", # standalone default space )
client.add_convers
$ claude mcp add TencentDB-Agent-Memory \
-- python -m otcore.mcp_server <graph>