🤖

Using a coding agent? Run this to install the Hindsight docs skill:

npx skills add https://github.com/vectorize-io/hindsight --skill hindsight-docs

Claude Code

Biomimetic long-term memory for Claude Code using Hindsight. Automatically captures conversations, recalls relevant context, and exposes knowledge tools and a subagent-creation skill so Claude can read and write its own memory.

View Changelog →

Quick Start

# 1. Add the Hindsight marketplace and install the plugin
claude plugin marketplace add vectorize-io/hindsight
claude plugin install hindsight-memory

# 2. Configure your LLM provider for memory extraction
# Option A: OpenAI (auto-detected)
export OPENAI_API_KEY="sk-your-key"

# Option B: Anthropic (auto-detected)
export ANTHROPIC_API_KEY="your-key"

# Option C: No API key needed (uses Claude Code's own model — personal/local use only)
export HINDSIGHT_LLM_PROVIDER=claude-code

# Option D: Connect to an external Hindsight server instead of running locally
mkdir -p ~/.hindsight
echo '{"hindsightApiUrl": "https://your-hindsight-server.com"}' > ~/.hindsight/claude-code.json

# 3. Start Claude Code — the plugin activates automatically
claude

That's it! The plugin will automatically start capturing and recalling memories.

Features

Auto-recall — on every user prompt, queries Hindsight for relevant memories and injects them as context (invisible to the chat transcript, visible to Claude)
Auto-retain — after every response (or every N turns), extracts and retains conversation content to Hindsight for long-term storage
Knowledge tools — Claude can read, write, and search its own memory via MCP tools (knowledge pages, recall, ingest)
Subagent skill — /hindsight-memory:create-agent scaffolds a subagent backed by an isolated memory bank
Dynamic bank IDs — per-agent, per-project, or per-session memory isolation (so each subagent gets its own brain)
Daemon management — can auto-start/stop hindsight-embed locally or connect to an external Hindsight server
Channel-agnostic — works with Claude Code Channels (Telegram, Discord, Slack) or interactive sessions

Architecture

The plugin combines hooks (for automatic recall/retain) with an MCP server (for explicit knowledge tools) and a skill (for subagent creation):

Component	Trigger	Purpose
`session_start.py`	`SessionStart` hook	Health check — verify Hindsight is reachable
`recall.py`	`UserPromptSubmit` hook	Auto-recall — query memories, inject as `additionalContext`
`retain.py`	`Stop` hook	Auto-retain — extract transcript, POST to Hindsight (async)
`session_end.py`	`SessionEnd` hook	Cleanup — stop auto-managed daemon if started
`mcp_server.py`	MCP server	Exposes `agent_knowledge_*` tools — list/get/create/update/delete pages, recall, ingest
`create-agent`	Skill	Scaffolds a subagent file under `~/.claude/agents/` and seeds its bank

Python dependencies (mcp) are bootstrapped on first run into a private venv under ${CLAUDE_PLUGIN_DATA}/venv — no global pip install, isolated to the plugin, survives plugin updates.

Connection Modes

1. External API (recommended for production)

Connect to a running Hindsight server (cloud or self-hosted). No local LLM needed — the server handles fact extraction.

{
  "hindsightApiUrl": "https://your-hindsight-server.com",
  "hindsightApiToken": "your-token"
}

2. Local Daemon (auto-managed)

The plugin automatically starts and stops hindsight-embed via uvx. Requires an LLM provider API key for local fact extraction.

Set an LLM provider:

export OPENAI_API_KEY="sk-your-key"
# or
export ANTHROPIC_API_KEY="your-key"
# or
export HINDSIGHT_LLM_PROVIDER=claude-code # No API key needed

The model is selected automatically by the Hindsight API. To override, set HINDSIGHT_LLM_MODEL.

3. Existing Local Server

If you already have hindsight-embed running, leave hindsightApiUrl empty and set apiPort to match your server's port. The plugin will detect it automatically.

Configuration

All settings live in ~/.hindsight/claude-code.json. Every setting can also be overridden via environment variables. The plugin ships with sensible defaults — you only need to configure what you want to change.

Loading order (later entries win):

Built-in defaults (hardcoded in the plugin)
Plugin settings.json (ships with the plugin, at CLAUDE_PLUGIN_ROOT/settings.json)
User config (~/.hindsight/claude-code.json — recommended for your overrides)
Environment variables

Connection & Daemon

These settings control how the plugin connects to the Hindsight API.

Setting	Env Var	Default	Description
`hindsightApiUrl`	`HINDSIGHT_API_URL`	`""` (empty)	URL of an external Hindsight API server. When empty, the plugin uses a local daemon instead.
`hindsightApiToken`	`HINDSIGHT_API_TOKEN`	`null`	Authentication token for the external API. Only needed when `hindsightApiUrl` is set.
`apiPort`	`HINDSIGHT_API_PORT`	`9077`	Port used by the local `hindsight-embed` daemon. Change this if you run multiple instances or have a port conflict.
`daemonIdleTimeout`	`HINDSIGHT_DAEMON_IDLE_TIMEOUT`	`0`	Seconds of inactivity before the local daemon shuts itself down. `0` means the daemon stays running until the session ends.
`embedVersion`	`HINDSIGHT_EMBED_VERSION`	`"latest"`	Which version of `hindsight-embed` to install via `uvx`. Pin to a specific version (e.g. `"0.5.2"`) for reproducibility.
`embedPackagePath`	`HINDSIGHT_EMBED_PACKAGE_PATH`	`null`	Local filesystem path to a `hindsight-embed` checkout. When set, the plugin runs from this path instead of installing via `uvx`. Useful for development.

LLM Provider (local daemon only)

These settings configure which LLM the local daemon uses for fact extraction. They are ignored when connecting to an external API (the server uses its own LLM configuration).

Setting	Env Var	Default	Description
`llmProvider`	`HINDSIGHT_LLM_PROVIDER`	auto-detect	Which LLM provider to use. Supported values: `openai`, `anthropic`, `gemini`, `groq`, `ollama`, `openai-codex`, `claude-code`. When omitted, the plugin auto-detects by checking for API key env vars in order: `OPENAI_API_KEY` → `ANTHROPIC_API_KEY` → `GEMINI_API_KEY` → `GROQ_API_KEY`.
`llmModel`	`HINDSIGHT_LLM_MODEL`	provider default	Override the default model for the chosen provider (e.g. `"gpt-4o"`, `"claude-sonnet-4-20250514"`). When omitted, the Hindsight API picks a sensible default for each provider.
`llmApiKeyEnv`	—	provider standard	Name of the environment variable that holds the API key. Normally auto-detected (e.g. `OPENAI_API_KEY` for the `openai` provider). Set this only if your key is in a non-standard env var.

Memory Bank

A bank is an isolated memory store — like a separate "brain." These settings control which bank the plugin reads from and writes to.

Setting	Env Var	Default	Description
`bankId`	`HINDSIGHT_BANK_ID`	`"claude_code"`	The bank ID to use when `dynamicBankId` is `false`. All sessions share this single bank.
`bankMission`	`HINDSIGHT_BANK_MISSION`	generic assistant prompt	A short description of the agent's identity and purpose. Sent to Hindsight when creating or updating the bank, and used during recall to contextualize results.
`retainMission`	—	extraction prompt	Instructions for the fact extraction LLM — tells it what to extract from conversations (e.g. "Extract technical decisions and user preferences").
`dynamicBankId`	`HINDSIGHT_DYNAMIC_BANK_ID`	`false`	When `true`, the plugin derives a unique bank ID from context fields (see `dynamicBankGranularity`), giving each combination its own isolated memory.
`dynamicBankGranularity`	—	`["agent", "project"]`	Which context fields to combine when building a dynamic bank ID. Available fields: `agent` (agent name), `project` (working directory), `session` (session ID), `channel` (channel ID), `user` (user ID).
`bankIdPrefix`	—	`""`	A string prepended to all bank IDs — both static and dynamic. Useful for namespacing (e.g. `"prod"` or `"staging"`).
`agentName`	`HINDSIGHT_AGENT_NAME`	`"claude-code"`	Name used for the `agent` field in dynamic bank ID derivation.
`resolveWorktrees`	—	`true`	When deriving the `project` field, resolve git worktrees to the main repository's basename so that all worktrees of the same repo share one bank. Set to `false` to use the literal working directory basename instead (each worktree gets its own bank).
`directoryBankMap`	—	`{}`	Explicit `{ "/path/to/dir": "bank-id" }` mapping. When the current working directory matches an entry, that bank is used directly — overrides both static and dynamic resolution. `bankIdPrefix` still applies on top.

Worktrees and explicit mapping

By default, all git worktrees of the same repository share one bank. For example, if you keep a main checkout at /home/me/myproject and a worktree at /home/me/myproject-wt1, both resolve to project = "myproject" — so dynamic bank IDs like claude-code::myproject are stable across all worktrees. This avoids fragmenting memory across short-lived branches.

If you want each worktree to have its own bank, set "resolveWorktrees": false.

For full control, use directoryBankMap to pin specific directories to specific bank IDs:

{
  "directoryBankMap": {
    "/home/me/work/client-a": "client-a-memories",
    "/home/me/personal/blog": "blog"
  }
}

When cwd matches one of the keys, that bank is used immediately — no static or dynamic resolution runs. Directories not listed fall through to the normal logic.

Auto-Recall

Auto-recall runs on every user prompt. It queries Hindsight for relevant memories and injects them into Claude's context as invisible additionalContext (the user doesn't see them in the chat transcript).

Setting	Env Var	Default	Description
`autoRecall`	`HINDSIGHT_AUTO_RECALL`	`true`	Master switch for auto-recall. Set to `false` to disable memory retrieval entirely.
`recallBudget`	`HINDSIGHT_RECALL_BUDGET`	`"mid"`	Controls how hard Hindsight searches for memories. `"low"` = fast, fewer strategies; `"mid"` = balanced; `"high"` = thorough, slower. Affects latency directly.
`recallMaxTokens`	`HINDSIGHT_RECALL_MAX_TOKENS`	`1024`	Maximum number of tokens in the recalled memory block. Lower values reduce context usage but may truncate relevant memories.
`recallTypes`	—	`["world", "experience"]`	Which memory types to retrieve. `"world"` = general facts; `"experience"` = personal experiences; `"observation"` = raw observations.
`recallContextTurns`	`HINDSIGHT_RECALL_CONTEXT_TURNS`	`1`	How many prior conversation turns to include when composing the recall query. `1` = only the latest user message; higher values give more context but may dilute the query.
`recallMaxQueryChars`	`HINDSIGHT_RECALL_MAX_QUERY_CHARS`	`800`	Maximum character length of the query sent to Hindsight. Longer queries are truncated.
`recallRoles`	—	`["user", "assistant"]`	Which message roles to include when building the recall query from prior turns.
`recallPromptPreamble`	—	built-in string	Text placed above the recalled memories in the injected context block. Customize this to change how Claude interprets the memories.

Auto-Retain

Auto-retain runs after Claude responds. It extracts the conversation transcript and sends it to Hindsight for long-term storage and fact extraction.

Setting	Env Var	Default	Description
`autoRetain`	`HINDSIGHT_AUTO_RETAIN`	`true`	Master switch for auto-retain. Set to `false` to disable memory storage entirely.
`retainMode`	`HINDSIGHT_RETAIN_MODE`	`"full-session"`	Retention strategy. `"full-session"` sends the full conversation transcript (with chunking).
`retainEveryNTurns`	—	`10`	How often to retain. `1` = every turn; `10` = every 10th turn. Higher values reduce API calls but delay memory capture. Values > 1 enable chunked retention with a sliding window.
`retainOverlapTurns`	—	`2`	When chunked retention fires, this many extra turns from the previous chunk are included for continuity. Total window size = `retainEveryNTurns + retainOverlapTurns`.
`retainRoles`	—	`["user", "assistant"]`	Which message roles to include in the retained transcript.
`retainToolCalls`	—	`true`	Whether to include tool calls (function invocations and results) in the retained transcript. Captures structured actions like file reads, searches, and code edits.
`retainTags`	—	`["{session_id}"]`	Tags attached to the retained document. Supports `{session_id}` placeholder which is replaced with the current session ID at runtime.
`retainMetadata`	—	`{}`	Arbitrary key-value metadata attached to the retained document.
`retainContext`	—	`"claude-code"`	A label attached to retained memories identifying their source. Useful when multiple integrations write to the same bank.

Knowledge Tools

The plugin runs an MCP server that exposes agent_knowledge_* tools, letting Claude explicitly read, write, and search its memory bank — as opposed to the hook-driven recall/retain which is fully automatic. None of the tools accept a bank_id parameter; the server resolves the active bank from the same config as the hooks, so tools and hooks always agree on which bank they touch.

Tool	Purpose
`agent_knowledge_get_current_bank`	Reports the bank ID the current session is bound to.
`agent_knowledge_list_pages`	Lists all knowledge pages (mental models) with IDs and names.
`agent_knowledge_get_page`	Reads the full synthesized content of a specific page.
`agent_knowledge_create_page`	Creates a new knowledge page with a `source_query` that re-synthesizes content after each consolidation.
`agent_knowledge_update_page`	Updates a page's name or `source_query`.
`agent_knowledge_delete_page`	Permanently deletes a knowledge page.
`agent_knowledge_recall`	Searches across retained conversations and documents for specific facts.
`agent_knowledge_ingest`	Uploads raw text content into the bank as a document.
`agent_knowledge_ingest_file`	Reads a file from disk and ingests its full content.

Setting	Env Var	Default	Description
`enableKnowledgeTools`	`HINDSIGHT_ENABLE_KNOWLEDGE_TOOLS`	`false`	Master switch for the MCP knowledge tools. When `false`, the MCP server exits at startup and none of the `agent_knowledge_*` tools are exposed.

Subagents with Memory

The /hindsight-memory:create-agent skill creates a Claude Code subagent backed by its own Hindsight bank. Each subagent's prompt instructs it to use the knowledge tools above for recall and ingestion, so Claude treats persistent memory as a first-class capability of that agent.

Two modes:

From a directory — /hindsight-memory:create-agent <name> from <path> ingests every text file under <path> (.md, .txt, .html, .json, .csv, .xml), creates a starter set of knowledge pages, and writes the subagent file to ~/.claude/agents/<name>.md. If the directory contains a bank-template.json listing exact mental_models, those are used verbatim instead of auto-generated.
Interactive — /hindsight-memory:create-agent (no arguments) prompts for the agent name and purpose, then writes the subagent file with no ingestion.

For per-subagent memory isolation, enable dynamic bank IDs:

{
  "dynamicBankId": true,
  "dynamicBankGranularity": ["agent", "project"],
  "enableKnowledgeTools": true
}

With ["agent", "project"], each subagent gets a unique bank per repository — <agent-name>::<project-basename>. Switch to ["agent"] for a single bank per subagent across all projects, or ["agent", "session"] for ephemeral per-session banks.

Debug

Setting	Env Var	Default	Description
`debug`	`HINDSIGHT_DEBUG`	`false`	Enable verbose logging to stderr. All log lines are prefixed with `[Hindsight]`. Useful for diagnosing connection issues, recall/retain behavior, and bank ID derivation.

Claude Code Channels

With Claude Code Channels, Claude Code can operate as a persistent background agent connected to Telegram, Discord, Slack, and other messaging platforms. This plugin gives Channel-based agents the same long-term memory that hindsight-openclaw provides for Openclaw agents.

For Channel agents, enable dynamic bank IDs for per-channel/per-user memory isolation:

{
  "dynamicBankId": true,
  "dynamicBankGranularity": ["agent", "channel", "user"]
}

And set channel context via environment variables:

export HINDSIGHT_CHANNEL_ID="telegram-group-12345"
export HINDSIGHT_USER_ID="user-67890"

Troubleshooting

Plugin not activating: Check Claude Code logs for [Hindsight] messages. Enable "debug": true in ~/.hindsight/claude-code.json.

Recall returning no memories: Verify the Hindsight server is reachable (curl http://localhost:9077/health). Memories need at least one retain cycle before they're available.

Daemon not starting: Ensure an LLM API key is set (or use HINDSIGHT_LLM_PROVIDER=claude-code). Review daemon logs at ~/.hindsight/profiles/claude-code.log.

High latency on recall: The recall hook has a 12-second timeout. Use recallBudget: "low" or reduce recallMaxTokens for faster responses.

Knowledge tools missing from Claude: Make sure enableKnowledgeTools is true in ~/.hindsight/claude-code.json and restart Claude Code. The first launch after enabling them takes ~20–30 seconds to bootstrap the Python venv at ${CLAUDE_PLUGIN_DATA}/venv; subsequent launches are instant.

Quick Start​

Features​

Architecture​

Connection Modes​

1. External API (recommended for production)​

2. Local Daemon (auto-managed)​

3. Existing Local Server​

Configuration​

Connection & Daemon​

LLM Provider (local daemon only)​

Memory Bank​

Worktrees and explicit mapping​

Auto-Recall​

Auto-Retain​

Knowledge Tools​

Subagents with Memory​

Debug​

Claude Code Channels​

Troubleshooting​