Guide: Run Hindsight as a Local MCP Server

If you want to run Hindsight as a local MCP server, the good news is that the setup is smaller than it sounds. Hindsight can run locally with an embedded PostgreSQL database, expose an MCP endpoint on your machine, and give tools like Claude Code, Claude Desktop, Cursor, and Windsurf persistent memory without a separate external database.
This is a strong pattern for personal use, development, and privacy-focused setups. Instead of exposing memory over a public endpoint or managing a full hosted stack first, you start one local process and connect your MCP client to localhost.
This guide walks through the local server startup flow, explains single-bank and multi-bank modes, and shows how to verify that the MCP tools are working before you trust them in real work. Keep the docs home and the quickstart guide open while you work.
Quick answer
- Start
hindsight-local-mcpwith your preferred LLM provider.- Connect your MCP client to
http://localhost:8888/mcp/.- Choose multi-bank mode or pin a single-bank URL.
- Restart the client so the memory tools appear.
- Test retain, recall, and reflect with a short memory check.
Prerequisites
Before you start, make sure you have:
- a machine where you can run the local server
- an LLM provider key, unless you are using a local model like Ollama
- an MCP-compatible client, such as Claude Code, Claude Desktop, Cursor, or Windsurf
If you would rather skip local infrastructure entirely, Hindsight Cloud is the easier managed alternative.
Step 1: Start the local MCP server
The simplest local startup command is:
HINDSIGHT_API_LLM_API_KEY=sk-... uvx --from hindsight-api hindsight-local-mcp
If you want to use Ollama instead of a cloud provider:
HINDSIGHT_API_LLM_PROVIDER=ollama \
HINDSIGHT_API_LLM_MODEL=llama3.2 \
uvx --from hindsight-api hindsight-local-mcp
That starts the full Hindsight API locally and exposes the MCP endpoint at:
http://localhost:8888/mcp/
Behind the scenes, Hindsight also starts an embedded PostgreSQL database so you do not need to provision a separate database service just to test or use memory locally.
What the local server actually includes
The local MCP server is not a thin demo wrapper. It runs the full memory API locally, including:
- retain
- recall
- reflect
- mental model tools
- document tools
- bank management tools in multi-bank mode
This is why it works well for both development and day-to-day single-user setups.
For retrieval details, see Hindsight's recall API. For storage details, see Hindsight's retain API.
Step 2: Choose multi-bank or single-bank mode
Hindsight supports both patterns locally.
Multi-bank mode
Use the root MCP path:
http://localhost:8888/mcp/
This exposes all tools, including bank management. It is the right choice when you want to work with multiple banks or let your agent specify the bank per request.
Example for Claude Code:
claude mcp add --transport http hindsight http://localhost:8888/mcp/
Single-bank mode
Use a bank-pinned URL:
http://localhost:8888/mcp/my-bank/
This removes the need to pass a bank_id in each tool call and keeps the whole client pinned to one memory bank.
Example:
claude mcp add --transport http hindsight http://localhost:8888/mcp/my-bank/
If you know one client should always use one bank, this mode is simpler.
Step 3: Connect your MCP client
Any MCP-compatible client can point at the local endpoint.
For clients that use JSON config, add an HTTP MCP server pointing to the local URL.
Typical examples:
- Claude Desktop config file
- Cursor MCP settings
- Windsurf MCP settings
- Claude Code CLI registration
If you are comparing client patterns, the Claude Code integration and Adding Memory to Codex with Hindsight are useful related reads.
Step 4: Verify that the tools are live
A simple check is to ask the client:
What memory tools do you have available?
You should see retain, recall, and reflect at minimum.
Then run a short memory test:
- tell the agent to remember a preference or fact
- open a fresh session or new thread
- ask for that fact back
If memory is working, recall should surface the stored context without you manually repeating it.
A second test is to ask for synthesis rather than literal recall. That checks reflect, not just search.
Common issues
Slow first startup
The first run may need time to initialize the database and local models. Later starts are faster.
Port 8888 is already in use
Change the port:
HINDSIGHT_API_LLM_API_KEY=sk-... HINDSIGHT_API_PORT=9000 uvx --from hindsight-api hindsight-local-mcp
Then point your MCP client at http://localhost:9000/mcp/.
I stored something, but cannot recall it immediately
Retain is asynchronous. Give the pipeline a few seconds and try again.
The tools never appear in the client
Restart the client fully. Many MCP clients only refresh the tool list on startup.
I want more logs
Turn on debug logging:
HINDSIGHT_API_LLM_API_KEY=sk-... HINDSIGHT_API_LOG_LEVEL=debug uvx --from hindsight-api hindsight-local-mcp
When local MCP is the right fit
Local MCP is a great choice when:
- you want memory on your own machine
- you are testing memory behavior before broader rollout
- you want minimal infrastructure
- you want to avoid exposing a public endpoint
If you eventually want multi-device access or easier OAuth-based client setup, move to Hindsight Cloud.
FAQ
Do I need Docker for local MCP?
No. The local MCP server can run directly through the Hindsight tooling.
Is local MCP only for Claude?
No. Any MCP-compatible client can use it.
Should I use single-bank or multi-bank mode?
Use single-bank when one client should always use one bank. Use multi-bank when you need flexibility.
Does the data survive restarts?
Yes, the embedded PostgreSQL data persists locally across restarts.
Next Steps
- Start with Hindsight Cloud if you want a hosted alternative
- Read the full Hindsight docs
- Follow the quickstart guide
- Review Hindsight's recall API
- Review Hindsight's retain API
- Compare coding workflows in Team Shared Memory for AI Coding Agents
