Skip to main content

Cline Persistent Memory: Lifecycle Hooks Instead of MCP

· 10 min read
Ben Bartholomew
Hindsight Team

Cline Persistent Memory with Hindsight

Cline is one of the most-used AI coding agents in VS Code. It reads files, runs commands, writes code, and iterates on a task until it's done. What it doesn't do is remember anything once a task ends. The next task opens cold, with no recollection of decisions you made, conventions you've established, or fragile areas of the codebase you've found the hard way.

This post is a walkthrough of the new Hindsight + Cline integration. It uses Cline's lifecycle hooks to add automatic recall before each task and automatic retain when a task ends, with no MCP server in the loop, and the model doesn't have to decide to call a memory tool. Memory just happens.

TL;DR

  • Cline has no persistent memory built in. Tasks restart from zero each time.
  • The Hindsight integration installs four lifecycle hook scripts (TaskStart, UserPromptSubmit, TaskComplete, TaskCancel) plus a small Python lib. pip install hindsight-cline, one install command, and the hooks themselves run on stdlib Python 3 — no runtime dependencies.
  • Recall is deterministic. Because it runs on hooks, memory is injected automatically. There's no MCP tool the model can forget to use.
  • Recalled memories appear inside Cline as a <hindsight_memories> block, scoped to the current task description and your in-progress prompt.
  • Hindsight Cloud means no local daemon. Memory is stored server-side and follows you across machines. Sign up free.
  • Platform note: Cline hooks run on macOS and Linux only, with no Windows support.

The Problem: Cline Has No Memory Between Tasks

Cline is fast and capable inside a single task. It can read your whole project, edit dozens of files, run tests, and converge on a solution. But when the task ends, everything it learned is gone. The next task starts with the model's training data plus whatever files Cline reads, with nothing you taught it before.

For one-off tasks that's fine. For an agent you use every day on the same codebase, it's a problem. You re-explain the same conventions, re-warn it about the same pitfalls, and re-state the same architectural decisions every time. The agent never gets to know your codebase the way a teammate would.

Hindsight closes that gap by giving Cline a persistent memory bank, and the lifecycle-hook integration wires it in without any in-task tool calls.

How Hindsight Adds Memory to Cline

Cline supports lifecycle hooks: small executable scripts it runs at key moments. The Hindsight integration installs four of them and routes each event to a Hindsight API call:

Cline hookWhat Hindsight does
TaskStartRecall context for the new task description; inject it.
UserPromptSubmitRecall memories for your message; record the prompt for retain later.
TaskCompleteRetain the task's accumulated transcript and final summary.
TaskCancelRetain the partial transcript of a cancelled task.

Because it runs on hooks, memory is deterministic. There's no MCP tool the model can forget to call and no extra latency from a tool-use round-trip. The recall and retain logic runs at well-defined points in Cline's task lifecycle.

Task starts ─ TaskStart ─────────► recall(task description) → inject memories
You send a message ─ UserPromptSubmit ─► recall(prompt) → inject memories
(and append the prompt to the task transcript)
Task completes ─ TaskComplete ──► retain(accumulated transcript + summary)
Task cancelled ─ TaskCancel ────► retain(partial transcript)

One Cline-specific detail worth knowing: Cline doesn't hand hooks a transcript. Each hook gets the task ID and the current event payload, not the running conversation. The integration accumulates each task's prompts in ~/.hindsight/cline/state/ as it goes, and the end-of-task hook reads that back to retain the full transcript at once. The model never sees this bookkeeping; it just sees memories show up in context when relevant.

Installing

The installer is a small CLI that copies the four hook files (plus their shared lib and a settings.json) into Cline's hooks directory. Install it with pip:

pip install hindsight-cline

Then, from your project directory:

hindsight-cline install \
--api-url https://api.hindsight.vectorize.io \
--api-token YOUR_KEY

That installs to .clinerules/hooks/; commit it to share with your team. To install globally (apply to every project), add --global:

hindsight-cline install --global \
--api-url https://api.hindsight.vectorize.io \
--api-token YOUR_KEY

This drops hooks into ~/Documents/Cline/Rules/Hooks/ instead. (hindsight-cline uninstall removes them.)

Final step, enable hooks in Cline: Settings → Features → Hooks (toggle on).

Cline hooks run on macOS and Linux only. They use Python 3 (any modern system Python works, with no pip install needed at runtime).

The fastest path is Hindsight Cloud: no daemon to keep alive, memory syncs across machines, and the extraction work happens server-side. That matters more for Cline than for a server-side agent. Cline lives in VS Code, which most developers use across a laptop, a desktop, and sometimes a remote dev box; Cloud means the same memory bank shows up everywhere without copying files around. Because extraction runs server-side, you also don't have to thread an LLM API key into the hook environment (the retain hook would otherwise need one to call the extraction model), and there's no hindsight-api process you have to remember to start before opening VS Code. The installer's --api-url and --api-token flags configure it in one step. Your connection settings land in ~/.hindsight/cline.json, which is stable across reinstalls:

{
"hindsightApiUrl": "https://api.hindsight.vectorize.io",
"hindsightApiToken": "hsk_your_token"
}

Create an account and grab an API key at hindsight.vectorize.io.

Self-hosting works exactly the same way: start the API locally and point the installer at it:

pip install hindsight-all
export HINDSIGHT_API_LLM_API_KEY=your-openai-key
hindsight-api # http://localhost:8888

Then re-run the installer with --api-url http://localhost:8888.

What Gets Recalled

TaskStart and UserPromptSubmit both run a Hindsight recall and return a <hindsight_memories> block as context that Cline injects before the model sees your prompt:

<hindsight_memories>
Relevant memories from past conversations. Only use memories that are directly useful to continue this task; ignore the rest:
Current time - 2026-06-09 13:42

- Project uses asyncpg, not SQLAlchemy; switched after Redis cache stampede in March [world]
- Tests live under tests/integration/ and run via `make test-int`, not pytest directly [world]
- The `auth_v2` module is being deprecated; new code should target `identity/` [experience]
</hindsight_memories>

Cline sees this block; it doesn't appear in your editor output. The result: Cline starts every task with relevant past context already in scope, without you having to provide it.

You can tune how much context to pull with recallBudget ("low" / "mid" / "high") and recallMaxTokens.

Before and after

Without persistent memory, a new task in Cline starts cold. You type "fix the broken auth tests" and Cline reads the test file, makes reasonable guesses about which auth module is in scope, and may try patterns you already rejected.

With Hindsight, the same task opens with recalled context: that auth_v2 is deprecated, that the test runner is make test-int, that the recent fix-stack landed in identity/. Cline picks the right module and the right test command on the first turn instead of the third.

Per-Project Memory

By default all Cline tasks share a single bank (cline). To give each project its own isolated bank, switch to dynamic bank IDs in ~/.hindsight/cline.json:

{
"dynamicBankId": true,
"dynamicBankGranularity": ["agent", "project"]
}

Bank IDs are derived from the workspace path (agent::project), so a task in ~/projects/api writes to a different bank than one in ~/projects/frontend. Switching folders automatically switches memory context.

Valid granularity fields are agent, project, session, and user. Adding user (sourced from the HINDSIGHT_USER_ID env var) is useful if multiple people share a machine but should not share recall.

Team Shared Memory

Individual persistent memory is useful. Shared memory across a team is transformative.

When everyone on a team points their Cline config at the same Hindsight bank, context accumulated by one developer becomes available to all. A bug discovered on Monday surfaces in recall on Tuesday, regardless of who's asking. Architecture decisions made in one task inform the next, without requiring anyone to update a shared doc.

To configure team shared memory, set a fixed bankId in each developer's config and point them at the same Hindsight Cloud endpoint:

{
"hindsightApiUrl": "https://api.hindsight.vectorize.io",
"hindsightApiToken": "hsk_your_token",
"bankId": "my-team-project"
}

See Shared Memory for AI Coding Agents for a full team setup guide.

Key Configuration Options

Settings live in ~/.hindsight/cline.json (personal overrides) or the installed settings.json (defaults). Every setting can also be set via HINDSIGHT_* environment variables.

SettingDefaultWhat it does
bankIdclineMemory bank for this integration.
autoRecalltrueInject memories before tasks/prompts.
autoRetaintrueRetain the task transcript when it ends.
recallBudgetmidRecall depth: low (fast) / mid / high (thorough).
recallTypes["world","experience"]Memory categories to recall.
retainMissiongenericSteers fact extraction; tell it what to focus on.
dynamicBankIdfalsePer-project bank isolation.
debugfalseLog activity to stderr.

A focused retainMission makes the extracted memories meaningfully better:

{
"retainMission": "Extract technical decisions, code patterns, debugging solutions, user preferences, project context, and architectural choices. Ignore routine greetings and transient operational details."
}

Pitfalls

Hooks not firing. The installer copies the files in, but the toggle is still off by default. Go to Settings → Features → Hooks in Cline and turn it on. A quick way to verify hooks run: enable debug: true and watch stderr for [Hindsight] lines.

No memories recalled in the first task. Recall only returns results after something has been retained. Complete one real task first; the second one starts seeing recalled context.

Nothing happening on Windows. Cline's hook runner is macOS/Linux only; there is currently no Windows path. (If you're on Windows and want persistent memory for a coding agent, the Hermes + Hindsight setup is a good alternative.)

Smoke-testing a hook without Cline. You can pipe a synthetic event into a hook script directly to make sure it works end-to-end:

echo '{"hookName":"UserPromptSubmit","prompt":"how do we authenticate?","taskId":"t1","workspaceRoots":["/tmp/x"]}' \
| .clinerules/hooks/UserPromptSubmit
# → {"cancel": false, "contextModification": "<hindsight_memories>…", "errorMessage": ""}

Tradeoffs

Recall adds latency. Every prompt triggers a Hindsight query before Cline sees it. With Hindsight Cloud and a fast connection that's typically under 300ms, imperceptible in interactive use. Drop recallBudget to "low", or set autoRecall: false, if you need to skip it.

Retain runs at task end, not mid-task. Memories from the task you're in become available after it completes. If you cancel a task you'd otherwise want to recall from, the TaskCancel hook still retains the partial transcript, but you have to actually cancel to trigger it.

Extraction quality depends on conversation quality. Hindsight extracts facts from what's in the transcript. If a task is all file edits and no narration, there's little for the extractor to work with. A few sentences explaining what you decided and why go a long way.

Recap

Cline defaultWith Hindsight
Memory across tasksNoneAutomatic
Memory setupManual .clinerules / docsExtracted from task transcripts
Recall mechanismFiles Cline reads each taskSemantic search, injected per task/prompt
Per-project isolationNoOptional via dynamicBankId
Team shared memoryNoShared bank via Hindsight Cloud
Model tool-calling neededn/aNo (lifecycle hooks)

Next Steps