What's new in Hindsight 0.4.20
Hindsight 0.4.20 adds Claude Code and LangGraph integrations, a NemoClaw setup CLI, independent versioning for integration packages, and reflect improvements including fact type filters, mental model exclusion, and a wall-clock timeout—plus a batch of reliability fixes.
- Claude Code Integration: Give Claude Code persistent long-term memory via a hook-based plugin.
- LangGraph Integration: Use Hindsight memory in LangGraph agents with tools, nodes, or the BaseStore API.
- NemoClaw Integration: One-command Hindsight setup inside NemoClaw sandboxes.
- Independent Integration Versioning: Integrations now have their own version numbers and release lifecycle.
- Reflect Improvements: Fact type filters, mental model exclusion, and a wall-clock timeout.
Claude Code Integration
hindsight-memory is a new plugin that gives Claude Code persistent long-term memory. It works in interactive sessions and through Claude Code Channels (Telegram, Discord, Slack).
Install from the marketplace:
# Add the Hindsight marketplace and install the plugin
claude plugin marketplace add vectorize-io/hindsight
claude plugin install hindsight-memory
# Configure your LLM provider for memory extraction
# Option A: OpenAI (auto-detected)
export OPENAI_API_KEY="sk-your-key"
# Option B: Anthropic (auto-detected)
export ANTHROPIC_API_KEY="your-key"
# Option C: No API key needed (uses Claude Code's own model)
export HINDSIGHT_LLM_PROVIDER=claude-code
# Start Claude Code — the plugin activates automatically
claude
The plugin uses four Claude Code hooks to manage memory automatically:
- SessionStart — verifies Hindsight is reachable and optionally starts a local daemon.
- UserPromptSubmit — auto-recalls relevant memories and injects them as context.
- Stop — auto-retains the conversation transcript to long-term memory.
- SessionEnd — cleanup and optional daemon shutdown.
Three connection modes are supported:
- External API — point
hindsightApiUrlandhindsightApiTokenat a remote Hindsight server. - Local daemon — leave
hindsightApiUrlempty and the plugin auto-starts a local instance viauvx hindsight-embed@latest. Just provide an LLM API key. - Existing local server — set
apiPortto match a server you already run.
Dynamic bank IDs let you isolate memories per agent, project, session, channel, or user—controlled by the dynamicBankId and dynamicBankGranularity settings. The plugin ships as pure Python with zero external dependencies.
See the Claude Code integration documentation for the full configuration reference, and the Claude Code + Telegram + Hindsight blog post for a walkthrough of using it with Claude Code Channels.
LangGraph Integration
hindsight-langgraph adds Hindsight memory to LangGraph agents with three integration patterns.
pip install hindsight-langgraph
Tools — works with both LangChain and LangGraph:
from hindsight_client import Hindsight
from hindsight_langgraph import create_hindsight_tools
from langchain_openai import ChatOpenAI
client = Hindsight(base_url="http://localhost:8888")
tools = create_hindsight_tools(client=client, bank_id="user-123")
model = ChatOpenAI(model="gpt-4o").bind_tools(tools)
Memory nodes — dedicated recall and retain nodes for LangGraph graphs:
from hindsight_client import Hindsight
from hindsight_langgraph import create_recall_node, create_retain_node
client = Hindsight(base_url="http://localhost:8888")
recall = create_recall_node(client=client, bank_id="user-123")
retain = create_retain_node(client=client, bank_id="user-123")
builder.add_node("recall", recall)
builder.add_node("retain", retain)
BaseStore — LangGraph-native checkpoint memory backed by Hindsight:
from hindsight_client import Hindsight
from hindsight_langgraph import HindsightStore
client = Hindsight(base_url="http://localhost:8888")
store = HindsightStore(client=client)
graph = builder.compile(checkpointer=checkpointer, store=store)
await store.aput(("user", "123", "prefs"), "theme", {"value": "dark mode"})
results = await store.asearch(("user", "123", "prefs"), query="theme preference")
Multi-tenant routing is built in—pass bank_id_from_config="user_id" to any node or tool factory, and the bank ID is resolved dynamically from the LangGraph configurable dict at runtime.
See the LangGraph integration documentation for the full API reference.
NemoClaw Integration
hindsight-nemoclaw is a new integration that automates the five-step process of adding Hindsight memory to a NemoClaw sandbox:
npx @vectorize-io/hindsight-nemoclaw setup \
--sandbox my-assistant \
--api-url https://api.hindsight.vectorize.io \
--api-token <your-api-key> \
--bank-prefix my-sandbox
The CLI handles plugin installation, configuration, network policy updates (so the sandbox can reach the Hindsight API), and gateway restart. Use --dry-run to preview changes before applying.
See the NemoClaw integration documentation for the full setup guide.
Independent Integration Versioning
Integration packages—LiteLLM, Pydantic AI, CrewAI, AI SDK, Chat, OpenClaw, LangGraph, and NemoClaw—now have their own version numbers and release lifecycle, decoupled from the core Hindsight server.
Previously, every integration was bumped and published together with each core release, even when nothing changed. This made it hard to tell whether a new version of an integration contained actual changes or was just a no-op bump.
Starting with 0.4.20, each integration is versioned and released independently. What this means for you:
- Version numbers reflect real changes: when you see a new version of
hindsight-langgraphorhindsight-litellmon PyPI/npm, it contains actual updates to that package. - Faster integration fixes: a bug fix or feature in one integration ships as soon as it's ready, without waiting for the next core release.
- Per-integration changelogs: each integration now has its own changelog page at
/changelog/integrations/<name>so you can track what changed in the packages you use.
Reflect Improvements
Three additions make reflect more controllable and predictable.
Fact type filters — fact_types restricts which fact types the reflection agent retrieves. Pass a subset of ["world", "experience", "observation"] to limit the search scope and save LLM tokens:
{
"query": "What does the user prefer?",
"fact_types": ["experience", "observation"]
}
Mental model exclusion — exclude_mental_models skips mental model search entirely, and exclude_mental_model_ids excludes specific models by ID. These filters also work in mental model triggers, so you can persist them on auto-refreshing models. The reflect agent enforces filters at the tool level—if the LLM tries to call a disabled tool, it receives an error rather than silently returning results from excluded sources.
Wall-clock timeout — reflect operations now enforce a configurable timeout (default: 300 seconds). Previously, a slow LLM provider or high iteration count could cause a reflect call to hang for up to 40 minutes. When the timeout fires, the API returns HTTP 504.
HINDSIGHT_API_REFLECT_WALL_TIMEOUT=300 # seconds
Other Updates
Improvements
- The
hindsight-apipackage is now runnable directly viauvx hindsight-apithanks to new script entry points. - The default MiniMax model has been upgraded from M2.5 to M2.7.
- The OperationValidator extension now receives richer context when validating operations.
- OpenAI-compatible client initialization now supports passing query parameters for broader provider compatibility.
Bug Fixes
- Fixed a memory leak in entity resolution where
_pending_statsand_pending_cooccurrencesdicts could grow unbounded when exceptions occurred between fact extraction and flush. - Fixed startup crash and silent retain failures when the PostgreSQL
pg_trgmextension is unavailable. The migration now handles missingpg_trgmgracefully, and entity resolution falls back to full table scan with a warning. - Fixed context overflow in reflect by disabling source facts in observation search results.
- Markdown code fences are now stripped from LLM outputs across all providers, not just local models.
- Empty recall queries now return a clear 400 error instead of failing with a SQL parameter gap.
- File retain requests now include authentication headers so uploads work in authenticated deployments.
- Fixed MCP tool calls failing when
MCP_AUTH_TOKENandTENANT_API_KEYdiffer. - Fixed
claude-agent-sdkinstallation on Linux/Docker environments. - LiteLLM integration now falls back to the last user message when no explicit
hindsight_queryis provided. - Fixed non-atomic async operation creation that could produce inconsistent operation records.
- Fixed orphaned parent operations when a batch retain child fails via unhandled exception.
- Fixed failures for non-ASCII entity names by ensuring entity IDs are set correctly.
- Fixed LLM facts labeled "assistant" being stored with the wrong fact type instead of "experience".
Feedback and Community
Hindsight 0.4.20 is a drop-in replacement for 0.4.x with no breaking changes.
Share your feedback:
For detailed changes, see the full changelog.
