Skip to main content

Changelog

User-facing changes only. Internal maintenance and infrastructure updates are omitted.

0.5.4

Features

  • Delta mental model refresh now scopes recall to memories created since the last refresh, making updates faster and more accurate.·@nicoloboschi@nicoloboschi·e90cfa4a
  • OpenAI-compatible embedding providers now support configurable batch sizes for better throughput.·@r266-tech@r266-tech·30700de6
  • Embedded daemon now includes a liveness check that auto-recovers from crashes.·@nicoloboschi@nicoloboschi·59f9a2bf
  • Disable daemon idle timeout by default so embedded instances stay alive between requests.·@nicoloboschi@nicoloboschi·f5dfe59b
  • Add {user_id} template variable for retainTags in the Claude Code integration.·@soichisumi@soichisumi·9181c9a2
  • New decommission-workers and worker-status admin CLI commands for managing worker fleets.·@nicoloboschi@nicoloboschi·c8b898bd

Bug Fixes

  • Fix duplicate memory units caused by chunk index scrambling during concurrent upserts.·@nicoloboschi@nicoloboschi·511ca723
  • Prevent directive content from leaking into reflect answers on empty banks.·@nicoloboschi@nicoloboschi·3d877b05
  • Honor the reflect_mission identity framing in the prompt builder so agent personality works correctly.·@nicoloboschi@nicoloboschi·a3b0d265
  • Allow reflect-specific LLM configuration when the default LLM provider is disabled.·@zwcf5200@zwcf5200·afd00c03
  • Preserve document created_at timestamp across upsert and add UI edit flow for documents.·@nicoloboschi@nicoloboschi·10785666
  • Fix ensure_ascii=False in json.dumps for LLM prompts so non-Latin text is preserved.·@harryplusplus@harryplusplus·d05b49a2
  • Route update_bank through the config resolver with generic config_updates.·@nicoloboschi@nicoloboschi·abbd3619
  • Workers now scan for active schemas before claiming tasks, preventing work on decommissioned tenants.·@cdbartholomew@cdbartholomew·7126bf8a
  • Pass DeferOperation through MemoryEngine.execute_task so extensions can requeue work.·@cdbartholomew@cdbartholomew·858f0b3a
  • Downgrade LLM verify_connection failure to a warning instead of crashing on startup.·@nicoloboschi@nicoloboschi·9901aa1e
  • Fix items_count in list_operations response to populate from result_metadata.·@nicoloboschi@nicoloboschi·41710ba1
  • Align AI SDK ReflectBasedOn types with the OpenAPI spec.·@nicoloboschi@nicoloboschi·3d6b3805
  • Fix database migration path by merging divergent Alembic heads for v0.5.3.·@grimmjoww@grimmjoww·487e2a5e
  • Lower OpenCode retainEveryNTurns default from 10 to 3 for more frequent memory saves.·@DK09876@DK09876·902704df

0.5.3

Features

  • Add a setting to limit how many memories can be consolidated per round, helping control consolidation workload and cost.·@nicoloboschi@nicoloboschi·ca561aca
  • Add integration with the OpenAI Agents SDK.·@DK09876@DK09876·b8da88c8
  • Improve mental model refresh and updates with structured operations and cleaner observation handling on upsert.·@nicoloboschi@nicoloboschi·8b80959b
  • Allow extensions to requeue work by throwing a DeferOperation exception from worker operations.·@nicoloboschi@nicoloboschi·f8904797
  • Make recall budget mapping configurable per memory bank.·@nicoloboschi@nicoloboschi·576c44d2
  • Control plane now shows failed consolidation counts with a drilldown to investigate issues.·@nicoloboschi@nicoloboschi·e1e5f36c
  • Add mental-model staleness signals and a refreshed UI/experience for reviewing model history and snapshots.·@nicoloboschi@nicoloboschi·654e4c0c
  • Replace the embedded Paperclip library with the Paperclip plugin for more flexible integrations.·@benfrank241@benfrank241·c571fac7
  • JSON logs can now include tenant information and support a configurable allowlist for what gets logged.·@nicoloboschi@nicoloboschi·3bedc1ce
  • CLI now supports named connection profiles (via -p/--profile) for easier switching between environments.·@nicoloboschi@nicoloboschi·70d60e96

Improvements

  • Reduce the default number of retries for LLM calls to fail faster when providers are erroring or unavailable.·@nicoloboschi@nicoloboschi·b52b483c
  • Make reranker failures easier to diagnose and add a configurable timeout for the TEI reranker.·@octo-patch@octo-patch·69383af8

Bug Fixes

  • Fix crashes when using Jina MLX on Metal GPUs by serializing GPU operations.·@lkttle@lkttle·2e74a324
  • Fix database migration path so upgrades from v0.4.22 to v0.5.x work correctly.·@nicoloboschi@nicoloboschi·5437cc02
  • Prevent orphaned observations if a source memory is deleted during consolidation.·@nicoloboschi@nicoloboschi·f9042e37
  • Fix Ollama requests by explicitly disabling "think" mode in the native call payload.·@karl-8888@karl-8888·7d4fd1aa
  • Fix file retain uploads and prevent orphaned retained files.·@christerence@christerence·9e30ae25
  • Fix file retain API to correctly accept and map a provided "timestamp" field.·@christerence@christerence·13f3052e
  • Improve fairness across tenants when workers claim tasks, reducing the chance of noisy tenants starving others.·@cdbartholomew@cdbartholomew·a5e53721
  • Ensure the mental model max_tokens setting is respected during refresh/reflect operations.·@nicoloboschi@nicoloboschi·568e3c30
  • Fix control-plane links by properly encoding bank IDs in URLs end-to-end.·@nicoloboschi@nicoloboschi·cbaec36f
  • Make task submission idempotent when a payload is already set, preventing duplicate/failed submissions.·@nicoloboschi@nicoloboschi·088dfecb

0.5.2

Features

  • Added a co-occurrence graph view for exploring entity relationships in the control plane.·@nicoloboschi@nicoloboschi·f64c5d20
  • Added recall controls to the mental model trigger API/CLI so you can tune what gets recalled during runs.·@nicoloboschi@nicoloboschi·f2fc8f9f
  • Async operations now expose task payload details and associated document IDs for better observability and debugging.·@nicoloboschi@nicoloboschi·870bf4a3

Improvements

  • Revamped the control plane bank statistics view for clearer insights.·@nicoloboschi@nicoloboschi·34365c32
  • Clients now send an identifying User-Agent header on all HTTP requests for easier server-side diagnostics.·@nicoloboschi@nicoloboschi·9372462e

Bug Fixes

  • Fixed consolidation retry budget handling so retries are correctly applied at the LLM call site.·@r266-tech@r266-tech·dee58139
  • Fixed a crash during retain when embeddings and extracted facts counts didn’t match.·@nicoloboschi@nicoloboschi·dbd1d1a7
  • Improved embedded mode cleanup stability by adding a timeout when acquiring the cleanup lock (prevents hangs).·@r266-tech@r266-tech·6b5aa3af
  • OpenClaw plugins now reliably register agent hooks on every entry invocation.·@nicoloboschi@nicoloboschi·1be5ff33
  • TypeScript SDK now re-exports BankTemplate types from the package root for simpler imports.·@mrkhachaturov@mrkhachaturov·581bbf3f
  • Bank template configuration validation was aligned with configurable fields to prevent invalid/ignored settings.·@mrkhachaturov@mrkhachaturov·099f4c92

0.5.1

Breaking Changes

  • OpenClaw now reads configuration from plugin config instead of environment variables. (e22ae05f)

Features

  • Added SiliconFlow as a supported reranker provider. (d0b2ab9a)
  • Added an interactive OpenClaw setup wizard with Cloud / API / Embedded modes. (87322396)
  • Added a config-aware CLI to backfill OpenClaw history. (72fd3d59)
  • Added OpenClaw session pattern filtering to ignore or treat sessions as stateless. (5a61ac50)
  • Added a Cloudflare OAuth proxy integration option for self-hosted Hindsight. (aad07a14)
  • Expanded the CLI to cover all OpenAPI endpoints and request-body parameters. (c05c491d)
  • Added a default bank template environment variable (HINDSIGHT_API_DEFAULT_BANK_TEMPLATE). (fc941d5c)
  • Added a daemon lifecycle package (@vectorize-io/hindsight-all) to simplify running the all-in-one daemon. (576016f5)
  • Added recallTags and recallTagsMatch configuration options to control which tagged memories are recalled. (b57e337f)

Improvements

  • Improved OpenClaw reliability with more resilient startup behavior and richer retain metadata. (1f1716bd)

Bug Fixes

  • OpenClaw setup wizard now prompts for the token value (not the env var name). (9679d813)
  • Fixed embedded mode daemon start/stop race that could terminate healthy daemons. (e5724fcb)
  • Fixed reranker initialization issues to show real import errors and avoid a Transformers 5.x race in jina-mlx. (f82f58fa)
  • Fixed worker consolidation slot accounting to respect the configured maximum concurrency. (2d74007d)
  • Improved CLI API error output by including the HTTP response body. (93300b91)
  • Fixed CLI memory listing showing "[UNKNOWN]" for fact types. (2635bbb4)
  • Fixed recall ranking so RRF ordering is preserved when the reranker is configured as a passthrough. (4f9cf15c)
  • Fixed retain chunk insertion to be idempotent and avoid repeated retries on integrity errors. (2d95f78b)
  • Fixed retain ANN seed temp table creation to run inside a transaction for better reliability. (3fc87e76)
  • Fixed LLM requests to use the correct max token parameter for reasoning models and Azure OpenAI. (7b2263ba)

0.5.0

Breaking Changes

  • Removed BFS and MPFP graph retrieval strategies. LinkExpansionRetriever is now the sole graph retrieval algorithm, offering simpler, faster, and more accurate results. (ea834bc7)
  • Dropped the hindsight-hermes integration package. (cf0537ba)

Features

  • Built-in llama.cpp LLM provider for fully local inference without external API calls. (f74b577e)
  • Retain update_mode='append' for concatenating new content onto an existing document instead of replacing it. (3c633e5e)
  • OpenRouter support for LLM, embeddings, and reranking. (e5944b63)
  • Bank template import/export with Template Hub — export a bank's configuration, mental models, and directives as a reusable manifest, then import into other banks. (30a319a6)
  • Constellation view in the Control Plane — interactive, zoomable canvas visualization of entity relationship graphs with heat-gradient coloring and dark mode support. (36783df3)
  • Added detail parameter to list/get mental model endpoints for controlling response verbosity. (8d1bfbbd)
  • Added AutoGen integration (hindsight-autogen) for persistent long-term memory in AutoGen agents. (a757765a)
  • Added Paperclip integration (@vectorize-io/hindsight-paperclip) with Express middleware and process adapter modes for stateless agent memory. (81441ee9)
  • Added OpenCode persistent memory plugin for the OpenCode editor. (e1c6220f)
  • OpenClaw JSONL-backed retain queue for external API resilience — buffers retain calls locally when the API is unreachable. (087545cc)
  • OpenClaw now supports bankId for static bank configurations. (0e81d1a2)
  • Added Google embeddings and reranker provider support. (07de798c)
  • Added persistent volume support in Helm chart for local model cache. (cefa7554)
  • MCP server now includes a sync_retain tool and validates UUID inputs. (48185a4b)
  • Recall combined scoring now includes proof_count boost for better ranking. (26794aab)

Improvements

  • 3-phase retain pipeline restructures memory ingestion into pre-resolve, insert, and post-link phases, dramatically improving throughput under concurrent load by removing slow reads from write transactions. (914ba796)
  • Recall entity graph expansion now caps per-entity fanout and includes a timeout fallback, preventing slow queries on banks with high-fanout entities. (57f15445)
  • Fact serialization in think-prompt now includes occurred_end and mentioned_at for richer temporal context. (37348c85)
  • Consolidation observation quality improved with structured processing rules. (6f173b10)

Bug Fixes

  • LiteLLM SDK embeddings encoding_format is now configurable instead of hardcoded. (cece2c90)
  • Fixed out-of-range content_index crash in recall result mapping. (9790d904)
  • Experience fact types are now preserved correctly during normalization. (9cfdd464)
  • Clear memories endpoint no longer deletes the bank profile. (26a64cc0)
  • Embedding daemon clears stale processes on the port before starting. (7d6c570a)
  • Per-bank vector index migration now respects vector extension configuration. (4fd7c5d1)
  • Timeline group sort uses numeric date comparison instead of locale string comparison. (f3f2c6b0)
  • Resolved 25 test regressions from the streaming retain pipeline. (7415ebff)
  • MCP server now auto-coerces string-encoded JSON in tool arguments. (443c94c8)
  • Entity labels structure is now validated on PATCH to prevent invalid configurations. (7e23f8e1)
  • Fixed bank_id metric label to be opt-in, preventing OTel memory leak. (cf4bd598)
  • Fixed max_tokens handling for OpenAI-compatible endpoints with custom base URLs. (cd99eef4)
  • Fixed event_date AttributeError when date is None in fact extraction. (6cb309f7)
  • Query analyzer now handles dateparser internal crashes gracefully. (e0e65c44)
  • Embedding profile .env overwrite skipped when config has no Hindsight keys. (9e2890ba)
  • Windows compatibility fix for hindsight-embed. (f9fe6953)
  • Addressed critical and high severity security vulnerabilities in dependencies. (ee4510a7)

0.4.22

Features

  • API now supports passing custom LLM request parameters via the HINDSIGHT_API_LLM_EXTRA_BODY configuration. (ecaa1ad1)
  • Document metadata is now exposed through the API and control plane. (627ec5d5)
  • Added a /code-review skill for automated code quality checks against project standards. (bdb33c58)
  • ZeroEntropy reranker now supports a configurable base URL. (a915584e)
  • Codex can now retain structured tool calls from rollout files. (3461398b)

Improvements

  • Embeddings via the LiteLLM SDK can now optionally specify output dimensions. (f841bcb9)
  • API responses now include an X-Ignored-Params header to warn when unknown request parameters were ignored. (cef42d81)
  • OpenClaw CLI startup is faster by deferring heavy initialization until the service starts. (41025c3b)

Bug Fixes

  • Mental model triggers now support the full config schema, including tag matching and tag group filters. (2c32ffad)
  • Cohere reranking via Azure endpoints now works reliably (avoids 404 errors). (84985ee9)
  • Claude Code provider no longer defers to built-in tools, preventing MCP tool handling issues. (fa82efc8)
  • Recall endpoint now returns metadata correctly instead of dropping it from the response. (4768bf39)
  • Gemini 3.1+ tool calls now read thought signatures correctly. (1b5c262a)
  • First-person agent memories are now correctly classified as "experience" facts. (00961156)
  • Codex upgrades now preserve and merge new settings instead of skipping them. (b104bad0)
  • LlamaIndex integration fixes improve document ID handling, memory API behavior, and ReAct tracing. (d93dfea8)

0.4.21

Features

  • Added audit logging for feature usage tracking, including request duration in audit entries. (083295dc)
  • Added Hindsight memory integration for the OpenAI Codex CLI. (0b17a67c)
  • Added an MCP hook to filter tool visibility per user. (f8285b7b)
  • Added a per-bank limit setting to cap the number of observations stored per scope. (b32767ca)
  • Added native Windows support so Hindsight can run without Docker. (c5700ff5)
  • Added a 'none' LLM provider to support chunk-only storage without LLM calls. (9e5a066d)
  • Added a setup command/skill to register hooks more reliably. (22ca6a8d)
  • Hermes now supports file-based configuration. (0ff36548)
  • Added a LiteLLM-based provider to support Bedrock and many additional LLM providers. (db70fdbe)
  • Added support for Strands Agents SDK integration with Hindsight memory tools. (7fe773c0)
  • Added LlamaIndex integration. (2d787c4f)
  • Added AG2 framework integration. (73123870)
  • Added support for Ark and Volcano LLM providers. (417fac61)
  • Retain now supports delta mode to skip LLM processing for unchanged chunks on upsert. (fd88c0ef)
  • Claude Code integration can now retain full sessions with document upsert and configurable tags, and records tool calls as structured JSON. (2d31b67d)
  • MCP retain tool now supports selecting a retain strategy via a parameter. (4285e944)

Improvements

  • OpenClaw logging is now configurable and can emit structured output. (d441ab81)
  • Made inclusion of source facts in search observations configurable. (5095d5e3)
  • Integrations no longer use hardcoded default models, relying on configured defaults instead. (58e68f3e)

Bug Fixes

  • Improved MCP server compatibility by handling Claude Code GET probes and allowing stateless HTTP mode to be configured. (d8050387)
  • Per-bank vector index creation now respects the configured vector extension setting. (6488c9bc)
  • Verbose retain extraction now correctly includes the retain mission context. (d2965e64)
  • Codex integration no longer crashes on startup when the API quota is exhausted (HTTP 429). (111e8c70)
  • OpenAI embeddings client now correctly parses query parameters included in base_url. (a209ef1a)
  • Fixed tool_choice handling for Codex/Claude Code when forcing specific tool calls. (585ac76f)
  • OpenClaw auto-recall now supports a configurable timeout to prevent hangs. (cd4d449f)
  • Fixed control plane UI issues affecting recall and data viewing. (6bb83f46)
  • Recall responses now include associated metadata. (0bcbf849)
  • Python client update_bank_config() now exposes all configurable fields. (7c18723f)
  • API OpenAPI schema now correctly includes Pydantic v2 ValidationError fields. (939cb40a)
  • JSON-string tags are now coerced to lists for MemoryItem and MCP tools to prevent tagging errors. (c5273f5f)

0.4.20

Features

  • Add a one-command setup CLI package for the NemoClaw integration. (d284de28)
  • Add a LangGraph integration for using Hindsight memory within LangGraph agents. (b4320254)
  • Add reflect filters to exclude specific fact types and mental model content during reflection. (ea662d06)
  • Introduce independent versioning for integrations so they can be released separately from the core server. (31f1c53c)
  • Add a Claude Code integration plugin. (f4390bdc)

Improvements

  • Add a wall-clock timeout to reflect operations so they don’t run indefinitely. (8ce06e3e)
  • Provide richer context when validating operations via the OperationValidator extension. (2eb1019d)
  • Make the hindsight-api package runnable directly via uvx by adding script entry points. (97f7a365)
  • Support passing query parameters during OpenAI-compatible client initialization for broader provider compatibility. (20e17f28)
  • Upgrade the default MiniMax model from M2.5 to M2.7. (1f1462a5)

Bug Fixes

  • Prevent context overflow during observation search by disabling source facts in results. (8e2e2d5b)
  • Fix Claude Code integration session startup by pre-starting the daemon in the background. (26944e25)
  • Fix Claude Code integration installation and configuration experience so setup is more reliable. (35b2cbb6)
  • Fix a memory leak in entity resolution that could grow over time under load. (e6333719)
  • Avoid crashes and retain failures when the Postgres pg_trgm extension is unavailable by handling detection/fallback correctly. (365fa3ce)
  • Strip Markdown code fences from model outputs across all LLM providers for more consistent parsing. (2f2db2a6)
  • Return a clear 400 error for empty recall queries and fix a SQL parameterization issue. (5cdc714a)
  • Ensure file retain requests include authentication headers so uploads work in authenticated deployments. (78aa7c53)
  • Fix MCP tool calls when MCP_AUTH_TOKEN and TENANT_API_KEY differ. (8364b9c5)
  • Allow claude-agent-sdk to install correctly on Linux/Docker environments. (3f31cbf5)
  • In LiteLLM mode, fall back to the last user message when no explicit hindsight query is provided. (5e8952c5)
  • Fix non-atomic async operation creation to prevent inconsistent operation records. (94cf89b5)
  • Prevent orphaned parent operations when a batch retain child fails unexpectedly. (43942455)
  • Fix failures for non-ASCII entity names by ensuring entity IDs are set correctly. (438ce98b)
  • Correctly store LLM facts labeled as "assistant" as "experience" in the database. (446c75f3)

0.4.19

Features

  • TypeScript client now works in Deno environments. (72c25c97)
  • Added Agno integration to use Hindsight as a memory toolkit. (8c378b98)
  • Added Hermes Agent integration (hindsight-hermes) for persistent memory. (ef90842f)
  • Expanded retain behavior with new verbatim and chunks extraction modes and named retain strategies. (e4f8a157)

Improvements

  • Improved local reranker performance/efficiency with FP16 and bucketed batching, plus compatibility with Transformers 5.x. (e7da7d0e)

Bug Fixes

  • Prevented silent memory loss when consolidation fails (failed consolidations are tracked and can be recovered). (28dac7c7)
  • Fixed Docker control-plane startup to respect the configured control-plane hostname. (8a64dc8d)
  • Database cleanup migration now removes orphaned observation memory units to avoid inconsistent memory state. (f09ad9de)
  • Deleting a document now also deletes linked memory units to prevent leftover/stale memory entries. (f27bd953)
  • Fixed MCP middleware to send an Accept header, preventing 406 response errors in some setups. (836fd81e)
  • Improved compatibility with Gemini tool-calling by preserving thought signature metadata to avoid failures on gemini-3.1-flash-lite-preview. (21f9f46c)

0.4.18

Features

  • Add compound tag filtering using tag groups. (5de793ee)
  • Publish new slim Python packages (hindsight-api-slim and hindsight-all-slim) for smaller installs. (15ea23d5)
  • Add MiniMax as a supported LLM provider. (2344484f)
  • Add Jina MLX reranker provider optimized for Apple Silicon. (1caf5ec9)

Improvements

  • Allow configuring maximum recall query tokens via an environment variable. (66dedb8d)
  • Improve retrieval performance by switching to per-bank HNSW indexes. (43b3efc4)

Bug Fixes

  • Prevent reranking failures by truncating long documents that exceed LiteLLM reranker context limits. (eeb938fc)
  • Ensure recalled memories are injected as system context for OpenClaw. (b17f338e)
  • Ensure embedded profiles are registered in CLI metadata when the daemon starts. (06b0f74a)
  • Cancel in-flight async operations when a bank is deleted to avoid dangling work. (0560f626)

0.4.17

Features

  • Added a manual retry option for failed asynchronous operations. (dcaacbe4)
  • You can now change/update tags on an existing document. (1b4ad7f4)
  • Added history tracking and a diff view for mental model changes. (e2baca8b)
  • Added observation history tracking with a UI diff view to review changes over time. (576473b6)
  • File uploads can now choose a parser per request, with configurable fallback chains. (99220d05)
  • Added an extension hook that runs after file-to-Markdown conversion completes. (1d17dea2)

Improvements

  • Operations view now supports filtering by operation type and has more reliable auto-refresh behavior. (f7a60f89)
  • Added token limits for “source facts” used during consolidation and recall to better control context usage. (5d05962d)
  • Improved bank selector usability by truncating very long bank names in the dropdown. (1e40cd22)

Bug Fixes

  • Fixed webhook schema issues affecting multi-tenant retain webhooks. (32a4882a)
  • Fixed file ingestion failures by stripping null bytes from parsed file content before retaining. (cd3a6a22)
  • Fixed tool selection handling for OpenAI-compatible providers when using named tool_choice. (1cdfb7c2)
  • Improved consolidation behavior to prioritize a bank’s mission over an ephemeral-state heuristic. (00ccf0b2)
  • Fixed database migrations to correctly handle mental model embedding dimension changes. (7accac94)
  • Fixed file upload failures caused by an Iris parser httpx read timeout. (fa3501d4)
  • Improved reliability of running migrations by serializing Alembic upgrades within the process. (f88b50a4)
  • Fixed Google Cloud Storage authentication when using Workload Identity Federation credentials. (d2504ac5)
  • Fixed the bank selector to refresh the bank list when the dropdown is opened. (0ad8c2d0)

0.4.16

Features

  • Added Webhooks with consolidation.completed and retain.completed events. (abbf874d)

Improvements

  • Improved OpenClaw recall/retention controls. (d425e93c)
  • Improved search/reranking quality by switching combined scoring to multiplicative boosts. (aa8e5475)
  • Improved performance of observation recall by 40x on large banks. (ad2cf72a)
  • Improved server shutdown behavior by capping graceful shutdown time and allowing a forced kill on a second Ctrl+C. (4c058b4b)

Bug Fixes

  • Fixed an async deadlock risk by running database schema migrations in a background thread during startup. (e0a2ac63)
  • Fixed webhook delivery/outbox processing so transactions don’t silently roll back due to using the wrong database schema name. (75b95106)
  • Fixed observation results to correctly resolve and return related chunks using source_memory_ids. (cb6d1c46)
  • Fixed MCP bank-level tool filtering compatibility with FastMCP 3.x. (f17406fd)
  • Fixed crashes when an LLM returns invalid JSON across all retries (now handled cleanly instead of raising a TypeError). (66423b85)
  • Fixed observations without source dates to preserve missing (None) temporal fields instead of incorrectly populating them. (891c33b1)

0.4.15

Features

  • Added observation_scopes to control the granularity/visibility of observations. (55af4681)
  • List documents API now supports filtering by tags (and fixes the q parameter description). (1d70abfe)
  • Added PydanticAI integration for persistent agent memory. (cab5a40f)
  • Added richer entity label support (optional labels, free-form values, multi-value fields, and UI polish). (9b96becc)
  • Added support for timestamp="unset" so content can be retained without a date. (f903948a)
  • OpenClaw can now automatically retain the last n+2 turns every n turns (default n=10). (ad1660b3)
  • Added configurable Gemini/Vertex AI safety settings for LLM calls. (73ef99e7)
  • Added extension hooks to customize root routing and error headers. (e407f4bc)

Improvements

  • Improved recall performance by fetching all recall chunks in a single query. (61bf428b)
  • Improved recall/retain performance and scalability for large memory banks. (7942f181)

Bug Fixes

  • Fixed the TypeScript SDK to send null (not undefined) when includeEntities is false. (15f4b876)
  • Prevented reflect from failing with context_length_exceeded on large memory banks. (77defd96)
  • Fixed a consolidation deadlock caused by retrying after zombie processing tasks. (c2876490)
  • Fixed observations count in the control plane that always showed 0. (eaeaa1f2)
  • Fixed ZeroEntropy rerank endpoint URL and ensured the MCP retain async_processing parameter is handled correctly. (f6f1a7d8)
  • Fixed JSON serialization issues and logging-related exception propagation when using the claude_code LLM provider. (ecb833f4)
  • Added bank-scoped request validation to prevent cross-bank/invalid bank operations. (5270aa5a)

0.4.14

Features

  • Add Chat SDK integration to give chatbots persistent memory. (fed987f9)
  • Allow configuring which MCP tools are exposed per memory bank, and expand the MCP tool set with additional tools and parameters. (3ffec650)
  • Enable the bank configuration API by default. (4d030707)
  • Support filtering graph-based memory retrieval by tags. (0bb5ca4c)
  • Add batch observations consolidation to process multiple observations more efficiently. (0aa7c2b3)
  • Add OpenClaw options to toggle autoRecall and exclude specific providers. (3f9eb27c)
  • Add a ZeroEntropy reranker provider option. (17259675)

Improvements

  • Increase customization options for reflect, retain, and consolidation behavior. (2a322732)
  • Include source document metadata in fact extraction results. (87219b73)

Bug Fixes

  • Raise a clear error when embedding dimensions exceed pgvector HNSW limits (instead of failing later at runtime). (8cd65b98)
  • Fix multi-tenant schema isolation issues in storage and the bank config API. (b180b3ad)
  • Ensure LiteLLM embedding calls use the correct float encoding format to prevent embedding failures. (58f2de70)
  • Improve recall performance by reducing memory usage during retrieval. (9f0c031d)
  • Handle observation regeneration correctly when underlying memories are deleted. (ac9a94ad)
  • Fix reflect retrieval to correctly populate dependencies and enforce full hierarchical retrieval. (8b1a4658)
  • Fix OpenClaw health checks by passing the auth token to the health endpoint. (40b02645)

0.4.13

Features

  • Switched the default OpenAI LLM to gpt-4o-mini. (325b5cc1)
  • Observation recall now includes the source facts behind recalled observations. (5569d4ad)
  • Added CrewAI integration to enable persistent memory. (41db2960)

Bug Fixes

  • Fixed npx hindsight-control-plane failing to run. (0758827d)
  • Improved MCP compatibility by aligning the local MCP implementation with the server and removing the deprecated stateless parameter. (ea8163c5)
  • Fixed Docker startup failures when using named Docker volumes. (ac739487)
  • Prevented reranker crashes when an upstream provider returns an error. (58c4d657)
  • Improved accuracy of fact temporal ordering by reducing per-fact time offsets. (c3ef1555)
  • Client timeout settings are now properly respected. (dcaa9f14)
  • Fixed documents not being tracked when fact extraction returns zero facts. (f78278ea)

0.4.12

Features

  • Accept and ingest PDFs, images, and common Office documents as inputs. (224b7b74)
  • Add the Iris file parser for improved document parsing support. (7eafba66)
  • Add async Retain support via provider Batch APIs (e.g., OpenAI and Groq) for higher-throughput ingestion. (40d42c58)
  • Allow Recall to return chunks only (no memories) by setting max_tokens=0. (7dad9da0)
  • Add a Go client SDK for the Hindsight API. (2a47389f)
  • Add support for the pgvectorscale (DiskANN) vector index backend. (95c42204)
  • Add support for Azure pg_diskann vector indexing. (476726c2)

Improvements

  • Improve reliability of async batch Retain when ingesting large payloads. (aefb3fcf)
  • Improve AI SDK tooling to make it easier to work with Hindsight programmatically. (d06a0259)

Bug Fixes

  • Ensure document tags are preserved when using the async Retain flow. (b4b5c44a)
  • Fix OpenClaw ingestion failures for very large content (E2BIG). (6bad6673)
  • Harden OpenClaw behavior (safer shell usage, better HTTP mode handling, and more reliable initialization), including per-user banks support. (c4610130)
  • Improve Python client async API consistency and reduce connection drop issues via keepalive timeout fixes. (8114ef44)

0.4.11

Features

  • Added support for LiteLLM SDK as an embeddings and reranking provider. (e408b7e)
  • Expanded Postgres search support with additional text/vector extensions, including TimescaleDB pg_textsearch and vchord/pgvector options. (d871c30)
  • Added hierarchical configuration scopes (system, tenant, bank) for more flexible multi-tenant setup and overrides. (8d731f2)
  • Added reverse proxy/base-path support for running Hindsight behind a proxy. (93ddd41)
  • Added MCP tools to create, read, update, and delete mental models. (f641b30)
  • Added a "docs" skill for agents/tools to access documentation-oriented capabilities. (dd1e098)
  • Added an OpenClaw configuration option to skip recall/retain for specific providers. (fb7be3e)

Improvements

  • Improved LiteLLM gateway model configuration for more reliable provider/model selection. (7d95a00)
  • Exposed actual LLM token usage in retain results to improve cost/usage visibility. (83ca669)
  • Added user-initiated attribution to request context to improve async task and usage attribution. (90be7c6)
  • Added OpenTelemetry tracing for improved request traceability and observability. (69dec8e)
  • Helm chart: split TEI embedding and reranker into separate deployments for independent scaling and rollout. (43f9a8b)
  • Helm chart: added PodDisruptionBudgets and per-component affinity controls for more resilient scheduling. (9943957)

Bug Fixes

  • Fixed a recursion issue in memory retention that could cause failures or runaway memory usage. (4f11210)
  • Fixed Reflect API serialization/schema issues for "based_on" so reflections are returned and stored correctly. (f9a8a8e)
  • Improved MCP server compatibility by allowing extra tool arguments when appropriate and fixing bank ID resolution priority. (7ee229b)
  • Added missing trust_code environment configuration support. (60574ee)
  • Hardened the MCP server with fixes to routing/validation and more accurate usage metering. (e798979)
  • Fixed the slim Docker image to include tiktoken to prevent runtime tokenization errors. (6eec83b)
  • Fixed MCP operations not being tracked correctly for usage metering. (888b50d)
  • Helm chart: fixed GKE deployments overriding the configured HINDSIGHT_API_PORT. (03f47e2)

0.4.10

Features

  • Provided a slimmer Docker distribution to reduce image size and speed up pulls. (f648178)
  • Added Markdown support in Reflect and Mental Models content. (c4ef090)
  • Added built-in Supabase tenant extension for running Hindsight with Supabase-backed multi-tenancy. (e99ee0f)
  • Added TenantExtension authentication support to the MCP endpoint. (fedfb49)

Improvements

  • Improved MCP tool availability/routing based on the endpoint being used. (d90588b)

Bug Fixes

  • Stopped logging database usernames and passwords to prevent credential leaks in logs. (c568094)
  • Fixed OpenClaw sessions wiping memory on each new session. (981cf60)
  • Fixed hindsight-embed profiles not loading correctly. (0430588)
  • Fixed tagged directives so they correctly apply to tagged mental models. (278718d)
  • Fixed a cast error that could cause failures at runtime. (093ecff)

Other

  • Added a docker-compose example to simplify local deployment and testing. (5179d5f)

0.4.9

Features

  • New AI SDK integration. (7e339e1)
  • Add a Python SDK for running Hindsight in embedded mode (HindsightEmbedded). (d3302c9)
  • Add streaming support to the hindsight-litellm wrappers. (665877b)
  • Add OpenClaw support for connecting to an external Hindsight API and using dynamic per-channel memory banks. (6b34692)

Improvements

  • Improve the mental models experience in the control plane UI. (7097716)
  • Reduce noisy Hugging Face logging output. (34d9188)

Bug Fixes

  • Improve recall endpoint reliability by handling timeouts correctly and rejecting overly long queries. (dd621a6)
  • Improve /reflect behavior with Claude Code and Codex providers. (a43d208)
  • Fix OpenClaw shell argument escaping for more reliable command execution. (63e2964)

0.4.8

Features

  • Added profile support for hindsight-embed, enabling separate embedding configurations/workspaces. (6c7f057)
  • Added support for additional LLM backends, including OpenAI Codex and Claude Code. (539190b)

Improvements

  • Enhanced OpenClaw and hindsight-embed parameter/config options for easier configuration and better defaults. (749478d)
  • Added OpenClaw plugin configuration options to select LLM provider and model. (8564135)
  • Server now prints its version during startup to simplify debugging and support requests. (1499ce5)
  • Improved tracing/debuggability by propagating request context through asynchronous background tasks. (44d9125)
  • Added stronger validation and context for mental model create/refresh operations to prevent invalid requests. (35127d5)

Bug Fixes

  • Improved embedding CLI experience with richer logs and isolated profiles to avoid cross-contamination between runs. (794a743)
  • Operation validation now runs correctly in the worker process, preventing invalid background operations from slipping through. (96f0e54)
  • Fixed unreliable behavior when using a custom PostgreSQL schema. (3825506)

0.4.7

Features

  • Add extension hooks to validate and customize mental model operations. (9c3fda7)
  • Add support for using an external embedding API provider in OpenClaw plugin (with additional OpenClaw compatibility fixes). (4b57b82)

Improvements

  • Speed up container startup by preloading the tiktoken encoding during Docker image builds. (039944c)

Bug Fixes

  • Prevent PostgreSQL insert failures by stripping null bytes from text fields before saving. (ef9d3a1)
  • Fix worker schema selection so it uses the correct default database schema. (d788a55)
  • Honor an already-set HINDSIGHT_API_DATABASE_URL instead of overwriting it in the hindsight-embed workflow. (f0cb192)

0.4.6

Improvements

  • Improved OpenClaw configuration setup to make embedding integration easier to configure. (27498f9)

Bug Fixes

  • Fixed OpenClaw embedding version binding/versioning to prevent mismatches when using the embed integration. (1163b1f)

0.4.5

Bug Fixes

  • Fixed occasional failures when retaining memories asynchronously with timestamps. (cbb8fc6)

0.4.4

Bug Fixes

  • Fixed async “retain” operations failing when a timestamp is provided. (35f0984)
  • Corrected the OpenClaw daemon integration name to “openclaw” (previously “openclawd”). (b364bc3)

0.4.3

Features

  • Add Vertex AI as a supported LLM provider. (c2ac7d0, 49ae55a)
  • Add Bearer token authentication for MCP and propagate tenant authentication across MCP requests. (0da77ce)

Improvements

  • CLI: add a --wait flag for consolidate and a --date filter for listing documents. (ff20bf9)

Bug Fixes

  • Fix worker polling deadlocks to prevent background processing from stalling. (f4f86e3)
  • Improve reliability of Docker builds by retrying ML model downloads. (ecc590c)
  • Fix tenant authentication handling for internal background tasks and ensure the control-plane forwards required auth to the dataplane. (03bf13e)
  • Ensure tenant database migrations run at startup and workers use the correct tenant schema context. (657fe02)
  • Fix control-plane graph endpoint errors when upstream data is missing. (751f99a)

Other

  • Rename the default bot/user identity from "moltbot" to "openclaw". (728ce13)

0.4.2

Features

  • Added Clawdbot/Moltbot/OpenClaw integration. (12e9a3d)

Improvements

  • Added additional configuration options to control LLM retry behavior. (3f211f0)
  • Added real-time logs showing a detailed timing breakdown during consolidation runs. (8781c9f)

Bug Fixes

  • Fixed hindsight-embed crashing on macOS. (c16ccc2)

0.4.1

Features

  • Added support for using a non-default PostgreSQL schema by default. (2b72e1f)

Improvements

  • Improved memory consolidation performance (benchmarking and optimizations). (b43ef98)

Bug Fixes

  • Fixed the /version endpoint returning an incorrect version. (cfcc23c)
  • Fixed mental model search failing due to UUID type mismatch after text-ID migration. (94cc0a1)
  • Added safer PyTorch device detection to prevent crashes on some environments. (67c4788)
  • Fixed Python packages exposing an incorrect version value. (fccbdfe)

0.4.0

Observations, Mental Models, new Agentic Reflect and Directives, read the announcement.

Features

  • Added support for providing a custom prompt for memory extraction. (3172e99)
  • Expanded the LiteLLM integration with async retain/reflect support, cleaner API, and support for tags/mission (including passing API keys correctly). (1d4879a)
  • Added a new worker service to run background tasks at scale. (4c79240)
  • MCP retain now supports timestamps. (b378f68)
  • Added support for installing skills via npx add-skill. (ec22317)

Improvements

  • CLI retain-files now accepts more file types. (1eeced3)

Bug Fixes

  • Fixed a macOS crash in the embed daemon caused by an XPC connection issue. (e5fc6ee)
  • Fixed occasional extraction in the wrong language. (87d4a36)
  • Fixed PyTorch model initialization issues that could cause startup failures (meta tensor/init problems). (ddaa5f5)

Features

  • Add memory tags so you can label and filter memories during recall/reflect. (20c8f8b)
  • Allow choosing different AI providers/models per operation. (e6709d5)
  • Add Cohere support for embeddings and reranking. (4de0730)
  • Add configurable embedding dimensions and OpenAI embeddings support. (70de23e)
  • Support custom base URLs for OpenAI-style embeddings and Cohere endpoints. (fa53917)
  • Add LiteLLM gateway support for routing LLM/embedding requests. (d47c8a2)
  • Add multilingual content support to improve handling and retrieval across languages. (c65c6a9)
  • Add delete memory bank capability. (4b82d2d)
  • Add backup/restore tooling for memory banks. (67b273d)

Improvements

  • Add retention modes to control how memories are extracted and stored. (fb31a35)
  • Add offline (optional) database migrations to support restricted/air-gapped deployments. (233bd2e)
  • Add database connection configuration options for more flexible deployments. (33fac2c)
  • Load .env automatically on startup to simplify configuration. (c06d9b4)
  • Expose an operation ID from retain requests so async/background processing can be tracked. (1dacd0e)
  • Add per-request LLM token usage metrics for monitoring and cost tracking. (29a542d)
  • Add LLM call latency metrics for performance monitoring. (5e1f13e)
  • Include tenant in metrics labels for better multi-tenant observability. (1ffc2a4)
  • Add async processing option to MCP retain tool for background retention workflows. (37fc7fb)

Bug Fixes

  • Fix extension loading in multi-worker deployments so all workers load extensions correctly. (f5f3fca)
  • Improve recall performance by batching recall queries. (5991308)
  • Improve retrieval quality and stability for large memory banks (graph/MPFP retrieval fixes). (6232e69)
  • Fix entities list being limited to 100 entities. (26bf571)
  • Fix UI only showing the first 1000 memories. (67c1a42)
  • Fix duplicated causal relationships and improve token usage during processing. (49e233c)
  • Improve causal link detection accuracy. (2a00df0)
  • Make retain max completion tokens configurable to prevent truncation issues. (7715a51)
  • Fix Python SDK not sending the Authorization header, preventing authenticated requests. (39e3f7c)
  • Fix stats endpoint missing tenant authentication in multi-tenant setups. (d6ff191)
  • Fix embedding dimension handling for tenant schemas in multi-tenant databases. (6fe9314)
  • Fix Groq free-tier compatibility so requests work correctly. (d899d18)
  • Fix security vulnerability (qs / CVE-2025-15284). (b3becb6)
  • Restore MCP tools for listing and creating memory banks. (9fd5679)

0.2.0

Features

  • Add additional model provider support, including Anthropic Claude and LM Studio. (787ed60)
  • Add multi-bank access and new MCP tools for interacting with multiple memory banks via MCP. (6b5f593)
  • Allow supplying custom entities when retaining memories via the retain endpoint. (dd59bc8)
  • Enhance the /reflect endpoint with max_tokens control and optional structured output responses. (d49e820)

Improvements

  • Improve local LLM support for reasoning-capable models and streamline Docker startup for local deployments. (eea0f27)
  • Support operation validator extensions and return proper HTTP errors when validation fails. (ce45d30)
  • Add configurable observation thresholds to control when observations are created/updated. (54e2df0)
  • Improve graph visualization to the control plane for exploring memory relationships. (1a62069)

Bug Fixes

  • Fix MCP server lifecycle handling so MCP lifespan is correctly tied to the FastAPI app lifespan. (6b78f7d)

0.1.15

Features

  • Add the ability to delete documents from the web UI. (f7ff32d)

Improvements

  • Improve the API health check endpoint and update the generated client APIs/types accordingly. (e06a612)

0.1.14

Bug Fixes

  • Fixes the embedded “get-skill” installer so installing skills works correctly. (0b352d1)

0.1.13

Improvements

  • Improve reliability by surfacing task handler failures so retries can occur when processing fails. (904ea4d)
  • Revamp the hindsight-embed component architecture, including a new daemon/client model and CLI updates for embedding workflows. (e6511e7)

Bug Fixes

  • Fix memory retention so timestamps are correctly taken into account. (234d426)

0.1.12

Features

  • Added an extensions system for plugging in new operations/skills (including built-in tenant support). (2a0c490)
  • Introduced the hindsight-embed tool and a native agentic skill for embedding/agent workflows. (da44a5e)

Improvements

  • Improved reliability when parsing LLM JSON by retrying on parse errors and adding clearer diagnostics. (a831a7b)

Bug Fixes

  • Fixed structured-output support for Ollama-based LLM providers. (32bca12)
  • Adjusted LLM validation to cap max completion tokens at 100 to prevent validation failures. (b94b5cf)

0.1.11

Bug Fixes

  • Fixed the standalone Docker image and control plane standalone build process so standalone deployments build correctly. (2948cb6)

0.1.10

This release contains internal maintenance and infrastructure changes only.

0.1.9

Features

  • Simplified local MCP installation and added a standalone UI option for easier setup. (1c6acc3)

Bug Fixes

  • Fixed the standalone Docker image so it builds and starts reliably. (b52eb90)
  • Improved Docker runtime reliability by adding required system utilities (procps). (ae80876)

0.1.8

Bug Fixes

  • Fix bank list responses when a bank has no name. (04f01ab)
  • Fix failures when retaining memories asynchronously. (63f5138)
  • Fix a race condition in the bank selector when switching banks. (e468a4e)

0.1.7

This release contains internal maintenance and infrastructure changes only.

0.1.6

Features

  • Added support for the Gemini 3 Pro and GPT-5.2 models. (bb1f9cb)
  • Added a local MCP server option for running/connecting to Hindsight via MCP without a separate remote service. (7dd6853)

Improvements

  • Updated the Postgres/pg0 dependency to a newer 0.11.x series for improved compatibility and stability. (47be07f)

0.1.5

Features

  • Added LiteLLM integration so Hindsight can capture and manage memories from LiteLLM-based LLM calls. (dfccbf2)
  • Added an optional graph-based retriever (MPFP) to improve recall by leveraging relationships between memories. (7445cef)

Improvements

  • Switched the embedded Postgres layer to pg0-embedded for a smoother local/standalone experience. (94c2b85)

Bug Fixes

  • Fixed repeated retries on 400 errors from the LLM, preventing unnecessary request loops and failures. (70983f5)
  • Fixed recall trace visualization in the control plane so search/recall debugging displays correctly. (922164e)
  • Fixed the CLI installer to make installation more reliable. (158a6aa)
  • Updated Next.js to patch security vulnerabilities (CVE-2025-55184, CVE-2025-55183). (f018cc5)

0.1.3

Improvements

  • Improved CLI and UI branding/polish, including new banner/logo assets and updated interface styling. (fa554b8)

0.1.2

Bug Fixes

  • Fixed the standalone Docker image so it builds/runs correctly. (1056a20)

Integration Changelogs

IntegrationPackageDescription
LiteLLMhindsight-litellmUniversal LLM memory via LiteLLM (100+ providers)
Pydantic AIhindsight-pydantic-aiPersistent memory tools for Pydantic AI agents
CrewAIhindsight-crewaiPersistent memory for CrewAI agents
AI SDK@vectorize-io/hindsight-ai-sdkMemory integration for Vercel AI SDK
Chat SDK@vectorize-io/hindsight-chatMemory integration for Vercel Chat SDK
OpenClaw@vectorize-io/hindsight-openclawHindsight memory plugin for OpenClaw