Skip to main content

Changelog

User-facing changes only. Internal maintenance and infrastructure updates are omitted.

0.8.1

Features

  • Add an API configuration flag to skip storing raw document text.·@nicoloboschi@nicoloboschi·1296e9fc1
  • Add an Oh-My-OpenAgent (OMO) integration for connecting Hindsight memory to OMO workflows.·@dcbouius@dcbouius·6dc56498c
  • Add a Cline integration that hooks into lifecycle events to save and retrieve Hindsight memory.·@benfrank241@benfrank241·66e58a23a
  • Add a Haystack integration to use Hindsight as a memory component in Haystack pipelines.·@DK09876@DK09876·394d66e60
  • Add a Cursor CLI integration for capturing and using Hindsight memory from Cursor workflows.·@Korayem@Korayem·dbfe83a2a

Improvements

  • Introduce an official Obsidian plugin to sync and use Hindsight memory from Obsidian.·@benfrank241@benfrank241·b0f86f9c0
  • Package the Roo Code integration as an installable PyPI CLI (hindsight-roo-code).·@benfrank241@benfrank241·644e37ac1

Bug Fixes

  • Fix vector index behavior to avoid forcing vchordrq.probes settings on vchord indexes that don't support it.·@nicoloboschi@nicoloboschi·109e1bd95
  • Ensure database maintenance routines are installed correctly when using the public schema.·@nicoloboschi@nicoloboschi·95d77233b

0.8.0

Features

  • Add periodic background maintenance to reconcile consolidation state and enforce retention across tenants.·@nicoloboschi@nicoloboschi·e77461762
  • Add durable progress snapshots for long-running consolidation and batch retain operations, so status can be resumed/inspected reliably.·@nicoloboschi@nicoloboschi·c94935bfa
  • Enable LLM request tracing by default (with per-bank support) and retain traces for 1 day for easier debugging/observability.·@nicoloboschi@nicoloboschi·f4a3329ce
  • Add semantic deduplication of near-duplicate observations during consolidation (enabled by default, except on Oracle).·@nicoloboschi@nicoloboschi·8aa31edd4
  • Store mental-model and observation history in dedicated database tables for better scalability and reliability.·@nicoloboschi@nicoloboschi·7e1145c08
  • Add a local ONNX embeddings provider so embeddings can be generated without a hosted embeddings API.·@shoveller@shoveller·b5a324b77
  • Add whole-bank export/import for cross-instance migration.·@nicoloboschi@nicoloboschi·602c9f55e
  • Add document export/import between banks without re-running the LLM (reusing previously extracted results).·@nicoloboschi@nicoloboschi·1d6d73bce
  • Add provider-agnostic prompt-prefix caching (default on) to reduce repeated LLM work for retain, consolidation, and reflect.·@cdbartholomew@cdbartholomew·d7a3aa526
  • Add global support for sending extra provider request fields via HINDSIGHT_API_LLM_EXTRA_BODY.·@nicoloboschi@nicoloboschi·01296d8d5
  • Honor HINDSIGHT_API_LLM_STRICT_SCHEMA across all JSON-schema capable LLM providers for stricter structured outputs.·@nicoloboschi@nicoloboschi·a3d3d42b3
  • Add Superagent safety middleware integration for safer agent execution.·@DK09876@DK09876·b70830218

Improvements

  • Improve client SDK trace visibility by exposing reflect tool/LLM call details in the Python and TypeScript wrappers.·@nicoloboschi@nicoloboschi·0db70bb88
  • Allow tuning recall behavior with configurable per-strategy retrieval boosts via environment configuration.·@nicoloboschi@nicoloboschi·01134047d
  • Allow tuning semantic recall sensitivity with a configurable minimum similarity threshold (HINDSIGHT_API_SEMANTIC_MIN_SIMILARITY).·@zwcf5200@zwcf5200·aa024a5cd
  • Avoid an unnecessary extra LLM call during reflect when a fresh mental model can be used directly.·@nicoloboschi@nicoloboschi·c255d3552
  • Speed up and reduce load from bank statistics queries via caching and more efficient computation.·@cdbartholomew@cdbartholomew·7683f2900
  • Improve visibility when audit logs or LLM requests are disabled by showing a clear "not enabled" splash in the control plane UI.·@nicoloboschi@nicoloboschi·283419280
  • Improve OpenCode integration observability with clearer debug logging, endpoint reporting, and surfaced errors.·@nicoloboschi@nicoloboschi·796a9eff9

Bug Fixes

  • Fix consolidation runs failing or truncating output by enforcing an explicit output token budget.·@xmh1011@xmh1011·c2524473e
  • Reduce slow/expensive temporal recall paths by limiting how many entries are scanned per fact type.·@nicoloboschi@nicoloboschi·4c33a4e55
  • Fix pgroonga BM25 search failures by escaping query text properly.·@xmh1011@xmh1011·6b8fc53d7
  • Preserve correct RRF (fusion) source ranks in retrieval traces/results.·@zwcf5200@zwcf5200·f62500193
  • Prevent the API from hanging indefinitely at startup by failing fast when model initialization blocks.·@nicoloboschi@nicoloboschi·4f5003480
  • Fix recall and mental-model history support when using the Oracle database backend.·@DK09876@DK09876·23710f4a8
  • Validate embedding vector dimensions before writing to pgvector to prevent storage/query errors.·@ai-ag2026@ai-ag2026·06c88e043
  • Improve retain concurrency safety by rechecking freshness before extraction and serializing concurrent writes to the same document.·@nicoloboschi@nicoloboschi·23168ebf6
  • Fix incorrect fact attribution during retain caused by bank routing key leakage.·@nicoloboschi@nicoloboschi·d695611ad
  • Fix mental model creation failures by ensuring the bank exists before inserting the model.·@r266-tech@r266-tech·56b4271d9
  • Fix external Postgres setups using vchord by adding required catalogs to the session search_path.·@nicoloboschi@nicoloboschi·049901802
  • Improve Docker standalone diagnostics by providing a clear error when pg0 bind-mount permissions prevent startup.·@nicoloboschi@nicoloboschi·75a7c19d6
  • Improve compatibility with LLM servers that mishandle tool_choice="required" by falling back to a safer tool-choice mode.·@nicoloboschi@nicoloboschi·e30f8af14
  • Fix Bedrock IAM authentication with the LiteLLM reranker by making the reranker API key optional.·@r266-tech@r266-tech·3c8ca47dd
  • Expose retain outcome metadata so users can see what happened during a retain operation.·@xmh1011@xmh1011·831f0efa1
  • Make bank config PATCH persist correctly even for banks that have never retained data yet.·@nicoloboschi@nicoloboschi·a809547aa
  • Prevent fact extraction prompt formatting from breaking extraction by removing Markdown bold markers.·@Oxygen56@Oxygen56·50b7eda2a
  • Improve recall stability/performance by gating VectorChord BM25 usage and applying per-source candidate caps.·@nicoloboschi@nicoloboschi·70d98c7a2
  • Fix control plane UI localization issues for operations and graph legends.·@MapleEve@MapleEve·a14ce623c
  • Improve transfer import reporting by including mental_model_history counts in CLI summaries.·@r266-tech@r266-tech·3346363d2

0.7.2

Features

Improvements

  • Improved graph maintenance performance for vector search seeding (faster ANN maintenance work).·@nicoloboschi@nicoloboschi·99c7367f
  • Improved vector search correctness/tuning by using cosine opclass for vchord ANN and applying backend-specific tuning settings.·@isac322@isac322·e4686b92
  • Reduced consolidation retry backoff to recover faster from transient failures.·@nicoloboschi@nicoloboschi·f49b85c0

Bug Fixes

  • Fixed a control-plane i18n redirect loop by pinning the Next.js version.·@nicoloboschi@nicoloboschi·201f5d7c
  • Limited native ML thread pools to available CPU cores to prevent resource overuse and instability.·@nicoloboschi@nicoloboschi·0a2ee845
  • Fixed PostgreSQL upgrade issues for v0.7.x deployments.·@nicoloboschi@nicoloboschi·b7f267b0
  • Fixed backup/restore to include previously missing tables.·@nicoloboschi@nicoloboschi·867b7b4a
  • Made memory consolidation more reliable under concurrency by improving scope-aware locking/dispatch and atomic submission behavior.·@nicoloboschi@nicoloboschi·08ce8176
  • Fixed retention of oversized documents to prevent incorrect chunk indexing and loss of full document content.·@nicoloboschi@nicoloboschi·364ccf17
  • Improved retain/recall robustness for unusual Unicode inputs (special-token literals and lone surrogates).·@nicoloboschi@nicoloboschi·4bc7013e
  • Fixed a race condition when inserting memory links during retain operations.·@nicoloboschi@nicoloboschi·8cf0dcbf
  • Fixed Windows embedded daemon launch behavior to avoid opening an extra terminal tab.·@nicoloboschi@nicoloboschi·4280ac3f
  • Improved embedded Postgres stop/restart reliability by updating the embedded DB dependency.·@nicoloboschi@nicoloboschi·8d9000a8
  • Fixed CLI memory retain timestamp handling and corrected supported fact-type values.·@slayoffer@slayoffer·df73c792
  • Fixed directive listing/reflect to honor tag groups so filtering behaves as expected.·@nicoloboschi@nicoloboschi·fb554664
  • Fixed daemon networking to honor an explicitly configured host and port.·@Sanderhoff-alt@Sanderhoff-alt·77380211
  • Isolated the Claude Code provider subprocess from user plugins to prevent interference and improve reliability.·@nicoloboschi@nicoloboschi·bcae23d9
  • Improved daemon startup reliability by waiting for health before reclaiming its port.·@s09x@s09x·5e547f71
  • Made worker retry behavior respect the configured maximum retries.·@aaronwestphal@aaronwestphal·85f6769e
  • Prevented retain from silently dropping memories when fact extraction fails.·@nicoloboschi@nicoloboschi·6e734e1a
  • Fixed control-plane auth redirects to validate return targets and honor basePath (improving security and correct routing).·@nicoloboschi@nicoloboschi·9571a341
  • Preserved raw reranker scores for providers that already return calibrated [0,1] scores.·@nicoloboschi@nicoloboschi·18b9c596

0.7.1

Features

  • Allow configuring LLM "reasoning effort" via environment variable.·@s9rkn@s9rkn·1890d2b7
  • Add per-provider HTTP timeout environment variables for reranker requests.·@nicoloboschi@nicoloboschi·a510b07a
  • Add additional Chinese locale variants in the control plane UI.·@MapleEve@MapleEve·617939d8

Improvements

  • Schedule consolidation work using bank priorities so higher-priority banks are processed first.·@nicoloboschi@nicoloboschi·cf637799
  • Expose "graph_maintenance" in the control plane operations type filter dropdown.·@nicoloboschi@nicoloboschi·691cb539

Bug Fixes

  • Make consolidation more reliable by retrying indefinitely with backoff and preventing duplicate work per bank.·@nicoloboschi@nicoloboschi·0d2ba56f
  • Fixed memory corruption when retaining large documents that exceeded a single LLM call — some extracted memories could be lost. Re-retain affected documents after upgrading.·@nicoloboschi@nicoloboschi·74525cc0
  • Improve Codex OAuth embeddings stability by adding token refresh support and follow-up fixes.·@DK09876@DK09876·ffa6fbf2
  • Fix recall recency scoring to anchor timestamps to the query time for more consistent results.·@Sanderhoff-alt@Sanderhoff-alt·2de19e57

0.7.0

Features

Improvements

  • Added configurable BM25 language/tokenization and optional PGroonga backend for better multilingual search tuning.·@nicoloboschi@nicoloboschi·cb04cb79
  • Added an environment variable to enable API access logging (Uvicorn access logs).·@nicoloboschi@nicoloboschi·25387083
  • Control plane now exposes a UI action to clear the mental model.·@nicoloboschi@nicoloboschi·6e9b741b
  • Improved observation consolidation quality, including better handling of temporal reasoning.·@nicoloboschi@nicoloboschi·d1ef9da9
  • Control plane now supports internationalization with 8 locales.·@nicoloboschi@nicoloboschi·486c3a8b

Bug Fixes

  • Fixed stale outgoing links after deletions by recomputing links asynchronously.·@nicoloboschi@nicoloboschi·cc3ba4a3
  • Fixed Codex configuration to ignore inherited base URLs and automatically refresh OAuth access tokens.·@benfrank241@benfrank241·795c081d
  • Stopped sending the temperature parameter to Anthropic requests to avoid API incompatibilities.·@nicoloboschi@nicoloboschi·dabbf9ff
  • Prevented duplicate webhook batch deliveries for retain events.·@xmh1011@xmh1011·6348f424
  • Improved reliability for large retain requests by splitting oversized single items in batch retains.·@nicoloboschi@nicoloboschi·c3b2b154
  • Fixed Ollama support by adding an Ollama Cloud provider and correcting authentication for cloud endpoints.·@nicoloboschi@nicoloboschi·cb037290
  • Ensured disabled tools are not included in the agent system prompt.·@nicoloboschi@nicoloboschi·2582b45a
  • Avoided retries when an embeddings provider returns invalid vector dimensions.·@ai-ag2026@ai-ag2026·e1e1a5e0
  • Prevented prompt formatting errors by escaping literal braces in all user-supplied prompt fields.·@nicoloboschi@nicoloboschi·67ae2a41
  • Fixed issues with mental-model baseline refresh and capped history size to prevent database JSON overflow.·@xmh1011@xmh1011·44b34c89
  • Enabled gzip handling to keep graph responses parseable when compressed.·@nicoloboschi@nicoloboschi·31d1e172
  • Added per-operation LLM concurrency caps to prevent overload and improve stability under parallel workloads.·@nicoloboschi@nicoloboschi·daf2348b
  • Preserved tag group/triggers correctly when updating tags.·@xmh1011@xmh1011·9c161e4e
  • Improved reranking robustness by detecting pre-normalized scores and applying rank-based normalization when needed.·@xuli500177@xuli500177·dcf5588e
  • Fixed entity resolution so user-defined label entities are not fuzzy-matched incorrectly.·@nicoloboschi@nicoloboschi·46dd2dfd
  • Improved worker resilience by handling stale pending schema routines.·@xmh1011@xmh1011·592f01bb
  • Fixed embedded UI startup on Windows by resolving the correct npx path before launch.·@tuancookiez-hub@tuancookiez-hub·2e5186a6
  • Improved control-plane session security by verifying signed session cookies instead of just checking presence.·@nicoloboschi@nicoloboschi·878ef957

0.6.2

Features

  • Added a configurable MCP request timeout to the Claude Code integration so long-running recall/reflect calls no longer time out at the default.·@rsaulo@rsaulo·55ef7067

Improvements

  • Renamed the Agent SDK agent_knowledge_recall max_results parameter to max_tokens and raised the default from 10 to 1024 so default-config users get useful recall results.·@r266-tech@r266-tech·d9dd1499
  • Applied the same max_resultsmax_tokens rename to the Claude Code MCP plugin so both integrations stay in sync.·@offendingcommit@offendingcommit·909a4fd4

Security

  • Bumped vulnerable dependencies across the stack to address GitHub-reported CVEs — including litellm to ≥1.83.14, urllib3 to 2.7.0, and additional high/critical pip and npm packages.·@dcbouius@dcbouius·fd05bdab

Bug Fixes

  • Fixed the CLI “memory retain” timestamp/Event Date option so the provided timestamp is actually sent to the API.·@benfrank241@benfrank241·51ea9aa2
  • Repaired a database migration issue affecting mental model subtype data at the current schema head.·@benfrank241@benfrank241·debbd919
  • Improved migration robustness by handling transient database OID errors during the embedding-dimension migration.·@dcbouius@dcbouius·9dfbfb4b
  • Fixed Claude Code knowledge get_page calls to request detail=content and avoid tool-result spillover so full page content returns reliably.·@cdbartholomew@cdbartholomew·b2a693ab
  • Applied the same agent_knowledge_get_page detail=content fix to the Agent SDK so SDK callers also receive full page content.·@r266-tech@r266-tech·b593d40f
  • Aligned the Paperclip integration with Paperclip's actual event payload shape, restoring correct fact extraction from incoming events.·@amirhmoradi@amirhmoradi·be908d5b

0.6.1

Features

  • Added AlloyDB ScaNN vector index support, providing a high-performance vector index option for AlloyDB deployments·@can1357@can1357·e4422a9b
  • Added map-type entity labels so entity extraction can produce structured groups (e.g. address fields) instead of flat strings·@nicoloboschi@nicoloboschi·2b725bc4
  • Added z.ai (智谱) as a first-class LLM provider with a free-tier-friendly default model·@Burgunthy@Burgunthy·4c75cd9e
  • Added a litellmrouter provider that lets the API call multiple LLM endpoints with automatic fallback chains·@nicoloboschi@nicoloboschi·98f33cbf
  • Added an optional read-only database backend for recall queries, letting recall traffic go to a replica while writes stay on the primary·@cdbartholomew@cdbartholomew·cf9b1f59
  • Added optional access-key login to the Control Plane for protecting the admin UI without a full SSO setup·@ariel-ai-bot@ariel-ai-bot·c0ff87ea
  • Added a --strategy flag to memory retain-files in the CLI for selecting how files are split before retention·@nicoloboschi@nicoloboschi·375747f5
  • Bank dropdown in the Control Plane now shows per-bank memory counts to make bank selection easier·@nicoloboschi@nicoloboschi·5a0cb4a5

Improvements

  • Worker progress-stats fanout is now scoped to schemas with pending work, eliminating wasted polling on idle tenants·@nicoloboschi@nicoloboschi·39d31ad2

Bug Fixes

  • Fixed fact extraction reliability by removing multiplicative retry layers that could turn one transient LLM failure into many·@nicoloboschi@nicoloboschi·a22e8bdd
  • Fixed reflect to read document metadata from the original retain parameters so reflect outputs include the right document context·@nepenth@nepenth·f2a2f9fe
  • Failed batch_retain operations now propagate the child error_message to the parent, so failures surface a meaningful reason instead of a generic error·@cdbartholomew@cdbartholomew·7b82d05b
  • Fixed macOS daemon mode by replacing os.fork() with subprocess.Popen, restoring compatibility with PyTorch MPS·@nicoloboschi@nicoloboschi·8d77976a
  • Hardened Control Plane access-key authentication and improved login UX (clearer errors, redirect handling)·@nicoloboschi@nicoloboschi·b628716f
  • Reduced retain memory pressure by clearing content references after use, helping long-running workers avoid OOM on large batches·@nicoloboschi@nicoloboschi·c9145805
  • Fixed Docker image to chmod 755 /home/hindsight so containers run cleanly with --user UID:GID overrides·@nicoloboschi@nicoloboschi·a5cef602
  • Hard-pinned meta packages (hindsight-api, hindsight-all, hindsight-all-slim) to matching hindsight-api-slim, preventing stale slim installs after upgrade·@nicoloboschi@nicoloboschi·a86d5381
  • Worker now probes pg_proc before calling the optional schemas_with_pending_work() function, so older databases without it don't crash the worker·@nicoloboschi@nicoloboschi·a1c1b7de
  • TypeScript client now derives CLIENT_VERSION via tsup define so the published package version is always reported correctly·@nicoloboschi@nicoloboschi·ab8cc3e6

0.6.0

Features

  • Added Dify integration providing Hindsight memory tools inside Dify workflows.·@benfrank241@benfrank241·bc23750b
  • Added n8n community node package for using Hindsight memory in n8n automations.·@benfrank241@benfrank241·c1eaf711
  • Added SmolAgents integration providing Hindsight memory tools for SmolAgents-based agents.·@benfrank241@benfrank241·8314de5e
  • Added AWS Bedrock AgentCore Runtime integration (hindsight-agentcore).·@benfrank241@benfrank241·c91696f5
  • Added an Oracle Database backend option for enterprise storage.·@DK09876@DK09876·50f559c9
  • Added the ability to cancel long-running async operations.·@nicoloboschi@nicoloboschi·7f30dcc7
  • Typescript client now supports AbortSignal across all methods for request cancellation.·@harryplusplus@harryplusplus·8367930c

Improvements

  • Python client now exposes retain_async via retain()/aretain() for async memory retention flows.·@harryplusplus@harryplusplus·daae8223
  • Anthropic provider now supports environment-configurable max retries and default headers.·@TuftyBruno@TuftyBruno·fa4bf700
  • Memories timeseries stats can now be grouped by different time fields (time_field toggle).·@aliu-ronin@aliu-ronin·cf1a97ab
  • Improved database performance for observation reads by reducing overhead and making reads backend-aware.·@nicoloboschi@nicoloboschi·d8ec2d7f

Bug Fixes

  • Fixed search ranking by correcting BM25 score direction for the vchord backend.·@liling@liling·b322b0c5
  • Recall now correctly inherits observation entities through source_memory_ids, improving entity continuity across related memories.·@youchi1@youchi1·8507095a
  • Entity co-occurrence timestamps now use the event date instead of the ingestion time, improving timeline accuracy.·@aliu-ronin@aliu-ronin·fc624cbf
  • Daemon mode now respects configured host settings (CLI --host and HINDSIGHT_API_HOST).·@nicoloboschi@nicoloboschi·3d3aa76b
  • Webhooks are now processed tenant-aware to prevent cross-tenant routing issues.·@cdbartholomew@cdbartholomew·b9069c28
  • API per-document graph and counts now include observations, improving completeness of document-level views.·@youchi1@youchi1·7cc2daf4
  • Improved compatibility with OpenAI-compatible providers by hardening JSON response parsing and handling null content.·@voarsh2@voarsh2·bc14e5c4
  • Fixed Windows worker startup by handling platforms that don't implement add_signal_handler.·@nicoloboschi@nicoloboschi·08b56fdc
  • Retain/batch retain reliability improved by fixing transactional atomicity, recovery checkpoint scoping, and a cascade deadlock risk.·@cdbartholomew@cdbartholomew·f4ca3038
  • Typescript client now exposes previously missing recall/reflect parameters.·@nicoloboschi@nicoloboschi·641b3912
  • MCP recall tool now exposes tag_groups so recalls can be filtered/grouped by tag groups.·@nicoloboschi@nicoloboschi·b948b574
  • Embed/daemon startup reliability improved (script path resolution, correct extras when spawning API, and safer subprocess output handling).·@nicoloboschi@nicoloboschi·0f15f76a
  • Configuration logging now redacts database URLs to avoid leaking credentials in logs.·@xmh1011@xmh1011·2bada2db
  • Fixed support for OpenAI-compatible embedding dimensions by allowing provider-specific parameters.·@zwcf5200@zwcf5200·324b4b0a

0.5.6

Bug Fixes

  • Reverted the JSON schema simplification introduced in 0.5.5 (5b1c3486). The change inlined $ref/$defs into a large blob that caused some LLMs (notably GPT-4o-mini in soft-enforcement mode) to echo the schema structure instead of producing valid responses, resulting in 0 facts extracted during retain. If you are on 0.5.5, upgrade to 0.5.6 immediately.

0.5.5

warning

0.5.5 contains a regression that can cause 0 facts to be extracted during retain. The JSON schema simplification change (5b1c3486) was reverted in 0.5.6. Please upgrade to 0.5.6.

Features

  • Added a new LlamaParse file parser, using the LlamaIndex Cloud parsing API to convert documents (PDF, DOCX, etc.) to markdown before retain.·@nicoloboschi@nicoloboschi·91106f30
  • Added a new Mental Models list view and tag filtering from mental models.·@nicoloboschi@nicoloboschi·8fbe85f0
  • Added Pipecat voice AI pipeline integration so voice agents can use Hindsight long-term memory.·@benfrank241@benfrank241·f7cc9ad6
  • Added an option to force IPv4 for Gemini embeddings to improve compatibility in restricted network environments.·@connorblack@connorblack·6fb8ac97
  • Added support to control Cohere embedding output dimensions via configuration.·@nicoloboschi@nicoloboschi·a7514e18
  • Added a new Document Chunks API plus document reprocessing and a richer document detail experience in the UI.·8eb6e0e5b
  • Added an exclude_parents filter for list operations to better control what results are returned.·@nicoloboschi@nicoloboschi·8f6e0e5b
  • Operation responses now include retry information (retry count and next retry time).·@cdbartholomew@cdbartholomew·45f47a91
  • Retain results now include processed content token counts for better usage/throughput visibility.·@cdbartholomew@cdbartholomew·9c9d7917

Improvements

  • Added full Windows support for the embedded runtime and improved how the local API is discovered/launched.·@nicoloboschi@nicoloboschi·4ba54d8c
  • Operation status reporting now includes processing and cancelled states across the API and UI.·@nicoloboschi@nicoloboschi·80982da5
  • Workers can reserve per-operation slots when claiming tasks, improving fairness and throughput under load.·@nicoloboschi@nicoloboschi·c81e62ae
  • PostgreSQL connections now support a configurable statement timeout to prevent runaway queries.·@nicoloboschi@nicoloboschi·bdb3a55d
  • Improved LLM interoperability by simplifying JSON schemas and avoiding problematic tool-choice defaults. Reverted in 0.5.6 — caused 0 facts extracted with some LLMs.·@nicoloboschi@nicoloboschi·5b1c3486
  • Worker scheduling no longer allows child tasks to block parent execution.·@nicoloboschi@nicoloboschi·a49d19cd

Bug Fixes

  • Fixed DeepSeek compatibility issues (including tool-calling quirks) and added it as a first-class LLM provider.·@nicoloboschi@nicoloboschi·461b00d4
  • Fixed Amazon Bedrock Converse compatibility by adjusting how causal relations are represented.·@nicoloboschi@nicoloboschi·4bc772d8
  • GET /banks/{bank_id}/profile no longer creates a bank as a side effect.·@cdbartholomew@cdbartholomew·99a89789
  • Reduced memory fan-out during consolidation recall to prevent excessive retrieval and improve stability.·@nicoloboschi@nicoloboschi·4ba2fffe
  • External PostgreSQL connection strings now accept asyncpg-style URLs.·@nicoloboschi@nicoloboschi·db7f4921
  • Timeseries stats buckets now return timezone-aware ISO timestamps.·@aliu-ronin@aliu-ronin·cd1ab497
  • Fixed conversation storage when using streaming LLM responses via LiteLLM.·@DK09876@DK09876·ac5181f5
  • Removed the MCP delete_memory tool to close an authorization-bypass vulnerability.·@nicoloboschi@nicoloboschi·90674aef
  • First-time UI launch no longer requires manual confirmation to install the control plane.·@bwjoke@bwjoke·33aacf5c

0.5.4

Features

  • Delta mental model refresh now scopes recall to memories created since the last refresh, making updates faster and more accurate.·@nicoloboschi@nicoloboschi·e90cfa4a
  • OpenAI-compatible embedding providers now support configurable batch sizes for better throughput.·@r266-tech@r266-tech·30700de6
  • Embedded daemon now includes a liveness check that auto-recovers from crashes.·@nicoloboschi@nicoloboschi·59f9a2bf
  • Disable daemon idle timeout by default so embedded instances stay alive between requests.·@nicoloboschi@nicoloboschi·f5dfe59b
  • Add {user_id} template variable for retainTags in the Claude Code integration.·@soichisumi@soichisumi·9181c9a2
  • New decommission-workers and worker-status admin CLI commands for managing worker fleets.·@nicoloboschi@nicoloboschi·c8b898bd

Bug Fixes

  • Fix duplicate memory units caused by chunk index scrambling during concurrent upserts.·@nicoloboschi@nicoloboschi·511ca723
  • Prevent directive content from leaking into reflect answers on empty banks.·@nicoloboschi@nicoloboschi·3d877b05
  • Honor the reflect_mission identity framing in the prompt builder so agent personality works correctly.·@nicoloboschi@nicoloboschi·a3b0d265
  • Allow reflect-specific LLM configuration when the default LLM provider is disabled.·@zwcf5200@zwcf5200·afd00c03
  • Preserve document created_at timestamp across upsert and add UI edit flow for documents.·@nicoloboschi@nicoloboschi·10785666
  • Fix ensure_ascii=False in json.dumps for LLM prompts so non-Latin text is preserved.·@harryplusplus@harryplusplus·d05b49a2
  • Route update_bank through the config resolver with generic config_updates.·@nicoloboschi@nicoloboschi·abbd3619
  • Workers now scan for active schemas before claiming tasks, preventing work on decommissioned tenants.·@cdbartholomew@cdbartholomew·7126bf8a
  • Pass DeferOperation through MemoryEngine.execute_task so extensions can requeue work.·@cdbartholomew@cdbartholomew·858f0b3a
  • Downgrade LLM verify_connection failure to a warning instead of crashing on startup.·@nicoloboschi@nicoloboschi·9901aa1e
  • Fix items_count in list_operations response to populate from result_metadata.·@nicoloboschi@nicoloboschi·41710ba1
  • Align AI SDK ReflectBasedOn types with the OpenAPI spec.·@nicoloboschi@nicoloboschi·3d6b3805
  • Fix database migration path by merging divergent Alembic heads for v0.5.3.·@grimmjoww@grimmjoww·487e2a5e
  • Lower OpenCode retainEveryNTurns default from 10 to 3 for more frequent memory saves.·@DK09876@DK09876·902704df

0.5.3

Features

  • Add a setting to limit how many memories can be consolidated per round, helping control consolidation workload and cost.·@nicoloboschi@nicoloboschi·ca561aca
  • Add integration with the OpenAI Agents SDK.·@DK09876@DK09876·b8da88c8
  • Improve mental model refresh and updates with structured operations and cleaner observation handling on upsert.·@nicoloboschi@nicoloboschi·8b80959b
  • Allow extensions to requeue work by throwing a DeferOperation exception from worker operations.·@nicoloboschi@nicoloboschi·f8904797
  • Make recall budget mapping configurable per memory bank.·@nicoloboschi@nicoloboschi·576c44d2
  • Control plane now shows failed consolidation counts with a drilldown to investigate issues.·@nicoloboschi@nicoloboschi·e1e5f36c
  • Add mental-model staleness signals and a refreshed UI/experience for reviewing model history and snapshots.·@nicoloboschi@nicoloboschi·654e4c0c
  • Replace the embedded Paperclip library with the Paperclip plugin for more flexible integrations.·@benfrank241@benfrank241·c571fac7
  • JSON logs can now include tenant information and support a configurable allowlist for what gets logged.·@nicoloboschi@nicoloboschi·3bedc1ce
  • CLI now supports named connection profiles (via -p/--profile) for easier switching between environments.·@nicoloboschi@nicoloboschi·70d60e96

Improvements

  • Reduce the default number of retries for LLM calls to fail faster when providers are erroring or unavailable.·@nicoloboschi@nicoloboschi·b52b483c
  • Make reranker failures easier to diagnose and add a configurable timeout for the TEI reranker.·@octo-patch@octo-patch·69383af8

Bug Fixes

  • Fix crashes when using Jina MLX on Metal GPUs by serializing GPU operations.·@lkttle@lkttle·2e74a324
  • Fix database migration path so upgrades from v0.4.22 to v0.5.x work correctly.·@nicoloboschi@nicoloboschi·5437cc02
  • Prevent orphaned observations if a source memory is deleted during consolidation.·@nicoloboschi@nicoloboschi·f9042e37
  • Fix Ollama requests by explicitly disabling "think" mode in the native call payload.·@karl-8888@karl-8888·7d4fd1aa
  • Fix file retain uploads and prevent orphaned retained files.·@christerence@christerence·9e30ae25
  • Fix file retain API to correctly accept and map a provided "timestamp" field.·@christerence@christerence·13f3052e
  • Improve fairness across tenants when workers claim tasks, reducing the chance of noisy tenants starving others.·@cdbartholomew@cdbartholomew·a5e53721
  • Ensure the mental model max_tokens setting is respected during refresh/reflect operations.·@nicoloboschi@nicoloboschi·568e3c30
  • Fix control-plane links by properly encoding bank IDs in URLs end-to-end.·@nicoloboschi@nicoloboschi·cbaec36f
  • Make task submission idempotent when a payload is already set, preventing duplicate/failed submissions.·@nicoloboschi@nicoloboschi·088dfecb

0.5.2

Features

  • Added a co-occurrence graph view for exploring entity relationships in the control plane.·@nicoloboschi@nicoloboschi·f64c5d20
  • Added recall controls to the mental model trigger API/CLI so you can tune what gets recalled during runs.·@nicoloboschi@nicoloboschi·f2fc8f9f
  • Async operations now expose task payload details and associated document IDs for better observability and debugging.·@nicoloboschi@nicoloboschi·870bf4a3

Improvements

  • Revamped the control plane bank statistics view for clearer insights.·@nicoloboschi@nicoloboschi·34365c32
  • Clients now send an identifying User-Agent header on all HTTP requests for easier server-side diagnostics.·@nicoloboschi@nicoloboschi·9372462e

Bug Fixes

  • Fixed consolidation retry budget handling so retries are correctly applied at the LLM call site.·@r266-tech@r266-tech·dee58139
  • Fixed a crash during retain when embeddings and extracted facts counts didn’t match.·@nicoloboschi@nicoloboschi·dbd1d1a7
  • Improved embedded mode cleanup stability by adding a timeout when acquiring the cleanup lock (prevents hangs).·@r266-tech@r266-tech·6b5aa3af
  • OpenClaw plugins now reliably register agent hooks on every entry invocation.·@nicoloboschi@nicoloboschi·1be5ff33
  • TypeScript SDK now re-exports BankTemplate types from the package root for simpler imports.·@mrkhachaturov@mrkhachaturov·581bbf3f
  • Bank template configuration validation was aligned with configurable fields to prevent invalid/ignored settings.·@mrkhachaturov@mrkhachaturov·099f4c92

0.5.1

Breaking Changes

  • OpenClaw now reads configuration from plugin config instead of environment variables. (e22ae05f)

Features

  • Added SiliconFlow as a supported reranker provider. (d0b2ab9a)
  • Added an interactive OpenClaw setup wizard with Cloud / API / Embedded modes. (87322396)
  • Added a config-aware CLI to backfill OpenClaw history. (72fd3d59)
  • Added OpenClaw session pattern filtering to ignore or treat sessions as stateless. (5a61ac50)
  • Added a Cloudflare OAuth proxy integration option for self-hosted Hindsight. (aad07a14)
  • Expanded the CLI to cover all OpenAPI endpoints and request-body parameters. (c05c491d)
  • Added a default bank template environment variable (HINDSIGHT_API_DEFAULT_BANK_TEMPLATE). (fc941d5c)
  • Added a daemon lifecycle package (@vectorize-io/hindsight-all) to simplify running the all-in-one daemon. (576016f5)
  • Added recallTags and recallTagsMatch configuration options to control which tagged memories are recalled. (b57e337f)

Improvements

  • Improved OpenClaw reliability with more resilient startup behavior and richer retain metadata. (1f1716bd)

Bug Fixes

  • OpenClaw setup wizard now prompts for the token value (not the env var name). (9679d813)
  • Fixed embedded mode daemon start/stop race that could terminate healthy daemons. (e5724fcb)
  • Fixed reranker initialization issues to show real import errors and avoid a Transformers 5.x race in jina-mlx. (f82f58fa)
  • Fixed worker consolidation slot accounting to respect the configured maximum concurrency. (2d74007d)
  • Improved CLI API error output by including the HTTP response body. (93300b91)
  • Fixed CLI memory listing showing "[UNKNOWN]" for fact types. (2635bbb4)
  • Fixed recall ranking so RRF ordering is preserved when the reranker is configured as a passthrough. (4f9cf15c)
  • Fixed retain chunk insertion to be idempotent and avoid repeated retries on integrity errors. (2d95f78b)
  • Fixed retain ANN seed temp table creation to run inside a transaction for better reliability. (3fc87e76)
  • Fixed LLM requests to use the correct max token parameter for reasoning models and Azure OpenAI. (7b2263ba)

0.5.0

Breaking Changes

  • Removed BFS and MPFP graph retrieval strategies. LinkExpansionRetriever is now the sole graph retrieval algorithm, offering simpler, faster, and more accurate results. (ea834bc7)
  • Dropped the hindsight-hermes integration package. (cf0537ba)

Features

  • Built-in llama.cpp LLM provider for fully local inference without external API calls. (f74b577e)
  • Retain update_mode='append' for concatenating new content onto an existing document instead of replacing it. (3c633e5e)
  • OpenRouter support for LLM, embeddings, and reranking. (e5944b63)
  • Bank template import/export with Template Hub — export a bank's configuration, mental models, and directives as a reusable manifest, then import into other banks. (30a319a6)
  • Constellation view in the Control Plane — interactive, zoomable canvas visualization of entity relationship graphs with heat-gradient coloring and dark mode support. (36783df3)
  • Added detail parameter to list/get mental model endpoints for controlling response verbosity. (8d1bfbbd)
  • Added AutoGen integration (hindsight-autogen) for persistent long-term memory in AutoGen agents. (a757765a)
  • Added Paperclip integration (@vectorize-io/hindsight-paperclip) with Express middleware and process adapter modes for stateless agent memory. (81441ee9)
  • Added OpenCode persistent memory plugin for the OpenCode editor. (e1c6220f)
  • OpenClaw JSONL-backed retain queue for external API resilience — buffers retain calls locally when the API is unreachable. (087545cc)
  • OpenClaw now supports bankId for static bank configurations. (0e81d1a2)
  • Added Google embeddings and reranker provider support. (07de798c)
  • Added persistent volume support in Helm chart for local model cache. (cefa7554)
  • MCP server now includes a sync_retain tool and validates UUID inputs. (48185a4b)
  • Recall combined scoring now includes proof_count boost for better ranking. (26794aab)

Improvements

  • 3-phase retain pipeline restructures memory ingestion into pre-resolve, insert, and post-link phases, dramatically improving throughput under concurrent load by removing slow reads from write transactions. (914ba796)
  • Recall entity graph expansion now caps per-entity fanout and includes a timeout fallback, preventing slow queries on banks with high-fanout entities. (57f15445)
  • Fact serialization in think-prompt now includes occurred_end and mentioned_at for richer temporal context. (37348c85)
  • Consolidation observation quality improved with structured processing rules. (6f173b10)

Bug Fixes

  • LiteLLM SDK embeddings encoding_format is now configurable instead of hardcoded. (cece2c90)
  • Fixed out-of-range content_index crash in recall result mapping. (9790d904)
  • Experience fact types are now preserved correctly during normalization. (9cfdd464)
  • Clear memories endpoint no longer deletes the bank profile. (26a64cc0)
  • Embedding daemon clears stale processes on the port before starting. (7d6c570a)
  • Per-bank vector index migration now respects vector extension configuration. (4fd7c5d1)
  • Timeline group sort uses numeric date comparison instead of locale string comparison. (f3f2c6b0)
  • Resolved 25 test regressions from the streaming retain pipeline. (7415ebff)
  • MCP server now auto-coerces string-encoded JSON in tool arguments. (443c94c8)
  • Entity labels structure is now validated on PATCH to prevent invalid configurations. (7e23f8e1)
  • Fixed bank_id metric label to be opt-in, preventing OTel memory leak. (cf4bd598)
  • Fixed max_tokens handling for OpenAI-compatible endpoints with custom base URLs. (cd99eef4)
  • Fixed event_date AttributeError when date is None in fact extraction. (6cb309f7)
  • Query analyzer now handles dateparser internal crashes gracefully. (e0e65c44)
  • Embedding profile .env overwrite skipped when config has no Hindsight keys. (9e2890ba)
  • Windows compatibility fix for hindsight-embed. (f9fe6953)
  • Addressed critical and high severity security vulnerabilities in dependencies. (ee4510a7)

0.4.22

Features

  • API now supports passing custom LLM request parameters via the HINDSIGHT_API_LLM_EXTRA_BODY configuration. (ecaa1ad1)
  • Document metadata is now exposed through the API and control plane. (627ec5d5)
  • Added a /code-review skill for automated code quality checks against project standards. (bdb33c58)
  • ZeroEntropy reranker now supports a configurable base URL. (a915584e)
  • Codex can now retain structured tool calls from rollout files. (3461398b)

Improvements

  • Embeddings via the LiteLLM SDK can now optionally specify output dimensions. (f841bcb9)
  • API responses now include an X-Ignored-Params header to warn when unknown request parameters were ignored. (cef42d81)
  • OpenClaw CLI startup is faster by deferring heavy initialization until the service starts. (41025c3b)

Bug Fixes

  • Mental model triggers now support the full config schema, including tag matching and tag group filters. (2c32ffad)
  • Cohere reranking via Azure endpoints now works reliably (avoids 404 errors). (84985ee9)
  • Claude Code provider no longer defers to built-in tools, preventing MCP tool handling issues. (fa82efc8)
  • Recall endpoint now returns metadata correctly instead of dropping it from the response. (4768bf39)
  • Gemini 3.1+ tool calls now read thought signatures correctly. (1b5c262a)
  • First-person agent memories are now correctly classified as "experience" facts. (00961156)
  • Codex upgrades now preserve and merge new settings instead of skipping them. (b104bad0)
  • LlamaIndex integration fixes improve document ID handling, memory API behavior, and ReAct tracing. (d93dfea8)

0.4.21

Features

  • Added audit logging for feature usage tracking, including request duration in audit entries. (083295dc)
  • Added Hindsight memory integration for the OpenAI Codex CLI. (0b17a67c)
  • Added an MCP hook to filter tool visibility per user. (f8285b7b)
  • Added a per-bank limit setting to cap the number of observations stored per scope. (b32767ca)
  • Added native Windows support so Hindsight can run without Docker. (c5700ff5)
  • Added a 'none' LLM provider to support chunk-only storage without LLM calls. (9e5a066d)
  • Added a setup command/skill to register hooks more reliably. (22ca6a8d)
  • Hermes now supports file-based configuration. (0ff36548)
  • Added a LiteLLM-based provider to support Bedrock and many additional LLM providers. (db70fdbe)
  • Added support for Strands Agents SDK integration with Hindsight memory tools. (7fe773c0)
  • Added LlamaIndex integration. (2d787c4f)
  • Added AG2 framework integration. (73123870)
  • Added support for Ark and Volcano LLM providers. (417fac61)
  • Retain now supports delta mode to skip LLM processing for unchanged chunks on upsert. (fd88c0ef)
  • Claude Code integration can now retain full sessions with document upsert and configurable tags, and records tool calls as structured JSON. (2d31b67d)
  • MCP retain tool now supports selecting a retain strategy via a parameter. (4285e944)

Improvements

  • OpenClaw logging is now configurable and can emit structured output. (d441ab81)
  • Made inclusion of source facts in search observations configurable. (5095d5e3)
  • Integrations no longer use hardcoded default models, relying on configured defaults instead. (58e68f3e)

Bug Fixes

  • Improved MCP server compatibility by handling Claude Code GET probes and allowing stateless HTTP mode to be configured. (d8050387)
  • Per-bank vector index creation now respects the configured vector extension setting. (6488c9bc)
  • Verbose retain extraction now correctly includes the retain mission context. (d2965e64)
  • Codex integration no longer crashes on startup when the API quota is exhausted (HTTP 429). (111e8c70)
  • OpenAI embeddings client now correctly parses query parameters included in base_url. (a209ef1a)
  • Fixed tool_choice handling for Codex/Claude Code when forcing specific tool calls. (585ac76f)
  • OpenClaw auto-recall now supports a configurable timeout to prevent hangs. (cd4d449f)
  • Fixed control plane UI issues affecting recall and data viewing. (6bb83f46)
  • Recall responses now include associated metadata. (0bcbf849)
  • Python client update_bank_config() now exposes all configurable fields. (7c18723f)
  • API OpenAPI schema now correctly includes Pydantic v2 ValidationError fields. (939cb40a)
  • JSON-string tags are now coerced to lists for MemoryItem and MCP tools to prevent tagging errors. (c5273f5f)

0.4.20

Features

  • Add a one-command setup CLI package for the NemoClaw integration. (d284de28)
  • Add a LangGraph integration for using Hindsight memory within LangGraph agents. (b4320254)
  • Add reflect filters to exclude specific fact types and mental model content during reflection. (ea662d06)
  • Introduce independent versioning for integrations so they can be released separately from the core server. (31f1c53c)
  • Add a Claude Code integration plugin. (f4390bdc)

Improvements

  • Add a wall-clock timeout to reflect operations so they don’t run indefinitely. (8ce06e3e)
  • Provide richer context when validating operations via the OperationValidator extension. (2eb1019d)
  • Make the hindsight-api package runnable directly via uvx by adding script entry points. (97f7a365)
  • Support passing query parameters during OpenAI-compatible client initialization for broader provider compatibility. (20e17f28)
  • Upgrade the default MiniMax model from M2.5 to M2.7. (1f1462a5)

Bug Fixes

  • Prevent context overflow during observation search by disabling source facts in results. (8e2e2d5b)
  • Fix Claude Code integration session startup by pre-starting the daemon in the background. (26944e25)
  • Fix Claude Code integration installation and configuration experience so setup is more reliable. (35b2cbb6)
  • Fix a memory leak in entity resolution that could grow over time under load. (e6333719)
  • Avoid crashes and retain failures when the Postgres pg_trgm extension is unavailable by handling detection/fallback correctly. (365fa3ce)
  • Strip Markdown code fences from model outputs across all LLM providers for more consistent parsing. (2f2db2a6)
  • Return a clear 400 error for empty recall queries and fix a SQL parameterization issue. (5cdc714a)
  • Ensure file retain requests include authentication headers so uploads work in authenticated deployments. (78aa7c53)
  • Fix MCP tool calls when MCP_AUTH_TOKEN and TENANT_API_KEY differ. (8364b9c5)
  • Allow claude-agent-sdk to install correctly on Linux/Docker environments. (3f31cbf5)
  • In LiteLLM mode, fall back to the last user message when no explicit hindsight query is provided. (5e8952c5)
  • Fix non-atomic async operation creation to prevent inconsistent operation records. (94cf89b5)
  • Prevent orphaned parent operations when a batch retain child fails unexpectedly. (43942455)
  • Fix failures for non-ASCII entity names by ensuring entity IDs are set correctly. (438ce98b)
  • Correctly store LLM facts labeled as "assistant" as "experience" in the database. (446c75f3)

0.4.19

Features

  • TypeScript client now works in Deno environments. (72c25c97)
  • Added Agno integration to use Hindsight as a memory toolkit. (8c378b98)
  • Added Hermes Agent integration (hindsight-hermes) for persistent memory. (ef90842f)
  • Expanded retain behavior with new verbatim and chunks extraction modes and named retain strategies. (e4f8a157)

Improvements

  • Improved local reranker performance/efficiency with FP16 and bucketed batching, plus compatibility with Transformers 5.x. (e7da7d0e)

Bug Fixes

  • Prevented silent memory loss when consolidation fails (failed consolidations are tracked and can be recovered). (28dac7c7)
  • Fixed Docker control-plane startup to respect the configured control-plane hostname. (8a64dc8d)
  • Database cleanup migration now removes orphaned observation memory units to avoid inconsistent memory state. (f09ad9de)
  • Deleting a document now also deletes linked memory units to prevent leftover/stale memory entries. (f27bd953)
  • Fixed MCP middleware to send an Accept header, preventing 406 response errors in some setups. (836fd81e)
  • Improved compatibility with Gemini tool-calling by preserving thought signature metadata to avoid failures on gemini-3.1-flash-lite-preview. (21f9f46c)

0.4.18

Features

  • Add compound tag filtering using tag groups. (5de793ee)
  • Publish new slim Python packages (hindsight-api-slim and hindsight-all-slim) for smaller installs. (15ea23d5)
  • Add MiniMax as a supported LLM provider. (2344484f)
  • Add Jina MLX reranker provider optimized for Apple Silicon. (1caf5ec9)

Improvements

  • Allow configuring maximum recall query tokens via an environment variable. (66dedb8d)
  • Improve retrieval performance by switching to per-bank HNSW indexes. (43b3efc4)

Bug Fixes

  • Prevent reranking failures by truncating long documents that exceed LiteLLM reranker context limits. (eeb938fc)
  • Ensure recalled memories are injected as system context for OpenClaw. (b17f338e)
  • Ensure embedded profiles are registered in CLI metadata when the daemon starts. (06b0f74a)
  • Cancel in-flight async operations when a bank is deleted to avoid dangling work. (0560f626)

0.4.17

Features

  • Added a manual retry option for failed asynchronous operations. (dcaacbe4)
  • You can now change/update tags on an existing document. (1b4ad7f4)
  • Added history tracking and a diff view for mental model changes. (e2baca8b)
  • Added observation history tracking with a UI diff view to review changes over time. (576473b6)
  • File uploads can now choose a parser per request, with configurable fallback chains. (99220d05)
  • Added an extension hook that runs after file-to-Markdown conversion completes. (1d17dea2)

Improvements

  • Operations view now supports filtering by operation type and has more reliable auto-refresh behavior. (f7a60f89)
  • Added token limits for “source facts” used during consolidation and recall to better control context usage. (5d05962d)
  • Improved bank selector usability by truncating very long bank names in the dropdown. (1e40cd22)

Bug Fixes

  • Fixed webhook schema issues affecting multi-tenant retain webhooks. (32a4882a)
  • Fixed file ingestion failures by stripping null bytes from parsed file content before retaining. (cd3a6a22)
  • Fixed tool selection handling for OpenAI-compatible providers when using named tool_choice. (1cdfb7c2)
  • Improved consolidation behavior to prioritize a bank’s mission over an ephemeral-state heuristic. (00ccf0b2)
  • Fixed database migrations to correctly handle mental model embedding dimension changes. (7accac94)
  • Fixed file upload failures caused by an Iris parser httpx read timeout. (fa3501d4)
  • Improved reliability of running migrations by serializing Alembic upgrades within the process. (f88b50a4)
  • Fixed Google Cloud Storage authentication when using Workload Identity Federation credentials. (d2504ac5)
  • Fixed the bank selector to refresh the bank list when the dropdown is opened. (0ad8c2d0)

0.4.16

Features

  • Added Webhooks with consolidation.completed and retain.completed events. (abbf874d)

Improvements

  • Improved OpenClaw recall/retention controls. (d425e93c)
  • Improved search/reranking quality by switching combined scoring to multiplicative boosts. (aa8e5475)
  • Improved performance of observation recall by 40x on large banks. (ad2cf72a)
  • Improved server shutdown behavior by capping graceful shutdown time and allowing a forced kill on a second Ctrl+C. (4c058b4b)

Bug Fixes

  • Fixed an async deadlock risk by running database schema migrations in a background thread during startup. (e0a2ac63)
  • Fixed webhook delivery/outbox processing so transactions don’t silently roll back due to using the wrong database schema name. (75b95106)
  • Fixed observation results to correctly resolve and return related chunks using source_memory_ids. (cb6d1c46)
  • Fixed MCP bank-level tool filtering compatibility with FastMCP 3.x. (f17406fd)
  • Fixed crashes when an LLM returns invalid JSON across all retries (now handled cleanly instead of raising a TypeError). (66423b85)
  • Fixed observations without source dates to preserve missing (None) temporal fields instead of incorrectly populating them. (891c33b1)

0.4.15

Features

  • Added observation_scopes to control the granularity/visibility of observations. (55af4681)
  • List documents API now supports filtering by tags (and fixes the q parameter description). (1d70abfe)
  • Added PydanticAI integration for persistent agent memory. (cab5a40f)
  • Added richer entity label support (optional labels, free-form values, multi-value fields, and UI polish). (9b96becc)
  • Added support for timestamp="unset" so content can be retained without a date. (f903948a)
  • OpenClaw can now automatically retain the last n+2 turns every n turns (default n=10). (ad1660b3)
  • Added configurable Gemini/Vertex AI safety settings for LLM calls. (73ef99e7)
  • Added extension hooks to customize root routing and error headers. (e407f4bc)

Improvements

  • Improved recall performance by fetching all recall chunks in a single query. (61bf428b)
  • Improved recall/retain performance and scalability for large memory banks. (7942f181)

Bug Fixes

  • Fixed the TypeScript SDK to send null (not undefined) when includeEntities is false. (15f4b876)
  • Prevented reflect from failing with context_length_exceeded on large memory banks. (77defd96)
  • Fixed a consolidation deadlock caused by retrying after zombie processing tasks. (c2876490)
  • Fixed observations count in the control plane that always showed 0. (eaeaa1f2)
  • Fixed ZeroEntropy rerank endpoint URL and ensured the MCP retain async_processing parameter is handled correctly. (f6f1a7d8)
  • Fixed JSON serialization issues and logging-related exception propagation when using the claude_code LLM provider. (ecb833f4)
  • Added bank-scoped request validation to prevent cross-bank/invalid bank operations. (5270aa5a)

0.4.14

Features

  • Add Chat SDK integration to give chatbots persistent memory. (fed987f9)
  • Allow configuring which MCP tools are exposed per memory bank, and expand the MCP tool set with additional tools and parameters. (3ffec650)
  • Enable the bank configuration API by default. (4d030707)
  • Support filtering graph-based memory retrieval by tags. (0bb5ca4c)
  • Add batch observations consolidation to process multiple observations more efficiently. (0aa7c2b3)
  • Add OpenClaw options to toggle autoRecall and exclude specific providers. (3f9eb27c)
  • Add a ZeroEntropy reranker provider option. (17259675)

Improvements

  • Increase customization options for reflect, retain, and consolidation behavior. (2a322732)
  • Include source document metadata in fact extraction results. (87219b73)

Bug Fixes

  • Raise a clear error when embedding dimensions exceed pgvector HNSW limits (instead of failing later at runtime). (8cd65b98)
  • Fix multi-tenant schema isolation issues in storage and the bank config API. (b180b3ad)
  • Ensure LiteLLM embedding calls use the correct float encoding format to prevent embedding failures. (58f2de70)
  • Improve recall performance by reducing memory usage during retrieval. (9f0c031d)
  • Handle observation regeneration correctly when underlying memories are deleted. (ac9a94ad)
  • Fix reflect retrieval to correctly populate dependencies and enforce full hierarchical retrieval. (8b1a4658)
  • Fix OpenClaw health checks by passing the auth token to the health endpoint. (40b02645)

0.4.13

Features

  • Switched the default OpenAI LLM to gpt-4o-mini. (325b5cc1)
  • Observation recall now includes the source facts behind recalled observations. (5569d4ad)
  • Added CrewAI integration to enable persistent memory. (41db2960)

Bug Fixes

  • Fixed npx hindsight-control-plane failing to run. (0758827d)
  • Improved MCP compatibility by aligning the local MCP implementation with the server and removing the deprecated stateless parameter. (ea8163c5)
  • Fixed Docker startup failures when using named Docker volumes. (ac739487)
  • Prevented reranker crashes when an upstream provider returns an error. (58c4d657)
  • Improved accuracy of fact temporal ordering by reducing per-fact time offsets. (c3ef1555)
  • Client timeout settings are now properly respected. (dcaa9f14)
  • Fixed documents not being tracked when fact extraction returns zero facts. (f78278ea)

0.4.12

Features

  • Accept and ingest PDFs, images, and common Office documents as inputs. (224b7b74)
  • Add the Iris file parser for improved document parsing support. (7eafba66)
  • Add async Retain support via provider Batch APIs (e.g., OpenAI and Groq) for higher-throughput ingestion. (40d42c58)
  • Allow Recall to return chunks only (no memories) by setting max_tokens=0. (7dad9da0)
  • Add a Go client SDK for the Hindsight API. (2a47389f)
  • Add support for the pgvectorscale (DiskANN) vector index backend. (95c42204)
  • Add support for Azure pg_diskann vector indexing. (476726c2)

Improvements

  • Improve reliability of async batch Retain when ingesting large payloads. (aefb3fcf)
  • Improve AI SDK tooling to make it easier to work with Hindsight programmatically. (d06a0259)

Bug Fixes

  • Ensure document tags are preserved when using the async Retain flow. (b4b5c44a)
  • Fix OpenClaw ingestion failures for very large content (E2BIG). (6bad6673)
  • Harden OpenClaw behavior (safer shell usage, better HTTP mode handling, and more reliable initialization), including per-user banks support. (c4610130)
  • Improve Python client async API consistency and reduce connection drop issues via keepalive timeout fixes. (8114ef44)

0.4.11

Features

  • Added support for LiteLLM SDK as an embeddings and reranking provider. (e408b7e)
  • Expanded Postgres search support with additional text/vector extensions, including TimescaleDB pg_textsearch and vchord/pgvector options. (d871c30)
  • Added hierarchical configuration scopes (system, tenant, bank) for more flexible multi-tenant setup and overrides. (8d731f2)
  • Added reverse proxy/base-path support for running Hindsight behind a proxy. (93ddd41)
  • Added MCP tools to create, read, update, and delete mental models. (f641b30)
  • Added a "docs" skill for agents/tools to access documentation-oriented capabilities. (dd1e098)
  • Added an OpenClaw configuration option to skip recall/retain for specific providers. (fb7be3e)

Improvements

  • Improved LiteLLM gateway model configuration for more reliable provider/model selection. (7d95a00)
  • Exposed actual LLM token usage in retain results to improve cost/usage visibility. (83ca669)
  • Added user-initiated attribution to request context to improve async task and usage attribution. (90be7c6)
  • Added OpenTelemetry tracing for improved request traceability and observability. (69dec8e)
  • Helm chart: split TEI embedding and reranker into separate deployments for independent scaling and rollout. (43f9a8b)
  • Helm chart: added PodDisruptionBudgets and per-component affinity controls for more resilient scheduling. (9943957)

Bug Fixes

  • Fixed a recursion issue in memory retention that could cause failures or runaway memory usage. (4f11210)
  • Fixed Reflect API serialization/schema issues for "based_on" so reflections are returned and stored correctly. (f9a8a8e)
  • Improved MCP server compatibility by allowing extra tool arguments when appropriate and fixing bank ID resolution priority. (7ee229b)
  • Added missing trust_code environment configuration support. (60574ee)
  • Hardened the MCP server with fixes to routing/validation and more accurate usage metering. (e798979)
  • Fixed the slim Docker image to include tiktoken to prevent runtime tokenization errors. (6eec83b)
  • Fixed MCP operations not being tracked correctly for usage metering. (888b50d)
  • Helm chart: fixed GKE deployments overriding the configured HINDSIGHT_API_PORT. (03f47e2)

0.4.10

Features

  • Provided a slimmer Docker distribution to reduce image size and speed up pulls. (f648178)
  • Added Markdown support in Reflect and Mental Models content. (c4ef090)
  • Added built-in Supabase tenant extension for running Hindsight with Supabase-backed multi-tenancy. (e99ee0f)
  • Added TenantExtension authentication support to the MCP endpoint. (fedfb49)

Improvements

  • Improved MCP tool availability/routing based on the endpoint being used. (d90588b)

Bug Fixes

  • Stopped logging database usernames and passwords to prevent credential leaks in logs. (c568094)
  • Fixed OpenClaw sessions wiping memory on each new session. (981cf60)
  • Fixed hindsight-embed profiles not loading correctly. (0430588)
  • Fixed tagged directives so they correctly apply to tagged mental models. (278718d)
  • Fixed a cast error that could cause failures at runtime. (093ecff)

Other

  • Added a docker-compose example to simplify local deployment and testing. (5179d5f)

0.4.9

Features

  • New AI SDK integration. (7e339e1)
  • Add a Python SDK for running Hindsight in embedded mode (HindsightEmbedded). (d3302c9)
  • Add streaming support to the hindsight-litellm wrappers. (665877b)
  • Add OpenClaw support for connecting to an external Hindsight API and using dynamic per-channel memory banks. (6b34692)

Improvements

  • Improve the mental models experience in the control plane UI. (7097716)
  • Reduce noisy Hugging Face logging output. (34d9188)

Bug Fixes

  • Improve recall endpoint reliability by handling timeouts correctly and rejecting overly long queries. (dd621a6)
  • Improve /reflect behavior with Claude Code and Codex providers. (a43d208)
  • Fix OpenClaw shell argument escaping for more reliable command execution. (63e2964)

0.4.8

Features

  • Added profile support for hindsight-embed, enabling separate embedding configurations/workspaces. (6c7f057)
  • Added support for additional LLM backends, including OpenAI Codex and Claude Code. (539190b)

Improvements

  • Enhanced OpenClaw and hindsight-embed parameter/config options for easier configuration and better defaults. (749478d)
  • Added OpenClaw plugin configuration options to select LLM provider and model. (8564135)
  • Server now prints its version during startup to simplify debugging and support requests. (1499ce5)
  • Improved tracing/debuggability by propagating request context through asynchronous background tasks. (44d9125)
  • Added stronger validation and context for mental model create/refresh operations to prevent invalid requests. (35127d5)

Bug Fixes

  • Improved embedding CLI experience with richer logs and isolated profiles to avoid cross-contamination between runs. (794a743)
  • Operation validation now runs correctly in the worker process, preventing invalid background operations from slipping through. (96f0e54)
  • Fixed unreliable behavior when using a custom PostgreSQL schema. (3825506)

0.4.7

Features

  • Add extension hooks to validate and customize mental model operations. (9c3fda7)
  • Add support for using an external embedding API provider in OpenClaw plugin (with additional OpenClaw compatibility fixes). (4b57b82)

Improvements

  • Speed up container startup by preloading the tiktoken encoding during Docker image builds. (039944c)

Bug Fixes

  • Prevent PostgreSQL insert failures by stripping null bytes from text fields before saving. (ef9d3a1)
  • Fix worker schema selection so it uses the correct default database schema. (d788a55)
  • Honor an already-set HINDSIGHT_API_DATABASE_URL instead of overwriting it in the hindsight-embed workflow. (f0cb192)

0.4.6

Improvements

  • Improved OpenClaw configuration setup to make embedding integration easier to configure. (27498f9)

Bug Fixes

  • Fixed OpenClaw embedding version binding/versioning to prevent mismatches when using the embed integration. (1163b1f)

0.4.5

Bug Fixes

  • Fixed occasional failures when retaining memories asynchronously with timestamps. (cbb8fc6)

0.4.4

Bug Fixes

  • Fixed async “retain” operations failing when a timestamp is provided. (35f0984)
  • Corrected the OpenClaw daemon integration name to “openclaw” (previously “openclawd”). (b364bc3)

0.4.3

Features

  • Add Vertex AI as a supported LLM provider. (c2ac7d0, 49ae55a)
  • Add Bearer token authentication for MCP and propagate tenant authentication across MCP requests. (0da77ce)

Improvements

  • CLI: add a --wait flag for consolidate and a --date filter for listing documents. (ff20bf9)

Bug Fixes

  • Fix worker polling deadlocks to prevent background processing from stalling. (f4f86e3)
  • Improve reliability of Docker builds by retrying ML model downloads. (ecc590c)
  • Fix tenant authentication handling for internal background tasks and ensure the control-plane forwards required auth to the dataplane. (03bf13e)
  • Ensure tenant database migrations run at startup and workers use the correct tenant schema context. (657fe02)
  • Fix control-plane graph endpoint errors when upstream data is missing. (751f99a)

Other

  • Rename the default bot/user identity from "moltbot" to "openclaw". (728ce13)

0.4.2

Features

  • Added Clawdbot/Moltbot/OpenClaw integration. (12e9a3d)

Improvements

  • Added additional configuration options to control LLM retry behavior. (3f211f0)
  • Added real-time logs showing a detailed timing breakdown during consolidation runs. (8781c9f)

Bug Fixes

  • Fixed hindsight-embed crashing on macOS. (c16ccc2)

0.4.1

Features

  • Added support for using a non-default PostgreSQL schema by default. (2b72e1f)

Improvements

  • Improved memory consolidation performance (benchmarking and optimizations). (b43ef98)

Bug Fixes

  • Fixed the /version endpoint returning an incorrect version. (cfcc23c)
  • Fixed mental model search failing due to UUID type mismatch after text-ID migration. (94cc0a1)
  • Added safer PyTorch device detection to prevent crashes on some environments. (67c4788)
  • Fixed Python packages exposing an incorrect version value. (fccbdfe)

0.4.0

Observations, Mental Models, new Agentic Reflect and Directives, read the announcement.

Features

  • Added support for providing a custom prompt for memory extraction. (3172e99)
  • Expanded the LiteLLM integration with async retain/reflect support, cleaner API, and support for tags/mission (including passing API keys correctly). (1d4879a)
  • Added a new worker service to run background tasks at scale. (4c79240)
  • MCP retain now supports timestamps. (b378f68)
  • Added support for installing skills via npx add-skill. (ec22317)

Improvements

  • CLI retain-files now accepts more file types. (1eeced3)

Bug Fixes

  • Fixed a macOS crash in the embed daemon caused by an XPC connection issue. (e5fc6ee)
  • Fixed occasional extraction in the wrong language. (87d4a36)
  • Fixed PyTorch model initialization issues that could cause startup failures (meta tensor/init problems). (ddaa5f5)

Features

  • Add memory tags so you can label and filter memories during recall/reflect. (20c8f8b)
  • Allow choosing different AI providers/models per operation. (e6709d5)
  • Add Cohere support for embeddings and reranking. (4de0730)
  • Add configurable embedding dimensions and OpenAI embeddings support. (70de23e)
  • Support custom base URLs for OpenAI-style embeddings and Cohere endpoints. (fa53917)
  • Add LiteLLM gateway support for routing LLM/embedding requests. (d47c8a2)
  • Add multilingual content support to improve handling and retrieval across languages. (c65c6a9)
  • Add delete memory bank capability. (4b82d2d)
  • Add backup/restore tooling for memory banks. (67b273d)

Improvements

  • Add retention modes to control how memories are extracted and stored. (fb31a35)
  • Add offline (optional) database migrations to support restricted/air-gapped deployments. (233bd2e)
  • Add database connection configuration options for more flexible deployments. (33fac2c)
  • Load .env automatically on startup to simplify configuration. (c06d9b4)
  • Expose an operation ID from retain requests so async/background processing can be tracked. (1dacd0e)
  • Add per-request LLM token usage metrics for monitoring and cost tracking. (29a542d)
  • Add LLM call latency metrics for performance monitoring. (5e1f13e)
  • Include tenant in metrics labels for better multi-tenant observability. (1ffc2a4)
  • Add async processing option to MCP retain tool for background retention workflows. (37fc7fb)

Bug Fixes

  • Fix extension loading in multi-worker deployments so all workers load extensions correctly. (f5f3fca)
  • Improve recall performance by batching recall queries. (5991308)
  • Improve retrieval quality and stability for large memory banks (graph/MPFP retrieval fixes). (6232e69)
  • Fix entities list being limited to 100 entities. (26bf571)
  • Fix UI only showing the first 1000 memories. (67c1a42)
  • Fix duplicated causal relationships and improve token usage during processing. (49e233c)
  • Improve causal link detection accuracy. (2a00df0)
  • Make retain max completion tokens configurable to prevent truncation issues. (7715a51)
  • Fix Python SDK not sending the Authorization header, preventing authenticated requests. (39e3f7c)
  • Fix stats endpoint missing tenant authentication in multi-tenant setups. (d6ff191)
  • Fix embedding dimension handling for tenant schemas in multi-tenant databases. (6fe9314)
  • Fix Groq free-tier compatibility so requests work correctly. (d899d18)
  • Fix security vulnerability (qs / CVE-2025-15284). (b3becb6)
  • Restore MCP tools for listing and creating memory banks. (9fd5679)

0.2.0

Features

  • Add additional model provider support, including Anthropic Claude and LM Studio. (787ed60)
  • Add multi-bank access and new MCP tools for interacting with multiple memory banks via MCP. (6b5f593)
  • Allow supplying custom entities when retaining memories via the retain endpoint. (dd59bc8)
  • Enhance the /reflect endpoint with max_tokens control and optional structured output responses. (d49e820)

Improvements

  • Improve local LLM support for reasoning-capable models and streamline Docker startup for local deployments. (eea0f27)
  • Support operation validator extensions and return proper HTTP errors when validation fails. (ce45d30)
  • Add configurable observation thresholds to control when observations are created/updated. (54e2df0)
  • Improve graph visualization to the control plane for exploring memory relationships. (1a62069)

Bug Fixes

  • Fix MCP server lifecycle handling so MCP lifespan is correctly tied to the FastAPI app lifespan. (6b78f7d)

0.1.15

Features

  • Add the ability to delete documents from the web UI. (f7ff32d)

Improvements

  • Improve the API health check endpoint and update the generated client APIs/types accordingly. (e06a612)

0.1.14

Bug Fixes

  • Fixes the embedded “get-skill” installer so installing skills works correctly. (0b352d1)

0.1.13

Improvements

  • Improve reliability by surfacing task handler failures so retries can occur when processing fails. (904ea4d)
  • Revamp the hindsight-embed component architecture, including a new daemon/client model and CLI updates for embedding workflows. (e6511e7)

Bug Fixes

  • Fix memory retention so timestamps are correctly taken into account. (234d426)

0.1.12

Features

  • Added an extensions system for plugging in new operations/skills (including built-in tenant support). (2a0c490)
  • Introduced the hindsight-embed tool and a native agentic skill for embedding/agent workflows. (da44a5e)

Improvements

  • Improved reliability when parsing LLM JSON by retrying on parse errors and adding clearer diagnostics. (a831a7b)

Bug Fixes

  • Fixed structured-output support for Ollama-based LLM providers. (32bca12)
  • Adjusted LLM validation to cap max completion tokens at 100 to prevent validation failures. (b94b5cf)

0.1.11

Bug Fixes

  • Fixed the standalone Docker image and control plane standalone build process so standalone deployments build correctly. (2948cb6)

0.1.10

This release contains internal maintenance and infrastructure changes only.

0.1.9

Features

  • Simplified local MCP installation and added a standalone UI option for easier setup. (1c6acc3)

Bug Fixes

  • Fixed the standalone Docker image so it builds and starts reliably. (b52eb90)
  • Improved Docker runtime reliability by adding required system utilities (procps). (ae80876)

0.1.8

Bug Fixes

  • Fix bank list responses when a bank has no name. (04f01ab)
  • Fix failures when retaining memories asynchronously. (63f5138)
  • Fix a race condition in the bank selector when switching banks. (e468a4e)

0.1.7

This release contains internal maintenance and infrastructure changes only.

0.1.6

Features

  • Added support for the Gemini 3 Pro and GPT-5.2 models. (bb1f9cb)
  • Added a local MCP server option for running/connecting to Hindsight via MCP without a separate remote service. (7dd6853)

Improvements

  • Updated the Postgres/pg0 dependency to a newer 0.11.x series for improved compatibility and stability. (47be07f)

0.1.5

Features

  • Added LiteLLM integration so Hindsight can capture and manage memories from LiteLLM-based LLM calls. (dfccbf2)
  • Added an optional graph-based retriever (MPFP) to improve recall by leveraging relationships between memories. (7445cef)

Improvements

  • Switched the embedded Postgres layer to pg0-embedded for a smoother local/standalone experience. (94c2b85)

Bug Fixes

  • Fixed repeated retries on 400 errors from the LLM, preventing unnecessary request loops and failures. (70983f5)
  • Fixed recall trace visualization in the control plane so search/recall debugging displays correctly. (922164e)
  • Fixed the CLI installer to make installation more reliable. (158a6aa)
  • Updated Next.js to patch security vulnerabilities (CVE-2025-55184, CVE-2025-55183). (f018cc5)

0.1.3

Improvements

  • Improved CLI and UI branding/polish, including new banner/logo assets and updated interface styling. (fa554b8)

0.1.2

Bug Fixes

  • Fixed the standalone Docker image so it builds/runs correctly. (1056a20)

Integration Changelogs

IntegrationPackageDescription
LiteLLMhindsight-litellmUniversal LLM memory via LiteLLM (100+ providers)
Pydantic AIhindsight-pydantic-aiPersistent memory tools for Pydantic AI agents
CrewAIhindsight-crewaiPersistent memory for CrewAI agents
AI SDK@vectorize-io/hindsight-ai-sdkMemory integration for Vercel AI SDK
Chat SDK@vectorize-io/hindsight-chatMemory integration for Vercel Chat SDK
OpenClaw@vectorize-io/hindsight-openclawHindsight memory plugin for OpenClaw