Skip to main content

What's new in Hindsight 0.5.1

· 4 min read
Nicolò Boschi
Hindsight Team

Hindsight 0.5.1 is a polish release focused on the CLI and reliability fixes across retain, recall, and the embedded daemon. It also introduces a Cloudflare OAuth proxy for self-hosted deployments, SiliconFlow as a reranker provider, a default bank template applied at bank creation, and a new @vectorize-io/hindsight-all daemon lifecycle package.

CLI Coverage

The hindsight CLI has been expanded to cover every endpoint in the OpenAPI spec, including every request-body parameter — no more dropping down to curl for operations the CLI didn't implement yet. Webhook, audit, operation, and memory-history subcommands are now fully documented.

Error output is also easier to debug: API errors now surface the HTTP response body, and the memory list command correctly labels fact types instead of showing [UNKNOWN] for every row.

Cloudflare OAuth Proxy

Self-hosted Hindsight deployments often want OAuth in front of the API without running a full identity provider. The new Cloudflare OAuth proxy is a Cloudflare Worker that sits in front of Hindsight, handles the OAuth handshake, and forwards authenticated requests downstream. It ships with tests, CI, and security hardening, and is intended as a drop-in option for teams already using Cloudflare as their edge.

Default Bank Template

A new HINDSIGHT_API_DEFAULT_BANK_TEMPLATE environment variable applies a template manifest to every newly created bank. This makes it easy to standardize bank configuration — retain mission, observations, mental models, directives — across a deployment without touching application code or post-provisioning hooks.

Pair it with the Bank Template Hub introduced in 0.5.0 to export a known-good bank once and have every new bank come up with the same baseline.

SiliconFlow Reranker

SiliconFlow is now a supported reranker provider alongside the existing options, giving self-hosted and cost-sensitive deployments another hosted reranker to choose from. Configure it like any other provider via HINDSIGHT_API_RERANKER_PROVIDER.

Daemon Lifecycle Package

@vectorize-io/hindsight-all is a new npm package that manages the lifecycle of the all-in-one daemon — start, stop, health checks — so Node-based applications can boot and shut down Hindsight locally without shelling out. Useful for local development, embedded deployments, and test harnesses that need a fresh Hindsight instance per run.

Reliability Fixes

A cluster of fixes addresses edge cases spotted in production:

  • Recall RRF ranking. When the reranker is configured as a passthrough, RRF ordering is now preserved end-to-end instead of being silently replaced.
  • Async query embeddings. Recall now uses the async batch embeddings path for the query, removing a blocking call on the event loop.
  • Idempotent retain chunk inserts. Chunk insertion no longer retries on integrity errors, eliminating a class of duplicate-insert loops under concurrent retain.
  • ANN seeds inside a transaction. The retain ANN seeds temp table is now created inside its transaction, fixing an intermittent "relation does not exist" under concurrent load.
  • Consolidation slot accounting. The worker now reserves consolidation slots within the configured max_slots instead of occasionally exceeding it.
  • Embedded daemon lifecycle. Daemon start and stop are now serialized so that starting a new instance can no longer kill a healthy one mid-request.
  • macOS local embeddings/reranker. The macOS FORCE_CPU default for local embeddings and reranker has been restored after a regression caused GPU initialization to fail on some Apple Silicon setups.
  • Reranker error surfacing. Reranker initialization now surfaces the real import error instead of a generic failure, and works around a Transformers 5.x race in the jina-mlx path.
  • Reasoning models & Azure OpenAI. LLM calls now send max_completion_tokens (instead of max_tokens) for reasoning models and Azure OpenAI, fixing request failures for recent OpenAI reasoning-family models.
  • Billing for reflect sub-recalls. Internal recall calls made by reflect are now marked as internal so they don't double-bill against the caller's usage.

Documentation

  • Mental model tag filtering now clarifies that tags filter the source memories used during refresh.
  • Audit logging is clearly documented as off by default.
  • The retain update_mode parameter is fully documented.
  • CLI reference gains docs for webhook, audit, operation, and memory history subcommands.

Feedback and Community

Hindsight 0.5.1 is a drop-in replacement for 0.5.0 with no breaking changes to the core API.

Share your feedback:

For detailed changes, see the full changelog.