What's new in Hindsight 0.4.11
Hindsight 0.4.11 focuses on production-ready deployments with improved flexibility and observability.
- Hierarchical Configuration: Customize operational settings per memory bank.
- LiteLLM SDK Integration: Direct API access without proxy server.
- Expanded Database Support: TimescaleDB pg_textsearch and additional Postgres extensions.
- OpenTelemetry Tracing: Request-level observability with ready-to-use Grafana stack.
- MCP Mental Models: Full lifecycle management via Model Context Protocol.
- Documentation Skill: Build documentation-aware assistants.
Hierarchical Configuration
You can now customize operational settings per memory bank. Configure retention behavior, extraction modes, and custom instructions for each bank independently.
# Update retention settings for a specific bank
curl -X PATCH http://localhost:8888/v1/default/banks/my-bank/config \
-H "Content-Type: application/json" \
-d '{
"updates": {
"retain_chunk_size": 1000,
"retain_extraction_mode": "custom",
"retain_custom_instructions": "Keep specific details about incidents, ignore complaints."
}
}'
Configuration cascades from system defaults (env vars) → tenant overrides → bank-specific settings. The bank config API is enabled by default and can be disabled with HINDSIGHT_API_ENABLE_BANK_CONFIG_API=false.
Type-safe access prevents accidentally using global defaults when bank overrides exist. See the Configuration Guide for details on hierarchical configuration.
LiteLLM SDK Integration
Hindsight already supported LiteLLM via proxy mode (routing requests through a LiteLLM proxy server). Now you can use LiteLLM directly via the Python SDK for embeddings and reranking—no proxy server needed.
This means simpler setup, lower latency, and fewer infrastructure components while still getting LiteLLM's benefits: unified observability, model fallback, and multi-provider support.
# Before: Required running a separate LiteLLM proxy server
export HINDSIGHT_API_EMBEDDINGS_PROVIDER=litellm
export HINDSIGHT_API_EMBEDDINGS_LITELLM_API_BASE=http://localhost:4000
# Now: Direct SDK access, no proxy needed
export HINDSIGHT_API_EMBEDDINGS_PROVIDER=litellm-sdk
export HINDSIGHT_API_EMBEDDINGS_LITELLM_SDK_API_KEY=your-api-key
export HINDSIGHT_API_EMBEDDINGS_LITELLM_SDK_MODEL=cohere/embed-english-v3.0
The same applies to reranking with HINDSIGHT_API_RERANKER_PROVIDER=litellm-sdk. Use proxy mode when you need centralized rate limiting and caching; use SDK mode for simpler deployments.
Expanded Database Support
PostgreSQL search support now includes:
- TimescaleDB pg_textsearch for better full-text search in time-series workloads (docker-compose example)
- vchord and pgvector for flexible vector storage options (docker-compose example)
- Better support for external Postgres instances with custom configurations
This gives you more deployment options whether you're running in the cloud, on-prem, or in specialized environments.
OpenTelemetry Tracing
Hindsight now emits OpenTelemetry traces for all operations, providing request-level observability across distributed systems. Combined with actual LLM token usage (not estimates) and improved user-initiated attribution in request context, you get complete visibility into costs and performance.
For local development, run ./scripts/dev/start-monitoring.sh to launch a ready-to-use Grafana LGTM stack (Loki, Grafana, Tempo, Mimir) with pre-configured dashboards—traces, metrics, and logs in a single container.
Async background operations are also properly attributed, making it easier to track usage and debug issues in production.
MCP Mental Models
The Model Context Protocol server now supports full mental model lifecycle management. Agents using Hindsight via MCP can create, read, update, and delete mental models—not just query them.
Documentation Skill
A new "docs" skill enables documentation-oriented capabilities, making it easier to build documentation-aware assistants that can access and reason over your documentation.
Reverse Proxy Support
Running Hindsight behind a reverse proxy or at a non-root path? Configure your base path and Hindsight handles routing correctly, making it easier to integrate with existing infrastructure.
See the nginx docker-compose example for a ready-to-use setup.
Other Updates
- Helm chart improvements: Split TEI deployments for embeddings and reranking, PodDisruptionBudgets, per-component affinity controls, and fixed GKE port configuration.
- Slim Docker image: Slim image now includes tiktoken to prevent download errors.
Feedback and Community
Hindsight 0.4.11 is a drop-in replacement for 0.4.x with no breaking changes.
Share your feedback:
For detailed changes, see the full changelog.
