What's new in Hindsight 0.4.16
Hindsight 0.4.16 introduces a webhook system for event-driven memory pipelines, improved search quality, and several reliability fixes.
- Webhooks: React to memory events in real time with a new webhook system.
- Search Quality: Improved recall ranking with multiplicative scoring boosts.
Webhooks
Hindsight now supports webhooks, letting you react to memory events in real time without polling the API.
Webhooks are registered per memory bank and fire automatically when matching events occur. Two event types are supported:
retain.completed — fired once per document after a retain operation finishes (both synchronous and async). When retaining a batch of N documents, N separate events fire.
{
"event": "retain.completed",
"bank_id": "my-bank",
"operation_id": "a1b2c3d4e5f6",
"status": "completed",
"timestamp": "2026-03-05T12:00:01Z",
"data": {
"document_id": "doc-abc123",
"tags": ["meeting", "q1-2026"]
}
}
consolidation.completed — fired after Hindsight finishes consolidating new memories into observations.
{
"event": "consolidation.completed",
"bank_id": "my-bank",
"operation_id": "a1b2c3d4e5f6",
"status": "completed",
"timestamp": "2026-03-05T12:00:00Z",
"data": {
"observations_created": 3,
"observations_updated": 1,
"observations_deleted": null,
"error_message": null
}
}
Delivery and retries: failed deliveries (non-2xx response or timeout after 30 seconds) are retried with exponential backoff—at 5 seconds, 5 minutes, 30 minutes, 2 hours, and 5 hours—before being marked as permanently failed. Delivery tasks are queued in the same database transaction as the triggering operation, so events survive server crashes and will be retried on restart. Use the operation_id field to deduplicate if your endpoint receives the same event more than once.
The control plane UI includes a new webhook management section where you can register endpoints, view delivery history, inspect payloads, and replay failed deliveries.
See the webhooks documentation for the full reference.
Search Quality
Different reranker models output scores in very different ranges and distributions—a cross-encoder might return values tightly clustered near zero, while another model spreads scores across a wide range. The previous normalization treated all models the same, which caused reranker scores to either dominate or be drowned out when combined with semantic and BM25 signals depending on which model was configured.
Recall ranking now applies temporal proximity and recency as multiplicative boosts on top of the reranker score, rather than adding them as separate score components. This sidesteps the normalization problem: instead of summing values that may be on incompatible scales, the boosts scale the reranker score proportionally. The final ranking is much less sensitive to the absolute score range of the configured reranker model.
No configuration is required. The change applies automatically to all recall requests regardless of which reranker is configured.
Other Updates
Bug Fixes
- Fixed an async deadlock risk during startup by running database schema migrations in a background thread instead of the event loop.
- Fixed webhook delivery so transactions no longer silently roll back due to an incorrect database schema name in the outbox processor.
- Fixed observation recall to correctly resolve and return related chunks using
source_memory_ids. - Fixed MCP bank-level tool filtering to be compatible with FastMCP 3.x.
- Fixed a crash when an LLM returns invalid JSON across all retries—the error is now handled cleanly instead of raising a
TypeError. - Fixed observations without source dates to preserve
Nonetemporal fields instead of incorrectly populating them.
Feedback and Community
Hindsight 0.4.16 is a drop-in replacement for 0.4.x with no breaking changes.
Share your feedback:
For detailed changes, see the full changelog.
