What's new in Hindsight 0.5.4
Hindsight 0.5.4 makes delta mental model refreshes dramatically faster and more accurate, hardens embedded mode with daemon crash recovery, and includes a wave of reflect and retain reliability fixes from the community.
- Smarter Delta Refreshes: Delta mode now only recalls new memories, producing faster and more focused updates.
- Embedded Daemon Recovery: Liveness checks auto-restart crashed daemons.
- Reflect Reliability: Directive leak fix, mission identity framing, and soft-fail on startup.
- Retain & Worker Fixes: Duplicate memory prevention, document timestamp preservation, and worker schema scanning.
Smarter Delta Refreshes
Delta mental model refreshes previously ran a full recall across every memory in the bank — identical to a full refresh — then asked a second LLM to diff the result against the existing document. This was slow, expensive, and caused content duplication when the LLM was overwhelmed by hundreds of observations.
In 0.5.4, delta mode scopes the agentic recall loop to memories created or updated since last_refreshed_at. The existing document already captures prior knowledge; the reflect agent only retrieves what's genuinely new. The delta prompt has been rewritten to integrate new facts into the existing structure — merging overlapping topics, preserving concrete examples, and avoiding the wholesale section replacements that caused bloat.
The reflect agent also receives context about the mental model's name and topic during refresh, helping it stay on-topic and discard tangential observations that retrieval may return.
The result: delta refreshes that are cheaper (fewer recall queries, smaller LLM context), faster (skip the delta LLM call entirely when nothing changed), and produce cleaner documents.
Embedded Daemon Recovery
The embedded mode daemon (hindsight-all, hindsight-embed) now includes a liveness check that detects when the background process has crashed or become unresponsive. On detection, the daemon is automatically restarted with the same configuration. The idle timeout is also disabled by default, so embedded instances stay alive between requests without manual keepalive.
Reflect Reliability
Three fixes improve reflect quality and operational robustness:
- Directive leak prevention. On empty banks with no memories, directive content could leak into the reflect answer text. Fixed — directives now stay in the system prompt where they belong.
- Mission identity framing. The
reflect_missionbank setting now correctly shapes the agent's identity in the prompt builder, so agents with custom personalities actually use them. - Soft-fail on startup. If the LLM provider is unreachable at startup, the server now logs a warning and continues instead of crashing. This prevents cascading restarts when a provider has a brief outage.
Retain & Worker Fixes
- Duplicate memory prevention. Fixed chunk index scrambling during concurrent document upserts that could create duplicate memory units.
- Document timestamp preservation. The
created_attimestamp is now preserved across document upserts instead of being reset. - Non-Latin text in prompts. LLM prompts now use
ensure_ascii=Falseso Chinese, Japanese, Korean, and other non-Latin text is sent correctly. - Worker schema scanning. Workers now verify schemas are active before claiming tasks, preventing work on decommissioned tenants.
- DeferOperation passthrough. Extensions can now throw
DeferOperationthroughMemoryEngine.execute_taskfor proper requeue handling. - Configurable embedding batch size. OpenAI-compatible embedding providers support a configurable batch size for better throughput on large ingests.
- Alembic head merge. Divergent migration heads from v0.5.3 are merged so upgrades work cleanly.
Feedback and Community
Hindsight 0.5.4 is a drop-in replacement for 0.5.3 with no breaking changes to the core API.
Share your feedback:
For detailed changes, see the full changelog.
