Skip to main content

Memory Banks

Memory banks are isolated containers that store all memory-related data for a specific context or use case.

What is a Memory Bank?

A memory bank is a complete, isolated storage unit containing:

  • Memories — Facts and information retained from conversations
  • Documents — Files and content indexed for retrieval
  • Entities — People, places, concepts extracted from memories
  • Relationships — Connections between entities in the knowledge graph
  • Directives — Hard rules the agent must follow during reflect operations

Banks are completely isolated from each other — memories stored in one bank are not visible to another.

You don't need to pre-create a bank. Hindsight will automatically create it with default settings when you first use it.

Prerequisites

Make sure you've completed the Quick Start to install the client and start the server.

Creating a Memory Bank

client.create_bank(bank_id="my-bank")

Bank Configuration

Each memory bank can be configured independently per operation. Configuration can be set via the bank config API, the Control Plane UI, or server-wide environment variables.

retain_mission

A plain-language description of what this bank should pay attention to during extraction. The mission is injected into the extraction prompt alongside the built-in rules — it steers focus without replacing the extraction logic.

e.g. Always include technical decisions, API design choices, and architectural trade-offs.
Ignore meeting logistics, greetings, and social exchanges.

Works alongside any extraction mode. Leave blank for general-purpose extraction.

retain_extraction_mode

Controls how aggressively facts are extracted:

ModeDescription
concise (default)Selective — only facts worth remembering long-term
verboseCaptures more detail per fact; slower and uses more tokens
customWrite your own extraction rules via retain_custom_instructions

retain_custom_instructions

Only active when retain_extraction_mode is custom. Replaces the built-in extraction rules entirely with your own instructions.

retain_chunk_size

Maximum number of characters per chunk when splitting content for fact extraction. Larger chunks mean fewer LLM calls but may reduce extraction quality on long inputs; smaller chunks improve granularity at the cost of more calls.

Default: 3000

See Retain configuration for environment variable names and defaults.

entity_labels

Defines a controlled vocabulary of key:value classification labels extracted at retain time and stored as entities. Because labels become entities, they automatically link memories in the knowledge graph (two memories with pedagogy:scaffolding are linked), improve semantic and BM25 retrieval, and optionally filter memories via the standard tags/tags_match API when tag: true is set on a group.

Each entry in entity_labels is a label group — one classification dimension:

{
"entity_labels": [
{
"key": "engagement",
"description": "Student engagement level during the session",
"type": "value",
"optional": true,
"values": [
{ "value": "active", "description": "Student is actively participating" },
{ "value": "passive", "description": "Student is listening but not participating" }
]
},
{
"key": "pedagogy",
"description": "Teaching strategies used",
"type": "multi-values",
"values": [
{ "value": "scaffolding", "description": "Breaking complex tasks into smaller steps" },
{ "value": "direct_instruction", "description": "Explicit explanation by the teacher" },
{ "value": "socratic_questioning", "description": "Guiding through questions rather than answers" }
]
}
]
}
FieldDefaultDescription
keyLabel group identifier. Becomes the prefix in key:value entities (or key:field:value for "map").
description""Shown to the LLM to guide label assignment.
type"value""value" → pick one enum value; "multi-values" → pick multiple; "text" → free-form string; "map" → structured group with named fields.
values[]Allowed values for "value" and "multi-values" types. Ignored for "text" and "map".
fields{}Field definitions for "map" types. Each field is itself typed ("text", "value", "multi-values", or nested "map"). Ignored for non-map types.
optionaltrueWhen true the LLM may skip the label if not applicable. When false the LLM must always assign a value. Has no effect on "multi-values" groups (always optional).
tagfalseWhen true, extracted key:value labels are also written as tags on the memory unit, enabling filtering via tags/tags_match in recall/reflect.

Enum groups (type: "value" or type: "multi-values"): the LLM picks from the predefined values list; anything outside the list is silently dropped. Vocabulary stays stable and graph links stay tight. Use "multi-values" when a fact can belong to several values at once.

Free-text groups (type: "text"): the LLM writes any string. Use the description field to provide examples and guidance. Graph clustering is less reliable than with enum groups because the model may phrase the same concept differently across sessions.

{
"key": "topic",
"description": "Specific subject being discussed. Examples: algebra, quadratic equations, geometry.",
"type": "text",
"optional": true,
"values": []
}

Map groups (type: "map"): defines a structured entity type with named fields. Each field is itself typed ("text", "value", "multi-values", or nested "map") so you can describe rich entities like a person with name, role, and organization. Each extracted field is stored as a flat key:field:value entity string (e.g. person:name:Alice), reusing the existing entity storage with no schema changes — so map fields participate in the knowledge graph and retrieval the same way single-value labels do.

{
"key": "person",
"description": "A person mentioned in the text",
"type": "map",
"fields": {
"name": { "type": "text", "description": "Full name of the person" },
"role": { "type": "text", "description": "Job title or role" },
"organization": { "type": "text", "description": "Company or organization" }
}
}

entities_allow_free_form

By default, entity labels are extracted alongside regular named entities (people, places, concepts). Set to false to disable free-form extraction so only label entities are stored:

{
"entity_labels": [...],
"entities_allow_free_form": false
}

enable_observations

Toggles observation consolidation on or off. When false, no consolidation runs for this bank — neither automatic nor manual. Defaults to true when the observations feature is enabled on the server.

enable_auto_consolidation

Controls whether consolidation runs automatically after retain, delete, and update operations. When false, consolidation only runs when explicitly triggered via the consolidate endpoint. Defaults to true.

This is useful when you want full control over consolidation timing — for example, batching many retains before consolidating, or running targeted consolidation for specific scopes only.

observations_mission

Defines what this bank should synthesise into durable observations. Replaces the built-in consolidation rules entirely — leave blank to use the server default.

e.g. Observations are stable facts about people and projects.
Always include preferences, skills, and recurring patterns.
Ignore one-off events and ephemeral state.

consolidation_llm_batch_size

Number of facts sent to the LLM in a single consolidation call. Higher values reduce LLM calls and improve throughput at the cost of larger prompts. Set to 1 to disable batching. Leave unset to use the server default (8).

consolidation_source_facts_max_tokens

Total token budget for source facts included with observations in the consolidation prompt. Source facts give the LLM evidence to compare new facts against existing observations. -1 = unlimited. Leave unset to use the server default (-1).

consolidation_source_facts_max_tokens_per_observation

Per-observation token cap for source facts in the consolidation prompt. Each observation independently gets at most this many tokens of source facts, preventing a single observation with many source facts from consuming the entire budget. -1 = unlimited. Leave unset to use the server default (256).

See Observations configuration for environment variable names and defaults.

reflect_mission

A first-person narrative that provides identity and framing context for reflect. The agent uses this to ground its reasoning and apply a consistent perspective.

e.g. You are a senior engineering assistant.
Always ground answers in documented decisions and rationale.
Ignore speculation. Be direct and precise.

disposition_skepticism

How skeptical vs trusting the bank is when evaluating claims during reflect. Scale 1–5.

client.create_bank(bank_id="architect-bank")
client.update_bank_config(
"architect-bank",
reflect_mission="You're a senior software architect - keep track of system designs, "
"technology decisions, and architectural patterns. Prefer simplicity over cutting-edge.",
disposition_skepticism=4, # Questions new technologies
disposition_literalism=4, # Focuses on concrete specs
disposition_empathy=2, # Prioritizes technical facts
)
ValueBehaviour
1Trusting — accepts information at face value
3 (default)Balanced
5Skeptical — questions and doubts claims

disposition_literalism

How literally to interpret information during reflect. Scale 1–5.

ValueBehaviour
1Flexible — reads between the lines, considers context
3 (default)Balanced
5Literal — takes things exactly as stated

disposition_empathy

How much to weight emotional context when reasoning during reflect. Scale 1–5.

ValueBehaviour
1Detached — focuses on facts and logic
3 (default)Balanced
5Empathetic — considers emotional context
info

Disposition traits and reflect_mission only affect the reflect operation. retain_mission and observations_mission are separate per-operation settings.

mcp_enabled_tools

An allowlist of MCP tool names that are enabled for this bank. When set, only the listed tools can be invoked; any tool not in the list returns an error (tools still appear in the MCP tools list for protocol compatibility). Set to null (or omit) to allow all tools.

["recall", "reflect"]

Available tool names: retain, recall, reflect, list_banks, create_bank, list_mental_models, get_mental_model, create_mental_model, update_mental_model, delete_mental_model, refresh_mental_model, list_directives, create_directive, delete_directive, list_memories, get_memory, list_documents, get_document, delete_document, list_operations, get_operation, cancel_operation, list_tags, get_bank, get_bank_stats, update_bank, delete_bank, clear_memories.

llm_gemini_safety_settings

Controls content filtering thresholds for Gemini and VertexAI providers. Accepts a list of safety setting objects in the Google AI safety settings format. When null (default), Gemini's built-in safety defaults are used.

[
{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_NONE"},
{"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_NONE"}
]

Only applies when HINDSIGHT_API_LLM_PROVIDER is gemini or vertexai.

recall_budget_function

Selects how the recall request's budget parameter (low / mid / high) maps to the internal thinking_budget integer used by every retrieval method (semantic, BM25, graph, temporal). Two functions are supported:

FunctionBehaviour
fixed (default)thinking_budget = recall_budget_fixed_<level> — independent of max_tokens. Preserves legacy behavior.
adaptivethinking_budget = round(max_tokens * recall_budget_adaptive_<level>), clamped to [recall_budget_min, recall_budget_max]. Retrieval breadth scales with the requested output size.
{
"recall_budget_function": "adaptive",
"recall_budget_adaptive_low": 0.05,
"recall_budget_adaptive_mid": 0.1,
"recall_budget_adaptive_high": 0.3,
"recall_budget_min": 30,
"recall_budget_max": 1500
}

recall_budget_fixed_low / recall_budget_fixed_mid / recall_budget_fixed_high

When recall_budget_function is fixed (the default), these positive integers are used directly as the per-method retrieval limit for each budget level. Defaults: 100 / 300 / 1000 — exactly matching the legacy hardcoded mapping.

recall_budget_adaptive_low / recall_budget_adaptive_mid / recall_budget_adaptive_high

When recall_budget_function is adaptive, these positive ratios multiply the request's max_tokens to derive the per-method retrieval limit. Defaults: 0.025 / 0.075 / 0.25 — chosen to roughly match the fixed defaults at max_tokens = 4096.

recall_budget_min / recall_budget_max

Floor and ceiling applied to the result of the adaptive function (after the ratio multiplication). Both must be positive integers and min ≤ max. Defaults: 20 / 2000.

See Recall budget mapping for environment variable names and full defaults.


Updating Configuration

Bank configuration fields (retain mission, extraction mode, observations mission, etc.) are managed via a separate config API, not the create_bank call. This lets you change operational settings independently from the bank's identity and disposition.

Setting Configuration Overrides

client.update_bank_config(
"my-bank",
retain_mission="Always include technical decisions, API design choices, and architectural trade-offs. Ignore meeting logistics and social exchanges.",
retain_extraction_mode="verbose",
observations_mission="Observations are stable facts about people and projects. Always include preferences, skills, and recurring patterns. Ignore one-off events.",
disposition_skepticism=4,
disposition_literalism=4,
disposition_empathy=2,
)

You can update any subset of fields — only the keys you provide are changed.

Reading the Current Configuration

# Returns resolved config (server defaults merged with bank overrides) and the raw overrides
data = client.get_bank_config("my-bank")
# data["config"] — full resolved configuration
# data["overrides"] — only fields overridden at the bank level

The response distinguishes:

  • config — the fully resolved configuration (server defaults merged with bank overrides)
  • overrides — only the fields explicitly overridden for this bank

Resetting to Defaults

# Remove all bank-level overrides, reverting to server defaults
client.reset_bank_config("my-bank")

This removes all bank-level overrides. The bank reverts to server-wide defaults (set via environment variables).

You can also update configuration directly from the Control Plane UI — navigate to a bank and open the Configuration tab.


Directives

Directives are hard rules that the agent must follow during reflect operations. Unlike disposition traits which influence how the agent reasons, directives are explicit instructions that are always enforced.

info

Directives only affect the reflect operation. They are injected into prompts and the agent is required to comply with them in all responses.

When to Use Directives

Use directives for rules that must never be violated:

  • Language/style constraints: "Always respond in formal English"
  • Privacy rules: "Never share personal data with third parties"
  • Domain constraints: "Prefer conservative investment recommendations"
  • Behavioral guardrails: "Always cite sources when making claims"

Creating Directives

# Create a directive (hard rule for reflect)
directive = client.create_directive(
bank_id=BANK_ID,
name="Formal Language",
content="Always respond in formal English, avoiding slang and colloquialisms."
)

print(f"Created directive: {directive.id}")

Listing Directives

# List all directives in a bank
directives = client.list_directives(bank_id=BANK_ID)

for d in directives.items:
print(f"- {d.name}: {d.content[:50]}...")

Updating Directives

# Update a directive (e.g., disable without deleting)
updated = client.update_directive(
bank_id=BANK_ID,
directive_id=directive_id,
is_active=False
)

print(f"Directive active: {updated.is_active}")

Deleting Directives

# Delete a directive
client.delete_directive(
bank_id=BANK_ID,
directive_id=directive_id
)

Directives vs Disposition

AspectDirectivesDisposition
NatureHard rules, must be followedSoft influence on reasoning style
EnforcementStrict — responses are rejected if violatedFlexible — shapes interpretation
Use caseCompliance, guardrails, constraintsPersonality, character, tone
Example"Never recommend specific stocks"High skepticism: questions claims

Document export & import

Move documents — and the facts already extracted from them — between banks without re-running the LLM. Useful for testing a different embedding model, or copying data between banks/instances without paying for re-extraction. The archive carries documents, raw chunks, and extracted facts (entities by canonical name, causal links) — but no embeddings or database ids. On import, facts are re-embedded with the target bank's model and entities/links are recomputed against it, so imported documents are integrated with whatever already exists there.

Export documents

GET /v1/default/banks/{bank_id}/document-transfer — synchronous; streams a ZIP archive.

# whole bank
curl -H "Authorization: Bearer $API_KEY" \
"$HINDSIGHT_URL/v1/default/banks/my-bank/document-transfer" -o my-bank.zip

# specific documents, including consolidated observations
curl -H "Authorization: Bearer $API_KEY" \
"$HINDSIGHT_URL/v1/default/banks/my-bank/document-transfer?document_id=doc-1&include_observations=true" -o subset.zip
Query paramDescription
document_idRepeatable. Export only these documents; omit for the whole bank.
include_observationsAlso export consolidated observations (default false). Only valid for a whole-bank export — combining it with document_id returns 400.

Import documents

POST /v1/default/banks/{bank_id}/document-transfer — multipart upload (file = the ZIP). Runs as a background operation (re-embedding + entity resolution can take a while), so it returns 202 with an operation_id; poll the bank's operations endpoint for status and the result counts in result_metadata.

curl -H "Authorization: Bearer $API_KEY" -F "file=@my-bank.zip" \
"$HINDSIGHT_URL/v1/default/banks/other-bank/document-transfer?on_conflict=replace"
# -> {"operation_id": "…", "status": "pending"}

curl -H "Authorization: Bearer $API_KEY" \
"$HINDSIGHT_URL/v1/default/banks/other-bank/operations/$OPERATION_ID"
# -> {"status":"completed","result_metadata":{"documents_imported":3,"facts_imported":42,"observations_imported":5,...}}

on_conflict controls what happens when a document id already exists in the target bank:

ModeBehavior
skip (default)Leave the existing document untouched.
replaceDelete the existing document's data and re-import.
new-idImport a copy under a freshly generated id.

Observations

Consolidated observations are excluded by default — the target bank regenerates them from the imported facts during consolidation. Pass include_observations=true to carry them instead: they're restored with no LLM, their source references remapped to the imported facts (which are marked consolidated so the target won't re-consolidate them).

Because an observation can be derived from facts spanning several documents, include_observations is only supported on a whole-bank export (omit document_id); combining it with a document subset returns 400.

Imported observations are inserted as-is — no merge

They are not merged or deduplicated against observations already in the target bank (consolidation merges related observations; import does not). Prefer importing observations into a fresh/empty bank, or omit include_observations and let the target consolidate the imported facts itself.

Enabling / disabling

Both endpoints are gated by server-level flags (default true). A disabled endpoint returns 404, and /version reports the state under features.document_export_api / features.document_import_api (the control plane hides the buttons accordingly).

VariableGates
HINDSIGHT_API_ENABLE_DOCUMENT_EXPORT_APIGET …/document-transfer
HINDSIGHT_API_ENABLE_DOCUMENT_IMPORT_APIPOST …/document-transfer

Migrating a bank to a new instance

To move a bank to an instance configured with a different embedding model, vector extension, or text-search backend — which can't be changed in place on a populated bank — export the whole bank and import it into the new instance, where every embedding and index is re-derived from the stored text with no LLM re-extraction. This carries documents, facts, observations, bank config, mental models, directives, and webhooks (never embeddings).

Use the hindsight-admin export-bank / import-bank commands and follow the blue-green runbook in Admin CLI → Migrating a bank to a new instance.