Skip to main content

Recall Memories

Retrieve memories using multi-strategy recall.

How Recall Works

Learn about the four retrieval strategies (semantic, keyword, graph, temporal) and RRF fusion in the Recall Architecture guide.

Prerequisites

Make sure you've completed the Quick Start to install the client and start the server.

Basic Recall

response = client.recall(bank_id="my-bank", query="What does Alice do?")
for r in response.results:
print(f"- {r.text}")

Recall Parameters

ParameterTypeDefaultDescription
querystringrequiredNatural language query
typeslistallFilter: world, experience, opinion
budgetstring"mid"Budget level: low, mid, high
max_tokensint4096Token budget for results
traceboolfalseEnable trace output for debugging
include_entitiesboolfalseInclude entity observations
max_entity_tokensint500Token budget for entity observations
tagslistNoneFilter memories by tags (see Tag Filtering)
tags_matchstring"any"How to match tags: any, all, any_strict, all_strict
response = client.recall(
bank_id="my-bank",
query="What does Alice do?",
types=["world", "experience"],
budget="high",
max_tokens=8000,
trace=True,
)

# Access results
for r in response.results:
print(f"- {r.text}")

Filter by Fact Type

Recall specific memory types:

# Only world facts (objective information)
world_facts = client.recall(
bank_id="my-bank",
query="Where does Alice work?",
types=["world"]
)
# Only experience (conversations and events)
experience = client.recall(
bank_id="my-bank",
query="What have I recommended?",
types=["experience"]
)
// Section "recall-opinions-only" not found
About Opinions

Opinions are beliefs formed during reflect operations. Unlike world facts and experience, opinions are subjective interpretations and may not represent objective truth. Depending on your use case:

  • Exclude opinions (types=["world", "experience"]) when you need factual, verifiable information
  • Include opinions when you want the agent's perspective or formed beliefs
  • Use opinions alone (types=["opinion"]) only when specifically asking about the agent's views

Token Budget Management

Hindsight is built for AI agents, not humans. Traditional retrieval systems return "top-k" results, but agents don't think in terms of result counts—they think in tokens. An agent's context window is measured in tokens, and that's exactly how Hindsight measures results.

The max_tokens parameter lets you control how much of your agent's context budget to spend on memories:

# Fill up to 4K tokens of context with relevant memories
results = client.recall(bank_id="my-bank", query="What do I know about Alice?", max_tokens=4096)

# Smaller budget for quick lookups
results = client.recall(bank_id="my-bank", query="Alice's email", max_tokens=500)

This design means you never have to guess whether 10 results or 50 results will fit your context. Just specify the token budget and Hindsight returns as many relevant memories as will fit.

Beyond the core memory results, you can optionally retrieve additional context—each with its own token budget:

OptionParameterDescription
Chunksinclude_chunks, max_chunk_tokensRaw text chunks that generated the memories
Entity Observationsinclude_entities, max_entity_tokensRelated observations about entities mentioned in results
// Section "recall-include-entities" not found

This gives your agent richer context while maintaining precise control over total token consumption.

Budget Levels

The budget parameter controls graph traversal depth:

  • "low": Fast, shallow retrieval — good for simple lookups
  • "mid": Balanced — default for most queries
  • "high": Deep exploration — finds indirect connections
# Quick lookup
results = client.recall(bank_id="my-bank", query="Alice's email", budget="low")

# Deep exploration
results = client.recall(bank_id="my-bank", query="How are Alice and Bob connected?", budget="high")

Filter by Tags

Tags enable visibility scoping—filter memories based on tags assigned during retain. This is essential for multi-user agents where each user should only see their own memories.

Basic Tag Filtering

# Filter recall to only memories tagged for a specific user
response = client.recall(
bank_id="my-bank",
query="What feedback did the user give?",
tags=["user:alice"],
tags_match="any" # OR matching, includes untagged (default)
)

Tag Match Modes

The tags_match parameter controls how tags are matched:

ModeBehaviorUntagged Memories
anyOR: memory has ANY of the specified tagsIncluded
allAND: memory has ALL of the specified tagsIncluded
any_strictOR: memory has ANY of the specified tagsExcluded
all_strictAND: memory has ALL of the specified tagsExcluded

Strict modes are useful when you want to ensure only tagged memories are returned:

# Strict mode: only return memories that have matching tags (exclude untagged)
response = client.recall(
bank_id="my-bank",
query="What did the user say?",
tags=["user:alice"],
tags_match="any_strict" # OR matching, excludes untagged memories
)

AND matching requires all specified tags to be present:

# AND matching: require ALL specified tags to be present
response = client.recall(
bank_id="my-bank",
query="What bugs were reported?",
tags=["user:alice", "bug-report"],
tags_match="all_strict" # Memory must have BOTH tags
)

Use Cases

ScenarioTagsModeResult
User A's memories only["user:alice"]any_strictOnly memories tagged user:alice
Support + feedback["support", "feedback"]anyMemories with either tag + untagged
Multi-user room["user:alice", "room:general"]all_strictOnly memories with both tags
Global + user-specific["user:alice"]anyAlice's memories + shared (untagged)