Pipecat
Persistent long-term memory for Pipecat voice AI pipelines via Hindsight. A single FrameProcessor slots between your user context aggregator and LLM service — recalling relevant memories before each turn and retaining conversation content after.
Quick Start
# 1. Start Hindsight (self-hosted)
pip install hindsight-all
export HINDSIGHT_API_LLM_API_KEY=your-openai-key
hindsight-api
# 2. Install the integration
pip install hindsight-pipecat
from pipecat.pipeline.pipeline import Pipeline
from hindsight_pipecat import HindsightMemoryService
memory = HindsightMemoryService(
bank_id="user-123",
hindsight_api_url="http://localhost:8888",
)
pipeline = Pipeline([
transport.input(),
stt_service,
user_aggregator,
memory, # ← add between user_aggregator and LLM
llm_service,
assistant_aggregator,
tts_service,
transport.output(),
])
Or with Hindsight Cloud:
memory = HindsightMemoryService(
bank_id="user-123",
hindsight_api_url="https://api.hindsight.vectorize.io",
api_key="hsk_your_token_here",
)
How It Works
New turn starts
└─ OpenAILLMContextFrame arrives
├─ Retain previous complete turn (user+assistant) — fire-and-forget
└─ Recall relevant memories for current user query
└─ Inject as <hindsight_memories> system message
└─ Forward enriched context to LLM
On each OpenAILLMContextFrame:
- Retain — any new complete user+assistant turn pairs are sent to Hindsight asynchronously (non-blocking)
- Recall — the latest user message is used as the search query; results are injected as a system message before the LLM sees the context
- Forward — the enriched context frame is pushed downstream
Memory accumulates across calls. By the third or fourth turn, recall starts surfacing useful context that the pipeline didn't have to re-establish.
Configuration
HindsightMemoryService(
bank_id="user-123", # Required: memory bank to use
hindsight_api_url="...", # Hindsight API URL
api_key="hsk_...", # API key (Hindsight Cloud)
recall_budget="mid", # "low", "mid", or "high"
recall_max_tokens=4096, # Max tokens for recall results
enable_recall=True, # Inject memories before LLM
enable_retain=True, # Store turns after each exchange
memory_prefix="Relevant memories from past conversations:\n",
)
Global Configuration
from hindsight_pipecat import configure
configure(
hindsight_api_url="http://localhost:8888",
api_key="hsk_...",
recall_budget="mid",
)
# Now create services without repeating connection details
memory = HindsightMemoryService(bank_id="user-123")
Compatibility
Tested with Pipecat v0.0.108. The processor handles both the new LLMContextFrame and the deprecated OpenAILLMContextFrame for forward compatibility.
Manual Testing
The examples/ directory includes an interactive text-based chat simulator for testing memory recall/retain without requiring Daily/Deepgram/Cartesia API keys:
python examples/interactive_chat.py --bank demo-user
The examples/basic_pipeline.py shows the full voice pipeline with Daily + Deepgram + OpenAI + Cartesia.
Prerequisites
A running Hindsight instance:
Self-hosted:
pip install hindsight-all
export HINDSIGHT_API_LLM_API_KEY=your-api-key
hindsight-api # starts on http://localhost:8888
Hindsight Cloud: Sign up — no self-hosting required.