Skip to main content

Vercel AI SDK

The @vectorize-io/hindsight-ai-sdk package integrates Hindsight memory with the Vercel AI SDK. It provides five ready-to-use tools for retaining, recalling, and reflecting on long-term memories.

Installation

npm install @vectorize-io/hindsight-ai-sdk @vectorize-io/hindsight-client ai

Setup

Create a Hindsight client and pass it to createHindsightTools along with a bankId. The bankId identifies the memory store for this session—typically a user ID.

import { HindsightClient } from '@vectorize-io/hindsight-client';
import { createHindsightTools } from '@vectorize-io/hindsight-ai-sdk';

const client = new HindsightClient({ baseUrl: 'http://localhost:8888' });

const tools = createHindsightTools({
client,
bankId: 'user-123',
});
Per-request bank IDs

In multi-user applications, create tools inside your request handler so each request closes over the correct bankId. See the Next.js example below.

Usage

With generateText

import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';

const { text } = await generateText({
model: openai('gpt-4o'),
tools,
maxSteps: 5,
system: 'You are a helpful assistant with long-term memory.',
prompt: 'Remember that I prefer dark mode and large fonts.',
});

With streamText

import { streamText } from 'ai';

const result = streamText({
model: openai('gpt-4o'),
tools,
maxSteps: 5,
system: 'You are a helpful assistant with long-term memory.',
prompt: 'What are my display preferences?',
});

for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}

With ToolLoopAgent

import { generateText, ToolLoopAgent, stepCountIs } from 'ai';
import { openai } from '@ai-sdk/openai';
import { HindsightClient } from '@vectorize-io/hindsight-client';
import { createHindsightTools } from '@vectorize-io/hindsight-ai-sdk';

const client = new HindsightClient({ baseUrl: process.env.HINDSIGHT_API_URL! });

const agent = new ToolLoopAgent({
model: openai('gpt-4o'),
tools: createHindsightTools({ client, bankId: 'user-123' }),
stopWhen: stepCountIs(10),
system: 'You are a helpful assistant with long-term memory.',
});

const result = await agent.generate({
prompt: 'Remember that my favorite editor is Neovim',
});

In a Next.js Route Handler

// app/api/chat/route.ts
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { HindsightClient } from '@vectorize-io/hindsight-client';
import { createHindsightTools } from '@vectorize-io/hindsight-ai-sdk';

const hindsightClient = new HindsightClient({
baseUrl: process.env.HINDSIGHT_API_URL!,
});

export async function POST(req: Request) {
const { messages, userId } = await req.json();

// Tools are created per-request, closing over the current user's bankId
const tools = createHindsightTools({
client: hindsightClient,
bankId: userId,
});

return streamText({
model: openai('gpt-4o'),
tools,
maxSteps: 5,
system: 'You are a helpful assistant with long-term memory.',
messages,
}).toDataStreamResponse();
}

Tools Reference

Five tools are registered. The bankId is fixed at creation time—the agent cannot change it.

ToolWhat the agent providesWhat the constructor controls
retaincontent, documentId, timestamp, contextasync, tags, metadata
recallquery, queryTimestampbudget, types, maxTokens, includeEntities, includeChunks
reflectquery, contextbudget
getMentalModelmentalModelId
getDocumentdocumentId

Why this split? Semantic inputs (what to remember, what to search for) belong to the agent. Infrastructure concerns (cost budget, tagging strategy, async mode) belong to the application.


Constructor Options

All options except client and bankId are optional. Each tool's options are grouped under the tool name.

const tools = createHindsightTools({
client,
bankId: userId,

retain: {
async: true, // fire-and-forget (default: false)
tags: ['env:prod', 'app:support'], // always attached to every retained memory
metadata: { version: '2.0' }, // always attached to every retained memory
},

recall: {
budget: 'high', // processing depth: low | mid | high (default: 'mid')
types: ['experience', 'world'], // restrict to these fact types (default: all)
maxTokens: 2048, // cap token budget (default: API default)
includeEntities: true, // include entity observations (default: false)
includeChunks: true, // include raw source chunks (default: false)
},

reflect: {
budget: 'mid', // processing depth (default: 'mid')
},

});

retain

OptionTypeDefaultDescription
asyncbooleanfalseFire-and-forget — do not wait for ingestion to complete
tagsstring[]Tags attached to every retained memory
metadataRecord<string, string>Metadata attached to every retained memory
descriptionstringbuilt-inOverride the tool description shown to the model

recall

OptionTypeDefaultDescription
budget'low' | 'mid' | 'high''mid'Controls retrieval depth and latency
types('world' | 'experience' | 'observation')[]allRestrict results to these fact types
maxTokensnumberAPI defaultCap the total tokens returned
includeEntitiesbooleanfalseInclude entity observations in results
includeChunksbooleanfalseInclude raw source chunks in results
descriptionstringbuilt-inOverride the tool description shown to the model

reflect

OptionTypeDefaultDescription
budget'low' | 'mid' | 'high''mid'Controls synthesis depth and latency
maxTokensnumberAPI defaultMaximum tokens for the response
descriptionstringbuilt-inOverride the tool description shown to the model