Vercel AI SDK - Honcho

Integrate Honcho with the Vercel AI SDK to build AI apps that remember users across sessions. The Vercel AI SDK is an open-source TypeScript toolkit for building AI-powered apps with a unified API across providers. This guide shows you how to wrap any generateText or streamText call with Honcho’s memory middleware and reasoning tools.

The full package source and examples are available on GitHub.

What We’re Building

We’ll wire Honcho into a Vercel AI SDK app so the model receives context from past conversations and can query what it knows about the user mid-generation. Here’s how the pieces fit together:

Vercel AI SDK handles model calls and streaming
Honcho stores messages and retrieves user context before each generation
Your model provider can be Anthropic, OpenAI, Google, etc.

The key benefit: you don’t manually manage conversation history across sessions. Honcho handles persistence and context injection — the model always has a rich picture of who it’s talking to. (New to Honcho’s primitives? See peers and sessions.)

Setup

Install the package:

npm install @honcho-ai/vercel-ai-sdk

Get your API key at app.honcho.dev.

HONCHO_API_KEY=your-api-key
HONCHO_WORKSPACE_ID=your-workspace-id

Use the Skill

The package ships a Skill that can walk an agent through wiring Honcho into your Vercel AI SDK app automatically — it greps for your generateText / streamText call sites, asks where userId / sessionId come from, and applies the integration in place.

npx skills add plastic-labs/vercel-ai-sdk

Then invoke /honcho-vercel-ai-sdk.

Alternative: manual symlink from npm package

If you’ve already installed @honcho-ai/vercel-ai-sdk via npm, you can symlink the skill directly. Example shown is for Claude Code:

mkdir -p ~/.claude/skills/honcho-vercel-ai-sdk
ln -sf "$(pwd)/node_modules/@honcho-ai/vercel-ai-sdk/skills/honcho-vercel-ai-sdk/SKILL.md" \
       ~/.claude/skills/honcho-vercel-ai-sdk/SKILL.md

Restart the session, then invoke /honcho-vercel-ai-sdk.

Create a Provider Instance

createHoncho() is the entry point. It reads your API key and workspace from environment variables and returns a provider object with middleware(), tools(), and send().

import { createHoncho } from '@honcho-ai/vercel-ai-sdk';

const honcho = createHoncho();

You can set a stable defaultAssistantId on the provider to identify the AI peer across all calls:

const honcho = createHoncho({
  defaultAssistantId: 'my-assistant',
});

Add Middleware

honcho.middleware() is compatible with wrapLanguageModel. Two things happen on each call:

Before generation — Honcho fetches the user’s representation, peer card, session summary, and recent messages and injects them into the system prompt
After generation — the user message and assistant response are stored back in Honcho with correct peer attribution

import { createHoncho } from '@honcho-ai/vercel-ai-sdk';
import { wrapLanguageModel, generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';

const honcho = createHoncho();

const model = wrapLanguageModel({
  model: anthropic('claude-sonnet-4-6'),
  middleware: honcho.middleware({
    userId: 'user-abc',
    sessionId: 'session-123',
  }),
});

const { text } = await generateText({
  model,
  prompt: 'What should I focus on today?',
});

Pass userId and sessionId per request — no session handles to construct. Both default to lazily generated IDs if omitted, which is fine for local scripts but not for multi-user server traffic.

Add Tools

honcho.tools() gives the model six tools it can call mid-generation to query or update what it knows about the user:

Tool	What it does
`honcho_chat`	Dialectic reasoning — ask natural-language questions about the user; answers synthesized from full interaction history
`honcho_context`	Short summary of recent context within the session
`honcho_search`	Semantic search over stored conversation messages
`honcho_search_conclusions`	Query derived conclusions: personality traits, preferences, behavioral patterns
`honcho_get_representation`	Full synthesized profile of the user
`honcho_save_conclusion`	Persist an observation about the user for future sessions

Pass the same userId and sessionId to honcho.tools() so tool calls bind to the same peers as the middleware:

import { generateText, stepCountIs } from 'ai';

const { text } = await generateText({
  model,
  tools: honcho.tools({
    userId: 'user-abc',
    sessionId: 'session-123',
  }),
  stopWhen: stepCountIs(3),
  prompt: 'Based on our conversations, what do I care about most?',
});

Complete Example

Here’s a full working example combining middleware and tools. Want a runnable end-to-end version? See the Full Script.

import { createHoncho } from '@honcho-ai/vercel-ai-sdk';
import { wrapLanguageModel, generateText, stepCountIs } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';

const honcho = createHoncho({
  defaultAssistantId: 'assistant',
});

const userId = 'user-abc';
const sessionId = 'session-123';

const model = wrapLanguageModel({
  model: anthropic('claude-sonnet-4-6'),
  middleware: honcho.middleware({ userId, sessionId }),
});

const { text } = await generateText({
  model,
  tools: honcho.tools({ userId, sessionId }),
  stopWhen: stepCountIs(3),
  prompt: 'What should we work on today?',
});

console.log(text);

Streaming

streamText works the same way — middleware handles persistence after the stream completes:

import { createHoncho } from '@honcho-ai/vercel-ai-sdk';
import { wrapLanguageModel, streamText } from 'ai';
import { openai } from '@ai-sdk/openai';

const honcho = createHoncho();

const userId = 'user-abc';
const sessionId = 'session-456';

const model = wrapLanguageModel({
  model: openai('gpt-4o'),
  middleware: honcho.middleware({ userId, sessionId }),
});

const result = streamText({
  model,
  tools: honcho.tools({ userId, sessionId }),
  prompt: 'What should we work on today?',
});

for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}

Using with `messages`

If your app already manages conversation history and passes a messages array directly, set injectHistory: false to prevent Honcho from prepending duplicate history:

honcho.middleware({
  userId,
  sessionId,
  injectHistory: false, // don't prepend history — we're passing messages directly
})

Honcho still injects the user’s representation and peer card into the system prompt, and still persists messages after generation. With injectHistory: false you must pass a messages array — without either messages or prompt, the Vercel AI SDK throws Invalid prompt: prompt or messages must be defined.

Verifying the Integration

1. Isolate Honcho’s Contribution

Let’s confirm the memory is actually coming from Honcho and not your app’s existing conversation history. Two ways to check: 1) through a developer method 2) through the UI. Token delta (developer check). On a session with a few prior turns, run the same prompt twice — once with injectHistory: false and once without. Compare result.usage.inputTokens:

const baseline = await generateText({
  model: wrapLanguageModel({
    model: anthropic('claude-sonnet-4-6'),
    middleware: honcho.middleware({ userId, sessionId, injectHistory: false }),
  }),
  prompt: 'What do you know about my preferences?',
});

const injected = await generateText({
  model: wrapLanguageModel({
    model: anthropic('claude-sonnet-4-6'),
    middleware: honcho.middleware({ userId, sessionId }),
  }),
  prompt: 'What do you know about my preferences?',
});

console.log(injected.usage.inputTokens - baseline.usage.inputTokens);

A positive delta is Honcho’s representation, peer card, and session summary being injected into the system prompt. Expect ~0 on a fresh peer — the deriver runs asynchronously after messages persist, so injected context only populates after a few prior turns. Dashboard (UI check). Open app.honcho.dev/explore, select your workspace, and confirm your peer and session appear under the Peers and Sessions tables. With Honcho’s contribution isolated, the rest of this section shows what the integration feels like in practice.

2. First turn

Send any message. The model responds normally — nothing is stored yet. Context injection returns empty on the first turn.

3. Build memory across turns

Have a multi-turn conversation and share something about yourself:

I prefer concise answers and I mostly work in TypeScript.

After a few turns, ask:

What do you know about my preferences?

If the model references TypeScript and concise answers without being told again in this session, memory is working.

4. Cross-session recall

Start a new session (new sessionId) with the same userId. Ask:

Call your honcho_search tool with the query 'TypeScript' and quote the exact verbatim message that contained TypeScript. Do not paraphrase.

If the search returns a message from the prior session word-for-word, peer-scoped retrieval is crossing session boundaries. honcho_search queries the user’s messages across all their sessions and doesn’t depend on the deriver, so it works regardless of how short the prior session was. To confirm the tool actually fired, inspect result.steps[i].toolCalls:

const toolFires = result.steps?.flatMap((step, i) =>
  (step.toolCalls ?? []).map((tc) => ({ step: i, tool: tc.toolName, input: tc.input }))
) ?? [];
console.log(toolFires);
// [{ step: 0, tool: "honcho_search", input: { query: "TypeScript", limit: 10 } }]

When the model takes more than one turn (call a tool, see the result, then answer), the top-level result.toolCalls is empty — check inside each step.

Full Script

honcho_vercel_chat.ts

/**
 * Multi-turn chat with Honcho memory + Vercel AI SDK.
 *
 * Prerequisites:
 * 1. Install dependencies:
 *    npm install @honcho-ai/vercel-ai-sdk ai @ai-sdk/anthropic dotenv
 * 2. Set environment variables in `.env`:
 *    HONCHO_API_KEY=your-honcho-api-key
 *    HONCHO_WORKSPACE_ID=your-workspace-id
 *    ANTHROPIC_API_KEY=your-anthropic-api-key
 * 3. Run with: npx tsx honcho_vercel_chat.ts
 *
 * Pass a stable userId from your auth system and a sessionId for the conversation
 * thread; Honcho handles persistence and context injection on every turn.
 */

import 'dotenv/config';
import { createHoncho } from '@honcho-ai/vercel-ai-sdk';
import { wrapLanguageModel, generateText, stepCountIs } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import * as readline from 'node:readline/promises';
import { stdin as input, stdout as output } from 'node:process';

const honcho = createHoncho({
  defaultAssistantId: 'assistant',
});

const userId = process.env.USER_ID ?? 'demo-user';
const sessionId = process.env.SESSION_ID ?? `session-${Date.now()}`;

const model = wrapLanguageModel({
  model: anthropic('claude-sonnet-4-6'),
  middleware: honcho.middleware({ userId, sessionId }),
});

async function chat(prompt: string): Promise<string> {
  const { text } = await generateText({
    model,
    tools: honcho.tools({ userId, sessionId }),
    stopWhen: stepCountIs(3),
    prompt,
  });
  return text;
}

async function main() {
  const rl = readline.createInterface({ input, output });
  console.log(`Honcho session: ${sessionId} (user: ${userId})`);
  console.log('Type a message, or "exit" to quit.\n');

  while (true) {
    const userMessage = (await rl.question('you > ')).trim();
    if (!userMessage || userMessage === 'exit') break;
    const reply = await chat(userMessage);
    console.log(`bot > ${reply}\n`);
  }

  rl.close();
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

Next Steps

Github Repository

Source, tests, and full API reference for @honcho-ai/vercel-ai-sdk.

Honcho Architecture

Learn about peers, sessions, and dialectic reasoning.

Self-Hosting Guide

Run Honcho locally with your Vercel AI SDK app.

Vercel AI SDK Docs

wrapLanguageModel, middleware, and tool use reference.

​What We’re Building

​Setup

​Use the Skill

​Create a Provider Instance

​Add Middleware

​Add Tools

​Complete Example

​Streaming

​Using with messages

​Verifying the Integration

​1. Isolate Honcho’s Contribution

​2. First turn

​3. Build memory across turns

​4. Cross-session recall

​Full Script

​Next Steps

Github Repository

Honcho Architecture

Self-Hosting Guide

Vercel AI SDK Docs

What We’re Building

Setup

Use the Skill

Create a Provider Instance

Add Middleware

Add Tools

Complete Example

Streaming

Using with `messages`

Verifying the Integration

1. Isolate Honcho’s Contribution

2. First turn

3. Build memory across turns

4. Cross-session recall

Full Script

Next Steps