# SDK and API Compatibility Guide Source: https://honcho.dev/docs/changelog/compatibility-guide Compatibility guide for Honcho's SDKs and API This guide helps you match the right SDK version to your Honcho API version. Newer SDK patch versions are always backward-compatible within the same major version — install the latest patch for your range. ## Current Versions **Latest:** v2.1.1 ```bash theme={null} npm install @honcho-ai/sdk ``` **Latest:** v2.1.1 ```bash theme={null} pip install honcho-ai ``` ## Version Compatibility Table | Honcho API Version | TypeScript SDK | Python SDK | | ------------------ | -------------- | ---------- | | v3.0.6 (Current) | v2.1.1 | v2.1.1 | | v3.0.5 | v2.1.0 | v2.1.0 | | v3.0.4 | v2.1.0 | v2.1.0 | | v3.0.3 | v2.1.0 | v2.1.0 | | v3.0.2 | v2.0.0+ | v2.0.0+ | | v3.0.1 | v2.0.0+ | v2.0.0+ | | v3.0.0 | v2.0.0+ | v2.0.0+ | | v2.5.1 | v1.6.0 | v1.6.0 | | v2.5.0 | v1.6.0 | v1.6.0 | | v2.4.3 | v1.5.0 | v1.5.0 | | v2.4.2 | v1.5.0 | v1.5.0 | | v2.4.1 | v1.5.0 | v1.5.0 | | v2.4.0 | v1.5.0 | v1.5.0 | | v2.3.3 | v1.4.1 | v1.4.1 | | v2.3.2 | v1.4.0 | v1.4.0 | | v2.3.1 | v1.4.0 | v1.4.0 | | v2.3.0 | v1.4.0 | v1.4.0 | | v2.2.0 | v1.3.0 | v1.3.0 | | v2.1.1 | v1.2.1 | v1.2.2 | | v2.1.0 | v1.2.1 | v1.2.2 | | v2.0.5 | v1.1.0 | v1.1.0 | | v2.0.4 | v1.1.0 | v1.1.0 | # Changelog Source: https://honcho.dev/docs/changelog/introduction Welcome to the Honcho changelog! This section documents all notable changes to the Honcho API and SDKs. Each release is documented with: * **Added**: New features and capabilities * **Changed**: Modifications to existing functionality * **Deprecated**: Features that will be removed in future versions * **Removed**: Features that have been removed * **Fixed**: Bug fixes and corrections * **Security**: Security-related improvements ## Version Format Honcho follows [Semantic Versioning](https://semver.org/): * **MAJOR** version for incompatible API changes * **MINOR** version for backwards-compatible functionality additions * **PATCH** version for backwards-compatible bug fixes ### Honcho API and SDK Changelogs ### Changed * Tightened transaction scopes across search, agent tools, queue manager, and webhook delivery to minimize DB connection hold time during external operations (#525) * Search operations refactored to two-phase pattern — external work (embeddings, LLM calls) completes before opening a transaction (#525) * Agent tool executor performs external operations before acquiring DB sessions (#525) * Queue manager transaction scope reduced to only the critical section (#525) * Webhook delivery no longer holds a DB session parameter (#525) ### Fixed * Session leakage in non-session-scoped dialectic chat calls (#526) ### Added * Health check endpoint (`/health`) for container orchestration and load balancer probes (#510) ### Fixed * explicit rollback on all transactions to force connection closed ### Added * JSONB metadata validation enforces 100 key limit and max depth of 5 (#419) ### Changed * Schemas refactored from single `schemas.py` into `schemas/api.py`, `schemas/configuration.py`, and `schemas/internal.py` with backwards-compatible re-exports (#419) ### Fixed * Missing `deleted_at` filter on `RepresentationManager._query_documents_recent()` and `._query_documents_most_derived()` allowed soft-deleted documents to leak into the deriver's working representation (#456) * `CleanupStaleItemsCompletedEvent` emitted spuriously when no queue item was actually deleted (#454) * Empty JSON file uploads caused unhandled errors; now returns normalized error responses (#434) * Memory leak: `_observation_locks` switched to `WeakValueDictionary` to prevent unbounded growth (#419) * SQL injection in `dependencies.py`: parameterized `set_config` calls to prevent injection via request context (#419) * NUL byte crashes: string inputs (message content, queries, peer cards) now stripped at schema level (#419) * Filter recursion depth capped at 5 to prevent stack overflow (#419) * Dedup-skipped observations now correctly reflected in created counts (#477) * External vector store support for message search — routes queries through configured external vector store with oversampling and deduplication to handle chunked embeddings (#479) * Dialectic agent no longer holds a DB connection during LLM calls — embeddings are pre-computed before tool execution, DB sessions isolated in `extract_preferences`, `query_documents` no longer accepts a DB session parameter (#477) ### Added * Consolidated session context into a single DB session with 40/60 token budget allocation between summary and messages * Observation validation via `ObservationInput` Pydantic schema with partial-success support and batch embedding with per-observation fallback * Peer card hard cap of 40 facts with case-insensitive deduplication and whitespace normalization * Safe integer coercion (`_safe_int`) for all LLM tool inputs to handle non-integer values like `"Infinity"` * Embedding pre-computation and reuse across multiple search calls in dialectic and representation flows * Peer existence validation in dialectic chat endpoints — raises ResourceNotFoundException instead of silently failing * Logging filter to suppress noisy `GET /metrics` access logs * Oolong long-context aggregation benchmark (synth and real variants, 1K–4M token context windows) * MolecularBench fact quality evaluation (ambiguity, decontextuality, minimality scoring) * CoverageBench information recall evaluation (gold fact extraction, coverage matching, QA verification) * LoCoMo summary-as-context baseline evaluation * Webhook delivery tests, dependency lifecycle tests, queue cleanup tests, summarizer fallback tests * Parallel test execution via pytest-xdist with worker-specific databases * `test_reasoning_levels.py` script for LOCOM dataset testing across reasoning levels ### Changed * Workspace deletion is now async — returns 202 Accepted, validates no active sessions (409 Conflict), cascade-deletes in background * Redis caching layer now stores plain-dict instead of ORM objects, with v2-prefixed keys, storage, resilient `safe_cache_set`/`safe_cache_delete` helpers, and deferred post-commit cache invalidation * All `get_or_create_*` CRUD operations now use savepoints (`db.begin_nested()`) instead of commit/rollback for race condition prevention * Reconciler vector sync uses direct ORM mutation instead of batch parameterized UPDATE statements * Summarizer enforces hard word limit in prompt and creates fallback text for empty summaries with `summary_tokens = 0` * Blocked Gemini responses (SAFETY, RECITATION, PROHIBITED\_CONTENT, BLOCKLIST) now raise `LLMError` to trigger retry/backup-provider logic * Gemini client explicitly sets `max_output_tokens` from `max_tokens` parameter * All deriver and metrics collector logging replaced with structured `logging.getLogger(__name__)` calls * Dreamer specialist prompts updated to enforce durable-facts-only peer cards with max 40 entries and deduplication * `GetOrCreateResult` changed from `NamedTuple` to `dataclass` with `async post_commit()` method * FastAPI upgraded from 0.111.0 to 0.131.0; added pyarrow dependency * Queue status filtering to only show user-facing tasks (representation, summary, dream); excludes internal infrastructure tasks ### Fixed * JWT timestamp bug — `JWTParams.t` was evaluated once at class definition time instead of per-instance * Session cache invalidation on deletion was missing * `get_peer_card()` now properly propagates `ResourceNotFoundException` instead of swallowing it * `set_peer_card()` ensures peer exists via `get_or_create_peers()` before updating * Backup provider failover with proper tool input type safety * Removed `setup_admin_jwt()` from server startup * Sentry coroutine detection switched from `asyncio.iscoroutinefunction` to `inspect.iscoroutinefunction` ### Removed * `explicit.py` and `obex.py` benchmarks replaced by coverage.py and molecular.py * Claude Code review automation workflow (`.github/workflows/claude.yml`) * Coverage reporting from default pytest configuration ### Added * Documentation for reasoning\_level and Claude Code plugin ### Changed * Gave dreaming sub-agents better prompting around peer card creation, tweaked overall prompts ### Fixed * Added message-search fallback for memory search tool, necessary in fresh sessions * Made FLUSH\_ENABLED a config value * Removed N+1 query in search\_messages ### Fixed * Token counting in Explicit Agent Loop * Backwards compatibility of queue items ### Added * Agentic Dreamer for intelligent memory consolidation using LLM agents * Agentic Dialectic for query answering using LLM agents with tool use * Reasoning levels configuration for dialectic (`minimal`, `low`, `medium`, `high`, `max`) * Prometheus token tracking for deriver and dialectic operations * n8n integration * Cloud Events for auditable telemetry * External Vector Store support for turbopuffer and lancedb with reconciliation flow ### Changed * API route renaming for consistency * Dreamer and dialectic now respect peer card configuration settings * Observations renamed to Conclusions across API and SDKs * Deriver to buffer representation tasks to normalize workloads * Local Representation tasks to create singular QueueItems * getContext endpoint to use `search_query` rather than force `last_user_message` ### Fixed * Dream scheduling bugs * Summary creation when start\_message\_id > end\_message\_id * Cashews upgrade to prevent NoScriptError * Memory leak in `accumulate_metric` call ### Removed * Peer card configuration from message configuration; peer cards no longer created/updated in deriver process ### Fixed * Backwards compatibility for `message_ids` field in documents to handle legacy tuple format ### Added * Message level configurations * CRUD operations for observations * Comprehensive test cases for harness * Peer level get\_context * Set Peer Card Method * Manual dreaming trigger endpoint ### Changed * Configurations to support more flags for fine-grained control of the deriver, peer cards, summaries, etc. * Working Representations to support more fine-grained parameters ### Fixed * File uploads to match `MessageCreate` structure * Cache invalidation strategy ### Added * Redis caching to improve DB IO * Backup LLM provider to avoid failures when a provider is down ### Changed * QueueItems to use standardized columns * Improved Deduplication logic for Representation Tasks * More finegrained metrics for representation, summary, and peer card tasks * DB constraint to follow standard naming conventions ### Fixed * Langfuse tracing to have readable waterfalls * Alembic Migrations to match models.py * message\_in\_seq correctly included in webhook payload ### Changed * Alembic to always use a session pooler * Statement timeout during alembic operations to 5 min ### Added * Alembic migration validation test suite ### Fixed * Alembic migrations to batch changes * Batch message creation sequence number ### Changed * Logging infrastructure to remove noisy messages * Sentry integration is centralized ### Added * Unified `Representation` class * vllm client support * Periodic queue cleanup logic * WIP Dreaming Feature * LongMemEval to Test Bench * Prometheus Client for better Metrics * Performance metrics instrumentation * Error reporting to deriver * Workspace Delete Method * Multi-db option in test harness ### Changed * Working Representations are Queried on the fly rather than cached in metadata * EmbeddingStore to RepresentationFactory * Summary Response Model to use public\_id of message for cutoff * Semantic across codebase to reference resources based on `observer` and `observed` * Prompts for Deriver & Dialectic to reference peer\_id and add examples * `Get Context` route returns peer card and representation in addition to messages and summaries * Refactoring logger.info calls to logger.debug where applicable ### Fixed * Gemini client to use async methods ### Changed * Deriver Rollup Queue processes interleaved messages for more context ### Fixed * Dialectic Streaming to follow SSE conventions * Sentry tracing in the deriver ### Added * Get peer cards endpoint (`GET /v2/peers/{peer_id}/card`) for retrieving targeted peer context information ### Changed * Replaced Mirascope dependency with small client implementation for better control * Optimized deriver performance by using joins on messages table instead of storing token count in queue payload * Database scope optimization for various operations * Batch representation task processing for \~10x speed improvement in practice ### Fixed * Separated clean and claim work units in queue manager to prevent race conditions * Skip locked ActiveQueueSession rows on delete operations * Langfuse SDK integration updates for compatibility * Added configurable maximum message size to prevent token overflow in deriver * Various minor bugfixes ### Fixed * Added max message count to deriver in order to not overflow token limits ### Added * `getSummaries` endpoint to get all available summaries for a session directly * Peer Card feature to improve context for deriver and dialectic ### Changed * Session Peer limit to be based on observers instead, renamed config value to `SESSION_OBSERVERS_LIMIT` * `Messages` can take a custom timestamp for the `created_at` field, defaulting to the current time * `get_context` endpoint returns detailed `Summary` object rather than just summary content * Working representations use a FIFO queue structure to maintain facts rather than a full rewrite * Optimized deriver enqueue by prefetching message sequence numbers (eliminates N+1 queries) ### Fixed * Deriver uses `get_context` internally to prevent context window limit errors * Embedding store will truncate context when querying documents to prevent embedding token limit errors * Queue manager to schedule work based on available works rather than total number of workers * Queue manager to use atomic db transactions rather than long lived transaction for the worker lifecycle * Timestamp formats unified to ISO 8601 across the codebase * Internal get\_context method's cutoff value is exclusive now ### Added * Arbitrary filters now available on all search endpoints * Search combines full-text and semantic using reciprocal rank fusion * Webhook support (currently only supports queue\_empty and test events, more to come) * Small test harness and custom test format for evaluating Honcho output quality * Added MCP server and documentation for it ### Changed * Search has 10 results by default, max 100 results * Queue structure generalized to handle more event types * Summarizer now exhaustive by default and tuned for performance ### Fixed * Resolve race condition for peers that leave a session while sending messages * Added explicit rollback to solve integrity error in queue * Re-introduced Sentry tracing to deriver * Better integrity logic in get\_or\_create API methods ### Fixed * Summarizer module to ignore empty summaries and pass appropriate one to get\_context * Structured Outputs calls with OpenAI provider to pass strict=True to Pydantic Schema ### Added * Test harness for custom Honcho evaluations * Better support for session and peer aware dialectic queries * Langfuse settings * Added recent history to dialectic prompt, dynamic based on new context window size setting ### Fixed * Summary queue logic * Formatting of logs * Filtering by session * Peer targeting in queries ### Changed * Made query expansion in dialectic off by default * Overhauled logging * Refactor summarization for performance and code clarity * Refactor queue payloads for clarity ### Added * File uploads * Brand new "ROTE" deriver system * Updated dialectic system * Local working representations * Better logging for deriver/dialectic * Deriver Queue Status no longer has redundant data ### Fixed * Document insertion * Session-scoped and peer-targeted dialectic queries work now * Minor bugs ### Removed * Peer-level messages ### Changed * Dialectic chat endpoint takes a single query * Rearranged configuration values (LLM, Deriver, Dialectic, History->Summary) ### Fixed * Groq API client to use the Async library ### Fixed * Migration/provision scripts did not have correct database connection arguments, causing timeouts ### Fixed * Bug that causes runtime error when Sentry flags are enabled ### Fixed * Database initialization was misconfigured and led to provision\_db script failing: switch to consistent working configuration with transaction pooler ### Added * Ergonomic SDKs for Python and TypeScript (uses Stainless underneath) * Deriver Queue Status endpoint * Complex arbitrary filters on workspace/session/peer/message * Message embedding table for full semantic search ### Changed * Overhauled documentation * BasedPyright typing for entire project * Resource filtering expanded to include logical operators ### Fixed * Various bugs * Use new config arrangement everywhere * Remove hardcoded responses ### Added * Ability to get a peer's working representation * Metadata to all data primitives (Workspaces, Peers, Sessions, Messages) * Internal metadata to store Honcho's state no longer exposed in API * Batch message operations and enhanced message querying with token and message count limits * Search and summary functionalities scoped by workspace, peer, and session * Session context retrieval with summaries and token allocatio * HNSW Index for Documents Table * Centralized Configuration via Environment Variables or config.toml file ### Changed * New architecture centered around the concept of a "peer" replaces the former "app"/"user"/"session" paradigm * Workspaces replace "apps" as top-level namespace * Peers replace "users" * Sessions no longer nested beneath peers and no longer limited to a single user-assistant model. A session exists independently of any one peer and peers can be added to and removed from sessions. * Dialectic API is now part of the Peer, not the Session * Dialectic API now allows queries to be scoped to a session or "targeted" to a fellow peer * Database schema migrated to adopt workspace/peer/session naming and structure * Authentication and JWT scopes updated to workspace/peer/session hierarchy * Queue processing now works on 'work units' instead of sessions * Message token counting updated with tiktoken integration and fallback heuristic * Queue and message processing updated to handle sender/target and task types for multi-peer scenarios ### Fixed * Improved error handling and validation for batch message operations and metadata * Database Sessions to be more atomic to reduce idle in transaction time ### Removed * Metamessages removed in favor of metadata * Collections and Documents no longer exposed in the API, solely internal * Obsolete tests for apps, users, collections, documents, and metamessages *** ### Added * Normalize resources to remove joins and increase query performance * Query tracing for debugging ### Changed * `/list` endpoints to not require a request body * `metamessage_type` to `label` with backwards compatibility * Database Provisioning to rely on alembic * Database Session Manager to explicitly rollback transactions before closing the connection ### Fixed * Alembic Migrations to include initial database migrations * Sentry Middleware to not report Honcho Exceptions ### Added * JWT based API authentication * Configurable logging * Consolidated LLM Inference via `ModelClient` class * Dynamic logging configurable via environment variables ### Changed * Deriver & Dialectic API to use Hybrid Memory Architecture * Metamessages are not strictly tied to a message * Database provisioning is a separate script instead of happening on startup * Consolidated `session/chat` and `session/chat/stream` endpoints ## Previous Releases For a complete history of all releases, see our [GitHub Releases](https://github.com/plastic-labs/honcho/tags) page. [Python SDK](https://pypi.org/project/honcho-ai/) ### Fixed * Broadened HTTP retry logic to cover `httpx.NetworkError` and `httpx.RemoteProtocolError` in addition to `httpx.TimeoutException` and `httpx.ConnectError`, improving resilience against transient network failures ### Added * `created_at` property on `Peer` and `Session` objects * `is_active` property on `Session` objects * `get_message(message_id)` method on `Session` (sync and async) to fetch a single message by ID * `page`, `size`, and `reverse` pagination parameters on all list methods ### Changed * **Breaking**: `peer()` and `session()` now always make a get-or-create API call — no more lazy initialization * Response configuration models now tolerate unknown fields from newer servers for forward compatibility ### Fixed * Sync and async `Session.get_metadata()`, `get_configuration()`, and `refresh()` now refresh cached `created_at` and `is_active` values along with metadata and configuration * `honcho.__version__` now derives from package metadata, with a source-checkout fallback, so it stays aligned with released package versions ### Changed * All input models now reject unknown fields via strict Pydantic validation (`extra="forbid"`). Previously, misspelled or extraneous fields were silently ignored. Now a `ValidationError` is raised with the unrecognized field name. ### Added * `set_peer_card` method ### Changed * `card` is now `get_card` with `card` kept for backwards compatibility and marked as deprecated ### Added * `ConclusionScope` object for CRUD operations on conclusions (renamed from observations) * Representation configuration support ### Changed * Observations renamed to Conclusions across the SDK * Major SDK refactoring and cleanup * Simplified method signatures throughout * Representation endpoints now return `string` instead of old Representation object ### Removed * Standalone types module (now uses honcho-core types) * Representation object ### Added * metadata and configuration fields to Workspace, Peer, Session, and Message objects * Session Clone methods * Peer level get\_context method * `ObservationScope` object to perform CRUD operations on observations * Representation object for WorkingRepresentations ### Changed * methods that take IDs, can all optionally take an object of the same type ### Added * Delete workspace method ### Changed * message\_id of `Summary` model is a string nanoid * Get Context can return Peer Card & Peer Representation ### Added * Get Peer Card method * Update Message metadata method * Session level deriver status methods * Delete session message ### Fixed * Dialectic Stream returns Iterators * Type warnings ### Changed * Pagination class to match core implementation ### Added * getSummaries API returning structured summaries * Webhook support ### Changed * Messages can take an optional `created_at` value, defaulting to the current time (UTC ISO 8601) ### Added * Filter parameter to various endpoints ### Fixed * Honcho util import paths ### Added * Get/poll deriver queue status endpoints added to workspace * Added endpoint to upload files as messages ### Removed * Removed peer messages in accordance with Honcho 2.1.0 ### Changed * Updated chat endpoint to use singular `query` in accordance with Honcho 2.1.0 ### Fixed * Properly handle AsyncClient [TypeScript SDK](https://www.npmjs.com/package/@honcho-ai/sdk) ### Fixed * Broadened fetch error retry logic to catch all `TypeError` network failures (connection resets, DNS errors, etc.) instead of only those with `'fetch'` in the message, improving resilience across runtimes (Node, Bun, browsers) ### Added * `createdAt` property on `Peer` and `Session` wrapper objects * `isActive` property on `Session` wrapper objects * `getMessage(messageId)` method on `Session` to fetch a single message by ID * `Peer.representation()`, `Session.representation()`, and `Session.context()` now accept `Message` objects for `searchQuery` * `page`, `size`, and `reverse` pagination controls on all list methods ### Changed * **Breaking**: `searchQuery` removed from top-level `context()` options — use `representationOptions.searchQuery` instead: ```typescript theme={null} // Before (v2.0.x) await session.context({ searchQuery: "..." }); // After (v2.1.0) await session.context({ representationOptions: { searchQuery: "..." } }); ``` * List methods (`peers()`, `sessions()`, `messages()`, `workspaces()`) support both the new options object and the legacy raw-filter form * Representation search options now accept strings and content-like objects, including `Message` instances, while rejecting whitespace-only or invalid runtime inputs * **Breaking**: `peer()` and `session()` now always make a get-or-create API call — no more lazy initialization. If you relied on constructing SDK objects without triggering a network request, note that every `peer()` and `session()` call now hits the API: ```typescript theme={null} // Before (v2.0.x) — no API call const session = honcho.session("my-session"); // After (v2.1.0) — makes a get-or-create API call const session = await honcho.session("my-session"); ``` * Response configuration models now tolerate unknown fields from newer servers for forward compatibility * Moved `@types/node` from `dependencies` to `devDependencies` ### Fixed * `uploadFile()` now rejects unsupported top-level binary/object inputs and only validates inputs the serializer can actually upload * `uploadFile()` now serializes message configuration using API field names, matching `addMessages()` * Session fetch methods now refresh cached `createdAt` and `isActive` values alongside metadata and configuration ### Changed * Client constructor now rejects unknown options via `.strict()` Zod validation. Previously, misspelled options (e.g., `baseUrl` instead of `baseURL`) were silently ignored, causing the SDK to fall back to defaults. Now a `ZodError` is thrown with the unrecognized key name. * All input schemas now use `.strict()` validation to reject unknown fields. * `FileUploadSchema.configuration` now uses `MessageConfigurationSchema` instead of open record type. ### Fixed * README example used `baseUrl` instead of `baseURL`. ### Added * `setPeerCard` method ### Changed * `card` is now `getCard` with `card` kept for backwards compatibility and marked as deprecated ### Added * `ConclusionScope` object for CRUD operations on conclusions (renamed from observations) * Representation configuration support ### Changed * Observations renamed to Conclusions across the SDK * Major SDK refactoring and cleanup * Simplified method signatures throughout * Representation endpoints now return `string` instead of old Representation object ### Fixed * Pagination `this` binding issue ### Removed * Representation object * Stainless "core" SDK -- this SDK is now standalone ### Added * metadata and configuration fields to Workspace, Peer, Session, and Message objects * Session Clone methods * Peer level get\_context method * `ObservationScope` object to perform CRUD operations on observations * Representation object for WorkingRepresentations ### Changed * methods that take IDs, can all optionally take an object of the same type ### Added * Delete workspace method ### Changed * message\_id of `Summary` model is a string nanoid * Get Context can return Peer Card & Peer Representation ### Added * Get Peer Card method * Update Message metadata method * Session level deriver status methods * Delete session message ### Fixed * Dialectic Stream returns Iterators * Type warnings ### Changed * Pagination class to match core implementation ### Added * getSummaries API returning structured summaries * Webhook support ### Changed * Messages can take an optional `created_at` value, defaulting to the current time (UTC ISO 8601) ### Added * linting via Biome * Adding filter parameter to various endpoints ### Fixed * Order of parameters in `getSessions` endpoint ### Added * Get/poll deriver queue status endpoints added to workspace * Added endpoint to upload files as messages ### Removed * Removed peer messages in accordance with Honcho 2.1.0 ### Changed * Updated chat endpoint to use singular `query` in accordance with Honcho 2.1.0 ### Fixed * Create default workspace on Honcho client instantiation * Simplified Honcho client import path ## Getting Help If you encounter issues using the Honcho API or its SDKs: 1. Open an issue on [GitHub](https://github.com/plastic-labs/honcho/issues) 2. Join our [Discord community](http://discord.gg/honcho) for support # Create Conclusions Source: https://honcho.dev/docs/v3/api-reference/endpoint/conclusions/create-conclusions post /v3/workspaces/{workspace_id}/conclusions Create one or more Conclusions. Conclusions are logical certainties derived from interactions between Peers. They form the basis of a Peer's Representation. # Delete Conclusion Source: https://honcho.dev/docs/v3/api-reference/endpoint/conclusions/delete-conclusion delete /v3/workspaces/{workspace_id}/conclusions/{conclusion_id} Delete a single Conclusion by ID. This action cannot be undone. # List Conclusions Source: https://honcho.dev/docs/v3/api-reference/endpoint/conclusions/list-conclusions post /v3/workspaces/{workspace_id}/conclusions/list List Conclusions using optional filters, ordered by recency unless `reverse` is true. Results are paginated. # Query Conclusions Source: https://honcho.dev/docs/v3/api-reference/endpoint/conclusions/query-conclusions post /v3/workspaces/{workspace_id}/conclusions/query Query Conclusions using semantic search. Use `top_k` to control the number of results returned. # Create Key Source: https://honcho.dev/docs/v3/api-reference/endpoint/keys/create-key post /v3/keys Create a new Key # Create Messages For Session Source: https://honcho.dev/docs/v3/api-reference/endpoint/messages/create-messages-for-session post /v3/workspaces/{workspace_id}/sessions/{session_id}/messages Add new message(s) to a session. # Create Messages With File Source: https://honcho.dev/docs/v3/api-reference/endpoint/messages/create-messages-with-file post /v3/workspaces/{workspace_id}/sessions/{session_id}/messages/upload Create messages from uploaded files. Files are converted to text and split into multiple messages. # Get Message Source: https://honcho.dev/docs/v3/api-reference/endpoint/messages/get-message get /v3/workspaces/{workspace_id}/sessions/{session_id}/messages/{message_id} Get a single message by ID from a Session. # Get Messages Source: https://honcho.dev/docs/v3/api-reference/endpoint/messages/get-messages post /v3/workspaces/{workspace_id}/sessions/{session_id}/messages/list Get all messages for a Session with optional filters. Results are paginated. # Update Message Source: https://honcho.dev/docs/v3/api-reference/endpoint/messages/update-message put /v3/workspaces/{workspace_id}/sessions/{session_id}/messages/{message_id} Update the metadata of a message. This will overwrite any existing metadata for the message. # Query a Peer's representation using natural language Source: https://honcho.dev/docs/v3/api-reference/endpoint/peers/chat post /v3/workspaces/{workspace_id}/peers/{peer_id}/chat Query a Peer's representation using natural language. Performs agentic search and reasoning to comprehensively answer the query based on all latent knowledge gathered about the peer from their messages and conclusions. # Get Or Create Peer Source: https://honcho.dev/docs/v3/api-reference/endpoint/peers/get-or-create-peer post /v3/workspaces/{workspace_id}/peers Get a Peer by ID or create a new Peer with the given ID. If peer_id is provided as a query parameter, it uses that (must match JWT workspace_id). Otherwise, it uses the peer_id from the JWT. # Get Peer Card Source: https://honcho.dev/docs/v3/api-reference/endpoint/peers/get-peer-card get /v3/workspaces/{workspace_id}/peers/{peer_id}/card Get a peer card for a specific peer relationship. Returns the peer card that the observer peer has for the target peer if it exists. If no target is specified, returns the observer's own peer card. # Get Peer Context Source: https://honcho.dev/docs/v3/api-reference/endpoint/peers/get-peer-context get /v3/workspaces/{workspace_id}/peers/{peer_id}/context Get context for a peer, including their representation and peer card. This endpoint returns a curated subset of the representation and peer card for a peer. If a target is specified, returns the context for the target from the observer peer's perspective. If no target is specified, returns the peer's own context (self-observation). This is useful for getting all the context needed about a peer without making multiple API calls. # Get Peers Source: https://honcho.dev/docs/v3/api-reference/endpoint/peers/get-peers post /v3/workspaces/{workspace_id}/peers/list Get all Peers for a Workspace, paginated with optional filters. # Get Representation Source: https://honcho.dev/docs/v3/api-reference/endpoint/peers/get-representation post /v3/workspaces/{workspace_id}/peers/{peer_id}/representation Get a curated subset of a Peer's Representation. A Representation is always a subset of the total knowledge about the Peer. The subset can be scoped and filtered in various ways. If a session_id is provided in the body, we get the Representation of the Peer scoped to that Session. If a target is provided, we get the Representation of the target from the perspective of the Peer. If no target is provided, we get the omniscient Honcho Representation of the Peer. # Get Sessions For Peer Source: https://honcho.dev/docs/v3/api-reference/endpoint/peers/get-sessions-for-peer post /v3/workspaces/{workspace_id}/peers/{peer_id}/sessions Get all Sessions for a Peer, paginated with optional filters. # Search Peer Source: https://honcho.dev/docs/v3/api-reference/endpoint/peers/search-peer post /v3/workspaces/{workspace_id}/peers/{peer_id}/search Search a Peer's messages, optionally filtered by various criteria. # Set Peer Card Source: https://honcho.dev/docs/v3/api-reference/endpoint/peers/set-peer-card put /v3/workspaces/{workspace_id}/peers/{peer_id}/card Set a peer card for a specific peer relationship. Sets the peer card that the observer peer has for the target peer. If no target is specified, sets the observer's own peer card. # Update Peer Source: https://honcho.dev/docs/v3/api-reference/endpoint/peers/update-peer put /v3/workspaces/{workspace_id}/peers/{peer_id} Update a Peer's metadata and/or configuration. # Add Peers To Session Source: https://honcho.dev/docs/v3/api-reference/endpoint/sessions/add-peers-to-session post /v3/workspaces/{workspace_id}/sessions/{session_id}/peers Add Peers to a Session. If a Peer does not yet exist, it will be created automatically. # Clone Session Source: https://honcho.dev/docs/v3/api-reference/endpoint/sessions/clone-session post /v3/workspaces/{workspace_id}/sessions/{session_id}/clone Clone a Session, optionally up to a specific message ID. # Delete Session Source: https://honcho.dev/docs/v3/api-reference/endpoint/sessions/delete-session delete /v3/workspaces/{workspace_id}/sessions/{session_id} Delete a Session and all associated messages. The Session is marked as inactive immediately and returns 202 Accepted. The actual deletion of all related data happens asynchronously via the queue with retry support. This action cannot be undone. # Get Or Create Session Source: https://honcho.dev/docs/v3/api-reference/endpoint/sessions/get-or-create-session post /v3/workspaces/{workspace_id}/sessions Get a Session by ID or create a new Session with the given ID. If Session ID is provided as a parameter, it verifies the Session is in the Workspace. Otherwise, it uses the session_id from the JWT for verification. # Get Peer Config Source: https://honcho.dev/docs/v3/api-reference/endpoint/sessions/get-peer-config get /v3/workspaces/{workspace_id}/sessions/{session_id}/peers/{peer_id}/config Get the configuration for a Peer in a Session. # Get Session Context Source: https://honcho.dev/docs/v3/api-reference/endpoint/sessions/get-session-context get /v3/workspaces/{workspace_id}/sessions/{session_id}/context Produce a context object from the Session. The caller provides an optional token limit which the entire context must fit into. If not provided, the context will be exhaustive (within configured max tokens). To do this, we allocate 40% of the token limit to the summary, and 60% to recent messages -- as many as can fit. Note that the summary will usually take up less space than this. If the caller does not want a summary, we allocate all the tokens to recent messages. # Get Session Peers Source: https://honcho.dev/docs/v3/api-reference/endpoint/sessions/get-session-peers get /v3/workspaces/{workspace_id}/sessions/{session_id}/peers Get all Peers in a Session. Results are paginated. # Get Session Summaries Source: https://honcho.dev/docs/v3/api-reference/endpoint/sessions/get-session-summaries get /v3/workspaces/{workspace_id}/sessions/{session_id}/summaries Get available summaries for a Session. Returns both short and long summaries if available, including metadata like the message ID they cover up to, creation timestamp, and token count. # Get Sessions Source: https://honcho.dev/docs/v3/api-reference/endpoint/sessions/get-sessions post /v3/workspaces/{workspace_id}/sessions/list Get all Sessions for a Workspace, paginated with optional filters. # Remove Peers From Session Source: https://honcho.dev/docs/v3/api-reference/endpoint/sessions/remove-peers-from-session delete /v3/workspaces/{workspace_id}/sessions/{session_id}/peers Remove Peers by ID from a Session. # Search Session Source: https://honcho.dev/docs/v3/api-reference/endpoint/sessions/search-session post /v3/workspaces/{workspace_id}/sessions/{session_id}/search Search a Session with optional filters. Use `limit` to control the number of results returned. # Set Peer Config Source: https://honcho.dev/docs/v3/api-reference/endpoint/sessions/set-peer-config put /v3/workspaces/{workspace_id}/sessions/{session_id}/peers/{peer_id}/config Set the configuration for a Peer in a Session. # Set Session Peers Source: https://honcho.dev/docs/v3/api-reference/endpoint/sessions/set-session-peers put /v3/workspaces/{workspace_id}/sessions/{session_id}/peers Set the Peers in a Session. If a Peer does not yet exist, it will be created automatically. This will fully replace the current set of Peers in the Session. # Update Session Source: https://honcho.dev/docs/v3/api-reference/endpoint/sessions/update-session put /v3/workspaces/{workspace_id}/sessions/{session_id} Update a Session's metadata and/or configuration. # Delete Webhook Endpoint Source: https://honcho.dev/docs/v3/api-reference/endpoint/webhooks/delete-webhook-endpoint delete /v3/workspaces/{workspace_id}/webhooks/{endpoint_id} Delete a specific webhook endpoint. # Get Or Create Webhook Endpoint Source: https://honcho.dev/docs/v3/api-reference/endpoint/webhooks/get-or-create-webhook-endpoint post /v3/workspaces/{workspace_id}/webhooks Get or create a webhook endpoint URL. # List Webhook Endpoints Source: https://honcho.dev/docs/v3/api-reference/endpoint/webhooks/list-webhook-endpoints get /v3/workspaces/{workspace_id}/webhooks List all webhook endpoints, optionally filtered by workspace. # Test Emit Source: https://honcho.dev/docs/v3/api-reference/endpoint/webhooks/test-emit get /v3/workspaces/{workspace_id}/webhooks/test Test publishing a webhook event. # Delete Workspace Source: https://honcho.dev/docs/v3/api-reference/endpoint/workspaces/delete-workspace delete /v3/workspaces/{workspace_id} Delete a Workspace. This accepts the deletion request and processes it in the background, permanently deleting all peers, messages, conclusions, and other resources associated with the workspace. Returns 409 Conflict if the workspace contains active sessions. Delete all sessions first, then delete the workspace. This action cannot be undone. # Get All Workspaces Source: https://honcho.dev/docs/v3/api-reference/endpoint/workspaces/get-all-workspaces post /v3/workspaces/list Get all Workspaces, paginated with optional filters. # Get Or Create Workspace Source: https://honcho.dev/docs/v3/api-reference/endpoint/workspaces/get-or-create-workspace post /v3/workspaces Get a Workspace by ID. If workspace_id is provided as a query parameter, it uses that (must match JWT workspace_id). Otherwise, it uses the workspace_id from the JWT. # Get Queue Status Source: https://honcho.dev/docs/v3/api-reference/endpoint/workspaces/get-queue-status get /v3/workspaces/{workspace_id}/queue/status Get the processing queue status for a Workspace, optionally scoped to an observer, sender, and/or session. Only tracks user-facing task types (representation, summary, dream). Internal infrastructure tasks (reconciler, webhook, deletion) are excluded. Note: completed counts reflect items since the last periodic queue cleanup, not lifetime totals. # Schedule Dream Source: https://honcho.dev/docs/v3/api-reference/endpoint/workspaces/schedule-dream post /v3/workspaces/{workspace_id}/schedule_dream Manually schedule a dream task for a specific collection. This endpoint bypasses all automatic dream conditions (document threshold, minimum hours between dreams) and schedules the dream task for a future execution. Currently this endpoint only supports scheduling immediate dreams. In the future, users may pass a cron-style expression to schedule dreams at specific times. # Search Workspace Source: https://honcho.dev/docs/v3/api-reference/endpoint/workspaces/search-workspace post /v3/workspaces/{workspace_id}/search Search messages in a Workspace using optional filters. Use `limit` to control the number of results returned. # Update Workspace Source: https://honcho.dev/docs/v3/api-reference/endpoint/workspaces/update-workspace put /v3/workspaces/{workspace_id} Update Workspace metadata and/or configuration. # Introduction Source: https://honcho.dev/docs/v3/api-reference/introduction This section documents all available API endpoints in the Honcho Server. Each endpoint provides CRUD operations for our core primitives. For information about these primitives, see [Architecture](/docs/v3/documentation/core-concepts/architecture). We strongly recommend using our official SDKs instead of calling these APIs directly. The SDKs provide better error handling, type safety, and developer experience. ## Recommended approach Use our official SDKs for the best development experience: * [Python SDK](https://pypi.org/project/honcho-ai/) * [TypeScript SDK](https://www.npmjs.com/package/@honcho-ai/sdk) ## When to use this API reference This reference is primarily useful for: * Debugging SDK behavior * Building integrations in unsupported languages * Understanding the underlying data structures The endpoints pages are autogenerated and include interactive examples for testing. # Configuration Guide Source: https://honcho.dev/docs/v3/contributing/configuration Complete reference for configuring Honcho providers, features, and infrastructure Most users only need the setup from the [Self-Hosting Guide](./self-hosting#llm-setup). This page is the full reference for customizing providers, tuning features, and hardening your deployment. Honcho loads configuration in this priority order (highest wins): 1. **Environment variables** (always take precedence) 2. **`.env` file** 3. **`config.toml` file** 4. **Built-in defaults** Use `.env` for secrets and overrides, `config.toml` for base settings. Or use environment variables exclusively — whatever fits your deployment. Copy the examples to get started: ```bash theme={null} cp .env.template .env cp config.toml.example config.toml ``` ### Environment Variable Naming All config values map to environment variables: * `{SECTION}_{KEY}` for top-level section settings (e.g., `DB_CONNECTION_URI` → `[db].CONNECTION_URI`) * `{KEY}` for app-level settings (e.g., `LOG_LEVEL` → `[app].LOG_LEVEL`) * Use `__` inside `{KEY}` for nested settings (e.g., `DIALECTIC_LEVELS__minimal__MODEL_CONFIG__TRANSPORT`, `DERIVER_MODEL_CONFIG__OVERRIDES__BASE_URL`) ## LLM Configuration The [Self-Hosting Guide](./self-hosting#llm-setup) covers the basic setup: either the built-in OpenAI defaults or one OpenAI-compatible endpoint/model for all features. This section covers recommended model tiers, using multiple providers, and per-feature tuning. All Honcho agents (deriver, dialectic, dream) require tool calling. Your models must support the OpenAI tool calling format. ### Choosing Models Model choice matters more for tool-use reliability than raw intelligence: | Tier | Example models | Use case | Notes | | ---------- | ------------------------------- | --------------------------------------- | ----------------------------------------- | | **Light** | Gemini 2.5 Flash, GLM-4.7-Flash | Deriver, summary, dialectic minimal/low | High throughput, cheap, reliable tool use | | **Medium** | Claude Haiku 4.5, Grok 4.1 Fast | Dialectic medium/high | Good reasoning + tool use balance | | **Heavy** | Claude Sonnet 4, GLM-5 | Dream, dialectic max | Best quality for rare/complex tasks | You can mix providers freely — for example, use Gemini for the deriver and Claude for dreaming. ### Provider Types | Transport value | What it connects to | API key env var | | --------------- | ------------------------------------------------------------------------------------------------- | ----------------------- | | `openai` | OpenAI or any OpenAI-compatible endpoint (OpenRouter, Together, Fireworks, LiteLLM, vLLM, Ollama) | `LLM_OPENAI_API_KEY` | | `anthropic` | Anthropic Claude (direct) | `LLM_ANTHROPIC_API_KEY` | | `gemini` | Google Gemini (direct) | `LLM_GEMINI_API_KEY` | For OpenAI-compatible proxies (OpenRouter, vLLM, Ollama, etc.), use `transport = "openai"` and set `MODEL_CONFIG__OVERRIDES__BASE_URL` on each feature to point at your endpoint. ### Tiered Model Setup Once you're past initial setup, you can assign different models per feature for better cost/quality tradeoffs. This example uses OpenRouter with light/medium/heavy tiers: ```bash theme={null} LLM_OPENAI_API_KEY=sk-or-v1-... # All features route through OpenRouter via overrides.base_url # (You can set this on each feature's MODEL_CONFIG) # Light tier — high throughput, cheap DERIVER_MODEL_CONFIG__TRANSPORT=openai DERIVER_MODEL_CONFIG__MODEL=google/gemini-2.5-flash-lite DERIVER_MODEL_CONFIG__OVERRIDES__BASE_URL=https://openrouter.ai/api/v1 SUMMARY_MODEL_CONFIG__TRANSPORT=openai SUMMARY_MODEL_CONFIG__MODEL=google/gemini-2.5-flash DIALECTIC_LEVELS__minimal__MODEL_CONFIG__TRANSPORT=openai DIALECTIC_LEVELS__minimal__MODEL_CONFIG__MODEL=google/gemini-2.5-flash-lite DIALECTIC_LEVELS__low__MODEL_CONFIG__TRANSPORT=openai DIALECTIC_LEVELS__low__MODEL_CONFIG__MODEL=google/gemini-2.5-flash-lite # Medium tier — better reasoning DIALECTIC_LEVELS__medium__MODEL_CONFIG__TRANSPORT=openai DIALECTIC_LEVELS__medium__MODEL_CONFIG__MODEL=anthropic/claude-haiku-4-5 DIALECTIC_LEVELS__high__MODEL_CONFIG__TRANSPORT=openai DIALECTIC_LEVELS__high__MODEL_CONFIG__MODEL=anthropic/claude-haiku-4-5 DIALECTIC_LEVELS__max__MODEL_CONFIG__TRANSPORT=openai DIALECTIC_LEVELS__max__MODEL_CONFIG__MODEL=anthropic/claude-haiku-4-5 # Heavy tier — best quality for complex tasks DREAM_DEDUCTION_MODEL_CONFIG__TRANSPORT=openai DREAM_DEDUCTION_MODEL_CONFIG__MODEL=anthropic/claude-haiku-4-5 DREAM_INDUCTION_MODEL_CONFIG__TRANSPORT=openai DREAM_INDUCTION_MODEL_CONFIG__MODEL=anthropic/claude-haiku-4-5 ``` ### Direct Vendor Keys Instead of an OpenAI-compatible proxy, you can use vendor APIs directly. Each transport picks up its own `LLM_{TRANSPORT}_API_KEY`. If you keep the built-in defaults, only `LLM_OPENAI_API_KEY` is required: ```bash theme={null} LLM_OPENAI_API_KEY=... # Built-in model defaults # - deriver: openai / gpt-5.4-mini # - dialectic (all levels): openai / gpt-5.4-mini # - summary: openai / gpt-5.4-mini # - dream specialists: openai / gpt-5.4-mini # - embeddings: openai / text-embedding-3-small ``` To use Gemini or Anthropic directly, override the features you want to move: ```bash theme={null} LLM_GEMINI_API_KEY=... DERIVER_MODEL_CONFIG__TRANSPORT=gemini DERIVER_MODEL_CONFIG__MODEL=gemini-2.5-flash LLM_ANTHROPIC_API_KEY=... DREAM_DEDUCTION_MODEL_CONFIG__TRANSPORT=anthropic DREAM_DEDUCTION_MODEL_CONFIG__MODEL=claude-haiku-4-5 ``` ### Self-Hosted (vLLM / Ollama) Use `transport = "openai"` and set `MODEL_CONFIG__OVERRIDES__BASE_URL` on each feature: ```bash theme={null} # vLLM LLM_OPENAI_API_KEY=not-needed DERIVER_MODEL_CONFIG__TRANSPORT=openai DERIVER_MODEL_CONFIG__MODEL=your-model-name DERIVER_MODEL_CONFIG__OVERRIDES__BASE_URL=http://localhost:8000/v1 # Ollama LLM_OPENAI_API_KEY=ollama DERIVER_MODEL_CONFIG__TRANSPORT=openai DERIVER_MODEL_CONFIG__MODEL=llama3.3:70b DERIVER_MODEL_CONFIG__OVERRIDES__BASE_URL=http://localhost:11434/v1 ``` Set `MODEL_CONFIG__TRANSPORT`, `MODEL_CONFIG__MODEL`, and `MODEL_CONFIG__OVERRIDES__BASE_URL` for each feature the same way. The same overrides are available in `config.toml`: ```toml theme={null} [deriver.model_config] transport = "openai" model = "my-local-model" [deriver.model_config.overrides] base_url = "http://localhost:8000/v1" api_key_env = "DERIVER_LOCAL_API_KEY" ``` ### Thinking Budget Built-in defaults do not set `MODEL_CONFIG__THINKING_BUDGET_TOKENS` or `MODEL_CONFIG__THINKING_EFFORT`. Add one only when your chosen model supports it. Use `MODEL_CONFIG__THINKING_EFFORT` for OpenAI reasoning models: ```bash theme={null} DERIVER_MODEL_CONFIG__THINKING_EFFORT=minimal DIALECTIC_LEVELS__max__MODEL_CONFIG__THINKING_EFFORT=medium ``` Use `MODEL_CONFIG__THINKING_BUDGET_TOKENS` for Anthropic and Gemini models. Set it to `0` or omit it for providers that don't support extended thinking: ```bash theme={null} SUMMARY_MODEL_CONFIG__THINKING_BUDGET_TOKENS=1024 DREAM_DEDUCTION_MODEL_CONFIG__THINKING_BUDGET_TOKENS=1024 ``` ### Provider-Specific Parameters Each model config supports an `overrides.provider_params` dict for passing arbitrary parameters to the underlying provider SDK. Use this for vendor-specific features that aren't part of the standard config: ```toml theme={null} [deriver.model_config.overrides.provider_params] # These are passed directly to the provider SDK verbosity = "low" ``` ### Changing Transport When changing a feature's `transport`, always specify `model` explicitly. Partial overrides that change transport without model will keep the previous model name, which may not be valid for the new provider. ### General LLM Settings ```bash theme={null} LLM_DEFAULT_MAX_TOKENS=2500 # Tool output limits (to prevent token explosion) LLM_MAX_TOOL_OUTPUT_CHARS=10000 # ~2500 tokens at 4 chars/token LLM_MAX_MESSAGE_CONTENT_CHARS=2000 # Max chars per message in tool results ``` ### Embedding Configuration Embeddings use their own nested model config, separate from the main text-generation LLM settings. ```bash theme={null} # Embedding vector settings EMBEDDING_VECTOR_DIMENSIONS=1536 EMBEDDING_MAX_INPUT_TOKENS=8192 EMBEDDING_MAX_TOKENS_PER_REQUEST=300000 # Embedding transport/model selection EMBEDDING_MODEL_CONFIG__TRANSPORT=openai # openai, gemini EMBEDDING_MODEL_CONFIG__MODEL=text-embedding-3-small # Optional endpoint overrides EMBEDDING_MODEL_CONFIG__OVERRIDES__BASE_URL=http://localhost:8000/v1 EMBEDDING_MODEL_CONFIG__OVERRIDES__API_KEY_ENV=EMBEDDING_CUSTOM_API_KEY ``` Current constraint: * `EMBEDDING_VECTOR_DIMENSIONS` can be changed for fully migrated external vector stores, but pgvector and dual-write mode still require `1536` until the schema migration lands. ### Feature-Specific Model Configuration Each feature can use a different provider and model. Below are all the tuning knobs. **Dialectic API:** The Dialectic API provides theory-of-mind informed responses. It uses a tiered reasoning system with five levels: ```bash theme={null} # Global dialectic settings DIALECTIC_MAX_OUTPUT_TOKENS=8192 DIALECTIC_MAX_INPUT_TOKENS=100000 DIALECTIC_HISTORY_TOKEN_LIMIT=8192 DIALECTIC_SESSION_HISTORY_MAX_TOKENS=4096 ``` **Per-Level Configuration:** Each reasoning level has its own provider, model, and settings: ```toml theme={null} # config.toml example [dialectic.levels.minimal] MAX_TOOL_ITERATIONS = 1 MAX_OUTPUT_TOKENS = 250 TOOL_CHOICE = "any" [dialectic.levels.minimal.model_config] transport = "openai" model = "gpt-5.4-mini" [dialectic.levels.low] MAX_TOOL_ITERATIONS = 5 TOOL_CHOICE = "any" [dialectic.levels.low.model_config] transport = "openai" model = "gpt-5.4-mini" [dialectic.levels.medium] MAX_TOOL_ITERATIONS = 2 [dialectic.levels.medium.model_config] transport = "openai" model = "gpt-5.4-mini" [dialectic.levels.high] MAX_TOOL_ITERATIONS = 4 [dialectic.levels.high.model_config] transport = "openai" model = "gpt-5.4-mini" [dialectic.levels.max] MAX_TOOL_ITERATIONS = 10 [dialectic.levels.max.model_config] transport = "openai" model = "gpt-5.4-mini" ``` Environment variables for nested levels use double underscores: ```bash theme={null} DIALECTIC_LEVELS__minimal__MODEL_CONFIG__TRANSPORT=openai DIALECTIC_LEVELS__minimal__MODEL_CONFIG__MODEL=gpt-5.4-mini DIALECTIC_LEVELS__minimal__MAX_TOOL_ITERATIONS=1 DIALECTIC_LEVELS__minimal__MAX_OUTPUT_TOKENS=250 DIALECTIC_LEVELS__minimal__TOOL_CHOICE=any ``` **Deriver (Theory of Mind):** The Deriver extracts facts from messages and builds theory-of-mind representations of peers. ```bash theme={null} DERIVER_ENABLED=true # LLM settings DERIVER_MODEL_CONFIG__TRANSPORT=openai DERIVER_MODEL_CONFIG__MODEL=gpt-5.4-mini DERIVER_MAX_INPUT_TOKENS=25000 DERIVER_MAX_CUSTOM_INSTRUCTIONS_TOKENS=2000 # DERIVER_MODEL_CONFIG__THINKING_EFFORT=minimal # DERIVER_MODEL_CONFIG__THINKING_BUDGET_TOKENS=1024 # DERIVER_MODEL_CONFIG__TEMPERATURE=0.7 # Optional temperature override # Backup model (optional) # DERIVER_MODEL_CONFIG__FALLBACK__MODEL=claude-haiku-4-5 # DERIVER_MODEL_CONFIG__FALLBACK__TRANSPORT=anthropic # Worker settings DERIVER_WORKERS=1 # Increase for higher throughput DERIVER_POLLING_SLEEP_INTERVAL_SECONDS=1.0 DERIVER_STALE_SESSION_TIMEOUT_MINUTES=5 # Queue management DERIVER_QUEUE_ERROR_RETENTION_SECONDS=2592000 # 30 days # Observation settings DERIVER_DEDUPLICATE=true DERIVER_LOG_OBSERVATIONS=false DERIVER_WORKING_REPRESENTATION_MAX_OBSERVATIONS=100 DERIVER_REPRESENTATION_BATCH_MAX_TOKENS=1024 ``` **Peer Card:** ```bash theme={null} PEER_CARD_ENABLED=true ``` **Summary Generation:** Session summaries provide compressed context for long conversations — short summaries (frequent) and long summaries (comprehensive). ```bash theme={null} SUMMARY_ENABLED=true SUMMARY_MODEL_CONFIG__TRANSPORT=openai SUMMARY_MODEL_CONFIG__MODEL=gpt-5.4-mini SUMMARY_MAX_TOKENS_SHORT=1000 SUMMARY_MAX_TOKENS_LONG=4000 # SUMMARY_MODEL_CONFIG__THINKING_EFFORT=minimal # SUMMARY_MODEL_CONFIG__THINKING_BUDGET_TOKENS=1024 SUMMARY_MESSAGES_PER_SHORT_SUMMARY=20 SUMMARY_MESSAGES_PER_LONG_SUMMARY=60 ``` **Dream Processing:** Dream processing consolidates and refines peer representations during idle periods. ```bash theme={null} DREAM_ENABLED=true DREAM_DOCUMENT_THRESHOLD=50 DREAM_IDLE_TIMEOUT_MINUTES=60 DREAM_MIN_HOURS_BETWEEN_DREAMS=8 DREAM_ENABLED_TYPES=["omni"] DREAM_MAX_TOOL_ITERATIONS=20 DREAM_HISTORY_TOKEN_LIMIT=16384 # Specialist model configs (each is independent) DREAM_DEDUCTION_MODEL_CONFIG__TRANSPORT=openai DREAM_DEDUCTION_MODEL_CONFIG__MODEL=gpt-5.4-mini DREAM_INDUCTION_MODEL_CONFIG__TRANSPORT=openai DREAM_INDUCTION_MODEL_CONFIG__MODEL=gpt-5.4-mini ``` **Surprisal-Based Sampling (Advanced):** Optional subsystem for identifying unusual observations during dreaming: ```bash theme={null} DREAM_SURPRISAL__ENABLED=false DREAM_SURPRISAL__TREE_TYPE=kdtree DREAM_SURPRISAL__TREE_K=5 DREAM_SURPRISAL__SAMPLING_STRATEGY=recent DREAM_SURPRISAL__SAMPLE_SIZE=200 DREAM_SURPRISAL__TOP_PERCENT_SURPRISAL=0.10 DREAM_SURPRISAL__MIN_HIGH_SURPRISAL_FOR_REPLACE=10 DREAM_SURPRISAL__INCLUDE_LEVELS=["explicit", "deductive"] ``` ## Core Configuration ### Application Settings ```bash theme={null} LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR, CRITICAL SESSION_OBSERVERS_LIMIT=10 GET_CONTEXT_MAX_TOKENS=100000 MAX_MESSAGE_SIZE=25000 MAX_FILE_SIZE=5242880 # 5MB EMBED_MESSAGES=true EMBEDDING_MAX_INPUT_TOKENS=8192 EMBEDDING_MAX_TOKENS_PER_REQUEST=300000 NAMESPACE=honcho ``` **Optional Integrations:** ```bash theme={null} LANGFUSE_HOST=https://cloud.langfuse.com LANGFUSE_PUBLIC_KEY=your-langfuse-public-key COLLECT_METRICS_LOCAL=false LOCAL_METRICS_FILE=metrics.jsonl REASONING_TRACES_FILE=traces.jsonl ``` ### Database ```bash theme={null} # Connection (required) DB_CONNECTION_URI=postgresql+psycopg://postgres:postgres@localhost:5432/postgres # Pool settings DB_SCHEMA=public DB_POOL_PRE_PING=true DB_POOL_SIZE=10 DB_MAX_OVERFLOW=20 DB_POOL_TIMEOUT=30 DB_POOL_RECYCLE=300 DB_POOL_USE_LIFO=true DB_SQL_DEBUG=false ``` ### Authentication ```bash theme={null} AUTH_USE_AUTH=false # Set to true to require JWT tokens AUTH_JWT_SECRET=your-super-secret-jwt-key # Required when auth is enabled ``` Generate a secret: `python scripts/generate_jwt_secret.py` ### Cache (Redis) Redis caching is optional. Honcho works without it but benefits from caching in high-traffic scenarios. ```bash theme={null} CACHE_ENABLED=false CACHE_URL=redis://localhost:6379/0?suppress=true CACHE_NAMESPACE=honcho CACHE_DEFAULT_TTL_SECONDS=300 CACHE_DEFAULT_LOCK_TTL_SECONDS=5 # Cache stampede prevention ``` ### Webhooks ```bash theme={null} WEBHOOK_SECRET=your-webhook-signing-secret WEBHOOK_MAX_WORKSPACE_LIMIT=10 ``` ### Vector Store ```bash theme={null} VECTOR_STORE_TYPE=pgvector # Options: pgvector, turbopuffer, lancedb VECTOR_STORE_MIGRATED=false VECTOR_STORE_NAMESPACE=honcho VECTOR_STORE_DIMENSIONS=1536 # Turbopuffer-specific VECTOR_STORE_TURBOPUFFER_API_KEY=your-turbopuffer-api-key VECTOR_STORE_TURBOPUFFER_REGION=us-east-1 # LanceDB-specific VECTOR_STORE_LANCEDB_PATH=./lancedb_data ``` ## Monitoring ### Prometheus Metrics Honcho exposes `/metrics` endpoints for scraping: * **API process**: Port 8000 * **Deriver process**: Port 9090 ```bash theme={null} METRICS_ENABLED=false METRICS_NAMESPACE=honcho ``` ### CloudEvents Telemetry ```bash theme={null} TELEMETRY_ENABLED=false TELEMETRY_ENDPOINT=https://telemetry.honcho.dev/v1/events TELEMETRY_HEADERS='{"Authorization": "Bearer your-token"}' TELEMETRY_BATCH_SIZE=100 TELEMETRY_FLUSH_INTERVAL_SECONDS=1.0 TELEMETRY_MAX_RETRIES=3 TELEMETRY_MAX_BUFFER_SIZE=10000 ``` ### Sentry ```bash theme={null} SENTRY_ENABLED=false SENTRY_DSN=https://your-sentry-dsn@sentry.io/project-id SENTRY_ENVIRONMENT=production SENTRY_TRACES_SAMPLE_RATE=0.1 SENTRY_PROFILES_SAMPLE_RATE=0.1 ``` ## Reference config.toml A complete config.toml with all defaults. Copy and modify what you need: ```toml theme={null} [app] LOG_LEVEL = "INFO" SESSION_OBSERVERS_LIMIT = 10 EMBED_MESSAGES = true NAMESPACE = "honcho" [db] CONNECTION_URI = "postgresql+psycopg://postgres:postgres@localhost:5432/postgres" POOL_SIZE = 10 MAX_OVERFLOW = 20 [auth] USE_AUTH = false [cache] ENABLED = false URL = "redis://localhost:6379/0?suppress=true" DEFAULT_TTL_SECONDS = 300 [deriver] ENABLED = true WORKERS = 1 [deriver.model_config] transport = "openai" model = "gpt-5.4-mini" [peer_card] ENABLED = true [dialectic] MAX_OUTPUT_TOKENS = 8192 [dialectic.levels.minimal] MAX_TOOL_ITERATIONS = 1 MAX_OUTPUT_TOKENS = 250 TOOL_CHOICE = "any" [dialectic.levels.minimal.model_config] transport = "openai" model = "gpt-5.4-mini" [dialectic.levels.low] MAX_TOOL_ITERATIONS = 5 TOOL_CHOICE = "any" [dialectic.levels.low.model_config] transport = "openai" model = "gpt-5.4-mini" [dialectic.levels.medium] MAX_TOOL_ITERATIONS = 2 [dialectic.levels.medium.model_config] transport = "openai" model = "gpt-5.4-mini" [dialectic.levels.high] MAX_TOOL_ITERATIONS = 4 [dialectic.levels.high.model_config] transport = "openai" model = "gpt-5.4-mini" [dialectic.levels.max] MAX_TOOL_ITERATIONS = 10 [dialectic.levels.max.model_config] transport = "openai" model = "gpt-5.4-mini" [summary] ENABLED = true MAX_TOKENS_SHORT = 1000 MAX_TOKENS_LONG = 4000 [summary.model_config] transport = "openai" model = "gpt-5.4-mini" [dream] ENABLED = true [dream.deduction_model_config] transport = "openai" model = "gpt-5.4-mini" [dream.induction_model_config] transport = "openai" model = "gpt-5.4-mini" [webhook] MAX_WORKSPACE_LIMIT = 10 [metrics] ENABLED = false [telemetry] ENABLED = false [vector_store] TYPE = "pgvector" [sentry] ENABLED = false ``` ## Database Migrations ```bash theme={null} uv run alembic current # Check status uv run alembic upgrade head # Upgrade to latest uv run alembic downgrade # Downgrade to specific revision uv run alembic revision --autogenerate -m "Description" # Create new migration ``` ## Troubleshooting 1. **Database connection errors** — Ensure `DB_CONNECTION_URI` uses `postgresql+psycopg://` prefix. Verify database is running and pgvector extension is installed. 2. **Authentication issues** — Generate and set `AUTH_JWT_SECRET` when `AUTH_USE_AUTH=true`. Use `python scripts/generate_jwt_secret.py`. 3. **LLM provider errors** — Verify API keys are set. Check model names match your provider's format. Ensure models support tool calling. 4. **Deriver not processing** — Check logs. Increase `DERIVER_WORKERS` for throughput. Verify database and LLM connectivity. 5. **Dialectic level issues** — Unset level fields inherit from the built-in defaults. For Anthropic, `THINKING_BUDGET_TOKENS` must be >= 1024 when enabled. For providers without budgeted thinking, omit it or set it to `0`. `MAX_OUTPUT_TOKENS` must exceed `THINKING_BUDGET_TOKENS`. 6. **Vector store issues** — For Turbopuffer, set the API key. Check `VECTOR_STORE_DIMENSIONS` matches your embedding model. # Contributing Guidelines Source: https://honcho.dev/docs/v3/contributing/guidelines Thank you for your interest in contributing to Honcho! This guide outlines the process for contributing to the project and our development conventions. ## Getting Started Before you start contributing, please: 1. **Set up your development environment** - Follow the [Local Development guide](https://github.com/plastic-labs/honcho/blob/main/CONTRIBUTING.md#local-development) in the Honcho repository to get Honcho running locally. 2. **Join our community** - Feel free to join us in our [Discord](http://discord.gg/honcho) to discuss your changes, get help, or ask questions. 3. **Review existing issues** - Check the [issues tab](https://github.com/plastic-labs/honcho/issues) to see what's already being worked on or to find something to contribute to. ## Contribution Workflow ### 1. Fork and Clone 1. Fork the repository on GitHub 2. Clone your fork locally: ```bash theme={null} git clone https://github.com/YOUR_USERNAME/honcho.git cd honcho ``` 3. Add the upstream repository as a remote: ```bash theme={null} git remote add upstream https://github.com/plastic-labs/honcho.git ``` ### 2. Create a Branch Create a new branch for your feature or bug fix: ```bash theme={null} git checkout -b feature/your-feature-name # or git checkout -b fix/your-bug-fix-name ``` **Branch naming conventions:** * `feature/description` - for new features * `fix/description` - for bug fixes * `docs/description` - for documentation updates * `refactor/description` - for code refactoring * `test/description` - for adding or updating tests ### 3. Make Your Changes * Write clean, readable code that follows our coding standards (see below) * Add tests for new functionality * Update documentation as needed * Make sure your changes don't break existing functionality ### 4. Commit Your Changes We follow conventional commit standards. Format your commit messages as: ``` type(scope): description [optional body] [optional footer] ``` **Types:** * `feat`: A new feature * `fix`: A bug fix * `docs`: Documentation only changes * `style`: Changes that do not affect the meaning of the code * `refactor`: A code change that neither fixes a bug nor adds a feature * `test`: Adding missing tests or correcting existing tests * `chore`: Changes to the build process or auxiliary tools **Examples:** ```bash theme={null} git commit -m "feat(api): add new dialectic endpoint for user insights" git commit -m "fix(db): resolve connection pool timeout issue" git commit -m "docs(readme): update installation instructions" ``` ### 5. Submit a Pull Request 1. Push your branch to your fork: ```bash theme={null} git push origin your-branch-name ``` 2. Create a pull request on GitHub from your branch to the `main` branch 3. Fill out the pull request template with: * A clear description of what changes you've made * The motivation for the changes * Any relevant issue numbers (use "Closes #123" to auto-close issues) * Screenshots or examples if applicable ## Coding Standards ### Python Code Style * Follow [PEP 8](https://www.python.org/dev/peps/pep-0008/) style guidelines * Use [Black](https://black.readthedocs.io/) for code formatting (we may add this to CI in the future) * Use type hints where possible * Write docstrings for functions and classes using Google style docstrings ### Code Organization * Keep functions focused and single-purpose * Use meaningful variable and function names * Add comments for complex logic * Follow existing patterns in the codebase ### Testing * Write unit tests for new functionality * Ensure existing tests pass before submitting * Use descriptive test names that explain what is being tested * Mock external dependencies appropriately ### Documentation * Update relevant documentation for new features * Include examples in docstrings where helpful * Keep README and other docs up to date with changes ## Review Process 1. **Automated checks** - Your PR will run through automated checks including tests and linting 2. **Project maintainer review** - A project maintainer will review your code for: * Code quality and adherence to standards * Functionality and correctness * Test coverage * Documentation completeness 3. **Discussion and iteration** - You may be asked to make changes or clarifications 4. **Approval and merge** - Once approved, your PR will be merged into `main` ## Types of Contributions We welcome various types of contributions: * **Bug fixes** - Help us squash bugs and improve stability * **New features** - Add functionality that benefits the community * **Documentation** - Improve or expand our documentation * **Tests** - Increase test coverage and reliability * **Performance improvements** - Help make Honcho faster and more efficient * **Examples and tutorials** - Help other developers use Honcho ## Issue Reporting When reporting bugs or requesting features: 1. Check if the issue already exists 2. Use the appropriate issue template 3. Provide clear reproduction steps for bugs 4. Include relevant environment information 5. Be specific about expected vs actual behavior ## Questions and Support * **General questions** - Join our [Discord](http://discord.gg/honcho) * **Bug reports** - Use GitHub issues * **Feature requests** - Use GitHub issues with the feature request template * **Security issues** - Please email us privately rather than opening a public issue ## License By contributing to Honcho, you agree that your contributions will be licensed under the same [AGPL-3.0 License](./license) that covers the project. Thank you for helping make Honcho better! 🫡 # License Source: https://honcho.dev/docs/v3/contributing/license Honcho is licensed under the AGPL-3.0 License. This is copied below for convenience and also present in the [GitHub Repository](https://github.com/plastic-labs/honcho) ``` GNU AFFERO GENERAL PUBLIC LICENSE Version 3, 19 November 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU Affero General Public License is a free, copyleft license for software and other kinds of works, specifically designed to ensure cooperation with the community in the case of network server software. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, our General Public Licenses are intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. Developers that use our General Public Licenses protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License which gives you legal permission to copy, distribute and/or modify the software. A secondary benefit of defending all users' freedom is that improvements made in alternate versions of the program, if they receive widespread use, become available for other developers to incorporate. Many developers of free software are heartened and encouraged by the resulting cooperation. However, in the case of software used on network servers, this result may fail to come about. The GNU General Public License permits making a modified version and letting the public access it on a server without ever releasing its source code to the public. The GNU Affero General Public License is designed specifically to ensure that, in such cases, the modified source code becomes available to the community. It requires the operator of a network server to provide the source code of the modified version running there to the users of that server. Therefore, public use of a modified version, on a publicly accessible server, gives the public access to the source code of the modified version. An older license, called the Affero General Public License and published by Affero, was designed to accomplish similar goals. This is a different license, not a version of the Affero GPL, but Affero has released a new version of the Affero GPL which permits relicensing under this license. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU Affero General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Remote Network Interaction; Use with the GNU General Public License. Notwithstanding any other provision of this License, if you modify the Program, your modified version must prominently offer all users interacting with it remotely through a computer network (if your version supports such interaction) an opportunity to receive the Corresponding Source of your version by providing access to the Corresponding Source from a network server at no charge, through some standard or customary means of facilitating copying of software. This Corresponding Source shall include the Corresponding Source for any work covered by version 3 of the GNU General Public License that is incorporated pursuant to the following paragraph. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the work with which it is combined will remain governed by version 3 of the GNU General Public License. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU Affero General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU Affero General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU Affero General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU Affero General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details. You should have received a copy of the GNU Affero General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If your software can interact with users remotely through a computer network, you should also make sure that it provides a way for users to get its source. For example, if your program is a web application, its interface could display a "Source" link that leads users to an archive of the code. There are many ways you could offer source, and different solutions will be better for different programs; see section 13 for the specific requirements. You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU AGPL, see . ``` # Local Environment Setup Source: https://honcho.dev/docs/v3/contributing/self-hosting Set up a local environment to run Honcho for development, testing, or self-hosting This guide helps you set up a local environment to run Honcho for development, testing, or self-hosting. ## Overview By the end of this guide, you'll have: * A local Honcho server running on your machine * A PostgreSQL database with pgvector extension * Basic configuration to connect your applications * A working environment for development or testing ## Prerequisites Before you begin, ensure you have the following installed: ### Required Software * **uv** - Python package manager: `curl -LsSf https://astral.sh/uv/install.sh | sh` or `brew install uv` * **Git** - [Download from git-scm.com](https://git-scm.com/downloads) * **Docker** (required for Docker setup, not needed for manual setup) - [Download from docker.com](https://www.docker.com/products/docker-desktop/) ### Database Options You'll need a PostgreSQL database with the pgvector extension. Choose one: * **Local PostgreSQL** - Install locally or use Docker * **Supabase** - Free cloud PostgreSQL with pgvector * **Railway** - Simple cloud PostgreSQL hosting * **Your own PostgreSQL server** ## LLM Setup Honcho uses LLMs for memory extraction, summarization, dialectic chat, and dreaming. The server will **fail to start** without a provider configured. If you keep the built-in defaults, you only need one API key: all text-generation features default to `openai / gpt-5.4-mini`, and embeddings default to `openai / text-embedding-3-small`. Any OpenAI-compatible endpoint works too — OpenRouter, Together, Fireworks, Ollama, vLLM, or LiteLLM. Models must support tool calling (function calling). After copying `.env.template` to `.env`, the default setup is: ```bash theme={null} # Required for the built-in defaults LLM_OPENAI_API_KEY=sk-... ``` If you want a different model or an OpenAI-compatible proxy, uncomment and edit the relevant `*_MODEL_CONFIG__TRANSPORT`, `*_MODEL_CONFIG__MODEL`, and `*_MODEL_CONFIG__OVERRIDES__BASE_URL` lines in the Deriver, Dialectic, Summary, and Dream sections. For example: ```bash theme={null} LLM_OPENAI_API_KEY=sk-or-v1-... DERIVER_MODEL_CONFIG__TRANSPORT=openai DERIVER_MODEL_CONFIG__MODEL=google/gemini-2.5-flash DERIVER_MODEL_CONFIG__OVERRIDES__BASE_URL=https://openrouter.ai/api/v1 ``` For recommended model tiers per feature, using multiple providers, or direct vendor API keys, see the [Configuration Guide](./configuration#llm-configuration). **Community quick-start**: [elkimek/honcho-self-hosted](https://github.com/elkimek/honcho-self-hosted) provides a one-command installer with pre-configured model tiers, interactive provider setup, and Hermes Agent integration. ## Docker Setup (Recommended) Docker Compose handles the database, Redis, and Honcho server. The compose file **builds the image from source** (there is no pre-built image on Docker Hub). This requires Docker with BuildKit enabled — see [Troubleshooting](./troubleshooting#docker-build-fails-with-permission-errors) if the build fails. The compose file is production-oriented by default (ports bound to `127.0.0.1`, restart policies, caching enabled). For development, uncomment the source mounts and monitoring services inside the file. ### 1. Clone the Repository ```bash theme={null} git clone https://github.com/plastic-labs/honcho.git cd honcho ``` ### 2. Set Up Environment Variables Copy the example environment file and configure it: ```bash theme={null} cp .env.template .env ``` Edit `.env` and configure your LLM provider — see [LLM Setup](#llm-setup) above. The database connection is set in the compose file. Auth is disabled by default (`AUTH_USE_AUTH=false`). ### 3. Start the Services ```bash theme={null} cp docker-compose.yml.example docker-compose.yml docker compose up -d --build ``` The first build takes a few minutes (compiling from source). Subsequent starts are fast. This starts four services: **api** (port 8000), **deriver** (background worker), **database** (PostgreSQL with pgvector, port 5432), and **redis** (port 6379). All ports are bound to `127.0.0.1`. Redis caching is enabled by default. For development, uncomment the source mount and monitoring sections inside `docker-compose.yml` to enable live reload, Prometheus, and Grafana. ### 4. Verify Migrations run automatically on startup. ```bash theme={null} # Check all containers are running docker compose ps # Health check (confirms the process is up) curl http://localhost:8000/health # Check the deriver is processing (look for "polling" or "processing" in logs) docker compose logs deriver --tail 20 ``` For a full end-to-end test, see [Verify Your Setup](#verify-your-setup) below. ## Manual Setup For more control over your environment, you can set up everything manually. ### 1. Clone and Install Dependencies ```bash theme={null} git clone https://github.com/plastic-labs/honcho.git cd honcho # Install dependencies using uv (this will also set up Python if needed) uv sync # Activate the virtual environment source .venv/bin/activate # On Windows: .venv\Scripts\activate ``` ### 2. Set Up PostgreSQL #### Option A: Local PostgreSQL Installation Install PostgreSQL and pgvector on your system: **macOS (using Homebrew):** ```bash theme={null} brew install postgresql brew install pgvector ``` **Ubuntu/Debian:** ```bash theme={null} sudo apt update sudo apt install postgresql postgresql-contrib # Install pgvector extension (see pgvector docs for your version) ``` **Windows:** Download from [postgresql.org](https://www.postgresql.org/download/windows/) #### Option B: Docker PostgreSQL ```bash theme={null} docker run --name honcho-db \ -e POSTGRES_USER=postgres \ -e POSTGRES_PASSWORD=postgres \ -p 5432:5432 \ -d pgvector/pgvector:pg15 ``` ### 3. Enable Extensions Connect to PostgreSQL and enable pgvector: ```bash theme={null} # Connect to PostgreSQL psql -U postgres # Enable the pgvector extension on the default database CREATE EXTENSION IF NOT EXISTS vector; \q ``` ### 4. Configure Environment Create a `.env` file with your settings: ```bash theme={null} cp .env.template .env ``` Edit `.env` — configure your LLM provider (see [LLM Setup](#llm-setup) above) and set the database connection: ```bash theme={null} DB_CONNECTION_URI=postgresql+psycopg://postgres:postgres@localhost:5432/postgres AUTH_USE_AUTH=false LOG_LEVEL=DEBUG ``` ### 5. Run Database Migrations ```bash theme={null} # Run migrations to create tables uv run alembic upgrade head ``` ### 6. Start the Server ```bash theme={null} # Start the development server uv run fastapi dev src/main.py ``` The server will be available at `http://localhost:8000`. ### 7. Start the Background Worker (Deriver) In a **separate terminal**, start the deriver background worker: ```bash theme={null} uv run python -m src.deriver ``` The deriver is essential for Honcho's core functionality. It processes incoming messages to extract observations, build peer representations, generate session summaries, and run dream consolidation. Without it, messages will be stored but no memory or reasoning will occur. ## Cloud Database Setup If you prefer to use a managed PostgreSQL service: ### Supabase (Recommended) 1. **Create a Supabase project** at [supabase.com](https://supabase.com) 2. **Enable pgvector extension** in the SQL editor: ```sql theme={null} CREATE EXTENSION IF NOT EXISTS vector; ``` 3. **Get your connection string** from Settings > Database 4. **Update your `.env` file** with the connection string ### Railway 1. **Create a Railway project** at [railway.app](https://railway.app) 2. **Add a PostgreSQL service** 3. **Enable pgvector** in the PostgreSQL console 4. **Get your connection string** from the service variables 5. **Update your `.env` file** ## Verify Your Setup Once your Honcho server is running, verify everything is working: ### 1. Health Check ```bash theme={null} curl http://localhost:8000/health # {"status":"ok"} ``` Note: `/health` only confirms the process is running. It does not check database or LLM connectivity. ### 2. Smoke Test (database + API) This confirms the database connection, migrations, and API are all working: ```bash theme={null} # Create a workspace curl -s -X POST http://localhost:8000/v3/workspaces \ -H "Content-Type: application/json" \ -d '{"name": "test"}' | python3 -m json.tool ``` If you get back a workspace object with an `id`, your database is connected and migrations ran correctly. ### 3. API Documentation Visit `http://localhost:8000/docs` to see the interactive API documentation. ### 4. Test with SDK ```python theme={null} from honcho import Honcho client = Honcho( base_url="http://localhost:8000", workspace_id="test" ) peer = client.peer("test-user") print(f"Created peer: {peer.id}") ``` ## Connect Your Application Now that Honcho is running locally, you can connect your applications: ### Update SDK Configuration ```python theme={null} # Python SDK from honcho import Honcho client = Honcho( base_url="http://localhost:8000", ) ``` ```typescript theme={null} // TypeScript SDK import { Honcho } from '@honcho-ai/sdk'; const client = new Honcho({ baseUrl: 'http://localhost:8000', }); ``` ### Next Steps * **Configure Honcho**: Visit the [Configuration Guide](./configuration) for model tiers, provider options, and tuning * **Explore the API**: Check out the [API Reference](../api-reference/introduction) * **Try the SDKs**: See our [guides](../guides) for examples * **Join the community**: [Discord](https://discord.gg/honcho) ## Troubleshooting Running into issues? See the [Troubleshooting Guide](./troubleshooting) for detailed solutions to common problems including: * Startup failures (missing API keys, database issues) * Runtime errors ("An unexpected error occurred" on every request) * Deriver not processing messages * Database connection and migration issues * Docker and Redis problems **Quick checks:** * Verify the server is running: `curl http://localhost:8000/health` * Check logs: `docker compose logs api` (Docker) or check terminal output (manual setup) * Ensure migrations ran: `uv run alembic upgrade head` ## Production Considerations The default compose file is already production-oriented — ports bound to `127.0.0.1`, restart policies, caching enabled. ### Security * Set `AUTH_USE_AUTH=true` and generate a JWT secret with `python scripts/generate_jwt_secret.py` * Use HTTPS via a reverse proxy in front of Honcho. Example with Caddy (automatic TLS): ``` honcho.example.com { reverse_proxy localhost:8000 } ``` Or with nginx: ```nginx theme={null} server { listen 443 ssl; server_name honcho.example.com; ssl_certificate /etc/letsencrypt/live/honcho.example.com/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/honcho.example.com/privkey.pem; location / { proxy_pass http://127.0.0.1:8000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } } ``` * Secure your database with strong credentials and restrict network access * The production compose binds PostgreSQL and Redis to `127.0.0.1` only — they are not accessible from the network ### Scaling the Deriver * Increase `DERIVER_WORKERS` (default: 1) for higher message throughput * You can also run multiple deriver processes across machines — they coordinate via the database queue * Monitor deriver logs for processing backlog ### Caching * The production compose enables Redis caching by default (`CACHE_ENABLED=true`) * For the development compose, enable manually: `CACHE_ENABLED=true` * Configure `CACHE_URL` to point to your Redis instance (or use a managed Redis service) ### Database Migrations * Always run `uv run alembic upgrade head` after updating Honcho before starting the server * Check current migration status with `uv run alembic current` ### LLM Providers * Ensure your API keys are configured (see [LLM Setup](#llm-setup)) * For alternative providers or per-feature model overrides, see the [Configuration Guide](./configuration#llm-configuration) ### Monitoring * Enable Prometheus metrics with `METRICS_ENABLED=true`. The API exposes `/metrics` on port 8000, the deriver on port 9090 (internal to its container — not published to the host by default). * Enable Sentry error tracking with `SENTRY_ENABLED=true` * The development compose includes Prometheus (host port 9090) and Grafana (host port 3000) for scraping and dashboards. Uncomment those services to enable them. ### Backups * Set up regular PostgreSQL backups: ```bash theme={null} # One-off backup docker compose exec database pg_dump -U postgres postgres > backup-$(date +%Y%m%d).sql # Restore cat backup.sql | docker compose exec -T database psql -U postgres postgres ``` * Back up your `.env` or `config.toml` configuration files # Troubleshooting Source: https://honcho.dev/docs/v3/contributing/troubleshooting Common issues and solutions when self-hosting Honcho This page covers common issues you may encounter when self-hosting Honcho, what causes them, and how to fix them. ## Startup Failures ### Server won't start: "Missing client for ..." ``` ValueError: Missing client for Deriver: google ``` **Cause:** The server validates at startup that all configured LLM providers have API keys. If a provider is referenced in your configuration but the corresponding API key isn't set, the server refuses to start. **Fix:** Set the API keys for your configured providers. With default configuration, you need: ```bash theme={null} LLM_GEMINI_API_KEY=... # Used by deriver, summary, dialectic minimal/low LLM_ANTHROPIC_API_KEY=... # Used by dialectic medium/high/max, dream LLM_OPENAI_API_KEY=... # Used by embeddings (when EMBED_MESSAGES=true) ``` See the [LLM Setup](/docs/v3/contributing/self-hosting#llm-setup) section for provider configuration. You can change which providers are used in your `.env` or `config.toml` (see [Configuration Guide](./configuration#llm-configuration)). ### Server won't start: "JWT\_SECRET must be set" ``` ValueError: JWT_SECRET must be set if USE_AUTH is true ``` **Cause:** You enabled authentication (`AUTH_USE_AUTH=true`) but didn't provide a JWT secret. **Fix:** Generate a secret and set it: ```bash theme={null} python scripts/generate_jwt_secret.py # Then set the output as: AUTH_JWT_SECRET= ``` Or disable authentication for local development: `AUTH_USE_AUTH=false` ## Runtime Errors ### API returns "An unexpected error occurred" on every request **Cause:** This is almost always a database issue. The health endpoint (`/health`) will return `{"status": "ok"}` even when the database is unreachable because it doesn't check the database connection. The actual error appears in the server logs. **Common causes and fixes:** 1. **Database is unreachable** — Check that PostgreSQL is running and the `DB_CONNECTION_URI` is correct 2. **Migrations haven't been run** — The server starts successfully without tables, but every API call will fail. Run: ```bash theme={null} uv run alembic upgrade head ``` In Docker: ```bash theme={null} docker compose exec api uv run alembic upgrade head ``` 3. **pgvector extension not installed** — The `vector` extension must be enabled in your database: ```sql theme={null} CREATE EXTENSION IF NOT EXISTS vector; ``` **How to diagnose:** Check the server logs for the actual error. Look for: * `sqlalchemy.exc.OperationalError` — database connection issue * `sqlalchemy.exc.ProgrammingError` with "relation does not exist" — migrations not run * `psycopg.OperationalError` — connection refused or authentication failed ### Health check passes but API calls fail The `/health` endpoint is a lightweight check that confirms the server process is running. It does **not** verify: * Database connectivity * That migrations have been run * That LLM providers are reachable To verify full functionality, try creating a workspace: ```bash theme={null} curl -X POST http://localhost:8000/v3/workspaces \ -H "Content-Type: application/json" \ -d '{"name": "test"}' ``` If this succeeds, your database connection and migrations are working. ### Deriver not processing messages Messages are stored but no observations, summaries, or representations are being generated. **Common causes:** 1. **Deriver isn't running** — In manual setup, the deriver is a separate process: ```bash theme={null} uv run python -m src.deriver ``` In Docker, it starts automatically via `docker compose up`. 2. **Deriver can't reach the database** — Check deriver logs for connection errors. The deriver uses the same `DB_CONNECTION_URI` as the API server. 3. **Missing LLM API key for deriver provider** — By default the deriver uses Google Gemini (`LLM_GEMINI_API_KEY`). Check deriver logs for API errors. 4. **Processing backlog** — With `DERIVER_WORKERS=1` (default), high message volume can cause a backlog. Increase workers: ```bash theme={null} DERIVER_WORKERS=4 ``` 5. **Representation Batch Max** — By default the deriver is set to buffer its operations until there are enough tokens for a given representation in a session. This is set via the `REPRESENTATION_BATCH_MAX_TOKENS` environment variable. If you aren't seeing tasks continue it may be that the batch size is set too high or enough data hasn't flowed into to the session yet. See [token batching](/docs/v3/documentation/core-concepts/reasoning#token-batching) for more details ## Alternative Provider Issues ### OpenRouter / custom provider not working If calls to an OpenAI-compatible proxy fail: 1. **Verify the endpoint and key are set.** Use `transport = "openai"` with a base URL override: ```bash theme={null} LLM_OPENAI_API_KEY=sk-or-v1-... DERIVER_MODEL_CONFIG__OVERRIDES__BASE_URL=https://openrouter.ai/api/v1 ``` 2. **Check model names match the provider's format.** OpenRouter uses `vendor/model` format (e.g., `anthropic/claude-haiku-4-5`), not the raw model ID. 3. **Ensure your model supports tool calling.** The deriver, dialectic, and dream agents require tool use. Check the provider's model page for tool calling support. 4. **Check server logs for the actual error.** API errors from the upstream provider will appear in Honcho's logs with the HTTP status code and message body. ### vLLM / Ollama not responding 1. **Verify the model server is running** and accessible from the Honcho process (or container): ```bash theme={null} curl http://localhost:8000/v1/models # vLLM curl http://localhost:11434/v1/models # Ollama ``` 2. **In Docker**, `localhost` inside a container doesn't reach the host. Use `host.docker.internal` (macOS/Windows) or the host's network IP: ```bash theme={null} DERIVER_MODEL_CONFIG__OVERRIDES__BASE_URL=http://host.docker.internal:8000/v1 ``` 3. **Structured output failures** — vLLM's structured output support is limited to certain response formats. If you see JSON parsing errors, check the deriver/dream logs for the raw response. ### Thinking budget errors with non-Anthropic providers If you see errors like `thinking budget not supported`, `invalid parameter`, or silent failures where agents produce no output, one of your per-component `*_MODEL_CONFIG__THINKING_BUDGET_TOKENS` overrides is likely set to a value > 0 with a provider that doesn't support Anthropic-style extended thinking. The built-in defaults do not set thinking budgets, so this only applies if you added those overrides yourself. **Fix:** Set `*_MODEL_CONFIG__THINKING_BUDGET_TOKENS=0` for every component when using models that don't support thinking: ```bash theme={null} DERIVER_MODEL_CONFIG__THINKING_BUDGET_TOKENS=0 SUMMARY_MODEL_CONFIG__THINKING_BUDGET_TOKENS=0 DREAM_DEDUCTION_MODEL_CONFIG__THINKING_BUDGET_TOKENS=0 DREAM_INDUCTION_MODEL_CONFIG__THINKING_BUDGET_TOKENS=0 DIALECTIC_LEVELS__minimal__MODEL_CONFIG__THINKING_BUDGET_TOKENS=0 DIALECTIC_LEVELS__low__MODEL_CONFIG__THINKING_BUDGET_TOKENS=0 DIALECTIC_LEVELS__medium__MODEL_CONFIG__THINKING_BUDGET_TOKENS=0 DIALECTIC_LEVELS__high__MODEL_CONFIG__THINKING_BUDGET_TOKENS=0 DIALECTIC_LEVELS__max__MODEL_CONFIG__THINKING_BUDGET_TOKENS=0 ``` For OpenAI reasoning models, use `*_MODEL_CONFIG__THINKING_EFFORT` instead of `*_MODEL_CONFIG__THINKING_BUDGET_TOKENS`. ## Database Issues ### Connection string format The connection URI **must** use the `postgresql+psycopg` prefix: ```bash theme={null} # Correct DB_CONNECTION_URI=postgresql+psycopg://postgres:postgres@localhost:5432/postgres # Wrong - will fail DB_CONNECTION_URI=postgresql://postgres:postgres@localhost:5432/postgres DB_CONNECTION_URI=postgres://postgres:postgres@localhost:5432/postgres ``` ### Checking migration status ```bash theme={null} # See current migration version uv run alembic current # See migration history uv run alembic history # Upgrade to latest uv run alembic upgrade head ``` ## Cache & Redis ### Redis is optional Redis is used for caching when `CACHE_ENABLED=true` (default: `false`). If Redis is unreachable, Honcho **gracefully falls back to in-memory caching** and logs a warning. This means: * The server and deriver will still start and function normally * Performance may be reduced under high load without Redis * You do not need Redis for local development or testing ### Redis connection issues If you see Redis connection warnings in logs but `CACHE_ENABLED=false`, they can be safely ignored. If you want caching: ```bash theme={null} # Start Redis via Docker docker run -d -p 6379:6379 redis:latest # Configure Honcho CACHE_ENABLED=true CACHE_URL=redis://localhost:6379/0 ``` ## Docker Issues ### Docker build fails with permission errors The Honcho Dockerfile uses BuildKit mount syntax and creates a non-root `app` user. Common build failures: **1. BuildKit not enabled** The Dockerfile uses `RUN --mount=type=cache` which requires Docker BuildKit. If you see syntax errors during build: ```bash theme={null} # Ensure BuildKit is enabled DOCKER_BUILDKIT=1 docker compose build ``` Or add to your Docker daemon config (`/etc/docker/daemon.json`): ```json theme={null} { "features": { "buildkit": true } } ``` **2. Permission denied during build or at runtime (Linux)** On Linux, AppArmor or SELinux can block Docker build operations and volume mounts. Symptoms include permission denied errors during `COPY`, `RUN`, or when the container tries to access mounted volumes. ```bash theme={null} # Check if AppArmor is blocking Docker sudo aa-status | grep docker # Temporarily test without AppArmor (for diagnosis only) docker compose down sudo aa-remove-unknown docker compose up -d ``` For SELinux, add `:z` to volume mounts in `docker-compose.yml`: ```yaml theme={null} volumes: - .:/app:z ``` **3. Volume mount UID mismatch** The Dockerfile creates a non-root `app` user, but `docker-compose.yml.example` mounts `.:/app` which overlays the container filesystem with host-owned files. The `app` user inside the container may not have permission to read them. If you see permission errors at runtime (not build time), you can either: * Run without the source mount (remove `- .:/app` from volumes — the image already contains the code) * Or fix ownership: `sudo chown -R 100:101 .` (matches the `app` user inside the container) ### Containers start but API fails 1. Check container status: `docker compose ps` 2. Check API logs: `docker compose logs api` 3. Check database logs: `docker compose logs database` 4. Ensure migrations ran: `docker compose exec api uv run alembic upgrade head` ### Port conflicts If port 8000 is already in use: ```bash theme={null} # Check what's using the port lsof -i :8000 # Or change the port mapping in docker-compose.yml ports: - "8001:8000" # Map to a different host port ``` ### Rebuilding after code changes ```bash theme={null} docker compose build --no-cache docker compose up -d ``` ## Getting Help If your issue isn't covered here: * **Check the logs** — most issues are diagnosed from server or deriver logs * **GitHub Issues** — [Report bugs](https://github.com/plastic-labs/honcho/issues) * **Discord** — [Join our community](https://discord.gg/plasticlabs) * **Configuration** — See the [Configuration Guide](./configuration) for all available settings # Architecture & Intuition Source: https://honcho.dev/docs/v3/documentation/core-concepts/architecture Understanding Honcho's core concepts and data model. Honcho is memory infrastructure that continuously [*reasons*](/docs/v3/documentation/core-concepts/reasoning) about data to build rich representations of peers (users, agents, or any entity) over time. This document explains the data model, system components, and how data flows through Honcho. ## Data Model Honcho has a hierarchical data model centered around the entities below. ```mermaid theme={null} graph LR W[Workspaces] -->|have| P[Peers] W -->|have| S[Sessions] S -->|have| SM[Messages] P <-.->|many-to-many| S style W fill:#B6DBFF,stroke:#333,color:#000 style P fill:#B6DBFF,stroke:#333,color:#000 style S fill:#B6DBFF,stroke:#333,color:#000 style SM fill:#B6DBFF,stroke:#333,color:#000 ``` * A Workspace has Peers & Sessions * A Peer can be in multiple Sessions and can send Messages in a Session * A Session can have many Peers and stores Messages sent by its Peers ### Workspaces Workspaces are the top-level containers in Honcho. They provide complete isolation between different applications or environments, essentially serving as a namespace to keep different workloads separate. You might use separate workspaces for development, staging, and production environments, or to isolate different product lines. They also enable multi-tenant SaaS applications where each customer gets their own isolated workspace with complete data separation. Authentication is scoped to the workspace level, and configuration settings can be applied workspace-wide to control behavior across all peers and sessions within that workspace. *** ### Peers Peers are the most important entity in Honcho--everything revolves around building and maintaining their [*representations*](/docs/v3/documentation/core-concepts/representation). A peer represents any individual user, agent, or entity in a workspace. Treating humans and agents the same way lets you build arbitrary combinations for multi-agent or group chat scenarios. Each peer has a unique identifier within a workspace and is a container for reasoning across all their sessions. This cross-session context means conclusions drawn about a peer in one session can inform interactions in completely different sessions. Peers can be configured to control whether Honcho reasons about them. You can use peers for any entity that persists over time--individual users in chatbot applications, AI agents interacting with users or other agents, customer profiles in support systems, student profiles in educational platforms, or even NPCs in role-playing games. *** ### Sessions Sessions represent interaction threads or contexts between peers. A session can involve multiple peers and provides temporal boundaries for when a set of interactions starts and ends. This lets you scope context and memory to specific interactions while still maintaining longer-term peer representations that span sessions. Use sessions to scope things like support tickets, meeting transcripts, learning sessions, or conversations. You can also use single-peer sessions as a way to import external data--create a session with just one peer and structure emails, documents, or files as messages to enrich that peer's representation. Session-level configuration gives you fine-grained control over perspective-taking behavior. You can configure whether a peer should form representations of other peers in the session, and whether other peers should form representations of them. *** ### Messages Messages are the fundamental units of interaction within sessions. While they typically represent back-and-forth communication between peers, you can also use messages to ingest any information that provides context--emails, documents, files, user actions, system notifications, or rich media content. Every message is attributed to a specific peer and ordered chronologically within its session. When messages are created, they trigger automatic background reasoning that updates peer representations. Messages support rich metadata and structured data through JSONB fields, making them flexible enough to capture whatever information matters for your use case. ## Data Flow Understanding how data moves through Honcho helps clarify the architecture. When you create messages, they're immediately written to PostgreSQL and reasoning tasks are added to background queues. Background workers then generate logic, summaries, and new insights to improve representations. These conclusions and insights get stored in vector collections for retrieval. This async approach ensures fast writes while still providing rich reasoning capabilities. When you need context from Honcho, you query through the "Chat" endpoint or "Get Context" endpoint. Honcho retrieves relevant conclusions from vector storage along with recent messages, then assembles everything into coherent context ready to inject into agent prompts. Honcho Architecture The diagram above shows how agents write messages to Honcho, which triggers reasoning that updates peer representations. Agents can then query representations to get additional context for their next response. Black arrows represent read/write of regular data (messages, storage), while red arrows represent read/write of reasoned-over data (logic, peer representations). ## Configuration & Extensibility Honcho is designed to be flexible. Settings cascade hierarchically from workspace to peer to session, so you can set defaults at the workspace level and override them for specific peers or sessions. Feature flags let you enable or disable reasoning modes, perspective tracking, and other capabilities. You can bring your own LLM provider--OpenAI, Anthropic, or custom endpoints--and metadata fields let you extend any primitive with custom JSON data. Batch operations let you create up to 100 messages in a single API call for efficient bulk ingestion. ## Design Principles Honcho's architecture follows a few core principles. Everything revolves around building representations of peers (peer-centric). Memory isn't just storage--it's continual learning (reasoning-first). Long-lived operations happen in the background so they don't block user interactions (async by default). The system works with any LLM provider (provider-agnostic) and is built for isolation and scalability from the ground up (multi-tenant). Users and agents are both represented as peers, which enables flexible scenarios you couldn't easily model with a traditional user-assistant paradigm (unified paradigm). ## Next Steps Sign up for the Honcho platform and start building Get started with your first integration Learn how Honcho reasons about messages to build memory Understand what peer representations are and how they work # Design Patterns Source: https://honcho.dev/docs/v3/documentation/core-concepts/design-patterns Design your workspace, peers, and sessions for common application patterns If you're using a coding agent (Claude Code, OpenCode, Cursor, etc.), the **`/honcho-integration` skill** walks you through these decisions interactively. It explores your codebase, interviews you about peers and sessions, and generates the integration code. The patterns below are the same ones the skill uses. ## Quick Reference | Decision | Recommendation | | --------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- | | How many workspaces? | One per application. Separate per-agent if you need hard data isolation. | | Who should be a peer? | Any entity you want Honcho to reason about — users, agents, NPCs, students, customers. | | How should I scope sessions? | Flexible -- per-conversation, per-channel, per-scene, etc. See [Session Design](#session-design) below. | | Should I set `observe_me: false`? | Yes, for any peer you don't need Honcho to build a representation of — typically assistants or bots with deterministic behavior. | | Do I need `observe_others`? | Only when different peers need distinct views of the same participant (e.g., games, multi-agent). Most apps can leave it at the default (false). | ## Workspace Design Workspaces are the top-level container. Everything inside a workspace (peers, sessions, messages, and all reasoning) is fully isolated from other workspaces. **One workspace per application** is the most common pattern. Use separate workspaces when you need hard isolation: | Pattern | When to use | | ---------------- | ------------------------------------------------------------------------- | | Single workspace | Most applications. One product, one environment. | | Per-tenant | Multi-tenant SaaS where each customer's data must be completely isolated. | If you are using the SDK, it will create a workspace called `default` if no name is specified for `workspace_id` *** ## Peer Design A peer is any entity that participates in a session. Observation settings control which ones Honcho reasons about. **What makes a good peer?** * It participates in sessions (a user, an agent, a character, an NPC) * It persists across sessions * It changes over time (preferences shift, knowledge grows), or it produces messages you want Honcho to see **Naming conventions** Give peers stable, unique identifiers scoped to your application: ```python Python theme={null} # Prefix with the source platform for multi-channel apps peer = honcho.peer("discord_491827364") peer = honcho.peer("slack_U04ABCDEF") # Use your own user IDs for backend integrations peer = honcho.peer("user_abc123") # Use descriptive names for agents/assistants peer = honcho.peer("assistant") peer = honcho.peer("dungeon-master") ``` ```typescript TypeScript theme={null} // Prefix with the source platform for multi-channel apps const peer = await honcho.peer("discord_491827364"); const peer = await honcho.peer("slack_U04ABCDEF"); // Use your own user IDs for backend integrations const peer = await honcho.peer("user_abc123"); // Use descriptive names for agents/assistants const peer = await honcho.peer("assistant"); const peer = await honcho.peer("dungeon-master"); ``` If your Peer represents an entity that may go by multiple different names, such as nicknames indicate that in the Peer Card: ```python Python theme={null} peer = honcho.peer("user_abc123") peer.set_card([ "Name: Alice. Also known as 'Ali' and 'A'.", "College student, prefers casual tone.", ]) ``` ```typescript TypeScript theme={null} const peer = await honcho.peer("user_abc123"); await peer.setCard([ "Name: Alice. Also known as 'Ali' and 'A'.", "College student, prefers casual tone.", ]); ``` **When to disable reasoning** Not every peer needs a representation. Set `observe_me: false` on peers that behave deterministically. ```python Python theme={null} from honcho.api_types import PeerConfig # The assistant doesn't need a representation assistant = honcho.peer("assistant", configuration=PeerConfig(observe_me=False)) # The user does--this is who you want to understand user = honcho.peer("user-123", configuration=PeerConfig(observe_me=True)) ``` ```typescript TypeScript theme={null} const assistant = await honcho.peer("assistant", { configuration: { observeMe: false } }); const user = await honcho.peer("user-123", { configuration: { observeMe: true } }); ``` *** ## Session Design Sessions define the temporal boundaries of an interaction. How you scope sessions directly affects how summaries are generated and how context is retrieved. **Common session patterns** | Pattern | Session scoped to | Example | | ---------------- | ---------------------------- | ----------------------------------------------- | | Per-conversation | Each new chat thread | ChatGPT-style UI where each thread is a session | | Per-channel | A persistent channel or room | Discord channel, Slack thread | | Per-interaction | A bounded task or encounter | A support ticket, a game encounter | | Per-import | A batch of external data | Importing emails or documents for a single peer | **When to create new sessions vs reuse** * **New session** when the context resets (new conversation, new day, new topic) * **Reuse session** when context should accumulate (ongoing channel, persistent thread) *** ## Application Patterns ### AI Companions An assistant that remembers the user across sessions and platforms. The [Honcho plugin for OpenClaw](/docs/v3/guides/integrations/openclaw) is a production example--one assistant with memory across WhatsApp, Telegram, Discord, and Slack. ```python Python theme={null} from honcho import Honcho from honcho.api_types import PeerConfig, SessionPeerConfig honcho = Honcho(workspace_id="my-companion-app") owner = honcho.peer("owner") agent = honcho.peer("agent-main", configuration=PeerConfig(observe_me=False)) # Session key = thread + platform → separate histories, shared user memory session = honcho.session("general-discord") session.add_peers([ (owner, SessionPeerConfig(observe_me=True, observe_others=False)), (agent, SessionPeerConfig(observe_me=True, observe_others=True)), ]) session.add_messages([ owner.message("I've been stressed about the move to Portland next month"), agent.message("Moving is a big deal. What's weighing on you the most?"), owner.message("Honestly just leaving my friend group behind"), ]) # Query from any session or platform response = owner.chat("What's going on in this user's life right now?") ``` ```typescript TypeScript theme={null} const honcho = new Honcho({ workspaceId: "my-companion-app" }); const owner = await honcho.peer("owner"); const agent = await honcho.peer("agent-main", { configuration: { observeMe: false } }); const session = await honcho.session("general-discord"); await session.addPeers([ ["owner", { observeMe: true, observeOthers: false }], ["agent-main", { observeMe: true, observeOthers: true }], ]); await session.addMessages([ owner.message("I've been stressed about the move to Portland next month"), agent.message("Moving is a big deal. What's weighing on you the most?"), owner.message("Honestly just leaving my friend group behind"), ]); const response = await owner.chat("What's going on in this user's life right now?"); ``` **Key decisions (from the [OpenClaw plugin](/docs/v3/guides/integrations/openclaw)):** * **Session key = thread + platform** — `general-discord` and `general-telegram` are separate sessions but share a single owner representation, so Honcho learns from every channel * **Dynamic agent peers** — each agent gets its own peer (`agent-{id}`), resolved via a workspace-level map. Renaming an agent recovers the peer by metadata lookup * **Subagent hierarchy** — when a primary agent spawns a subagent, the parent joins the child's session as a silent observer (`observe_me: false, observe_others: true`), giving Honcho visibility into the full agent tree * **Asymmetric observation** — both owner and agent are observed, but with different scopes: owner has `observe_others: false` (default view), while the agent has `observe_others: true` so it can build its own representation of the owner. Subagents get lighter context (peer card only, no session summary) See the [OpenClaw integration guide](/docs/v3/guides/integrations/openclaw) for the full plugin setup. *** ### Coding Agents Coding agents survive terminal restarts, editor switches, and project hops. The [Honcho plugin for Claude Code](/docs/v3/guides/integrations/claude-code) is a production example of this pattern. ```python Python theme={null} from honcho import Honcho from honcho.api_types import PeerConfig honcho = Honcho(workspace_id="claude_code") # Developer is observed; agent is not developer = honcho.peer("user") agent = honcho.peer("claude", configuration=PeerConfig(observe_me=False)) # Session per project directory -- stable across restarts session = honcho.session("user-honcho-repo") session.add_peers([developer, agent]) session.add_messages([ developer.message("refactor the auth module to use dependency injection"), agent.message("I'll extract the auth dependencies into a provider pattern..."), developer.message("actually let's keep it simpler, just pass the config directly"), ]) # In a future session, query what Honcho learned context = developer.chat("What are this developer's preferences for code architecture?") # Honcho knows: prefers simplicity, reverses decisions when simpler approach exists ``` ```typescript TypeScript theme={null} const honcho = new Honcho({ workspaceId: "claude_code" }); const developer = await honcho.peer("user"); const agent = await honcho.peer("claude", { configuration: { observeMe: false } }); const session = await honcho.session("user-honcho-repo"); await session.addPeers([developer, agent]); await session.addMessages([ developer.message("refactor the auth module to use dependency injection"), agent.message("I'll extract the auth dependencies into a provider pattern..."), developer.message("actually let's keep it simpler, just pass the config directly"), ]); const context = await developer.chat("What are this developer's preferences for code architecture?"); ``` **Key decisions (from the Claude Code plugin):** * **One workspace per tool** -- Claude Code and Cursor each get their own workspace, with optional cross-linking for read access * **Asymmetric peers** -- developer is observed (memory formation), agent is not observed but still stores messages so Honcho sees both sides * **Session-per-directory** by default -- each project accumulates its own memory. Prefix with peer name (`user-honcho-repo`) so multiple developers on the same workspace don't collide. Alternative strategies: `git-branch` (session switches on branch change) or `chat-instance` (clean slate each time) * **Filter what you store** -- user messages go in real-time; agent messages are filtered to skip trivial tool output and keep substantive explanations * **Import external data** with single-peer sessions to ingest READMEs, architecture docs, or commit history into a developer's representation See the [Claude Code integration guide](/docs/v3/guides/integrations/claude-code) for the full plugin setup. *** ### Games Games introduce multi-peer scenarios where **information asymmetry matters**. An NPC should only know what it has witnessed, not the full game state. ```python Python theme={null} from honcho import Honcho from honcho.api_types import SessionPeerConfig honcho = Honcho(workspace_id="my-rpg") # Every character is a peer player = honcho.peer("player-one") merchant = honcho.peer("merchant-grim") thief = honcho.peer("thief-shadow") # Scene 1: Player talks to the merchant tavern = honcho.session("tavern-scene") tavern.add_peers([player, merchant]) # Enable the merchant to build its own representation of the player tavern.set_peer_configuration(merchant, SessionPeerConfig(observe_others=True)) tavern.add_messages([ player.message("I'm looking for a rare gemstone. Money is no object."), merchant.message("I may know of one... but it won't come cheap."), ]) # Scene 2: Player talks to the thief (merchant isn't here) alley = honcho.session("dark-alley") alley.add_peers([player, thief]) alley.set_peer_configuration(thief, SessionPeerConfig(observe_others=True)) alley.add_messages([ player.message("I need that gemstone stolen from the merchant. Quietly."), thief.message("Consider it done. Half up front."), ]) # The merchant's view of the player: wealthy buyer seeking a gemstone merchant_view = merchant.chat("What do I know about this player?", target="player-one") # The thief's view: someone willing to steal from the merchant thief_view = thief.chat("What do I know about this player?", target="player-one") # Honcho's global view: knows both sides of the story full_view = player.chat("What is this player up to?") ``` ```typescript TypeScript theme={null} const honcho = new Honcho({ workspaceId: "my-rpg" }); const player = await honcho.peer("player-one"); const merchant = await honcho.peer("merchant-grim"); const thief = await honcho.peer("thief-shadow"); const tavern = await honcho.session("tavern-scene"); await tavern.addPeers([player, merchant]); await tavern.setPeerConfiguration(merchant, { observeOthers: true }); await tavern.addMessages([ player.message("I'm looking for a rare gemstone. Money is no object."), merchant.message("I may know of one... but it won't come cheap."), ]); const alley = await honcho.session("dark-alley"); await alley.addPeers([player, thief]); await alley.setPeerConfiguration(thief, { observeOthers: true }); await alley.addMessages([ player.message("I need that gemstone stolen from the merchant. Quietly."), thief.message("Consider it done. Half up front."), ]); const merchantView = await merchant.chat("What do I know about this player?", { target: "player-one" }); const thiefView = await thief.chat("What do I know about this player?", { target: "player-one" }); const fullView = await player.chat("What is this player up to?"); ``` **Key decisions:** * Every character (player, NPC) is a peer * `observe_others: true` lets NPCs build their own representations of the player based only on what they've witnessed * Session-per-scene or session-per-encounter so context scopes to specific interactions * Use `target` when querying to get a specific NPC's perspective rather than Honcho's omniscient view * See [Representation Scopes](/docs/v3/documentation/features/advanced/representation-scopes) for the full details *** ## Common Mistakes * **Leaving `observe_me` on for assistants** -- Wastes reasoning compute on a peer you control. Deterministic behavior doesn't need to be modeled. * **Not storing messages** -- Honcho reasons about messages asynchronously. If you don't call `add_messages()`, there's nothing to reason about — no messages means no memory. See [Storing Data](/docs/v3/documentation/features/storing-data) for details. * **Creating a new workspace per user** -- Use peers within a single workspace instead. Workspaces are for isolation between applications, not between users. * **Too many tiny sessions** -- Summaries and `session.context()` are scoped to a single session. If you split a continuous conversation across many sessions, context is fragmented and each session is too short to summarize. Reuse a session when context should flow continuously. * **Blocking on processing** -- Messages are processed asynchronously in the background. Don't poll or wait for reasoning to complete before continuing your application flow. ## Next Steps Retrieve formatted context from sessions for your LLM Query Honcho about your peers with natural language Fine-tune what gets reasoned about and how Directional representations for multi-peer scenarios # Honcho Reasoning Source: https://honcho.dev/docs/v3/documentation/core-concepts/reasoning Honcho is a memory system that *reasons*. You can read more on the philosophy behind the approach [here](https://blog.plasticlabs.ai/blog/Memory-as-Reasoning), but practically speaking, the system runs inference on data in the background to produce the highest quality context for simulating statefulness. This document explains why reasoning is necessary and how Honcho implements it. If you'd like to experience this methodology first-hand, try out [Honcho Chat](https://honcho.chat)--an interface to your personal memory. Read more [here](https://blog.plasticlabs.ai/blog/Introducing-Honcho-Chat)! ## Why Reasoning? Traditional RAG systems treat memory as static storage--they retrieve what was explicitly said when semantically similar queries appear. Other solutions take an opinion for you on what's important to store, whether through structured facts in databases or predefined knowledge graphs. Honcho takes a different approach: we extract all latent information by reasoning about everything, so it's there when you need it. Our job is to produce the most robust reasoning possible--it's your job as a developer to decide what's relevant for your use case. We extract this latent information through formal logic. Formal logical reasoning is AI-native--LLMs perform the rigorous, compute-intensive thinking that humans struggle with, instantly and consistently. This unlocks insights that are only accessible by *rigorously thinking* about your data, generating new understanding that goes beyond simple recall. ## Formal Logic Framework Honcho's memory system is powered by custom models trained to perform formal logical reasoning. The system extracts what was explicitly stated, draws certain conclusions from those, identifies patterns across multiple conclusions, and infers the simplest explanations for behavior. Why formal logic specifically? LLMs are uniquely well-suited for this reasoning task--it's well-represented in the pretraining data. LLMs can maintain consistent reasoning across thousands of conclusions without cognitive fatigue or belief resistance--which is extremely hard for humans to do reliably. The outputs are also composable, meaning logical conclusions can be stored, retrieved, and combined programmatically for dynamic context assembly. Here's an example of a data structure the reasoning models generate: ```json theme={null} { "explicit": [ { "content": "premise 1" }, ... { "content": "premise n" } ], "deductive": [ { "premises": [ "premise 1", ... "premise n" ], "conclusion": "conclusion 1" }, ... ] } ``` The explicit reasoning model ([Neuromancer XR](https://blog.plasticlabs.ai/research/Introducing-Neuromancer-XR)) outputs its "thinking" followed by things that were explicitly stated, which serve as premises to scaffold deductive conclusions. It's on top of this reasoning foundation that further reasoning is scaffolded. Currently that includes peer cards (key biographical information about the peer), consolidation (identifying redundant or contradictory information), induction (pattern recognition across multiple messages), and abduction (inferring the simplest explanations for observed behavior). The reasoning that Honcho does is something we're constantly iterating and improving on. Our goal is simple--provide the richest, most relevant context in the fastest, cheapest way possible in order to simulate statefulness in whatever setting you need. ## How It Works When you write messages to Honcho, they're stored immediately and enqueued for background processing. Reasoning asynchronously ensures fast writes while still providing rich reasoning capabilities. Messages are stored immediately without blocking, and session-based queues maintain chronological consistency so reasoning tasks affecting the same peer representation are always processed in order. The reasoning outputs--conclusions, summaries, peer cards--are stored as part of peer representations, indexed in vector collections for retrieval. Diagram for reasoning in Honcho The diagram above shows how agents write messages to Honcho, which triggers reasoning that updates peer representations. Agents can then query representations to get additional context for their next response. ### Token Batching Rather than running inference on every individual message, Honcho accumulates messages in the queue and processes them as a batch once the total token count of pending messages for a given peer representation crosses a threshold--roughly **1,000 tokens** at the current batch size. This keeps ingestion costs down, since Honcho charges based on reasoning passes, and ensures each pass has a meaningful amount of context to work with. At \~1,000 tokens the batch comfortably fits in the context window of any modern LLM, so no content is lost. If a user sends several short messages in a row (e.g., "yes", "ok", "sounds good"), those messages sit in the queue until enough content has accumulated. Once the threshold is met, the full batch is processed together in a single reasoning call. This batching only applies to **representation** tasks (conclusion extraction). Summary and dream tasks have their own scheduling logic and are not subject to the token threshold. ## Balances & Design Choices Off-the-shelf LLMs can perform formal logical reasoning, but they aren't optimized for it. Honcho uses custom models trained specifically for logical rigor (following formal reasoning rules rather than plausible-sounding text), structured output (consistent JSON schema with premises and conclusions), and efficiency (smaller, faster models tuned for this specific task). This allows Honcho to reason more reliably and at lower cost than general-purpose frontier LLMs. The approach balances quality with practical constraints. Custom models are smaller and cheaper to run, scaffolded conclusions are more token-efficient than raw conversation history, and we batch where appropriate to optimize update frequency. Honcho's reasoning capabilities are actively being improved. Current areas of development include enhanced inductive and abductive reasoning, multi-hop and temporal reasoning, and expanded file types and modalities. The system is designed to be extensible--new reasoning capabilities can be added without breaking existing functionality. If you find that the data you're uploading to Honcho isn't being reasoned over to your liking, we'd love to improve it for you and ingest your data for free--reach out via [Discord](https://discord.gg/honcho) or [email](mailto:support@plasticlabs.ai)! ## Next Steps Without exhaustive reasoning, you're stuck with surface-level retrieval or someone else's opinion on what matters. You can't effectively simulate statefulness if you're not reasoning about everything in the present--coherence plummets, trust falls, and users churn. Don't leave key information on the table. Use Honcho to give your agents the context they need to reconstruct the past as comprehensively as possible and maintain coherence--for your use case. Sign up for the Honcho platform and start building Get started with your first integration See how reasoning fits into Honcho's overall architecture Learn how reasoning produces peer representations # Peer Representations Source: https://honcho.dev/docs/v3/documentation/core-concepts/representation A representation is the collection of reasoning Honcho has done about a peer over time. It's the continual learning about a peer over every message that's been written to it. Representations evolve dynamically as new messages come in, with Honcho reasoning about them in the background. When you write messages to Honcho, the reasoning models extract premises, draw conclusions, and scaffold new conclusions as well. All of that reasoning gets stored as the peer's representation. Think of it as Honcho's understanding of who that peer is, what they care about, and how they behave, built through formal logic rather than simple storage. ## What's in a Representation? A peer representation is made up of several types of artifacts that Honcho generates through [*reasoning*](/docs/v3/documentation/core-concepts/reasoning): **Conclusions** are insights derived through formal logic. Deductive conclusions are things Honcho can be certain about based on extracted premises. Inductive conclusions identify patterns across multiple messages. Abductive conclusions infer the simplest explanations for observed behavior. For example, if a user frequently mentions work deadlines and rarely mentions hobbies, Honcho might inductively conclude they're time-constrained or career-focused. **Summaries** capture the essence of sessions. Short summaries are generated every 20 messages by default, and long summaries every 60 messages. These help compress conversation history into dense, queryable context. **Peer cards** contain key biographical information. They essentially cache the most basic information about a peer (name, occupation, interests) to ensure the model never loses its grounding. These enable continuous improvement. Each new message refines conclusions, updates summaries, and keeps peer cards current—building a more accurate representation over time. ## Observation & Perspective-Taking Honcho can build different representations based on what each peer observes. This enables sophisticated multi-peer scenarios where understanding is relative to what was actually witnessed. There are two observation modes controlled by [configuration](/docs/v3/documentation/features/advanced/configuration): **Honcho observing peers** (`observe_me`): When enabled (default), Honcho forms a representation of the peer based on all messages they've sent across all sessions. This is Honcho's understanding of that peer, built from everything they've said and done in your system. Set `observe_me: false` if you don't want Honcho to reason about that peer at all. **Peers observing others** (`observe_others`): When enabled at the session level, a peer will form representations of other peers in that session based only on messages they've observed. If Alice and Bob are in a session together and Alice has `observe_others: true`, Alice will form a representation of Bob based solely on what Bob said in sessions Alice participated in. Alice's representation of Bob will be completely different from Charlie's representation of Bob if they've observed different interactions. In the diagram below, assume `observe_me` isn't turned off (again, default behavior) and `observe_others` is turned on for both peers in a session that contains the peers Alice and Bob. The shared session that Alice and Bob have informs their respective representations of each other. Alice has a small set of conclusions that pertain to Bob, and Bob has a small set of conclusions that pertain to Alice. Honcho can observe the totality of each peer's interactions, forming representations of the peers themselves, and enable peers to store conclusions about peers they interact with based only on what they witness in shared sessions. Why would you want peers observing others? So you can simulate stateful *perspectives*. If Bob participates with Alice in sessions 1 and 2, while Charlie participates with Alice in session 3, Bob's representation of Alice will be built from sessions 1 and 2, while Charlie's representation will only include what happened in session 3. Bob can reference shared history, inside jokes, or past conflicts that Charlie knows nothing about. Without perspective-based segmentation, all agents are omniscient--the simulation breaks down, trust falls apart, and users churn. ## Why Representations Work Statefulness is simulated through reconstruction of the past. Traditional systems reconstruct by retrieving stored facts, querying semantically similar items, and hoping the LLM does the rest. Honcho reconstructs through reasoning about the past exhaustively, leaving much less to chance. Reasoning can surface insights never explicitly stated. If a user mentions they're saving for a house in one session and complains about subscription costs in another, Honcho can conclude they're budget-conscious without anyone saying it. Reasoning handles contradictions gracefully--when new information conflicts with old conclusions, it reconciles them instead of just accumulating more data. And reasoning enables prediction under uncertainty, inferring what's likely true based on patterns even when data is incomplete. Humans reconstruct the past from imperfect recollections, then act on those reconstructions as if they were complete. Representations enable agents to do the same with far greater fidelity. Reasoning produces an exhaustive, explicit record of what can be concluded about a peer--giving agents complete recollection that humans can only pretend to have. That's what makes truly stateful agents possible. ## Next Steps Sign up for the Honcho platform and start building See representations in action with a working example Understand how representations fit into Honcho's architecture Chat with Honcho about your users # Dreaming Source: https://honcho.dev/docs/v3/documentation/features/advanced/dreaming How Honcho continuously improves memory through autonomous consolidation Dreaming is an experimental feature under active development. The scheduling heuristics, specialist behavior, and configuration options described here are subject to change as we iterate on the approach. Honcho's reasoning system extracts conclusions from every message as it arrives. Over time, this produces a large body of knowledge--some of which is redundant, outdated, or missing higher-order patterns that only become visible across many interactions. **Dreaming** is the process that addresses this: an autonomous, periodic consolidation cycle that refines the peer representation by reasoning over existing conclusions. Think of it like sleep for a memory system. The "waking" reasoning process captures what happened. The dreaming process reflects on what it all means. ## What Dreaming Does A dream cycle runs two specialized agents in sequence: ### 1. Deduction The deduction specialist performs logical inference over existing conclusions. It autonomously explores the observation space and looks for: * **Knowledge updates**: When the same fact has changed over time (e.g., "works at Company A" followed later by "works at Company B"), it deletes the outdated conclusion and creates a new one reflecting the current state. * **Logical implications**: Conclusions that follow necessarily from existing premises but weren't captured during real-time processing. * **Contradictions**: Conflicting conclusions that need resolution. * **Peer card updates**: Key biographical facts (name, location, occupation) that should be recorded on the peer card for quick access. ### 2. Induction The induction specialist identifies patterns across multiple conclusions. It looks for: * **Behavioral tendencies**: Recurring behaviors observed across different contexts (e.g., "tends to reschedule meetings when stressed"). * **Preferences**: Consistent choices that indicate underlying preferences. * **Personality traits**: Stable characteristics inferred from multiple data points. * **Correlations**: Relationships between different aspects of behavior. Inductive conclusions require evidence from at least two source conclusions--patterns need more than a single data point. Each pattern is assigned a confidence level based on the number of supporting observations. ## When Dreams Are Scheduled Dreams are triggered automatically based on a set of heuristics designed to balance freshness with efficiency: ### Conditions All of the following must be true for a dream to be scheduled: 1. **Document threshold**: At least 50 new conclusions have been created since the last dream for that peer representation. 2. **Minimum cooldown**: At least 8 hours have passed since the last dream for that peer representation. 3. **Dreaming is enabled**: The workspace and/or session configuration has `dream.enabled` set to `true` (the default). ### Idle timeout When the threshold conditions are met, a dream is **not** immediately executed. Instead, a timer is set (default: 60 minutes) that waits for user inactivity. If new messages arrive during the waiting period, the pending dream is cancelled and the timer resets. This prevents dreaming while the user is actively interacting, ensuring the system consolidates only after the conversation has settled. Once the idle timeout expires without interruption, the dream task is enqueued for processing. ### Manual scheduling You can also trigger a dream explicitly via the API: ```python Python theme={null} honcho.workspaces.schedule_dream( observer="user-peer-name", observed="user-peer-name", ) ``` ```typescript TypeScript theme={null} await honcho.workspaces.scheduleDream({ observer: "user-peer-name", observed: "user-peer-name", }); ``` Manual dreams bypass the threshold and cooldown checks, but are still subject to deduplication--if a dream is already pending or in progress for the same peer representation, the request is a no-op. ## Scope Dreams operate at the **peer representation** level--specifically, a (workspace, observer, observed) tuple. This means: * A dream consolidates conclusions for a specific observer's view of a specific observed peer. * In the common case of self-observation (where the observer and observed are the same peer), the dream consolidates that peer's own representation. * Dreams do not span across workspaces or across different peer pairs. ## Deduplication and Safety The system includes several safeguards to prevent wasted work: * **No concurrent dreams**: If a dream is already being processed for a given peer representation, a new one will not be enqueued. * **No duplicate pending dreams**: If a dream is already queued and waiting, a second enqueue request is skipped. * **Cancellation on new activity**: When new messages arrive for a peer, any pending (not yet started) dream for that peer is cancelled. This ensures the dream always runs on the most up-to-date set of conclusions. ## Configuration Dreams can be enabled or disabled at the workspace or session level: ```python Python theme={null} # Disable dreams for a workspace honcho.set_configuration({ "dream": {"enabled": False} }) # Disable dreams for a specific session session = honcho.session("my-session", config={ "dream": {"enabled": False} }) ``` ```typescript TypeScript theme={null} // Disable dreams for a workspace await honcho.setConfiguration({ dream: { enabled: false } }); // Disable dreams for a specific session const session = await honcho.session("my-session", { config: { dream: { enabled: false } } }); ``` Dreaming is automatically disabled if reasoning itself is disabled, since there would be no conclusions to consolidate. Learn how Honcho reasons over messages to produce conclusions Full configuration reference for reasoning, summaries, and dreams Monitor dream tasks alongside other background processing API reference for manually triggering dreams # File Uploads Source: https://honcho.dev/docs/v3/documentation/features/advanced/file-uploads Upload PDFs, text files, and JSON documents to create messages in Honcho Honcho's file upload feature allows you to convert documents into messages automatically. Upload PDFs, text files, or JSON documents, and Honcho will extract the text content, split it into appropriately sized chunks, and create messages that become part of your peer's representation or session context. This feature is perfect for ingesting documents, reports, research papers, or any text-based content that you want your AI agents to understand and reference. ## How It Works When you upload a file, Honcho: 1. **Extracts text** from the file using specialized processors based on file type 2. **Creates messages** with the extracted content split into chunks that fit within message limits (messages are limited to 50,000 characters) 3. **Queues processing** for background analysis and insight derivation like any other message The file content becomes part of the peer's representation, making it available for natural language queries and context retrieval. ## Supported File Types Honcho currently supports the following file types with more to come: * **PDF files** (`application/pdf`) - Text extraction with page numbers * **Text files** (`text/*`) - Plain text, markdown, code files, etc. * **JSON files** (`application/json`) - Structured data converted to readable format Files are processed in memory and not stored on disk. Only the extracted text content is preserved in Honcho's message system. ## Basic Usage ```python Python theme={null} from honcho import Honcho # Initialize client honcho = Honcho() # Create session and peer session = honcho.session("research-session") user = honcho.peer("researcher") # Upload a PDF to a session with open("research_paper.pdf", "rb") as file: messages = session.upload_file( file=file, peer_id=user.id, ) print(f"Created {len(messages)} messages from the PDF") ``` ```typescript TypeScript theme={null} import { Honcho } from "@honcho-ai/sdk"; import fs from "fs"; (async () => { // Initialize client const honcho = new Honcho({}); // Create session and peer const session = await honcho.session("research-session"); const user = await honcho.peer("researcher"); // Upload a PDF to a session const fileStream = fs.createReadStream("research_paper.pdf"); const messages = await session.uploadFile(fileStream, user.id); console.log(`Created ${messages.length} messages from the PDF`); })(); ``` ## Upload Parameters The upload methods accept the following parameters: | Parameter | Type | Required | Description | | --------- | ------ | -------- | ------------------------------------ | | `file` | File | Yes | File to upload | | `peer_id` | String | Yes | ID of the peer creating the messages | ## File Processing Details ### Text Extraction **PDF Files**: Text is extracted page by page with page numbers preserved: ``` [Page 1] Introduction This document provides... [Page 2] Methodology Our approach involves... ``` **Text Files**: Content is decoded using UTF-8, UTF-16, or Latin-1 encoding as needed. **JSON Files**: Structured data is converted to string format. ### Chunking Strategy Large files are automatically split into chunks of \~49,500 characters. The system seeks to break at natural boundaries if present: 1. Paragraph breaks (`\n\n`) 2. Line breaks (`\n`) 3. Sentence endings (`. `) 4. Word boundaries (` `) Each chunk becomes a separate message, maintaining the original document structure. ## Querying Uploaded Content Once files are uploaded, you can query the content using Honcho's natural language interface: ```python Python theme={null} # Query what was learned from the uploaded documents response = user.chat("What are the key findings from the research papers I uploaded?") print(response) # Ask about specific documents response = user.chat("What does the quarterly report say about revenue growth?") print(response) # Get context from the uploaded documents for LLM integration context = session.context(tokens=3000) messages = context.to_openai(assistant=assistant) ``` ```typescript TypeScript theme={null} (async () => { // Query what was learned from the uploaded documents const response = await user.chat("What are the key findings from the research papers I uploaded?"); console.log(response); // Ask about specific documents const response2 = await user.chat("What does the quarterly report say about revenue growth?"); console.log(response2); // Get context from the uploaded documents for LLM integration const context = await session.context({ tokens: 3000 }); const messages = context.toOpenAI(assistant); })(); ``` ## Error Handling ### Unsupported File Types Files with unsupported content types will raise an exception: ```python theme={null} try: messages = session.upload_file( file=open("image.jpg", "rb"), peer_id=user.id ) except Exception as e: print(f"Upload failed: {e}") # Error: "Could not process file image.jpg: Unsupported file type: image/jpeg" ``` ### Missing Required Fields Session uploads require a `peer_id` parameter: ```python theme={null} # This will fail for session uploads try: messages = session.upload_file(file=file) # Missing peer_id except ValueError as e: print(f"Validation error: {e}") ``` ## Complete Example: Document Analysis Assistant Here's a complete example of building a document analysis assistant: ```python Python theme={null} from honcho import Honcho # Initialize honcho = Honcho() session = honcho.session("document-analysis") user = honcho.peer("analyst") assistant = honcho.peer("analysis-bot") def upload_document(file_path, description): """Upload a document and add it to the session""" with open(file_path, "rb") as file: messages = session.upload_file( file=file, peer_id=user.id, ) return messages def analyze_documents(): """Get AI analysis of uploaded documents""" context = session.context(tokens=4000) messages = context.to_openai(assistant=assistant) # Add analysis request messages.append({ "role": "user", "content": "Please analyze all the documents I've uploaded and provide a comprehensive summary of the key findings, trends, and recommendations." }) # Call OpenAI (or your preferred LLM) # response = openai.chat.completions.create(model="gpt-4", messages=messages) # return response.choices[0].message.content return "Analysis would be generated here" # Upload multiple documents documents = [ ("quarterly_report.pdf", "Q3 2024 Quarterly Financial Report"), ("market_research.pdf", "Market Analysis and Competitive Landscape"), ("product_roadmap.pdf", "Product Development Roadmap 2024-2025") ] for file_path, description in documents: messages = upload_document(file_path, description) print(f"Uploaded {file_path}: {len(messages)} messages created") # Get AI analysis analysis = analyze_documents() print("Document Analysis:", analysis) ``` ```typescript TypeScript theme={null} import { Honcho } from "@honcho-ai/sdk"; import fs from "fs"; (async () => { // Initialize const honcho = new Honcho({}); const session = await honcho.session("document-analysis"); const user = await honcho.peer("analyst"); const assistant = await honcho.peer("analysis-bot"); async function uploadDocument(filePath: string, description: string) { const fileStream = fs.createReadStream(filePath); const messages = await session.uploadFile(fileStream, user.id); return messages; } async function analyzeDocuments() { const context = await session.context({ tokens: 4000 }); const messages = context.toOpenAI(assistant); // Add analysis request messages.push({ role: "user", content: "Please analyze all the documents I've uploaded and provide a comprehensive summary of the key findings, trends, and recommendations." }); // Call OpenAI (or your preferred LLM) // const response = await openai.chat.completions.create({ model: "gpt-4", messages }); // return response.choices[0].message.content; return "Analysis would be generated here"; } // Upload multiple documents const documents = [ ["quarterly_report.pdf", "Q3 2024 Quarterly Financial Report"], ["market_research.pdf", "Market Analysis and Competitive Landscape"], ["product_roadmap.pdf", "Product Development Roadmap 2024-2025"] ]; for (const [filePath, description] of documents) { const messages = await uploadDocument(filePath, description); console.log(`Uploaded ${filePath}: ${messages.length} messages created`); } // Get AI analysis const analysis = await analyzeDocuments(); console.log("Document Analysis:", analysis); })(); ``` ## Error Handling * **Always wrap uploads in try-catch blocks** for robust error handling * **Validate file types** before upload to avoid processing errors * **Handle large files gracefully** with progress indicators * **Implement retry logic** for network failures # Advanced Features Source: https://honcho.dev/docs/v3/documentation/features/advanced/overview Advanced configuration and monitoring options for Honcho Advanced features give you fine-grained control over Honcho's behavior and implementation. ## Reasoning & Memory * [Configuration](/docs/v3/documentation/features/advanced/reasoning-configuration) - Configure reasoning models and behavior * [Summarizer](/docs/v3/documentation/features/advanced/summarizer) - Automatic session summarization * [Peer Card](/docs/v3/documentation/features/advanced/peer-card) - Quick-reference profile of stable biographical facts about a peer * [Representation Scopes](/docs/v3/documentation/features/advanced/representation-scopes) - Directional representations for multi-peer scenarios * [Dreaming](/docs/v3/documentation/features/advanced/dreaming) - Autonomous memory consolidation and self-improvement * [Queue Status](/docs/v3/documentation/features/advanced/queue-status) - Monitor background processing and reasoning tasks ## Querying & Data * [Search](/docs/v3/documentation/features/advanced/search) - Search across peers, sessions, and messages * [Filters](/docs/v3/documentation/features/advanced/using-filters) - Filter queries with advanced parameters * [Streaming Responses](/docs/v3/documentation/features/advanced/streaming-response) - Stream dialectic responses in real-time * [File Uploads](/docs/v3/documentation/features/advanced/file-uploads) - Ingest files into peer memory # Peer Card Source: https://honcho.dev/docs/v3/documentation/features/advanced/peer-card A quick-reference profile of stable biographical facts about a peer A **peer card** is a list of stable, biographical facts about a peer--name, occupation, preferences, standing instructions--that acts as a quick-reference profile. While the full [representation](/docs/v3/documentation/core-concepts/representation) contains all of Honcho's reasoning (conclusions, summaries, semantic search results), the peer card captures the grounding facts that should never be forgotten. Think of it as the front of a contact card: the information an agent needs at a glance to know who it's talking to. ## What Goes in a Peer Card Peer cards are designed for **durable, biographical information**--things that remain true across sessions and contexts. Each fact is stored as a single string in a list. | Category | Examples | | ------------- | -------------------------------------------------------- | | Identity | `"Name: Alice"`, `"Age: 28"`, `"Location: Portland, OR"` | | Occupation | `"Works as a senior engineer at Acme Corp"` | | Relationships | `"Has a dog named Max"`, `"Married to Bob"` | | Instructions | `"INSTRUCTION: Always address as Dr. Chen"` | | Preferences | `"PREFERENCE: Prefers concise responses"` | | Traits | `"TRAIT: Detail-oriented, prefers data over anecdotes"` | Peer cards are **not** for transient information like current mood, recent conversation topics, or reasoning traces. Those belong in conclusions and summaries. ## How Peer Cards Are Created Peer cards are populated through two paths: **1. Automatic (via Dreaming)** When [dreaming](/docs/v3/documentation/features/advanced/dreaming) runs, the deduction and induction specialists extract stable biographical facts from existing conclusions and write them to the peer card. This happens without any manual intervention--Honcho identifies facts like names, occupations, and preferences from conversation history and records them automatically. **2. Manual (via SDK or API)** You can set a peer card directly. This is useful for bootstrapping a peer with known information before any conversation has occurred, or for correcting facts that Honcho hasn't yet discovered. ```python Python theme={null} from honcho import Honcho honcho = Honcho() peer = honcho.peer("user-123") # Set the peer card peer.set_card([ "Name: Alice. Also known as 'Ali'.", "College student at MIT, studying computer science.", "PREFERENCE: Prefers casual tone.", "INSTRUCTION: Never mention her ex-boyfriend.", ]) # Retrieve the peer card card = peer.get_card() print(card) # ["Name: Alice. Also known as 'Ali'.", "College student at MIT...", ...] ``` ```typescript TypeScript theme={null} import { Honcho } from "@honcho-ai/sdk"; const honcho = new Honcho({}); const peer = await honcho.peer("user-123"); // Set the peer card await peer.setCard([ "Name: Alice. Also known as 'Ali'.", "College student at MIT, studying computer science.", "PREFERENCE: Prefers casual tone.", "INSTRUCTION: Never mention her ex-boyfriend.", ]); // Retrieve the peer card const card = await peer.getCard(); console.log(card); // ["Name: Alice. Also known as 'Ali'.", "College student at MIT...", ...] ``` ## Directional Peer Cards Peer cards follow the same observer-observed model as [representations](/docs/v3/documentation/features/advanced/representation-scopes). When `observe_others` is enabled, a peer can have a **different** card for each peer it observes. For example, if Alice and Bob are in a session together and Alice has `observe_others: true`, Alice will build her own peer card for Bob--separate from Honcho's peer card for Bob. You can read and write these directional cards using the `target` parameter. ```python Python theme={null} alice = honcho.peer("alice") # Get Alice's own peer card (Honcho's view of Alice) alice_card = alice.get_card() # Get Alice's card for Bob (Alice's view of Bob) alice_bob_card = alice.get_card(target="bob") # Set Alice's card for Bob alice.set_card( ["Bob mentioned he's allergic to peanuts."], target="bob" ) ``` ```typescript TypeScript theme={null} const alice = await honcho.peer("alice"); // Get Alice's own peer card (Honcho's view of Alice) const aliceCard = await alice.getCard(); // Get Alice's card for Bob (Alice's view of Bob) const aliceBobCard = await alice.getCard({ target: "bob" }); // Set Alice's card for Bob await alice.setCard( ["Bob mentioned he's allergic to peanuts."], { target: "bob" } ); ``` ## Where Peer Cards Are Used ### In the Dialectic (Chat Endpoint) When you call [`peer.chat()`](/docs/v3/documentation/features/chat), Honcho automatically injects the relevant peer cards into the system prompt. The dialectic agent sees both the observer's own card and the observed peer's card, giving it immediate grounding without needing to search memory. ### In Context Retrieval The [`session.context()`](/docs/v3/documentation/features/get-context) method includes the peer card when you specify a `peer_target`: ```python Python theme={null} context = session.context( tokens=2000, peer_target="user-123" ) # Access the peer card alongside the representation print(context.peer_card) # List of peer card facts print(context.peer_representation) # Full representation text ``` ```typescript TypeScript theme={null} const context = await session.context({ tokens: 2000, peerTarget: "user-123" }); console.log(context.peerCard); // Array of peer card facts console.log(context.peerRepresentation); // Full representation text ``` ### During Dreaming The [dreaming](/docs/v3/documentation/features/advanced/dreaming) process reads the current peer card before consolidation, then updates it with any new stable facts discovered during the deduction and induction phases. ## Limits | Constraint | Value | | ---------------------- | ----------------------------------- | | Maximum facts per card | **40** | | Data type | `list[str]` (each fact is a string) | When the dreaming process or a manual update pushes the card beyond 40 facts, it is automatically truncated to the first 40 entries. Keep facts concise and deduplicated to stay within the limit. If you manually set a peer card, it **replaces** the entire card--it does not merge with existing facts. Make sure to include all facts you want to keep when calling `set_card()`. ## Configuration Peer card behavior is controlled through the [configuration hierarchy](/docs/v3/documentation/features/advanced/reasoning-configuration). You can independently toggle whether agents **use** existing peer cards and whether they **create/update** them. ```python Python theme={null} # Disable peer card updates but still use existing cards during reasoning session = honcho.session("my-session", config={ "peer_card": {"create": False, "use": True} }) # Disable peer cards entirely session = honcho.session("no-cards", config={ "peer_card": {"create": False, "use": False} }) ``` ```typescript TypeScript theme={null} // Disable peer card updates but still use existing cards during reasoning const session = await honcho.session("my-session", { config: { peer_card: { create: false, use: true } } }); // Disable peer cards entirely const noCards = await honcho.session("no-cards", { config: { peer_card: { create: false, use: false } } }); ``` | Field | Type | Default | Description | | -------- | ------ | ------- | ----------------------------------------------------------- | | `use` | `bool` | `true` | Whether agents read the peer card during reasoning and chat | | `create` | `bool` | `true` | Whether agents can create or update peer cards | Configuration can be set at the workspace, session, or message level. See [Reasoning Configuration](/docs/v3/documentation/features/advanced/reasoning-configuration) for the full hierarchy. ## Best Practices 1. **Bootstrap with known facts.** If you already know the user's name or preferences at signup, set the peer card immediately. This gives the agent grounding from the very first interaction instead of waiting for dreaming to discover it. 2. **Use structured prefixes.** Prefixing facts with `INSTRUCTION:`, `PREFERENCE:`, or `TRAIT:` makes it easier for the agent to distinguish categories and act on them appropriately. 3. **Keep facts atomic.** Each string should contain one fact. Avoid combining multiple pieces of information into a single entry--`"Name: Alice"` and `"Location: Portland"` are better than `"Alice lives in Portland and works at Acme"`. 4. **Let dreaming handle updates.** For most applications, you don't need to manually manage the peer card after bootstrapping. The dreaming process will discover and record new facts as conversations progress. 5. **Use `set_card` for corrections.** If the automatic system has recorded something incorrect, manually set the card with the corrected facts. Remember this replaces the entire card. Understand the full representation system that peer cards complement Learn how dreaming automatically populates peer cards Configure peer card behavior at workspace, session, or message level API reference for retrieving peer cards # Queue Status Source: https://honcho.dev/docs/v3/documentation/features/advanced/queue-status Learn how to check the status of Honcho's reasoning Whenever messages are stored in Honcho, background processes kick off to [reason](/docs/v3/documentation/core-concepts/reasoning) about the conversation and generate insights. Reasoning is an asynchronous process and will not immediately generate insights for the latest message you've sent. This is by design: we want to reason efficiently over batches of messages rather than assessing each message in a vacuum. Honcho provides several utilities to check the status of the queue. ```python Python theme={null} from honcho import Honcho honcho = Honcho() status = honcho.queue_status() ``` ```typescript typescript theme={null} import { Honcho } from '@honcho-ai/sdk'; const honcho = new Honcho({}); const status = await honcho.queueStatus(); ``` Output types ```python Python theme={null} class QueueStatus(BaseModel): completed_work_units: int """Completed work units""" in_progress_work_units: int """Work units currently being processed""" pending_work_units: int """Work units waiting to be processed""" total_work_units: int """Total work units""" sessions: Optional[Dict[str, Sessions]] = None """Per-session status when not filtered by session""" ``` ```typescript TypeScript theme={null} Promise<{ totalWorkUnits: number completedWorkUnits: number inProgressWorkUnits: number pendingWorkUnits: number sessions?: Record }> ``` Whenever a message is sent it will generate several tasks. These could be tasks such as generating insights, cleaning up a representation, summarizing a conversation etc. These tasks are defined based on who is sending the message, what session the message is in, and potentially who is observing the message. We call the combination of these parameters a `work_unit` This has a few different implications. * tasks within the same work\_unit are processed sequentially, but multiple work\_units will be processed in parallel * If local representations are turned in a Session then a message will generate an additional work unit for every peer that has `observe_others=True` ### Tracked task types The queue status endpoint reports on the following task types: | Task Type | Description | | ------------------ | ------------------------------------------------------------------------------------------ | | **representation** | Memory formation — the deriver processes messages and extracts observations about peers | | **summary** | Session summarization — creates short and long summaries at configurable message intervals | | **dream** | Memory consolidation — explores and consolidates observations to improve memory quality | Internal infrastructure tasks (such as webhook delivery, resource deletion, and vector reconciliation) are **not** included in queue status counts. **Completed counts are not lifetime totals.** Honcho periodically cleans up processed queue items to keep the queue table lean. As a result, `completed_work_units` reflects items completed since the last cleanup cycle, not the total number of items ever processed. The `queue_status` method can take additional parameters to scope the status to a specific work unit: ```python Python theme={null} def queue_status( self, observer_id: str | None = None, sender_id: str | None = None, session_id: str | None = None, ) -> QueueStatus: ``` ```typescript TypeScript theme={null} export const QueueStatusOptionsSchema = z.object({ observerId: z.string().optional(), senderId: z.string().optional(), sessionId: z.string().optional(), timeoutMs: z .number() .positive('Timeout must be a positive number') .optional(), }) ``` Additionally, there are queue status methods available on the session objects in each of the SDKs. **Do not wait for the queue to be empty.** The queue is a continuous processing system—new messages may arrive at any time, and "completion" is not a meaningful state. Design your application to work without assuming the queue will ever be fully drained. Use `queueStatus()` for observability and debugging, not for synchronization. Below are the function signatures for the session level queue status method: ```python python theme={null} @validate_call def queue_status( self, observer_id: str | None = None, sender_id: str | None = None, ) -> QueueStatus: ``` ```typescript TypeScript theme={null} async queueStatus( options?: Omit ): Promise<{ totalWorkUnits: number completedWorkUnits: number inProgressWorkUnits: number pendingWorkUnits: number sessions?: Record }> ``` # Reasoning Configuration Source: https://honcho.dev/docs/v3/documentation/features/advanced/reasoning-configuration Customize how Honcho reasons over peers, sessions, and messages Honcho's reasoning can be configured at multiple levels to control how it processes messages, generates conclusions, creates summaries, and builds peer representations. Configuration follows a hierarchy: **message > session > workspace > global defaults**. Settings at lower levels override those at higher levels, giving you fine-grained control over behavior. ## Configuration Hierarchy Honcho uses a hierarchical configuration system where more specific settings override more general ones: 1. **Global Defaults**: Built-in system defaults 2. **Workspace Configuration**: Settings that apply to all sessions in a workspace 3. **Session Configuration**: Settings that apply to all messages in a session 4. **Message Configuration**: Settings that apply to a specific message Separately, you can configure the reasoning status of a peer. This overrides defaults and workspace configuration, but not session or message configuration. All configuration fields are optional. If not specified, the value is inherited from the next level up in the hierarchy. ## Configuration Options ### Reasoning Configuration Controls whether the system should reason over messages. | Field | Type | Description | | --------- | ------ | ---------------------------------------------------------------------------------------------------- | | `enabled` | `bool` | Whether to enable reasoning functionality. When disabled, no facts or representations are generated. | ```python Python theme={null} from honcho import Honcho honcho = Honcho() # Disable reasoning at session level session = honcho.session("private-session", config={ "reasoning": {"enabled": False} }) ``` ```typescript TypeScript theme={null} import { Honcho } from "@honcho-ai/sdk"; const honcho = new Honcho({}); // Disable reasoning at session level const session = await honcho.session("private-session", { config: { reasoning: { enabled: false } } }); ``` ### Peer Card Configuration Controls how peer cards (containing key biographical information) are generated and used. | Field | Type | Description | | -------- | ------ | ------------------------------------------------------------------- | | `use` | `bool` | Whether to use peer cards during the reasoning process. | | `create` | `bool` | Whether to generate and update peer cards based on message content. | ```python Python theme={null} # Disable peer card generation but still use existing cards session = honcho.session("my-session", config={ "peer_card": {"create": False, "use": True} }) ``` ```typescript TypeScript theme={null} // Disable peer card generation but still use existing cards const session = await honcho.session("my-session", { config: { peer_card: { create: false, use: true } } }); ``` ### Summary Configuration Controls automatic conversation summarization. Available at workspace and session levels only. | Field | Type | Description | | ---------------------------- | ------ | ------------------------------------------------------------------------------------------------------ | | `enabled` | `bool` | Whether to enable summary functionality. | | `messages_per_short_summary` | `int` | Number of messages between short summaries. Must be ≥ 10. | | `messages_per_long_summary` | `int` | Number of messages between long summaries. Must be ≥ 20 and greater than `messages_per_short_summary`. | ```python Python theme={null} # Customize summary frequency session = honcho.session("verbose-session", config={ "summary": { "enabled": True, "messages_per_short_summary": 15, "messages_per_long_summary": 45 } }) ``` ```typescript TypeScript theme={null} // Customize summary frequency const session = await honcho.session("verbose-session", { config: { summary: { enabled: true, messages_per_short_summary: 15, messages_per_long_summary: 45 } } }); ``` ### Dream Configuration Controls the "dreaming" process that consolidates and refines representations. Available at workspace and session levels only. | Field | Type | Description | | --------- | ------ | --------------------------------------------------------------------------------------- | | `enabled` | `bool` | Whether to enable dream functionality. Automatically disabled if reasoning is disabled. | ```python Python theme={null} # Disable dreams for a workspace honcho.set_configuration({ "dream": { "enabled": False } }) ``` ```typescript TypeScript theme={null} // Disable dreams for a workspace await honcho.setConfiguration({ dream: { enabled: false } }); ``` *** ## Peer Configuration By default, all peers are "observed" by Honcho. This means that Honcho will reason over messages sent by the peer and generate a representation of them. In most cases, this is why you use Honcho! However, sometimes an application requires a peer that should not be observed: for example, an assistant or game NPC that your program will never need to access advanced reasoning for. You may therefore disable observation of a peer by setting the `observe_me` flag in their configuration to `false`. If the peer has a session-level configuration, it will override this configuration. If the flag is not set, or is set to `true`, the peer will be observed. For session-level observation controls and local representations (where peers build separate models of each other), see [Representation Scopes](/docs/v3/documentation/features/advanced/representation-scopes). ```python Python theme={null} from honcho import Honcho # Initialize client honcho = Honcho() # Create peer with configuration peer = honcho.peer("my-peer", configuration={"observe_me": False}) # Change peer's configuration peer.set_configuration({"observe_me": True}) # Note: creating the same peer again will also replace the configuration peer = honcho.peer("my-peer", configuration={"observe_me": False}) ``` ```typescript TypeScript theme={null} import { Honcho } from "@honcho-ai/sdk"; (async () => { // Initialize client const honcho = new Honcho({}); // Create peer with configuration const peer = await honcho.peer("my-peer", { configuration: { observeMe: false } }); // Change peer's configuration await peer.setConfiguration({ observeMe: true }); // Note: creating the same peer again will also replace the configuration await honcho.peer("my-peer", { configuration: { observeMe: false } }); })(); ``` ## Session Configuration Sessions support the full configuration schema. You can disable reasoning entirely for a session, customize summary behavior, or adjust peer card settings. ```python Python theme={null} from honcho import Honcho # Initialize client honcho = Honcho() # Create session with reasoning disabled session = honcho.session("my-session", configuration={ "reasoning": {"enabled": False} }) # Create session with custom summary settings session = honcho.session("detailed-session", configuration={ "summary": { "messages_per_short_summary": 10, "messages_per_long_summary": 30 } }) ``` ```typescript TypeScript theme={null} import { Honcho } from "@honcho-ai/sdk"; (async () => { // Initialize client const honcho = new Honcho({}); // Create session with reasoning disabled const session = await honcho.session("my-session", { configuration: { reasoning: { enabled: false } } }); // Create session with custom summary settings const detailedSession = await honcho.session("detailed-session", { configuration: { summary: { messages_per_short_summary: 10, messages_per_long_summary: 30 } } }); })(); ``` ## Message Configuration Individual messages can override session and workspace configuration for fine-grained control. This is useful for excluding specific messages from processing or adjusting behavior on a per-message basis. ```python Python theme={null} from honcho import Honcho honcho = Honcho() session = honcho.session("my-session") user = honcho.peer("user") # Create a message that skips the reasoning process session.add_messages([ user.message("This message won't be analyzed", configuration={ "reasoning": {"enabled": False} }) ]) # Create a message with custom peer card settings session.add_messages([ user.message("Use existing card but don't update it", configuration={ "peer_card": {"use": True, "create": False} }) ]) ``` ```typescript TypeScript theme={null} import { Honcho } from "@honcho-ai/sdk"; (async () => { const honcho = new Honcho({}); const session = await honcho.session("my-session"); const user = await honcho.peer("user"); // Create a message that skips the reasoning process await session.addMessages([ user.message("This message won't be analyzed", { configuration: { reasoning: { enabled: false } } }) ]); })(); ``` ## Full Configuration Schema Reference ### Workspace & Session Configuration ```json theme={null} { "reasoning": { "enabled": true }, "peer_card": { "use": true, "create": true }, "summary": { "enabled": true, "messages_per_short_summary": 20, "messages_per_long_summary": 60 }, "dream": { "enabled": true } } ``` ### Message Configuration ```json theme={null} { "reasoning": { "enabled": true }, "peer_card": { "use": true, "create": true } } ``` Message configuration only supports reasoning and `peer_card` settings. Summary and dream configurations are session/workspace-level only. # Representation Scopes Source: https://honcho.dev/docs/v3/documentation/features/advanced/representation-scopes Advanced configuration and querying for representations Assuming reasoning is enabled, you can control the perspectives representations are built from. This page covers: 1. **Default Behavior** — Honcho reasons over every message written to a peer 2. **Observer-Observed Model** — How peers build representations of other peers 3. **Querying with Target** — Accessing perspective-specific representations 4. **Use Cases** — When to use directional representations ## Default: Reasoning On When `observe_me=true` (the default), Honcho forms one representation per peer, reasoning over every message written to that peer across all sessions. You can retrieve a subset of conclusions from a peer's representation using `representation()`: ```python theme={null} # Retrieve conclusions from Honcho's representation of Alice (across all sessions) alice_rep = session.representation("alice") # Or via chat response = alice.chat("What are Alice's main interests?", session_id=session.id) ``` This is sufficient for most applications—Honcho reasons over every message written to the peer, storing conclusions that any part of your system can retrieve. ## Observer-Observed Representations When you enable `observe_others=true` at the session level, peers begin forming **directional representations** of other peers they interact with. These representations are scoped to what that observer has actually witnessed. ### How It Works Each peer has **one representation**, but that representation can contain reasoning about: * **Itself** (when Honcho observes the peer with `observe_me=true`) * **Other peers** (when the peer observes others with `observe_others=true`) These are stored as separate (observer, observed) pairs in Honcho's internal collections: | Observer | Observed | What This Represents | | -------- | -------- | ----------------------------------------------------------------------- | | alice | alice | Honcho's representation of Alice (across all sessions) | | alice | bob | Alice's representation of Bob (from sessions Alice participated in) | | alice | charlie | Alice's representation of Charlie (from sessions Alice participated in) | ### Information Segmentation This enables sophisticated scenarios where different agents have different knowledge based on what they've actually witnessed. **Example**: Bob and Charlie tell different things to Alice in separate sessions. ``` Session 1 (Alice + Bob): Bob → "I had pancakes for breakfast." Session 2 (Alice + Charlie): Charlie → "I had pancakes for breakfast. Bob is lying about his breakfast." ``` With `observe_others=true` enabled on Alice: * **Alice's representation of Bob** only includes Session 1 (she heard Bob say he had pancakes) * **Alice's representation of Charlie** only includes Session 2 (she heard Charlie's claim about Bob lying) * **Honcho's representation of Alice** reasons over both sessions ## Querying with Target The `target` parameter controls which representation you retrieve: | Query | Returns | | ------------------------------------------- | ---------------------------------------------------------------------------------------- | | `representation("alice")` | Conclusions from Honcho's representation of Alice (across all sessions) | | `representation("alice", target="bob")` | Conclusions from Alice's representation of Bob (from sessions Alice participated in) | | `representation("alice", target="charlie")` | Conclusions from Alice's representation of Charlie (from sessions Alice participated in) | ### Code Examples ```python Python theme={null} from honcho import Honcho from honcho.api_types import SessionPeerConfig honcho = Honcho() session = honcho.session("game-session") alice = honcho.peer("alice") bob = honcho.peer("bob") charlie = honcho.peer("charlie") # Add peers to session session.add_peers([alice, bob, charlie]) # Enable Alice to form representations of others session.set_peer_configuration(alice, SessionPeerConfig(observe_others=True)) # Add messages session.add_messages([ bob.message("I had pancakes for breakfast."), charlie.message("I prefer waffles.") ]) # Different sessions with different participants session2 = honcho.session("game-session-2") session2.add_peers([alice, charlie]) session2.set_peer_configuration(alice, SessionPeerConfig(observe_others=True)) session2.add_messages([ charlie.message("I didn't have breakfast. I lied to Bob.") ]) # Retrieve conclusions from different perspectives honcho_view = session.representation("alice") # Across all sessions bob_view = session.representation("alice", target="bob") # Alice's view of Bob charlie_view = session2.representation("alice", target="charlie") # Alice's view of Charlie ``` ```typescript TypeScript theme={null} import { Honcho } from "@honcho-ai/sdk"; const honcho = new Honcho({}); const session = await honcho.session("game-session"); const alice = await honcho.peer("alice"); const bob = await honcho.peer("bob"); const charlie = await honcho.peer("charlie"); await session.addPeers([alice, bob, charlie]); await session.setPeerConfiguration(alice, { observeOthers: true }); await session.addMessages([ bob.message("I had pancakes for breakfast."), charlie.message("I prefer waffles.") ]); const session2 = await honcho.session("game-session-2"); await session2.addPeers([alice, charlie]); await session2.setPeerConfiguration(alice, { observeOthers: true }); await session2.addMessages([ charlie.message("I didn't have breakfast. I lied to Bob.") ]); // Retrieve conclusions from different perspectives const honchoView = await session.representation("alice"); // Across all sessions const bobView = await session.representation("alice", { target: "bob" }); // Alice's view of Bob const charlieView = await session2.representation("alice", { target: "charlie" }); // Alice's view of Charlie ``` ### Chat Endpoint with Target The `target` parameter also works with the chat endpoint: ```python Python theme={null} # Query using conclusions from Honcho's representation (across all sessions) honcho_answer = alice.chat( "What did Bob say about breakfast?", session_id=session.id ) # Query using conclusions from Alice's representation of Bob (from Alice's sessions only) alice_answer = alice.chat( "What did Bob say about breakfast?", session_id=session.id, target="bob" ) ``` ```typescript TypeScript theme={null} // Query using conclusions from Honcho's representation (across all sessions) const honchoAnswer = await alice.chat( "What did Bob say about breakfast?", { sessionId: session.id } ); // Query using conclusions from Alice's representation of Bob (from Alice's sessions only) const aliceAnswer = await alice.chat( "What did Bob say about breakfast?", { sessionId: session.id, target: "bob" } ); ``` The `target` parameter only returns meaningful results if the observer peer has `observe_others=true` and has actually participated in sessions with the observed peer. Otherwise, the representation will be empty or non-existent. ## When to Use Directional Representations ### Use Cases Where This Matters 1. **Multi-agent games**: NPCs should only know what they've witnessed, not omniscient game state 2. **Information asymmetry scenarios**: Different agents have access to different information 3. **Perspective-dependent agents**: Agent behavior depends on their unique understanding of other agents 4. **Privacy-segmented systems**: Users should only see representations based on their interactions ### Use Cases Where Default Is Sufficient 1. **Single-user applications**: Only one user, so perspective doesn't matter 2. **Centralized knowledge systems**: All agents should share the same understanding 3. **Simple chatbots**: No multi-agent interaction or information segmentation needed Most applications don't need directional representations. Start with the default Honcho-observes-all behavior and only enable `observe_others` when you need information segmentation between agents. ## Architecture: How It's Stored Under the hood, Honcho stores representations as (observer, observed) pairs in internal collections: * **Collection**: A unique (observer, observed, workspace) tuple containing documents * **Documents**: Individual conclusions and artifacts (deductive, inductive, abductive conclusions, summaries, peer cards) with session scoping When you retrieve with `target`, Honcho fetches documents from the specific (observer, observed) collection. When you retrieve without `target`, it fetches from the (peer, peer) collection—the peer's self-representation. This architecture enables: * **Efficient querying**: Each perspective is isolated and can be queried independently * **Session filtering**: Within a collection, documents can be filtered by session * **Scalability**: Adding more observers doesn't degrade query performance ## Semantic Search Parameters Both `representation()` and `chat()` support semantic filtering to retrieve a subset of relevant conclusions. You can optionally filter by session to retrieve only conclusions from specific session context: | Parameter | Type | Description | | ----------------------- | ------- | ----------------------------------------- | | `search_query` | `str` | Semantic query to filter conclusions | | `search_top_k` | `int` | Number of results to include (1–100) | | `search_max_distance` | `float` | Maximum semantic distance (0.0–1.0) | | `include_most_frequent` | `bool` | Include most frequent conclusions | | `max_conclusions` | `int` | Cap on total conclusions returned (1–100) | ```python Python theme={null} # Retrieve conclusions about billing from Alice's representation of Bob alice_view_billing = session.representation( "alice", target="bob", search_query="billing issues", search_top_k=10, include_most_frequent=True ) ``` ```typescript TypeScript theme={null} const aliceViewBilling = await session.representation("alice", { target: "bob", searchQuery: "billing issues", searchTopK: 10, includeMostFrequent: true }); ``` ## When Representations Update Directional representations update automatically through the reasoning pipeline when: 1. A message is created in a session 2. The message sender has `observe_me=true` (or session-level equivalent) 3. Other peers in the session have `observe_others=true` The pipeline respects scoping—Honcho's representations reason over messages across all sessions, while directional representations only reason over messages from sessions where the observer was an active participant. ### Peer Join Order Matters Reasoning tasks are scheduled at the time a message is created, based on which peers are in the session **at that moment**. Honcho does not retroactively schedule reasoning for peers that join later. This means: * If Peer C joins a session **after** messages from Peer A and Peer B have already been sent, Peer C will **not** receive reasoning tasks for those earlier messages—even if Peer C has `observe_others=true`. * Peer C will only begin observing new messages sent after they join the session. * Similarly, if a peer leaves a session, they stop being included as an observer for any messages sent after their departure. There is no retroactive reasoning. If your application needs an observer peer to reason about prior conversation history, add the peer to the session **before** messages are sent. Alternatively, use `peer.chat()` to include conversation history in the agent's context whether or not those messages were previously reasoned over. Conclusions are cached for fast retrieval. Use `representation()` to retrieve stored conclusions for dashboards and analytics. Use `peer.chat()` when you need query-specific reasoning with natural language. # Search Source: https://honcho.dev/docs/v3/documentation/features/advanced/search Learn how to search across workspaces, sessions, and peers to find relevant conversations and content Honcho's search functionality allows you to find relevant messages and conversations across different scopes - from entire workspaces down to specific peers or sessions. ## Search Scopes ### Workspace Search Search across all content in your workspace - sessions, peers, and messages: ```python Python theme={null} from honcho import Honcho # Initialize client honcho = Honcho() # Search across entire workspace results = honcho.search("budget planning") # Iterate through all results for result in results: print(f"Found: {result}") ``` ```typescript TypeScript theme={null} import { Honcho } from "@honcho-ai/sdk"; (async () => { // Initialize client const honcho = new Honcho({}); // Search across entire workspace const results = await honcho.search("budget planning"); // Iterate through all results for (const result of results) { console.log(`Found: ${result}`); } })(); ``` ### Session Search Search within a specific session's conversation history: ```python Python theme={null} # Create or get a session session = honcho.session("team-meeting-jan") # Search within this session only results = session.search("action items") # Process results for result in results: print(f"Session result: {result}") ``` ```typescript TypeScript theme={null} (async () => { // Create or get a session const session = await honcho.session("team-meeting-jan"); // Search within this session only const results = await session.search("action items"); // Process results for (const result of results) { console.log(`Session result: ${result}`); } })(); ``` ### Peer Search Search across all content associated with a specific peer: ```python Python theme={null} # Create or get a peer alice = honcho.peer("alice") # Search across all of Alice's messages and interactions results = alice.search("programming") # View results for result in results: print(f"Alice's content: {result}") ``` ```typescript TypeScript theme={null} import { Message } from "@honcho-ai/sdk"; (async () => { // Create or get a peer const alice = await honcho.peer("alice"); // Search across all of Alice's messages and interactions const results: Message[] = await alice.search("programming"); // View results for (const result of results) { console.log(`Alice's content: ${result.content}`); } })(); ``` ## Filters and Limits ### Get a specific number of results You can specify the number of results you want to return by passing the `limit` parameter to the search method. The default is 10 results, with a maximum of 100. ```python Python theme={null} results = honcho.search("budget planning", limit=20) ``` ```typescript TypeScript theme={null} (async () => { const results = await honcho.search("budget planning", { limit: 20 }); })(); ``` ### Get messages from a Peer in a specific Session Combine Peer-level search with a `session_id` filter to get messages from a Peer in a specific Session. ```python Python theme={null} my_peer = honcho.peer("my-peer") my_session = honcho.session("team-meeting-jan") results = my_peer.search("budget planning", filters={"session_id": my_session.id}) ``` ```typescript TypeScript theme={null} (async () => { const myPeer = await honcho.peer("my-peer"); const mySession = await honcho.session("team-meeting-jan"); const results = await myPeer.search("budget planning", { filters: { session_id: mySession.id } }); })(); ``` Search returns an object containing an `items` array of message objects: ```json theme={null} { "items": [ { "id": "", "content": "", "peer_id": "", "session_id": "", "metadata": {}, "created_at": "2023-11-07T05:31:56Z", "workspace_id": "", "token_count": 123 } ] } ``` ### Filter results by time range ```python Python theme={null} results = honcho.search("budget planning", filters={"created_at": {"gte": "2024-01-01", "lte": "2024-01-31"}}) ``` ```typescript TypeScript theme={null} (async () => { const results = await honcho.search("budget planning", { filters: { created_at: { gte: "2024-01-01", lte: "2024-01-31" } } }); })(); ``` ### Filter results by metadata ```python Python theme={null} results = honcho.search("budget planning", filters={"metadata": {"key": "value"}}) ``` ```typescript TypeScript theme={null} (async () => { const results = await honcho.search("budget planning", { filters: { metadata: { key: "value" } } }); })(); ``` ### Best Practices ### Handle Empty Results Gracefully ```python Python theme={null} # Always check for empty results results = honcho.search("very specific query") result_list = list(results) if result_list: print(f"Found {len(result_list)} results") for result in result_list: print(f"- {result}") else: print("No results found - try a broader search") ``` ```typescript TypeScript theme={null} import { Message } from "@honcho-ai/sdk"; (async () => { // Always check for empty results const results: Message[] = await honcho.search("very specific query"); if (results.length > 0) { console.log(`Found ${results.length} results`); for (const result of results) { console.log(`- ${result.content}`); } } else { console.log("No results found - try a broader search"); } })(); ``` ## Conclusion Honcho's search functionality provides powerful discovery capabilities across your conversational data. By understanding how to: * Choose the appropriate search scope (workspace, session, or peer) * Handle paginated results effectively * Combine search with context building You can build applications that provide intelligent insights and context-aware responses based on historical conversations and interactions. # Streaming Responses Source: https://honcho.dev/docs/v3/documentation/features/advanced/streaming-response Using streaming responses with Honcho SDKs When working with AI-generated content, streaming the response as it's generated can significantly improve the user experience. Honcho provides streaming functionality in its SDKs that allows your application to display content as it's being generated, rather than waiting for the complete response. ## When to Use Streaming Streaming is particularly useful for: * Real-time chat interfaces * Long-form content generation * Applications where perceived speed is important * Interactive agent experiences * Reducing time-to-first-word in user interactions ## Streaming with the Chat Endpoint One of the primary use cases for streaming in Honcho is with the [chat endpoint](/docs/v3/documentation/features/chat). This allows you to stream the AI's reasoning about a user in real-time. ### Prerequisites ```python Python theme={null} from honcho import Honcho # Initialize client (using the default workspace) honcho = Honcho() # Create or get peers user = honcho.peer("demo-user") assistant = honcho.peer("assistant") # Create a new session session = honcho.session("demo-session") # Add peers to the session session.add_peers([user, assistant]) # Store some messages for context (optional) session.add_messages([ user.message("Hello, I'm testing the streaming functionality") ]) ``` ```typescript TypeScript theme={null} import { Honcho } from '@honcho-ai/sdk'; (async () => { // Initialize client (using the default workspace) const honcho = new Honcho({}); // Create or get peers const user = await honcho.peer('demo-user'); const assistant = await honcho.peer('assistant'); // Create a new session const session = await honcho.session('demo-session'); // Add peers to the session await session.addPeers([user, assistant]); // Store some messages for context (optional) await session.addMessages([ user.message("Hello, I'm testing the streaming functionality") ]); })(); ``` ## Streaming from the Chat Endpoint ```python Python theme={null} import time # Basic streaming example response_stream = user.chat("What can you tell me about this user?", stream=True) for chunk in response_stream.iter_text(): print(chunk, end="", flush=True) # Print each chunk as it arrives time.sleep(0.01) # Optional delay for demonstration ``` ```typescript TypeScript theme={null} (async () => { // Basic streaming example const responseStream = await user.chat("What can you tell me about this user?", { stream: true }); // Process the stream for await (const chunk of responseStream.iter_text()) { process.stdout.write(chunk); // Write to console without newlines } })(); ``` ## Working with Streaming Data When working with streaming responses, consider these patterns: 1. **Progressive Rendering** - Update your UI as chunks arrive instead of waiting for the full response 2. **Buffered Processing** - Accumulate chunks until a logical break (like a sentence or paragraph) 3. **Token Counting** - Monitor token usage in real-time for applications with token limits 4. **Error Handling** - Implement appropriate error handling for interrupted streams ## Example: Restaurant Recommendation Chat ```python Python theme={null} import asyncio from honcho import Honcho async def restaurant_recommendation_chat(): # Initialize client honcho = Honcho() # Create peers user = honcho.peer("food-lover") assistant = honcho.peer("restaurant-assistant") # Create session session = honcho.session("food-preferences-session") # Add peers to session await session.add_peers([user, assistant]) # Store multiple user messages about food preferences user_messages = [ "I absolutely love spicy Thai food, especially curries with coconut milk.", "Italian cuisine is another favorite - fresh pasta and wood-fired pizza are my weakness!", "I try to eat vegetarian most of the time, but occasionally enjoy seafood.", "I can't handle overly sweet desserts, but love something with dark chocolate." ] # Add the user's messages to the session session_messages = [user.message(message) for message in user_messages] await session.add_messages(session_messages) # Print the user messages for message in user_messages: print(f"User: {message}") # Ask for restaurant recommendations based on preferences print("\nRequesting restaurant recommendations...") print("Assistant: ", end="", flush=True) full_response = "" # Stream the response using the user's peer to get recommendations response_stream = user.chat( "Based on this user's food preferences, recommend 3 restaurants they might enjoy in the Lower East Side.", stream=True, session_id=session.id ) for chunk in response_stream.iter_text(): print(chunk, end="", flush=True) full_response += chunk await asyncio.sleep(0.01) # Store the assistant's complete response await session.add_messages([ assistant.message(full_response) ]) # Run the async function if __name__ == "__main__": asyncio.run(restaurant_recommendation_chat()) ``` ```typescript TypeScript theme={null} import { Honcho } from '@honcho-ai/sdk'; (async () => { async function restaurantRecommendationChat() { // Initialize client const honcho = new Honcho({}); // Create peers const user = await honcho.peer('food-lover'); const assistant = await honcho.peer('restaurant-assistant'); // Create session const session = await honcho.session('food-preferences-session'); // Add peers to session await session.addPeers([user, assistant]); // Store multiple user messages about food preferences const userMessages = [ "I absolutely love spicy Thai food, especially curries with coconut milk.", "Italian cuisine is another favorite - fresh pasta and wood-fired pizza are my weakness!", "I try to eat vegetarian most of the time, but occasionally enjoy seafood.", "I can't handle overly sweet desserts, but love something with dark chocolate." ]; // Add the user's messages to the session const sessionMessages = userMessages.map(message => user.message(message)); await session.addMessages(sessionMessages); // Print the user messages for (const message of userMessages) { console.log(`User: ${message}`); } // Ask for restaurant recommendations based on preferences console.log("\nRequesting restaurant recommendations..."); process.stdout.write("Assistant: "); let fullResponse = ""; // Stream the response using the user's peer to get recommendations const responseStream = await user.chat( "Based on this user's food preferences, recommend 3 restaurants they might enjoy in the Lower East Side.", { stream: true, sessionId: session.id } ); for await (const chunk of responseStream.iter_text()) { process.stdout.write(chunk); fullResponse += chunk; } // Store the assistant's complete response await session.addMessages([ assistant.message(fullResponse) ]); } await restaurantRecommendationChat(); })(); ``` ## Performance Considerations When implementing streaming: * Consider connection stability for mobile or unreliable networks * Implement appropriate timeouts for stream operations * Be mindful of memory usage when accumulating large responses * Use appropriate error handling for network interruptions Streaming responses provide a more interactive and engaging user experience. By implementing streaming in your Honcho applications, you can create more responsive AI-powered features that feel natural and immediate to your users. # Summarizer Source: https://honcho.dev/docs/v3/documentation/features/advanced/summarizer How Honcho creates summaries of conversations Almost all agents require, in addition to personalization and memory, a way to quickly prime a context window with a summary of the conversation (in Honcho, this is equivalent to a `session`). The general strategy for summarization is to combine a list of recent messages verbatim with a compressed LLM-generated summary of the older messages not included. Implementing this correctly, in such a way that the resulting context is: * Exhaustive: the combination of recent messages and summary should cover the entire conversation * Dynamically sized: the tokens used on both summary and recent messages should be malleable based on desired token usage * Performant: while creation of the summary by LLM introduces necessary latency, this should never add latency to an arbitrary end-user request ...is a non-trivial problem. Summarization should not be necessary to re-implement for every new agent you build, so Honcho comes with a built-in solution. ### Creating Summaries Honcho already has an asynchronous task queue for the purpose of deriving facts from messages. This is the ideal place to create summaries where they won't add latency to a message. Currently, Honcho has two configurable summary types: * Short summaries: by default, enqueued every 20 messages and given a token limit of 1000 * Long summaries: by default, enqueued every 60 messages and given a token limit of 4000 Both summaries are designed to be exhaustive: when enqueued, they are given the *prior* summary of their type plus every message after that summary. This recursive compression process naturally biases the summary towards recent messages while still covering the entire conversation. For example, if message 160 in a conversation triggers a short summary, as it would with default settings, the summary task would retrieve the prior short summary (message 140) plus messages 141-160. It would then produce a summary of messages 0-160 and store that in the short summary slot on the session. Every session has a single slot for each summary type: new summaries replace old ones. It's important to keep in mind that summary tasks run in the background and are not guaranteed to complete before the next message. However, they are guaranteed to complete in order, so that if a user saves 100 messages in a single batch, the short summary will first be created for messages 0-20, then 21-40, and so on, in our desired recursive way. ### Retrieving Summaries Summaries are retrieved from the session by the [`get_context`](/docs/v3/documentation/features/get-context) method. This method has two parameters: * `summary`: A boolean indicating whether to include the summary in the return type. The default is true. * `tokens`: An integer indicating the maximum number of tokens to use for the context. **If not provided, `get_context` will retrieve as many tokens as are required to create exhaustive conversation coverage.** The return type is simply a list of recent messages and a summary if the flag is used. These two components are dynamically sized based on the token limit. Combined, they will always be below the given token limit. Honcho reserves 60% of the context size for recent messages and 40% for the summary. There's a critical trade-off to understand between exhaustiveness and token usage. Let's go through some scenarios: * If the *last message* contains more tokens than the context token limit, no summary *or* message list is possible -- both will be empty. * If the *last few messages* contain more tokens than the context token limit, no summary is possible -- the context will only contain the last 1 or 2 messages that fit in the token limit. * If the summaries contain more tokens than the context token limit, no summary is possible -- the context will only contain the X most recent messages that fit in the token limit. Note that while summaries will often be smaller than their token limits, avoiding this scenario means passing a higher token limit than the Honcho-configured summary size(s). For this reason, the default token limit for `get_context` is a few times larger than the configured long summary size. The above scenarios indicate where summarization is not possible -- therefore, the context retrieved will almost certainly **not** be exhaustive. Sometimes, gaps in context aren't an issue. In these cases, it's best to pass a reasonable token limit depending on your needs. Other cases demand exhaustive context -- don't pass a token limit and just let Honcho retrieve the ideal combination of summary and recent messages. Finally, if you don't care about the conversation at large and just want the last few messages, set `summary` to false and `tokens` to some multiple of your desired message count. Note that context messages are not paginated, so there's a hard limit on the number of messages that can be retrieved (currently 100,000 tokens). As a final note, remember that summaries are generated asynchronously and therefore may not be available immediately. If you batch-save a large number of messages, assume that summaries will not be available until those messages are processed, which can take seconds to minutes depending on the number of messages and the configured LLM provider. Exhaustive `get_context` calls performed during this time will likely just return the messages in the session. # Using Filters Source: https://honcho.dev/docs/v3/documentation/features/advanced/using-filters Learn how to filter workspaces, peers, sessions, and messages using Honcho's powerful filtering system Honcho provides a sophisticated filtering system that allows you to query workspaces, peers, sessions, and messages with precise control. The filtering system supports logical operators, comparison operators, metadata filtering, and wildcards to help you find exactly what you need. ## Basic Filtering Concepts Filters in Honcho are expressed as dictionaries that define conditions for matching resources. The system supports both simple equality filters and complex queries with multiple conditions. ### Simple Filters The most basic filters check for exact matches: ```python Python theme={null} from honcho import Honcho # Initialize client honcho = Honcho() # Simple peer filter peers = honcho.peers(filters={"peer_id": "alice"}) # Simple session filter with metadata sessions = honcho.sessions(filters={ "metadata": {"type": "support"} }) # Simple message filter messages = honcho.messages(filters={ "session_id": "support-chat-1", "peer_id": "alice" }) ``` ```typescript TypeScript theme={null} import { Honcho } from "@honcho-ai/sdk"; (async () => { // Initialize client const honcho = new Honcho({}); // Simple peer filter const peers = await honcho.peers({ filters: { peer_id: "alice" } }); // Simple session filter with metadata const sessions = await honcho.sessions({ filters: { metadata: { type: "support" } } }); // Simple message filter const messages = await honcho.messages({ filters: { session_id: "support-chat-1", peer_id: "alice" } }); })(); ``` ## Logical Operators Combine multiple conditions using logical operators for complex queries: ### AND Operator Use AND to require all conditions to be true: ```python Python theme={null} messages = honcho.messages(filters={ "AND": [ {"session_id": "chat-1"}, {"created_at": {"gte": "2024-01-01"}} ] }) ``` ```typescript TypeScript theme={null} (async () => { const messages = await honcho.messages({ filters: { AND: [ { session_id: "chat-1" }, { created_at: { gte: "2024-01-01" } } ] } }); })(); ``` ### OR Operator Use OR to match any of the specified conditions: ```python Python theme={null} # Find messages from either alice or bob messages = session.messages(filters={ "OR": [ {"peer_id": "alice"}, {"peer_id": "bob"} ] }) # Complex OR with metadata conditions sessions = honcho.sessions(filters={ "OR": [ {"metadata": {"priority": "high"}}, {"metadata": {"urgent": True}}, {"metadata": {"escalated": True}} ] }) ``` ```typescript TypeScript theme={null} (async () => { // Find messages from either alice or bob const messages = await session.messages({ filters: { OR: [ { peer_id: "alice" }, { peer_id: "bob" } ] } }); // Complex OR with metadata conditions const sessions = await honcho.sessions({ filters: { OR: [ { metadata: { priority: "high" } }, { metadata: { urgent: true } }, { metadata: { escalated: true } } ] } }); })(); ``` ### NOT Operator Use NOT to exclude specific conditions: ```python Python theme={null} # Find all peers except alice peers = honcho.peers(filters={ "NOT": [ {"peer_id": "alice"} ] }) # Find sessions that are NOT completed sessions = honcho.sessions(filters={ "NOT": [ {"metadata": {"status": "completed"}} ] }) ``` ```typescript TypeScript theme={null} (async () => { // Find all peers except alice const peers = await honcho.peers({ filters: { NOT: [ { peer_id: "alice" } ] } }); // Find sessions that are NOT completed const sessions = await honcho.sessions({ filters: { NOT: [ { metadata: { status: "completed" } } ] } }); })(); ``` ### Combining Logical Operators Create sophisticated queries by combining different logical operators: ```python Python theme={null} # Find messages from alice OR bob, but NOT where message has archived set to true in metadata messages = session.messages(filters={ "AND": [ { "OR": [ {"peer_id": "alice"}, {"peer_id": "bob"} ] }, { "NOT": [ {"metadata": {"archived": True}} ] } ] }) ``` ```typescript TypeScript theme={null} (async () => { // Find messages from alice OR bob, but NOT where message has archived set to true in metadata const messages = await session.messages({ filters: { AND: [ { OR: [ { peer_id: "alice" }, { peer_id: "bob" } ] }, { NOT: [ { metadata: { archived: true } } ] } ] } }); })(); ``` ## Comparison Operators Use comparison operators for range queries and advanced matching: ### Numeric Comparisons ```python Python theme={null} # Find sessions created after a specific date sessions = honcho.sessions(filters={ "created_at": {"gte": "2024-01-01"} }) # Find messages within a date range messages = session.messages(filters={ "created_at": { "gte": "2024-01-01", "lte": "2024-12-31" } }) # Metadata numeric comparisons sessions = honcho.sessions(filters={ "metadata": { "score": {"gt": 8.5}, "duration": {"lte": 3600} } }) ``` ```typescript TypeScript theme={null} (async () => { // Find sessions created after a specific date const sessions = await honcho.sessions({ filters: { created_at: { gte: "2024-01-01" } } }); // Find messages within a date range const messages = await session.messages({ filters: { created_at: { gte: "2024-01-01", lte: "2024-12-31" } } }); // Metadata numeric comparisons const filteredSessions = await honcho.sessions({ filters: { metadata: { score: { gt: 8.5 }, duration: { lte: 3600 } } } }); })(); ``` ### List Membership ```python Python theme={null} # Find messages from specific peers in a session messages = session.messages(filters={ "peer_id": {"in": ["alice", "bob", "charlie"]} }) # Find sessions with specific tags sessions = honcho.sessions(filters={ "metadata": { "tag": {"in": ["important", "urgent", "follow-up"]} } }) # Not equal comparisons peers = honcho.peers(filters={ "metadata": { "status": {"ne": "inactive"} } }) ``` ```typescript TypeScript theme={null} (async () => { // Find messages from specific peers in a session const messages = await session.messages({ filters: { peer_id: { in: ["alice", "bob", "charlie"] } } }); // Find sessions with specific tags const sessions = await honcho.sessions({ filters: { metadata: { tag: { in: ["important", "urgent", "follow-up"] } } } }); // Not equal comparisons const peers = await honcho.peers({ filters: { metadata: { status: { ne: "inactive" } } } }); })(); ``` ## Metadata Filtering Metadata filtering is particularly powerful in Honcho, supporting nested conditions and complex queries: ### Basic Metadata Filtering ```python Python theme={null} # Simple metadata equality sessions = honcho.sessions(filters={ "metadata": { "type": "customer_support", "priority": "high" } }) # Nested metadata objects peers = honcho.peers(filters={ "metadata": { "profile": { "role": "admin", "department": "engineering" } } }) ``` ```typescript TypeScript theme={null} (async () => { // Simple metadata equality const sessions = await honcho.sessions({ filters: { metadata: { type: "customer_support", priority: "high" } } }); // Nested metadata objects const peers = await honcho.peers({ filters: { metadata: { profile: { role: "admin", department: "engineering" } } } }); })(); ``` ### Advanced Metadata Queries If you want to do advanced queries like these, make sure not to create metadata fields that use the same names as the included comparison operators! For example, if you have a metadata field called `contains`, it will conflict with the `contains` operator. ```python Python theme={null} # Metadata with comparison operators sessions = honcho.sessions(filters={ "metadata": { "score": {"gte": 4.0, "lte": 5.0}, "created_by": {"ne": "system"}, "tags": {"contains": "important"} } }) # Complex metadata conditions messages = session.messages(filters={ "AND": [ {"metadata": {"sentiment": {"in": ["positive", "neutral"]}}}, {"metadata": {"confidence": {"gt": 0.8}}}, {"content": {"icontains": "thank"}} ] }) ``` ```typescript TypeScript theme={null} (async () => { // Metadata with comparison operators const sessions = await honcho.sessions({ filters: { metadata: { score: { gte: 4.0, lte: 5.0 }, created_by: { ne: "system" }, tags: { contains: "important" } } } }); // Complex metadata conditions const messages = await session.messages({ filters: { AND: [ { metadata: { sentiment: { in: ["positive", "neutral"] } } }, { metadata: { confidence: { gt: 0.8 } } }, { content: { icontains: "thank" } } ] } }); })(); ``` ## Wildcards Use wildcards (\*) to match any value for a field: ```python Python theme={null} # Find all sessions with any peer_id (essentially all sessions) sessions = honcho.sessions(filters={ "peer_id": "*" }) # Wildcard in lists - matches everything messages = session.messages(filters={ "peer_id": {"in": ["alice", "bob", "*"]} }) # Metadata wildcards sessions = honcho.sessions(filters={ "metadata": { "type": "*", # Any type "status": "active" # But status must be active } }) ``` ```typescript TypeScript theme={null} (async () => { // Find all sessions with any peer_id (essentially all sessions) const sessions = await honcho.sessions({ filters: { peer_id: "*" } }); // Wildcard in lists - matches everything const messages = await session.messages({ filters: { peer_id: { in: ["alice", "bob", "*"] } } }); // Metadata wildcards const filteredSessions = await honcho.sessions({ filters: { metadata: { type: "*", // Any type status: "active" // But status must be active } } }); })(); ``` ## Resource-Specific Examples ### Filtering Workspaces ```python Python theme={null} # Find workspaces by name pattern workspaces = honcho.workspaces(filters={ "name": {"contains": "prod"} }) # Filter by metadata workspaces = honcho.workspaces(filters={ "metadata": { "environment": "production", "team": {"in": ["backend", "frontend", "devops"]} } }) ``` ```typescript TypeScript theme={null} (async () => { // Find workspaces by name pattern const workspaces = await honcho.workspaces({ filters: { name: { contains: "prod" } } }); // Filter by metadata const workspaces = await honcho.workspaces({ filters: { metadata: { environment: "production", team: { in: ["backend", "frontend", "devops"] } } } }); })(); ``` ### Filtering Messages ```python Python theme={null} # Find error messages from the last week from datetime import datetime, timedelta week_ago = (datetime.now() - timedelta(days=7)).isoformat() messages = session.messages(filters={ "AND": [ {"content": {"icontains": "error"}}, {"created_at": {"gte": week_ago}}, {"metadata": {"level": {"in": ["error", "critical"]}}} ] }) # Find messages in specific sessions with sentiment analysis messages = session.messages(filters={ "AND": [ {"session_id": {"in": ["support-1", "support-2", "support-3"]}}, {"metadata": {"sentiment": "negative"}}, {"metadata": {"confidence": {"gte": 0.7}}} ] }) ``` ```typescript TypeScript theme={null} (async () => { // Find error messages from the last week const weekAgo = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000).toISOString(); const messages = await session.messages({ filters: { AND: [ { content: { icontains: "error" } }, { created_at: { gte: weekAgo } }, { metadata: { level: { in: ["error", "critical"] } } } ] } }); // Find messages in specific sessions with sentiment analysis const sentimentMessages = await session.messages({ filters: { AND: [ { session_id: { in: ["support-1", "support-2", "support-3"] } }, { metadata: { sentiment: "negative" } }, { metadata: { confidence: { gte: 0.7 } } } ] } }); })(); ``` ## Error Handling Handle filter errors gracefully: ```python Python theme={null} from honcho.exceptions import FilterError try: # Invalid filter - unsupported operator messages = session.messages(filters={ "created_at": {"invalid_operator": "2024-01-01"} }) except FilterError as e: print(f"Filter error: {e}") # Handle the error appropriately try: # Invalid column name sessions = honcho.sessions(filters={ "nonexistent_field": "value" }) except FilterError as e: print(f"Invalid field: {e}") ``` ```typescript TypeScript theme={null} (async () => { try { // Invalid filter - unsupported operator const messages = await session.messages({ filters: { created_at: { invalid_operator: "2024-01-01" } } }); } catch (error) { if (error.message.includes("filters")) { console.error(`Filter error: ${error.message}`); // Handle the error appropriately } } try { // Invalid column name const sessions = await honcho.sessions({ filters: { nonexistent_field: "value" } }); } catch (error) { console.error(`Invalid field: ${error.message}`); } })(); ``` ## Conclusion Honcho's filtering system provides powerful capabilities for querying your conversational data. By understanding how to: * Use simple equality filters and complex logical operators * Apply comparison operators for range and pattern matching * Filter metadata with nested conditions * Handle wildcards and dynamic filter construction * Follow best practices for performance and validation You can build sophisticated applications that efficiently find and process exactly the conversations, messages, and insights you need from your Honcho data. # Chat Endpoint Source: https://honcho.dev/docs/v3/documentation/features/chat An endpoint for reasoning about your users The Chat endpoint (`peer.chat()`) is the natural language interface to Honcho's reasoning. Instead of manually retrieving conclusions, your LLM can ask questions and get synthesized answers based on all the reasoning Honcho has done about a peer. Think of it as agent-to-agent communication. ## Basic Usage The simplest way to use the chat endpoint is to ask a question and get a text response: ```python Python theme={null} from honcho import Honcho honcho = Honcho() peer = honcho.peer("user-123") # Ask Honcho about the peer query = "What is the user's favorite way of completing the task?" answer = peer.chat(query) print(answer) # "Based on conclusions, the user prefers using keyboard shortcuts..." ``` ```typescript TypeScript theme={null} import { Honcho } from '@honcho-ai/sdk'; const honcho = new Honcho({}); const peer = await honcho.peer("user-123"); // Ask Honcho about the peer const query = "What is the user's favorite way of completing the task?"; const answer = await peer.chat(query); console.log(answer); // "Based on conclusions, the user prefers using keyboard shortcuts..." ``` The chat endpoint searches through the peer's representation--all the conclusions Honcho has reasoned about them--and synthesizes a natural language answer. ## Reasoning Level Use `reasoning_level` to trade off speed against depth for a specific chat request. It is optional and defaults to `low`. Accepted values are `minimal`, `low`, `medium`, `high`, and `max`. The reasoning level controls which model the request is routed to, the tools used by the agent, the thinking budget, the maximum tool-iteration count, and output token limits. | Level | When to use | Notes | | --------- | ----------------------------------- | ----------------------------------------------------------- | | `minimal` | Fast factual lookups | Smallest prefetch window and minimal tools for lower cost. | | `low` | Default balance | Standard tool set and budgets. | | `medium` | Multi-step or ambiguous questions | Calls fewer tools than `low`, but thinks harder and longer. | | `high` | Complex synthesis across sources | Thinks like `medium`, but uses more tools. | | `max` | Deep research, most complex queries | Highest thinking budget, max iterations. | ```python Python theme={null} query = "Summarize the user's long-term goals." answer = peer.chat(query, reasoning_level="high") ``` ```typescript TypeScript theme={null} const query = "Summarize the user's long-term goals."; const answer = await peer.chat(query, { reasoningLevel: "high" }); ``` ## Streaming Responses For longer answers, use streaming to get incremental responses: ```python Python theme={null} query = "What do we know about the user?" response_stream = peer.chat(query, stream=True) for chunk in response_stream.iter_text(): print(chunk, end="", flush=True) ``` ```typescript TypeScript theme={null} const query = "What do we know about the user?"; const responseStream = await peer.chat(query, { stream: true }); for await (const chunk of responseStream.iter_text()) { process.stdout.write(chunk); } ``` Streaming is useful for displaying real-time responses in chat interfaces or when asking complex questions that require longer answers. ## Integration Patterns ### Dynamic Prompt Enhancement Let your LLM decide what it needs to know, then inject that context into the next generation: ```python Python theme={null} # Your LLM generates a query based on the conversation llm_query = "Does the user prefer formal or casual communication?" # Get answer from Honcho context = peer.chat(llm_query) # Add to your next LLM prompt enhanced_prompt = f""" Context about the user: {context} User message: {user_input} Respond appropriately based on the context. """ ``` ```typescript TypeScript theme={null} // Your LLM generates a query based on the conversation const llmQuery = "Does the user prefer formal or casual communication?"; // Get answer from Honcho const context = await peer.chat(llmQuery); // Add to your next LLM prompt const enhancedPrompt = ` Context about the user: ${context} User message: ${userInput} Respond appropriately based on the context. `; ``` ### Conditional Logic Use chat endpoint responses to drive application logic: ```python Python theme={null} # Check if user has completed onboarding onboarding_status = peer.chat("Has the user completed the onboarding flow?") if "yes" in onboarding_status.lower(): # Show main interface pass else: # Show onboarding pass ``` ```typescript TypeScript theme={null} // Check if user has completed onboarding const onboardingStatus = await peer.chat("Has the user completed the onboarding flow?"); if (onboardingStatus.toLowerCase().includes("yes")) { // Show main interface } else { // Show onboarding } ``` ### Preference Extraction Extract specific preferences for personalization: ```python Python theme={null} # Get multiple insights tone = peer.chat("What tone does the user prefer in responses?") expertise = peer.chat("What is the user's level of technical expertise?") goals = peer.chat("What are the user's main goals or objectives?") # Use these to configure your agent's behavior ``` ```typescript TypeScript theme={null} // Get multiple insights const tone = await peer.chat("What tone does the user prefer in responses?"); const expertise = await peer.chat("What is the user's level of technical expertise?"); const goals = await peer.chat("What are the user's main goals or objectives?"); // Use these to configure your agent's behavior ``` ## How Honcho Answers When you call `peer.chat(query)`: 1. Honcho searches through the peer's peer card and representation--conclusions drawn from reasoning over their messages 2. Retrieves conclusions semantically relevant to your query 3. Combines them with segments of source messages, if needed, to gather more context 4. Synthesizes them into a coherent natural language response to your query Honcho [reasoning](/docs/v3/documentation/core-concepts/reasoning) runs continuously in the background, processing new messages and updating representations. The chat endpoint always has access to Honcho's latest conclusions about the peer. ## Best Practices ### Ask specific questions Instead of "Tell me about the user", ask "What communication style does the user prefer?" You'll get more actionable answers. ### Let your LLM formulate queries The chat endpoint shines when your LLM decides what it needs to know. This creates dynamic, context-aware personalization. An excellent way to achieve this, if building an agent, is to give access to the Honcho chat endpoint as just another tool. ### Use for runtime decisions Don't just use chat for LLM prompts - use it to drive application logic, routing, and feature flags based on user behavior. ### Combine with context() Use `context()` for conversation context and `peer.chat()` for specific insights. They complement each other. For more ideas on using the chat endpoint, see our [guides](/docs/v3/guides/overview). # Get Context Source: https://honcho.dev/docs/v3/documentation/features/get-context Learn how to use context() to retrieve and format conversation context for LLM integration The `context()` method is a powerful feature that retrieves formatted conversation context from sessions, making it easy to integrate with LLMs like OpenAI, Anthropic, and others. This guide covers everything you need to know about working with session context. By default, the context includes a blend of summary and messages ***which covers the entire session history of a peer***. Summaries are automatically generated at intervals and recent messages are included depending on how many tokens the context is intended to be. You can specify any token limit you want, and can disable summaries to fill that limit entirely with recent messages. To get representation data, you need to specify a target peer. ## Basic Usage The `context()` method is available on all Session objects and returns a `SessionContext` that contains the formatted conversation history. ```python Python theme={null} from honcho import Honcho # Initialize client and create session honcho = Honcho() session = honcho.session("conversation-1") # Get basic context (not very useful before adding any messages!) context = session.context() ``` ```typescript TypeScript theme={null} import { Honcho } from "@honcho-ai/sdk"; (async () => { // Initialize client and create session const honcho = new Honcho({}); const session = await honcho.session("conversation-1"); // Get basic context (not very useful before adding any messages!) const context = await session.context(); })(); ``` ## Context Parameters The `context()` method accepts several optional parameters to customize the retrieved context: ### Token Limits Control the size of the context by setting a maximum token count: ```python Python theme={null} # Limit context to 1500 tokens context = session.context(tokens=1500) # Limit context to 3000 tokens for larger conversations context = session.context(tokens=3000) ``` ```typescript TypeScript theme={null} (async () => { // Limit context to 1500 tokens const context = await session.context({ tokens: 1500 }); // Limit context to 3000 tokens for larger conversations const context = await session.context({ tokens: 3000 }); })(); ``` ### Summary Mode Enable summary mode (on by default) to get a condensed version of the conversation: ```python Python theme={null} # Get context with summary enabled -- will contain both summary and messages context = session.context(summary=True) # Combine summary=False with token limits to get more messages context = session.context(summary=False, tokens=2000) ``` ```typescript TypeScript theme={null} (async () => { // Get context with summary enabled -- will contain both summary and messages const context = await session.context({ summary: true }); // Combine summary=False with token limits to get more messages const context = await session.context({ summary: false, tokens: 2000 }); })(); ``` ### Peer Representation in Context You can include a peer's [representation](/docs/v3/documentation/core-concepts/representation) and peer card in the context by specifying `peer_target`. This is useful for providing the LLM with knowledge about a specific peer. ```python Python theme={null} # Get context with peer representation included context = session.context( tokens=2000, peer_target="user-123" # Include representation of user-123 ) # Access the representation and peer card print(context.peer_representation) # String representation print(context.peer_card) # List of peer card items # Get representation from a specific peer's perspective context = session.context( tokens=2000, peer_target="user-123", peer_perspective="assistant" # From assistant's viewpoint ) ``` ```typescript TypeScript theme={null} (async () => { // Get context with peer representation included const context = await session.context({ tokens: 2000, peerTarget: "user-123" // Include representation of user-123 }); // Access the representation and peer card console.log(context.peerRepresentation); // String representation console.log(context.peerCard); // Array of peer card items // Get representation from a specific peer's perspective const perspectiveContext = await session.context({ tokens: 2000, peerTarget: "user-123", peerPerspective: "assistant" // From assistant's viewpoint }); })(); ``` ### Semantic Search Use `search_query` to fetch semantically relevant conclusions based on a query string (requires `peer_target`): ```python Python theme={null} context = session.context( tokens=2000, peer_target="user-123", search_query="What are my coding preferences?", search_top_k=10, # Number of relevant conclusions to fetch search_max_distance=0.8, # Max semantic distance (0.0-1.0) include_most_frequent=True, # Include most frequent conclusions max_conclusions=25 # Cap total conclusions ) ``` ```typescript TypeScript theme={null} (async () => { const context = await session.context({ tokens: 2000, peerTarget: "user-123", representationOptions: { searchQuery: "What are my coding preferences?", searchTopK: 10, // Number of relevant conclusions to fetch searchMaxDistance: 0.8, // Max semantic distance (0.0-1.0) includeMostFrequent: true, // Include most frequent conclusions maxConclusions: 25 // Cap total conclusions } }); })(); ``` ### Session-Scoped Representations Use `limit_to_session` to only include conclusions from the current session: ```python Python theme={null} # Get context limited to this session's conclusions only context = session.context( tokens=2000, peer_target="user-123", limit_to_session=True # Only conclusions from this session ) ``` ```typescript TypeScript theme={null} (async () => { // Get context limited to this session's conclusions only const context = await session.context({ tokens: 2000, peerTarget: "user-123", limitToSession: true // Only conclusions from this session }); })(); ``` ### All Parameters Reference | Parameter | Type | Description | | ----------------------- | ------- | ------------------------------------------------- | | `summary` | `bool` | Include summary in context (default: true) | | `tokens` | `int` | Maximum tokens to include | | `peer_target` | `str` | Peer ID to include representation for | | `peer_perspective` | `str` | Peer ID for perspective (requires peer\_target) | | `search_query` | `str` | Query for semantic search (requires peer\_target) | | `limit_to_session` | `bool` | Limit to session conclusions only | | `search_top_k` | `int` | Semantic search results to include (1-100) | | `search_max_distance` | `float` | Max semantic distance (0.0-1.0) | | `include_most_frequent` | `bool` | Include most frequent conclusions | | `max_conclusions` | `int` | Maximum conclusions to include (1-100) | ## Converting to LLM Formats The `SessionContext` object provides methods to convert the context into formats compatible with popular LLM APIs. When converting to OpenAI format, you must specify the assistant peer to format the context in such a way that the LLM can understand it. ### OpenAI Format Convert context to OpenAI's chat completion format: ```python Python theme={null} # Create peers alice = honcho.peer("alice") assistant = honcho.peer("assistant") # Add some conversation session.add_messages([ alice.message("What's the weather like today?"), assistant.message("It's sunny and 75°F outside!") ]) # Get context and convert to OpenAI format context = session.context() openai_messages = context.to_openai(assistant=assistant) # The messages are now ready for OpenAI API print(openai_messages) # [ # {"role": "user", "content": "What's the weather like today?"}, # {"role": "assistant", "content": "It's sunny and 75°F outside!"} # ] ``` ```typescript TypeScript theme={null} (async () => { // Create peers const alice = await honcho.peer("alice"); const assistant = await honcho.peer("assistant"); // Add some conversation await session.addMessages([ alice.message("What's the weather like today?"), assistant.message("It's sunny and 75°F outside!") ]); // Get context and convert to OpenAI format const context = await session.context(); const openaiMessages = context.toOpenAI(assistant); // The messages are now ready for OpenAI API console.log(openaiMessages); // [ // {"role": "user", "content": "What's the weather like today?"}, // {"role": "assistant", "content": "It's sunny and 75°F outside!"} // ] })(); ``` ### Anthropic Format Convert context to Anthropic's Claude format: ```python Python theme={null} # Get context and convert to Anthropic format context = session.context() anthropic_messages = context.to_anthropic(assistant=assistant) # Ready for Anthropic API print(anthropic_messages) ``` ```typescript TypeScript theme={null} (async () => { // Get context and convert to Anthropic format const context = await session.context(); const anthropicMessages = context.toAnthropic(assistant); // Ready for Anthropic API console.log(anthropicMessages); })(); ``` ## Complete LLM Integration Examples ### Using with OpenAI ```python Python theme={null} import openai from honcho import Honcho # Initialize clients honcho = Honcho() openai_client = openai.OpenAI() # Set up conversation session = honcho.session("support-chat") user = honcho.peer("user-123") assistant = honcho.peer("support-bot") # Add conversation history session.add_messages([ user.message("I'm having trouble with my account login"), assistant.message("I can help you with that. What error message are you seeing?"), user.message("It says 'Invalid credentials' but I'm sure my password is correct") ]) # Get context for LLM messages = session.context(tokens=2000).to_openai(assistant=assistant) # Add new user message and get AI response messages.append({ "role": "user", "content": "Can you reset my password?" }) response = openai_client.chat.completions.create( model="gpt-4", messages=messages ) # Add AI response back to session session.add_messages([ user.message("Can you reset my password?"), assistant.message(response.choices[0].message.content) ]) ``` ```typescript TypeScript theme={null} import OpenAI from 'openai'; import { Honcho } from "@honcho-ai/sdk"; (async () => { // Initialize clients const honcho = new Honcho({}); const openai = new OpenAI(); // Set up conversation const session = await honcho.session("support-chat"); const user = await honcho.peer("user-123"); const assistant = await honcho.peer("support-bot"); // Add conversation history await session.addMessages([ user.message("I'm having trouble with my account login"), assistant.message("I can help you with that. What error message are you seeing?"), user.message("It says 'Invalid credentials' but I'm sure my password is correct") ]); // Get context for LLM const messages = (await session.context({ tokens: 2000 })).toOpenAI(assistant); // Add new user message and get AI response const response = await openai.chat.completions.create({ model: "gpt-4", messages: [ ...messages, { role: "user", content: "Can you reset my password?" } ] }); // Add AI response back to session await session.addMessages([ user.message("Can you reset my password?"), assistant.message(response.choices[0].message.content) ]); })(); ``` ### Multi-Turn Conversation Loop ```python Python theme={null} def chat_loop(): """Example of a continuous chat loop using context()""" session = honcho.session("chat-session") user = honcho.peer("user") assistant = honcho.peer("ai-assistant") while True: # Get user input user_input = input("You: ") if user_input.lower() in ['quit', 'exit']: break # Add user message to session session.add_messages([user.message(user_input)]) # Get conversation context context = session.context(tokens=2000) messages = context.to_openai(assistant=assistant) # Get AI response response = openai_client.chat.completions.create( model="gpt-4", messages=messages ) ai_response = response.choices[0].message.content print(f"Assistant: {ai_response}") # Add AI response to session session.add_messages([assistant.message(ai_response)]) # Start the chat loop chat_loop() ``` ```typescript TypeScript theme={null} (async () => { async function chatLoop() { const session = await honcho.session("chat-session"); const user = await honcho.peer("user"); const assistant = await honcho.peer("ai-assistant"); // This would be replaced with actual user input handling in a real app const userInputs = [ "Hello, how are you?", "What's the weather like?", "Tell me a joke" ]; for (const userInput of userInputs) { console.log(`You: ${userInput}`); // Add user message to session await session.addMessages([user.message(userInput)]); // Get conversation context const context = await session.context({ tokens: 2000 }); const messages = context.toOpenAI(assistant); // Get AI response const response = await openai.chat.completions.create({ model: "gpt-4", messages: messages }); const aiResponse = response.choices[0].message.content; console.log(`Assistant: ${aiResponse}`); // Add AI response to session await session.addMessages([assistant.message(aiResponse)]); } } // Start the chat loop await chatLoop(); })(); ``` ## Advanced Context Usage ### Context with Summaries for Long Conversations For very long conversations, use summaries to maintain context while controlling token usage: ```python Python theme={null} # For long conversations, use summary mode long_session = honcho.session("long-conversation") # Get summarized context to fit within token limits context = long_session.context(summary=True, tokens=1500) messages = context.to_openai(assistant=assistant) # This will include a summary of older messages and recent full messages print(f"Context contains {len(messages)} formatted messages") ``` ```typescript TypeScript theme={null} (async () => { // For long conversations, use summary mode const longSession = await honcho.session("long-conversation"); // Get summarized context to fit within token limits const context = await longSession.context({ summary: true, tokens: 1500 }); const messages = context.toOpenAI(assistant); // This will include a summary of older messages and recent full messages console.log(`Context contains ${messages.length} formatted messages`); })(); ``` ### Context for Different Assistant Types You can get context formatted for different types of assistants in the same session: ```python Python theme={null} # Create different assistant peers chatbot = honcho.peer("chatbot") analyzer = honcho.peer("data-analyzer") moderator = honcho.peer("moderator") # Get context formatted for each assistant type chatbot_context = session.context().to_openai(assistant=chatbot) analyzer_context = session.context().to_openai(assistant=analyzer) moderator_context = session.context().to_openai(assistant=moderator) # Each context will format the conversation from that assistant's perspective ``` ```typescript TypeScript theme={null} (async () => { // Create different assistant peers const chatbot = await honcho.peer("chatbot"); const analyzer = await honcho.peer("data-analyzer"); const moderator = await honcho.peer("moderator"); // Get context formatted for each assistant type const context = await session.context(); const chatbotContext = context.toOpenAI(chatbot); const analyzerContext = context.toOpenAI(analyzer); const moderatorContext = context.toOpenAI(moderator); // Each context will format the conversation from that assistant's perspective })(); ``` ## Best Practices ### 1. Token Management Always set appropriate token limits to control costs and ensure context fits within LLM limits: ```python Python theme={null} # Good: Set reasonable token limits based on your model context = session.context(tokens=3000) # For GPT-4 context = session.context(tokens=1500) # For smaller models # Good: Use summaries for very long conversations context = session.context(summary=True, tokens=2000) ``` ```typescript TypeScript theme={null} (async () => { // Good: Set reasonable token limits based on your model const context = await session.context({ tokens: 3000 }); // For GPT-4 const context = await session.context({ tokens: 1500 }); // For smaller models // Good: Use summaries for very long conversations const context = await session.context({ summary: true, tokens: 2000 }); })(); ``` ### 2. Context Caching For applications with frequent context retrieval, consider caching context when appropriate: ```python Python theme={null} # Cache context for multiple LLM calls within the same request context = session.context(tokens=2000) openai_messages = context.to_openai(assistant=assistant) anthropic_messages = context.to_anthropic(assistant=assistant) # Use the same context object for multiple format conversions ``` ```typescript TypeScript theme={null} (async () => { // Cache context for multiple LLM calls within the same request const context = await session.context({ tokens: 2000 }); const openaiMessages = context.toOpenAI(assistant); const anthropicMessages = context.toAnthropic(assistant); // Use the same context object for multiple format conversions })(); ``` ### 3. Error Handling Always handle potential errors when retrieving context: ```python Python theme={null} try: context = session.context(tokens=2000) except Exception as e: print(f"Error getting context: {e}") # Handle error appropriately (fallback to basic context, retry, etc.) ``` ```typescript TypeScript theme={null} (async () => { try { const context = await session.context({ tokens: 2000 }); } catch (error) { console.error(`Error getting context: ${error}`); // Handle error appropriately (fallback to basic context, retry, etc.) } })(); ``` ## Conclusion The `context()` method is essential for integrating Honcho sessions with LLMs. By understanding how to: * Retrieve context with appropriate parameters * Convert context to LLM-specific formats * Manage token limits and summaries * Handle multi-turn conversations You can build sophisticated AI applications that maintain conversation history and context across interactions while integrating seamlessly with popular LLM providers. # Storing Data Source: https://honcho.dev/docs/v3/documentation/features/storing-data Store Data in Honcho to Generate Memories and Insights The most basic building block of Honcho's data model is the `Message` object. A `Message` is sent by a `Peer` and saved in a `Session` ```python Python theme={null} from honcho import Honcho honcho = Honcho() peer = honcho.peer("sample-peer") session = honcho.session("sample-session") message = peer.message("Hello, world!") session.add_messages([message]) ``` ```typescript TypeScript theme={null} import { Honcho } from '@honcho-ai/sdk'; const honcho = new Honcho({}); const peer = await honcho.peer('sample-peer'); const session = await honcho.session('sample-session'); const message = peer.message('Hello, world!'); await session.addMessages([message]); ``` Once a `Message` is saved in Honcho, it will kick off a background task that looks at the new data to generate insights about the `Peer` that sent the `Message` This is the default behavior of Honcho and can be turned off by [configuring the Peer or Session](/docs/v3/documentation/features/advanced/reasoning-configuration) This pattern of having a Peer, Session, and Messages is highly flexible and works for many different use cases and agent setups. Some use cases may only need a single Peer, but many Sessions. Others will only use a single `Session` for their entire app. These are flexible components that work in any situation. ## Chat Bots A common use case for Honcho to is to build a chatbot like ChatGPT or Claude. In this case you can simply * Make a `Peer` for the User * Make a `Peer` for the AI Then you can make a `Session` for each thread of conversation and save `Messages` from the user and assistant in each turn of conversation # Honcho Overview Source: https://honcho.dev/docs/v3/documentation/introduction/overview Honcho is an open source memory library with a managed service for building stateful agents. Use it with any model, framework, or architecture. It enables agents to build and maintain state about any entity--users, agents, groups, ideas, and more. And because it's a continual learning system, it understands entities that change over time. Using Honcho as your memory system will earn your agents higher retention, more trust, and help you build data moats to out-compete incumbents. Honcho has defined the Pareto Frontier of Agent Memory. Watch the [video](https://x.com/honchodotdev/status/2002090546521911703?s=20), check out our [evals page](https://evals.honcho.dev/), and read the [blog post](https://blog.plasticlabs.ai/research/Benchmarking-Honcho) for more detail. Sign up and start building with Honcho Build your first stateful agent in minutes ## Why Use Honcho? Honcho streamlines the agent building process by offering elegant, flexible primitives for managing context. It also reasons over that context to give developers access to far richer insights only accessible through reasoning. Take the following scenario: * You find a use case for LLMs and build an agent around it * It works well initially but can't maintain context across sessions * You spend weeks engineering a RAG solution that seems to help * Then the cycle begins... * Users report the agent forgetting things, contradicting itself, or losing context mid-session * You build evals to quantify the problem * You re-engineer your entire RAG pipeline with better chunking, embeddings, retrieval strategies * The problems shift but don't disappear * Repeat Eventually you realize the issue isn't engineering—-it's that you're not extracting all the latent information from your data. You need to reason exhaustively, handle contradictions, track patterns over time, and maintain coherent state. In other words, you'd need to build Honcho. Break free from this cycle. Honcho is a general solution to context engineering, memory, and statefulness. ## How Honcho Works Honcho is a memory system that reasons. Read more on the approach [here](https://blog.plasticlabs.ai/blog/Memory-as-Reasoning). Honcho has four storage primitives that work together: ```mermaid theme={null} graph LR W[Workspaces] -->|have| P[Peers] W -->|have| S[Sessions] S -->|have| SM[Messages] P <-.->|many-to-many| S style W fill:#B6DBFF,stroke:#333,color:#000 style P fill:#B6DBFF,stroke:#333,color:#000 style S fill:#B6DBFF,stroke:#333,color:#000 style SM fill:#B6DBFF,stroke:#333,color:#000 ``` * **Workspaces** - Top-level containers that isolate different applications or environments * **Peers** - Any entity that persists but changes over time (users, agents, objects, and more) * **Sessions** - Interaction threads between peers with temporal boundaries * **Messages** - Units of data that trigger reasoning (conversations, events, activity, documents, and more) When you write messages to Honcho, they're stored and processed in the background. Custom reasoning models perform formal logical [*reasoning*](/docs/v3/documentation/core-concepts/reasoning) to generate conclusions about each peer. These conclusions are stored as [*representations*](/docs/v3/documentation/core-concepts/representation) that you can query to provide rich context for your agents. Honcho Architecture The diagram above shows the flow: agents write messages to Honcho, which triggers reasoning that updates what's stored in representations. Developers (or agents) can then query to get additional context for their next response. ## Why Reasoning? Traditional RAG systems retrieve what was explicitly said, but they miss what matters most—the insights only accessible by *rigorously thinking* about your data. Without reasoning, you're leaving latent information on the table. Static retrieval can't surface implicit connections, struggles when new information contradicts old data, and fails when you need to make predictions under uncertainty. Honcho uses formal logic to extract all that latent information. This reasoning is AI-native—it performs the rigorous, compute-intensive thinking that humans struggle with, instantly and consistently. The result is memory that goes beyond simple RAG recall to provide exhaustive context for statefulness. ## Get Started Honcho gives you maximum control over your agent's context and memory. The data model is flexible and composable, the reasoning backend is powerful yet cost-effective, and everything is built to give developers levers to manage token usage, latency, and reasoning depth. We're just scratching the surface. Dive into the quickstart to see Honcho in action, explore the architecture to understand how it all fits together, or jump straight to building. Welcome to Honcho. We're excited to have you at the frontier of AI with us 🫡. Sign up for the Honcho platform and get your API key Build your first stateful agent in minutes Deep dive into how Honcho's primitives fit together Learn how Honcho reasons about data to build memory # Quickstart Source: https://honcho.dev/docs/v3/documentation/introduction/quickstart Let's get started with Honcho. In this quickstart, you will: * Set up a workspace with peers (user and assistant) * Ingest messages from across multiple sessions * Query the reasoning Honcho produces to get synthesized insights about the user Running the code below requires an API key. Create and account and get your API key at [app.honcho.dev](https://app.honcho.dev) under "API KEYS". Every new tenant gets \$100.00 in free credits on sign up. The code below costs \~\$0.04 to run, so don't worry--still plenty of free credits for iterating. #### 1. Install the SDK ```bash Python (uv) theme={null} uv add honcho-ai ``` ```bash Python (pip) theme={null} pip install honcho-ai ``` ```bash TypeScript (npm) theme={null} npm install @honcho-ai/sdk ``` ```bash TypeScript (yarn) theme={null} yarn add @honcho-ai/sdk ``` ```bash TypeScript (pnpm) theme={null} pnpm add @honcho-ai/sdk ``` #### 2. Initialize the Client The Honcho client is the main entry point for interacting with Honcho's API. It uses a workspace called `default` unless specified, so let's create a `first-honcho-test` workspace for this quickstart. ```python Python theme={null} from honcho import Honcho # Initialize client honcho = Honcho(workspace_id="first-honcho-test", api_key=HONCHO_API_KEY) ``` ```typescript TypeScript theme={null} import { Honcho } from '@honcho-ai/sdk'; // Initialize client const honcho = new Honcho({ workspaceId: "first-honcho-test", apiKey: HONCHO_API_KEY }); ``` #### 3. Create Peers ```python Python theme={null} user = honcho.peer("user") assistant = honcho.peer("assistant") ``` ```typescript TypeScript theme={null} const user = await honcho.peer("user") const assistant = await honcho.peer("assistant") ``` #### 4. Add Messages to Sessions We've generated an example conversation dataset with 14 messages across 4 sessions. At a high level, the conversation contains a user chatting with an assistant to get help debugging software infrastructure problems for work *and* jam strategy on a side project they're working on. Spoiler alert--the user is way more interested in their side project. Create a file called `conversation.json` and add the content in the accordion below. Then we'll loop through the sessions and messages in that file and write them to Honcho. ```json Example Conversation theme={null} { "sessions": [ { "id": "session_1", "messages": [ { "role": "user", "content": "containers getting OOM killed in CI after someone on my team updated the base image. quick fix?" }, { "role": "assistant", "content": "OOM kills after a base image update usually means the new image has higher memory requirements. Quick fix: bump the memory limit in your CI config. If you need to unblock fast, you could also pin back to the previous base image version temporarily." } ] }, { "id": "session_2", "messages": [ { "role": "user", "content": "hey, been playing with honcho for a side project. do you know if there's a limit on how many users i can have?" }, { "role": "assistant", "content": "Honcho's free tier is based on context ingested, not number of users - so you can have as many users as you want, you're just limited by how much content you're processing. What are you building?" }, { "role": "user", "content": "ok so basically it's a personal finance assistant that actually remembers your situation - like it knows you're saving for a house, that you hate subscriptions, that you always regret impulse buys over $50. i've been thinking a lot about the UX, like when should it proactively bring up past context vs wait for you to ask? and how do you balance feeling \"known\" vs feeling surveilled? i have a whole notion doc on this lol" }, { "role": "assistant", "content": "That's a really thoughtful approach to the personalization problem! The \"known vs surveilled\" tension is exactly right - the best experiences feel like talking to a friend who remembers, not a system that tracks. Proactive context works best when it's in service of the user's stated goals." } ] }, { "id": "session_3", "messages": [ { "role": "user", "content": "do i need to update docker-compose too or just the CI yaml for that memory thing" }, { "role": "assistant", "content": "Just the CI yaml for the memory limit - docker-compose is separate and only matters if you're hitting the same OOM issue locally." }, { "role": "user", "content": "k. oh btw 3 friends are using the finance app now, they keep asking when they can pay me lol. one of them said it's the first budgeting thing that \"gets\" her" }, { "role": "assistant", "content": "That's awesome - \"when can I pay you\" is the best kind of validation. And that quote about it being the first thing that \"gets\" her is exactly the value prop of personalized memory. Sounds like you're onto something real." } ] }, { "id": "session_4", "messages": [ { "role": "user", "content": "if i wanted to charge like $5/mo for the finance app, would honcho's pricing still make sense at that price point? trying to figure out unit economics early" }, { "role": "assistant", "content": "You should be fine at that price point. Unlike other memory frameworks that charge you to access your own data, Honcho only charges for the reasoning layer - ingestion and inference. Storage and retrieval are free. So your unit economics are mostly about how often you're synthesizing insights, not how much history you're keeping." }, { "role": "user", "content": "ok interesting. wish i had more time to work on this" }, { "role": "assistant", "content": "Yeah, sounds like you've got something worth making time for. The early traction is real." } ] } ] } ``` ```python Python theme={null} import json # Load conversation data with open("conversation.json", "r") as f: data = json.load(f) # Process each session for session_data in data["sessions"]: session = honcho.session(session_data["id"]) session.add_peers([user, assistant]) # Add messages with correct roles messages = [] for msg in session_data["messages"]: if msg["role"] == "user": messages.append(user.message(msg["content"])) elif msg["role"] == "assistant": messages.append(assistant.message(msg["content"])) session.add_messages(messages) ``` ```typescript TypeScript theme={null} import * as fs from 'fs'; const data = JSON.parse(fs.readFileSync("conversation.json", "utf-8")); for (const sessionData of data.sessions) { const session = honcho.session(sessionData.id); session.addPeers([user, assistant]); const messages = sessionData.messages.map((msg: any) => msg.role === "user" ? user.message(msg.content) : assistant.message(msg.content) ); session.addMessages(messages); } ``` #### 5. Query for Insights Now ask Honcho what it's learned--this is where the magic happens: ```python Python theme={null} response = user.chat("What should I know about this user? 3 sentences max") print(response) ``` ```typescript TypeScript theme={null} user.chat("What should I know about this user? 3 sentences max").then((response) => { console.log(response); }) ``` Honcho needs a short amount of time to process messages you write to it. There are several utilities to [check the status](/docs/v3/documentation/features/advanced/queue-status) of the queue. Honcho also offers numerous ways to query reasoning to fit latency needs: see the [Get Context](/docs/v3/documentation/features/get-context) page. The response will look something like this: > User is a personal finance app developer building a personalized finance assistant that's generating real demand (friends are already asking when they can pay). They're notably thoughtful about product design, carefully considering the UX balance between making users feel "known" versus "surveilled" when their app proactively surfaces remembered context like savings goals and spending regrets. They're business-minded and working through unit economics early, exploring a \$5/month subscription model with usage-based cost structure focused on insight generation frequency rather than data storage—though they wish they had more time to dedicate to the project. Honcho synthesizes signal by reasoning about the user to draw conclusions beyond what was explicitly stated. It identifies the user as "notably thoughtful about product design", "business-minded" from the discussion of unit economics, and surfaces the signal that they desire to work on the project more. This is rich personal context for domain-specific agents to do what they want with. * A life coach agent might see "they wish they had more time to dedicate to the project" and "friends are already asking when they can pay" and ask "have you thought about what it would take to go full-time?" * A productivity agent might see the same pattern and say "let's protect your weekend time for the finance app." * A financial advisor agent might see it and ask "what runway would you need to make the leap?" Honcho acts almost like a detective--it reasons about new and existing evidence in order to form conclusions that can be used to make a *case*. These conclusions wait to be composed dynamically based on how you, the ~~judge~~ developer, query it. This approach is what drives our [pareto-frontier](https://evals.honcho.dev) performance on memory benchmarks, and our custom models allow us to optimize speed and cost. ## Next Steps You just saw how Honcho reasons about data to build rich peer representations. In this quickstart, you: * Set up a workspace with peers (user and assistant) * Ingested messages across multiple sessions * Queried the reasoning to get synthesized insights about the user Here's the full working code if you want to run it yourself: ```python Python theme={null} # uv sync # uv run python test.py import json import time import uuid from honcho import Honcho from dotenv import load_dotenv load_dotenv() # Initialize Honcho client with a unique workspace workspace_id = f"docs-example-{uuid.uuid4().hex[:8]}" honcho = Honcho(environment="production", workspace_id=workspace_id) # Create peers to represent the user and assistant user = honcho.peer("user") assistant = honcho.peer("assistant") # Load conversation data from JSON file with open("conversation.json", "r") as f: conversation_data = json.load(f) # Import historical conversation sessions for session_data in conversation_data["sessions"]: session = honcho.session(session_data["id"]) session.add_peers([user, assistant]) # Convert messages to peer messages with correct attribution messages = [] for msg in session_data["messages"]: if msg["role"] == "user": messages.append(user.message(msg["content"])) elif msg["role"] == "assistant": messages.append(assistant.message(msg["content"])) session.add_messages(messages) # Query insights about the user based on conversation history response = user.chat("What should I know about this user? 3 sentences max") print(response) ``` ```typescript Typescript theme={null} // npm install // npx ts-node test.ts import * as fs from 'fs'; import { randomUUID } from 'crypto'; import * as dotenv from 'dotenv'; import { Honcho } from '@honcho-ai/sdk'; dotenv.config(); // Initialize Honcho client with a unique workspace const workspaceId = `docs-example-${randomUUID().slice(0, 8)}`; const honcho = new Honcho({ environment: "production", workspaceId, }); // Create peers to represent the user and assistant const user = await honcho.peer("user"); const assistant = await honcho.peer("assistant"); // Load conversation data from JSON file const conversationData = JSON.parse(fs.readFileSync("conversation.json", "utf-8")); // Import historical conversation sessions for (const sessionData of conversationData.sessions) { const session = await honcho.session(sessionData.id); await session.addPeers([user, assistant]); // Convert messages to peer messages with correct attribution const messages = []; for (const msg of sessionData.messages) { if (msg.role === "user") { messages.push(user.message(msg.content)); } else if (msg.role === "assistant") { messages.push(assistant.message(msg.content)); } } await session.addMessages(messages); } // Query insights about the user based on conversation history const response = await user.chat("What should I know about this user? 3 sentences max"); console.log(response); ``` From here, you can explore how to use Honcho's features in your own applications: Learn how to fetch the right context for your agent's next response Deep dive into how Honcho's primitives fit together Chat with Honcho about your users Integration patterns and advanced use cases # Agentic Development Source: https://honcho.dev/docs/v3/documentation/introduction/vibecoding Agent skills, MCP server, and tools for building with Honcho ## MCP Server The fastest way to give any AI tool persistent memory is through the Honcho MCP server. It works with any client that supports the Model Context Protocol. **Get started in 2 minutes:** 1. Get an API key at [app.honcho.dev](https://app.honcho.dev) 2. Add the config for your client below 3. Restart your client See the [full MCP documentation](/docs/v3/guides/integrations/mcp) for all available tools, advanced configuration, and setup instructions for every supported client. ```json Claude Desktop theme={null} { "mcpServers": { "honcho": { "command": "npx", "args": [ "mcp-remote", "https://mcp.honcho.dev", "--header", "Authorization:${AUTH_HEADER}", "--header", "X-Honcho-User-Name:${USER_NAME}" ], "env": { "AUTH_HEADER": "Bearer hch-your-key-here", "USER_NAME": "YourName" } } } } ``` ```json Cursor theme={null} { "mcpServers": { "honcho": { "url": "https://mcp.honcho.dev", "headers": { "Authorization": "Bearer hch-your-key-here", "X-Honcho-User-Name": "YourName" } } } } ``` ```bash Claude Code theme={null} claude mcp add honcho \ --transport http \ --url "https://mcp.honcho.dev" \ --header "Authorization: Bearer hch-your-key-here" \ --header "X-Honcho-User-Name: YourName" ``` *** ## CLI Inspect and debug a running Honcho deployment from your terminal. The honcho CLI wraps the Python SDK with agent-friendly defaults — JSON output, structured errors, and commands for every primitive (workspaces, peers, sessions, messages, conclusions). **Get started:** ```bash theme={null} uv tool install honcho-cli honcho init # configure apiKey + environmentUrl honcho doctor # verify connectivity ``` The CLI also ships an agent skill. Install it with `npx skills add plastic-labs/honcho` and pick `honcho-cli` from the list. See the [full CLI reference](/docs/v3/documentation/reference/cli) for all commands, flags, and environment variables. *** ## Claude Code Plugin Use Honcho to build with Honcho! The [plugin](/docs/v3/guides/integrations/claudecode) provides Claude Code persistent memory that survives context wipes and session restarts. ```bash theme={null} /plugin marketplace add plastic-labs/claude-honcho /plugin install honcho@honcho # Tools for Claude to use Honcho to manage its own context /plugin install honcho-dev@honcho # Skills to teach Claude how to integrate Honcho ``` The marketplace also includes all the agent skills below, so you can use `/honcho-dev:integrate` directly after installing. See the [full Claude Code integration guide](/docs/v3/guides/integrations/claudecode) for setup details. *** ## OpenCode Plugin The [OpenCode plugin](/docs/v3/guides/integrations/opencode) gives OpenCode sessions persistent memory that survives context wipes, session restarts, and fresh chats. ```bash theme={null} bunx @honcho-ai/opencode-honcho install ``` Then run `/honcho:setup` inside OpenCode. See the [full OpenCode integration guide](/docs/v3/guides/integrations/opencode) for setup details. *** ## Agent Skills We provide agent skills for coding assistants like Claude Code, OpenCode, Cursor, Windsurf, and others. ```bash Install via npx (Recommended) theme={null} npx skills add plastic-labs/honcho ``` ```bash Install as Claude Skill Manually theme={null} curl -o ~/.claude/skills/honcho-integration.md https://raw.githubusercontent.com/plastic-labs/honcho/main/docs/SKILL.md ``` ### Available Skills #### honcho-integration **For new integrations.** This skill helps you add Honcho to an existing Python or TypeScript codebase. It provides a guided, interactive experience: 1. **Explores your codebase** to understand your language, framework, and existing AI/LLM integrations 2. **Interviews you** about which entities should be peers, your preferred integration pattern, and session structure 3. **Implements the integration** based on your answers—installing the SDK, creating peers, configuring sessions, and wiring up the chat endpoint 4. **Verifies the setup** to ensure everything is configured correctly Invoke with `/honcho-integration` in your coding agent. #### honcho-cli **For inspection & debugging.** Teaches your coding agent the right commands and flags for the [honcho CLI](#cli) — peer memory, session context, queue status, dialectic quality. Invoke implicitly when you ask your agent to inspect a Honcho deployment. #### migrate-honcho-py / migrate-honcho-ts **For SDK upgrades.** Migrates code from v1.6.0 to v2.0.0 (required for Honcho 3.0.0+). Use when upgrading the SDK or seeing errors about removed APIs like `observations`, `Representation`, `.core`, or `get_config`. Both skills handle: terminology changes (`Observation` → `Conclusion`), `Representation` class removal, method renames, and streaming API updates. | Python | TypeScript | | ------------------------------- | -------------------------- | | `/migrate-honcho-py` | `/migrate-honcho-ts` | | `AsyncHoncho` → `.aio` accessor | `@honcho-ai/core` removal | | | `snake_case` → `camelCase` | *** ## Universal Starter Prompt ``` I want to start building with Honcho - an open source memory library for building stateful agents. ## Honcho Resources **Documentation:** - Main docs: https://docs.honcho.dev - API Reference: https://docs.honcho.dev/v3/api-reference/introduction - Quickstart: https://docs.honcho.dev/v3/documentation/introduction/quickstart - Architecture: https://docs.honcho.dev/v3/documentation/core-concepts/architecture **Code & Examples:** - Core repo: https://github.com/plastic-labs/honcho - Python SDK: https://github.com/plastic-labs/honcho-python - TypeScript SDK: https://github.com/plastic-labs/honcho-node - CLI (inspect & debug a deployment): https://github.com/plastic-labs/honcho/tree/main/honcho-cli - Discord bot starter: https://github.com/plastic-labs/discord-python-starter - Telegram bot example: https://github.com/plastic-labs/telegram-python-starter **What Honcho Does:** Honcho is an open source memory library with a managed service for building stateful agents. It enables agents to build and maintain state about any entity--users, agents, groups, ideas, and more. Because it's a continual learning system, it understands entities that change over time. When you write messages to Honcho, they're stored and processed in the background. Custom reasoning models perform formal logical reasoning to generate conclusions about each peer. These conclusions are stored as representations that you can query to provide rich context for your agents. **Architecture Overview:** - Core primitives: Workspaces contain Peers (any entity that persists but changes) and Sessions (interaction threads between peers) - Peers can observe other peers in sessions (configurable with observe_me and observe_others) - Background reasoning processes messages to extract premises, draw conclusions, and build representations - Representations enable continuous improvement as new messages refine existing conclusions and scaffold new ones over time - Chat endpoint provides personalized responses based on learned context - Supports any LLM (OpenAI, Anthropic, open source) - Can use managed service or self-host Please assess the resources above and ask me relevant questions to help build a well-structured application using Honcho. Consider asking about: - What I'm trying to build - My technical preferences and stack - Whether I want to use the managed service or self-host - My experience level with the technologies involved - Specific features I need (multi-peer sessions, perspective-taking, streaming, etc.) Once you understand my needs, help me create a working implementation with proper memory and statefulness. ``` # CLI Reference Source: https://honcho.dev/docs/v3/documentation/reference/cli Command-line interface for Honcho — inspect workspaces, peers, sessions, and memory from your terminal ## Install ```bash uv (recommended) theme={null} uv tool install honcho-cli ``` ```bash uvx (ephemeral) theme={null} uvx honcho-cli ``` ## Quick Start ```bash theme={null} honcho init # confirm/set apiKey + Honcho URL in ~/.honcho/config.json honcho doctor # verify your config + connectivity honcho # show banner + command list ``` ## Configuration The CLI resolves config in this order: **flag → env var → config file → default**. | Value | File key | Env var | Flag | Persisted? | | ----------- | ---------------- | --------------------- | -------------------- | ---------- | | API key | `apiKey` | `HONCHO_API_KEY` | — | Yes | | API URL | `environmentUrl` | `HONCHO_BASE_URL` | — | Yes | | Workspace | — | `HONCHO_WORKSPACE_ID` | `-w` / `--workspace` | No | | Peer | — | `HONCHO_PEER_ID` | `-p` / `--peer` | No | | Session | — | `HONCHO_SESSION_ID` | `-s` / `--session` | No | | JSON output | — | `HONCHO_JSON` | `--json` | No | ### Persisted config The CLI shares `~/.honcho/config.json` with sibling Honcho tools. It owns only `apiKey` and `environmentUrl` at the top level — everything else (`hosts`, `sessions`, etc.) is written by other tools and left untouched on save. ```json theme={null} { "apiKey": "hch-v3-...", "environmentUrl": "https://api.honcho.dev", "hosts": { "claude_code": { "...": "..." } } } ``` Per-command scoping (workspace / peer / session) is handled via `-w` / `-p` / `-s` flags or `HONCHO_*` env vars. **Not** persisted as CLI defaults. This is deliberate: every invocation is explicit about what it operates on. ### Runtime overrides Workspace, peer, and session scoping are **per-command only** — pass flags or `HONCHO_*` env vars on every invocation. ```bash theme={null} # Per-command flags honcho peer card -w prod -p user # Or export once per shell export HONCHO_WORKSPACE_ID=prod export HONCHO_PEER_ID=user honcho peer card # One-off against a different server HONCHO_BASE_URL=http://localhost:8000 honcho workspace list # CI/CD — env vars only, no config file needed export HONCHO_API_KEY=hch-v3-xxx export HONCHO_BASE_URL=https://api.honcho.dev honcho workspace list ``` ## Output & exit codes Every command adapts its output to the context: * **TTY** — human-readable tables via Rich. * **Piped or redirected** — JSON automatically (detected via `isatty`). * **`--json` flag / `HONCHO_JSON=1`** — force JSON regardless of terminal. Collection commands emit JSON arrays; single-resource commands emit JSON objects. Errors are always structured: ```json theme={null} { "error": { "code": "PEER_NOT_FOUND", "message": "Peer 'abc' not found in workspace 'my-ws'", "details": {"workspace_id": "my-ws", "peer_id": "abc"} } } ``` | Exit code | Meaning | | --------- | -------------------------------------------- | | `0` | Success | | `1` | Client error (bad input, resource not found) | | `2` | Server error | | `3` | Auth error (missing or invalid API key) | CI pipelines and agent runtimes can branch on these without parsing stderr. ## Command reference ## honcho conclusion List, search, create, and delete peer conclusions (Honcho's memory atoms). Create a conclusion. ```bash theme={null} honcho conclusion create ``` Observer peer ID. Observed peer ID. Session context. Short alias: `-s`. Delete a conclusion. ```bash theme={null} honcho conclusion delete ``` Observer peer ID. Observed peer ID. Skip confirmation. Short alias: `-y`. List conclusions. ```bash theme={null} honcho conclusion list ``` Observer peer ID. Observed peer ID. Max results. Semantic search over conclusions. ```bash theme={null} honcho conclusion search ``` Observer peer ID. Observed peer ID. Max results. ## honcho config Inspect CLI configuration. ```bash theme={null} honcho config ``` ## honcho doctor Verify config and connectivity. Scope with -w / -p to check workspace, peer, and queue health. ```bash theme={null} honcho doctor ``` ## honcho help Show help message. ```bash theme={null} honcho help ``` ## honcho init Set API key and server URL in \~/.honcho/config.json. Press Enter to keep the current value or type a replacement. Workspace / peer / session scoping is per-command via -w / -p / -s or HONCHO\_\* env vars — never persisted. ```bash theme={null} honcho init ``` API key (admin JWT). Honcho API URL (e.g. [https://api.honcho.dev](https://api.honcho.dev), [http://localhost:8000](http://localhost:8000)). ## honcho message List, create, and get messages within a session. Create a message in a session. ```bash theme={null} honcho message create ``` Peer ID of the message sender. Short alias: `-p`. JSON metadata to associate with the message. Session ID. Short alias: `-s`. Get a single message by ID. ```bash theme={null} honcho message get ``` Session ID. Short alias: `-s`. List messages in a session. Scoped to a peer with -p. ```bash theme={null} honcho message list [] ``` Number of recent messages. Show oldest first (default is newest first). Show only IDs, peer, token count, and created\_at (no content). Filter by peer ID. Short alias: `-p`. ## honcho peer List, create, chat with, search, and manage peers and their representations. Get raw peer card content. ```bash theme={null} honcho peer card [] ``` Target peer for relationship card. Query the dialectic about a peer. ```bash theme={null} honcho peer chat ``` Target peer for perspective. Reasoning level: minimal, low, medium, high, max. Short alias: `-r`. Create or get a peer. ```bash theme={null} honcho peer create ``` Whether Honcho will form a representation of this peer. Negate with `--no-observe-me`. JSON metadata to associate with the peer. Get metadata for a peer. ```bash theme={null} honcho peer get-metadata [] ``` Inspect a peer: card, session count, recent conclusions. ```bash theme={null} honcho peer inspect [] ``` List all peers in the workspace. ```bash theme={null} honcho peer list ``` Get the formatted representation for a peer. ```bash theme={null} honcho peer representation [] ``` Target peer to get representation about. Semantic search query to filter conclusions. Maximum number of conclusions to include. Search a peer's messages. ```bash theme={null} honcho peer search ``` Max results. Set metadata for a peer. ```bash theme={null} honcho peer set-metadata ``` Peer ID (uses default if omitted). Short alias: `-p`. ## honcho session List, inspect, create, delete, and manage conversation sessions and their peers. Add peers to a session. ```bash theme={null} honcho session add-peers ``` Get session context (what an agent would see). ```bash theme={null} honcho session context [] ``` Token budget. Include summary. Negate with `--no-summary`. Create or get a session. ```bash theme={null} honcho session create ``` Comma-separated peer IDs to add to the session. JSON metadata to associate with the session. Delete a session and all its data. Destructive — requires --yes or interactive confirm. ```bash theme={null} honcho session delete [] ``` Skip confirmation. Short alias: `-y`. Get metadata for a session. ```bash theme={null} honcho session get-metadata [] ``` Inspect a session: peers, message count, summaries, config. ```bash theme={null} honcho session inspect [] ``` List sessions in the workspace. ```bash theme={null} honcho session list ``` Filter by peer. Short alias: `-p`. List peers in a session. ```bash theme={null} honcho session peers [] ``` Remove peers from a session. ```bash theme={null} honcho session remove-peers ``` Get the representation of a peer within a session. ```bash theme={null} honcho session representation [] ``` Target peer (what peer\_id knows about target). Semantic search query to filter conclusions. Maximum number of conclusions to include. Search messages in a session. ```bash theme={null} honcho session search [] ``` Max results. Set metadata for a session. ```bash theme={null} honcho session set-metadata [] ``` JSON metadata to set (e.g. '\{"key": "value"}'). Short alias: `-d`. Get session summaries (short + long). ```bash theme={null} honcho session summaries [] ``` ## honcho workspace List, create, inspect, delete, and search workspaces. Create or get a workspace. ```bash theme={null} honcho workspace create ``` JSON metadata to associate with the workspace. Delete a workspace. Use --dry-run first to see what will be deleted. Requires --yes to skip confirmation, or will prompt interactively. If sessions exist, requires --cascade to delete them first. ```bash theme={null} honcho workspace delete ``` Skip confirmation prompt (for scripted/agent use). Short alias: `-y`. Delete all sessions before deleting the workspace. Show what would be deleted without deleting. Inspect a workspace: peers, sessions, config. ```bash theme={null} honcho workspace inspect [] ``` List all accessible workspaces. ```bash theme={null} honcho workspace list ``` Get queue processing status. ```bash theme={null} honcho workspace queue-status ``` Filter by observer peer. Filter by sender peer. Search messages across workspace. ```bash theme={null} honcho workspace search ``` Max results. ## Workflows ### Inspect an unfamiliar workspace When you pick up a workspace and need to orient — start broad, narrow to the peer and session you care about. ```bash theme={null} honcho workspace inspect --json honcho peer list --json ``` ```bash theme={null} honcho peer inspect --json honcho peer card --json ``` ```bash theme={null} honcho conclusion list --observer --json honcho conclusion search "topic" --observer --json ``` ```bash theme={null} honcho session inspect --json honcho message list --last 20 --json honcho session context --json honcho session summaries --json ``` `honcho session context` shows exactly what an agent would receive at inference time — check it before `honcho peer chat` if a response surprises you. ### A peer isn't learning If new messages aren't producing new conclusions, work down the diagnostic ladder. ```bash theme={null} # Is observation enabled for this peer? honcho peer inspect --json | jq '.configuration' # Is the deriver actually processing? honcho workspace queue-status --json # Do any conclusions exist at all? Any for the expected topic? honcho conclusion list --observer --json honcho conclusion search "expected topic" --observer --json ``` ### Session context looks wrong When an agent's responses don't reflect what you expect it to know. ```bash theme={null} honcho session context --json honcho session summaries --json honcho message list --last 50 --json ``` ### Dialectic returns bad answers When `honcho peer chat` or the dialectic API is hallucinating or missing context. ```bash theme={null} # What does the peer card actually say? honcho peer card --json # Any conclusions for this topic? honcho conclusion search "topic" --observer --json # Reproduce the query against the CLI honcho peer chat "what do you know about X?" --json ``` ## Scripting & automation Pipe commands into `jq` for inline transforms, or set `HONCHO_*` env vars for a CI/CD environment with no config file: ```bash theme={null} # Pipe to jq honcho peer list --json | jq '.[].id' honcho workspace inspect --json | jq '.peers' # Machine-parseable health check — exit code for CI, details for logs honcho doctor --json # CI/CD — env vars only, no ~/.honcho/config.json export HONCHO_API_KEY=hch-v3-xxx export HONCHO_BASE_URL=https://api.honcho.dev honcho workspace list ``` Non-interactive onboarding: ```bash theme={null} # Pre-seed via flags / env vars; init still prompts for anything missing HONCHO_API_KEY=hch-v3-xxx honcho init --base-url https://api.honcho.dev ``` # The Honcho Dashboard Source: https://honcho.dev/docs/v3/documentation/reference/platform Build stateful agents without worrying about infrastructure Start using the platform to manage Honcho instances for your workspace or app. The quickest way to begin using Honcho in production is with the [Honcho Cloud Service](https://app.honcho.dev). Sign up, generate an API key, and start building with Honcho. ## 1. Go to [app.honcho.dev](https://app.honcho.dev) Create an account to start using Honcho. If a teammate already uses Honcho, ask them to invite you to their organization. Otherwise, you'll see a banner prompting you to create a new one.
Honcho Platform Dashboard
Once you've created an organization, you'll be taken to the dashboard and see the Welcome page with integration guidance and links to documentation. Honcho Dashboard Getting Started Each organization has dedicated infrastructure running to isolate your workloads. Once you add a valid payment method under the [Billing](https://app.honcho.dev/billing) page, your instance will turn on. ## 2. Activate your Honcho instance Navigate to the [Billing](https://app.honcho.dev/billing) page to add a payment method. Your Honcho instance provisions automatically, and you can monitor the deployment on the [Instance Status](https://app.honcho.dev/status) page until all systems show a green check mark. Instance Status Page You can also upgrade Honcho when new versions are made available directly from the status page.
Upgrade Honcho
The **Performance** page provides comprehensive monitoring with usage metrics, health analytics, API response times, and endpoint usage across Honcho. Performance Analytics Dashboard ## 3. Manage API Keys The [API Keys](https://app.honcho.dev/api-keys) page allows you to create and manage authentication tokens for different environments. You can create admin-level keys with full instance access or scope keys to specific `Workspaces`, `Peers`, or `Sessions`. API Key Management Dashboard ## 4. Test with API Playground The [API Playground](https://app.honcho.dev/playground) provides a Postman-like interface to test queries, explore endpoints, and validate your integration. Authenticate with an API key and send requests directly to your Honcho instance with real-time responses and full request/response logging. API Playground Interface ## 5. Workspaces The [Explore](https://app.honcho.dev/explore) page provides comprehensive `Workspace` management where you can create workspaces and begin exploring the platform. Each `Workspace` serves as a container for organizing your Honcho data. Workspace Table Click into any workspace to access a general overview of `Peers` and `Sessions`. Here you can quickly create `Peers`, `Sessions`, and add multiple `Peers` to any `Session`. Edit the metadata and configuration for a `Workspace` with the Edit Config button. Click into any entity to navigate to their respective utilities pages or click the expand icon to view Workspace-wide `Peers` and `Sessions` data tables with more details. Workspace Dashboard Overview ## 6. Peer Dashboard & Utilities Expand the `Peers` list from the `Workspace` dashboard to see a detailed view of `Peers`. Peer Dashboard Click into any peer to navigate to their respective utilities page. Next to the `Peer` name you can edit the [Peer Configuration](/docs/v3/documentation/features/advanced/reasoning-configuration), and in the tabs below, explore all utilities for the `Peer`. Peer Management Dashboard Utilities include: * **Message search** across all sessions for a `Peer` * **Chat** to query `Peer` representations with an optional session scope (results vary based on the `Peer`'s configuration) Chat Endpoint * **Session logs** view which `Sessions` the `Peer` is active * **Peer configuration and metadata management** including [Session-Peer Configuration](/docs/v3/documentation/features/advanced/reasoning-configuration#session-configuration) Peer Management Dashboard ## 7. Session Dashboard & Utilities Click into the sessions view within a workspace to see a table of all of your `Sessions` data. Sessions Table Click into a `Session` to open its utilities page. Session Utilities Here you can: * **View and add Messages** within the `Session`; filter messages by `Peer` * **Advanced search** across `Session` messages * **Peer management** for adding/removing `Peers` and editing a `Peer`'s Session-level configuration * **Get Context** to generate LLM-ready context with customizable token limits Get Context ## 8. Webhooks Integration The [Webhooks](https://app.honcho.dev/webhooks) page enables Webhook creation and management. Webhooks Dashboard ## 9. Organization Member Access The [Members](https://app.honcho.dev/members) page provides organization administration to manage your team's access to Honcho with the ability to grant admin permissions. Members Dashboard ## Go Further View the [Architecture](/docs/v3/documentation/core-concepts/architecture) to see how Honcho works under the hood. Dive into our [API Reference](/docs/v3/api-reference) to explore all available endpoints. ## Next Steps Get started with managed Honcho instances Connect with 1000+ developers building with Honcho View our guidelines and explore the codebase See Honcho in action with real examples # SDK Reference Source: https://honcho.dev/docs/v3/documentation/reference/sdk Complete SDK documentation and examples for Python and TypeScript The Honcho SDKs provide ergonomic interfaces for building agentic AI applications with Honcho in Python and TypeScript/JavaScript. ## Installation ```bash Python (uv) theme={null} uv add honcho-ai ``` ```bash Python (pip) theme={null} pip install honcho-ai ``` ```bash TypeScript (npm) theme={null} npm install @honcho-ai/sdk ``` ```bash TypeScript (yarn) theme={null} yarn add @honcho-ai/sdk ``` ```bash TypeScript (pnpm) theme={null} pnpm add @honcho-ai/sdk ``` ## Quickstart Without configuration, the SDK defaults to the demo server. For production use: 1. Get your API key at [app.honcho.dev/api-keys](https://app.honcho.dev/api-keys) 2. Set `environment="production"` and provide your `api_key` ```python Python theme={null} from honcho import Honcho # Initialize client (using the default workspace) honcho = Honcho() # Create peers alice = honcho.peer("alice") assistant = honcho.peer("assistant") # Create a session for conversation session = honcho.session("conversation-1") # Add messages to conversation session.add_messages([ alice.message("What's the weather like today?"), assistant.message("It's sunny and 75°F outside!") ]) # Chat with Honcho about a peer response = alice.chat("What did the assistant tell this user about the weather?") # Get conversation context for LLM completions context = session.context() openai_messages = context.to_openai(assistant=assistant) ``` ```typescript TypeScript theme={null} import { Honcho } from "@honcho-ai/sdk"; // Initialize client (using the default workspace) const honcho = new Honcho({}); // Create peers const alice = await honcho.peer("alice"); const assistant = await honcho.peer("assistant"); // Create a session for conversation const session = await honcho.session("conversation-1"); // Add messages to conversation await session.addMessages([ alice.message("What's the weather like today?"), assistant.message("It's sunny and 75°F outside!") ]); // Chat with Honcho about a peer const response = await alice.chat("What did the assistant tell this user about the weather?"); // Get conversation context for LLM completions const context = await session.context(); const openaiMessages = context.toOpenAI(assistant); ``` ## Core Concepts ### Peers and Representations **Representations** are how Honcho models what peers know. Each peer has a **global representation** (everything they know across all sessions) and **local representations** (what other specific peers know about them, scoped by session or globally). ```python Python theme={null} # Query alice's global knowledge response = alice.chat("What does the user know about weather?") # Query what alice knows about the assistant (local representation) response = alice.chat("What does the user know about the assistant?", target=assistant) # Query scoped to a specific session response = alice.chat("What happened in our conversation?", session=session.id) ``` ```typescript TypeScript theme={null} // Query alice's global knowledge const response = await alice.chat("What does the user know about weather?"); // Query what alice knows about the assistant (local representation) const targetResponse = await alice.chat("What does the user know about the assistant?", { target: assistant }); // Query scoped to a specific session const sessionResponse = await alice.chat("What happened in our conversation?", { sessionId: session.id }); ``` ## Core Classes ### Honcho Client The main entry point for workspace operations: ```python Python theme={null} from honcho import Honcho # Basic initialization (uses environment variables) honcho = Honcho(workspace_id="my-app-name") # Full configuration honcho = Honcho( workspace_id="my-app-name", api_key="my-api-key", environment="production", # or "local", "demo" base_url="https://api.honcho.dev", timeout=30.0, max_retries=3 ) ``` ```typescript TypeScript theme={null} import { Honcho } from "@honcho-ai/sdk"; // Basic initialization (uses environment variables) const honcho = new Honcho({ workspaceId: "my-app-name" }); // Full configuration const honcho = new Honcho({ workspaceId: "my-app-name", apiKey: "my-api-key", environment: "production", // or "local", "demo" baseURL: "https://api.honcho.dev", timeout: 30000, maxRetries: 3, defaultHeaders: { "X-Custom-Header": "value" }, defaultQuery: { "param": "value" } }); ``` **Environment Variables:** * `HONCHO_API_KEY` - API key for authentication * `HONCHO_BASE_URL` - Base URL for the Honcho API * `HONCHO_WORKSPACE_ID` - Default workspace ID **Key Methods:** ```python Python theme={null} # Get or create a peer peer = honcho.peer(id) # Get or create a session session = honcho.session(id) # List all peers in workspace peers = honcho.peers() # List all sessions in workspace sessions = honcho.sessions() # Search across all content in workspace results = honcho.search(query) # Workspace metadata management metadata = honcho.get_metadata() honcho.set_metadata(dict) # Get list of all workspace IDs workspaces = honcho.workspaces() ``` ```typescript TypeScript theme={null} // Get or create a peer const peer = await honcho.peer(id); // Get or create a session const session = await honcho.session(id); // List all peers in workspace (returns Page) const peers = await honcho.peers(); // List with pagination and filtering const filtered = await honcho.peers({ filters: { metadata: { role: "user" } }, page: 1, size: 25, reverse: true }); // List all sessions in workspace (returns Page) const sessions = await honcho.sessions(); // Search across all content in workspace (returns Page) const results = await honcho.search(query); // Workspace metadata management const metadata = await honcho.getMetadata(); await honcho.setMetadata(metadata); // Get list of all workspace IDs const workspaces = await honcho.workspaces(); ``` `peer()` and `session()` always make a get-or-create API call, returning objects with cached metadata, configuration, and timestamps. ### Peer Represents an entity that can participate in conversations: ```python Python theme={null} # Create peers (get-or-create API call) alice = honcho.peer("alice") assistant = honcho.peer("assistant") # Create with immediate configuration # This will make an API call to create the peer with the custom configuration and/or metadata alice = honcho.peer("bob", config={"role": "user", "active": True}, metadata={"location": "NYC", "role": "developer"}) # Peer properties print(f"Peer ID: {alice.id}") print(f"Workspace: {alice.workspace_id}") print(f"Created: {alice.created_at}") # Available after API fetch # Chat with peer's representations (supports streaming) response = alice.chat("What did I have for breakfast?") response = alice.chat("What do I know about Bob?", target="bob") response = alice.chat("What happened in session-1?", session="session-1") response = alice.chat("Summarize what matters most to me.", reasoning_level="high") # Add content to a session with a peer session = honcho.session("session-1") session.add_messages([ alice.message("I love Python programming"), alice.message("Today I learned about async programming"), alice.message("I prefer functional programming patterns") ]) # Get peer's sessions sessions = alice.sessions() # Search peer's messages results = alice.search("programming") # Metadata management metadata = alice.get_metadata() metadata["location"] = "Paris" alice.set_metadata(metadata) # Peer card management card = alice.get_card() # Get peer card card = alice.get_card(target="bob") # Get card about another peer updated = alice.set_card(["Likes Python", "Lives in NYC"]) # Set peer card updated = alice.set_card(["Works at Acme"], target="bob") # Set card about another peer # Get peer context (representation + peer card in one call) context = alice.context() context = alice.context(target="bob") # What alice knows about bob # Get working representation with semantic search rep = alice.representation(search_query="preferences", search_top_k=10) # Access conclusions self_conclusions = alice.conclusions.list() # Self-conclusions bob_conclusions = alice.conclusions_of("bob").list() # Conclusions of bob ``` ```typescript TypeScript theme={null} // Create peers (returns Promise) const alice = await honcho.peer("alice"); const assistant = await honcho.peer("assistant"); // Peer properties console.log(`Peer ID: ${alice.id}`); console.log(`Created: ${alice.createdAt}`); // Available after API fetch // Chat with peer's representations (supports streaming) const response = await alice.chat("What did I have for breakfast?"); const targetResponse = await alice.chat("What do I know about Bob?", { target: "bob" }); const sessionResponse = await alice.chat("What happened in session-1?", { sessionId: "session-1" }); const deeperResponse = await alice.chat("Summarize what matters most to me.", { reasoningLevel: "high" }); // Chat with streaming support const streamResponse = await alice.chat("Tell me a story", { stream: true }); // Add content to a session with a peer const session = await honcho.session("session-1"); await session.addMessages([ alice.message("I love TypeScript programming"), alice.message("Today I learned about async programming"), alice.message("I prefer functional programming patterns") ]); // Get peer's sessions const sessions = await alice.sessions(); // Search peer's messages const results = await alice.search("programming"); // Metadata management const metadata = await alice.getMetadata(); await alice.setMetadata({ ...metadata, location: "Paris" }); // Peer card management const card = await alice.getCard(); // Get peer card const targetCard = await alice.getCard("bob"); // Get card about another peer const updated = await alice.setCard(["Likes TypeScript", "Lives in NYC"]); // Set peer card const updatedTarget = await alice.setCard(["Works at Acme"], "bob"); // Set card about another peer // Get peer context (representation + peer card in one call) const context = await alice.context(); const targetContext = await alice.context({ target: "bob" }); // What alice knows about bob // Get working representation with semantic search const rep = await alice.representation({ searchQuery: "preferences", searchTopK: 10 }); // Access conclusions const selfConclusions = await alice.conclusions.list(); // Self-conclusions const bobConclusions = await alice.conclusionsOf("bob").list(); // Conclusions of bob ``` ### Peer Context The `context()` method on peers retrieves both the working representation and peer card in a single API call: ```python Python theme={null} # Get peer's own context context = alice.context() print(context.representation) # Working representation print(context.peer_card) # Peer card as list of strings # Get context about another peer (what alice knows about bob) bob_context = alice.context(target="bob") # Get context with semantic search context = alice.context( target="bob", search_query="work preferences", search_top_k=10, search_max_distance=0.8, include_most_frequent=True, max_conclusions=50 ) ``` ```typescript TypeScript theme={null} // Get peer's own context const context = await alice.context(); console.log(context.representation); // Working representation console.log(context.peerCard); // Peer card as array of strings // Get context about another peer (what alice knows about bob) const bobContext = await alice.context({ target: "bob" }); // Get context with semantic search const searchedContext = await alice.context({ target: "bob", searchQuery: "work preferences", searchTopK: 10, searchMaxDistance: 0.8, includeMostFrequent: true, maxConclusions: 50 }); ``` ### Peer Card The peer card contains stable biographical facts about a peer (name, preferences, background). Use `get_card()` / `getCard()` to retrieve it and `set_card()` / `setCard()` to overwrite it: ```python Python theme={null} # Get peer's own card card = alice.get_card() print(card) # ["Likes Python", "Lives in NYC", ...] # Get card about another peer (local representation) bob_card = alice.get_card(target="bob") # Set peer's own card updated = alice.set_card(["Likes Python", "Lives in NYC"]) # Set card about another peer updated = alice.set_card(["Works at Acme", "Enjoys hiking"], target="bob") ``` ```typescript TypeScript theme={null} // Get peer's own card const card = await alice.getCard(); console.log(card); // ["Likes TypeScript", "Lives in NYC", ...] // Get card about another peer (local representation) const bobCard = await alice.getCard("bob"); // Set peer's own card const updated = await alice.setCard(["Likes TypeScript", "Lives in NYC"]); // Set card about another peer const updatedBob = await alice.setCard(["Works at Acme", "Enjoys hiking"], "bob"); ``` Peer cards are automatically maintained by the dreaming agent during message processing. Use `set_card()` / `setCard()` when you need to manually override or seed the card — the peer will be created automatically if it doesn't already exist. ### Conclusions Peers can access their conclusions (facts derived from messages) through the `conclusions` property and `conclusions_of()` method: ```python Python theme={null} # Access self-conclusions (what honcho knows about alice) self_conclusions = alice.conclusions # List self-conclusions conclusions_list = self_conclusions.list() # Search self-conclusions semantically results = self_conclusions.query("food preferences") # Delete a conclusion self_conclusions.delete("conclusion-id") # Access conclusions of another peer (what alice knows about bob) bob_conclusions = alice.conclusions_of("bob") bob_conclusions_list = bob_conclusions.list() bob_search = bob_conclusions.query("work history") ``` ```typescript TypeScript theme={null} // Access self-conclusions (what honcho knows about alice) const selfConclusions = alice.conclusions; // List self-conclusions const conclusionsList = await selfConclusions.list(); // Search self-conclusions semantically const results = await selfConclusions.query("food preferences"); // Delete a conclusion await selfConclusions.delete("conclusion-id"); // Access conclusions of another peer (what alice knows about bob) const bobConclusions = alice.conclusionsOf("bob"); const bobConclusionsList = await bobConclusions.list(); const bobSearch = await bobConclusions.query("work history"); ``` #### Creating Conclusions Manually You can also create conclusions directly, which is useful for importing data or adding explicit facts: ```python Python theme={null} # Create conclusions for what alice knows about bob bob_conclusions = alice.conclusions_of("bob") # Create a single conclusion created = bob_conclusions.create([ {"content": "User prefers dark mode", "session_id": "session-1"} ]) # Create multiple conclusions in batch created = bob_conclusions.create([ {"content": "User prefers dark mode", "session_id": "session-1"}, {"content": "User works late at night", "session_id": "session-1"}, {"content": "User enjoys programming", "session_id": "session-1"}, ]) # Returns list of created Conclusion objects with IDs for conclusion in created: print(f"Created conclusion: {conclusion.id} - {conclusion.content}") ``` ```typescript TypeScript theme={null} // Create conclusions for what alice knows about bob const bobConclusions = alice.conclusionsOf("bob"); // Create a single conclusion const created = await bobConclusions.create([ { content: "User prefers dark mode", sessionId: "session-1" } ]); // Create multiple conclusions in batch const batchCreated = await bobConclusions.create([ { content: "User prefers dark mode", sessionId: "session-1" }, { content: "User works late at night", sessionId: "session-1" }, { content: "User enjoys programming", sessionId: "session-1" }, ]); // Returns array of created Conclusion objects with IDs for (const conclusion of batchCreated) { console.log(`Created conclusion: ${conclusion.id} - ${conclusion.content}`); } ``` Manually created conclusions are marked as "explicit" and are treated the same as system-derived conclusions. Each conclusion must be tied to a session and the content length is validated against the embedding token limit. ### Session Manages multi-party conversations: ```python Python theme={null} # Create session (get-or-create API call) session = honcho.session("conversation-1") # Create with immediate configuration # This will make an API call to create the session with the custom configuration and/or metadata session = honcho.session("meeting-1", config={"type": "meeting", "max_peers": 10}) # Session properties print(f"Session ID: {session.id}") print(f"Workspace: {session.workspace_id}") print(f"Created: {session.created_at}") # Available after API fetch print(f"Active: {session.is_active}") # Available after API fetch # Peer management session.add_peers([alice, assistant]) session.add_peers([(alice, SessionPeerConfig(observe_others=True))]) session.set_peers([alice, bob, charlie]) # Replace all peers session.remove_peers([alice]) # Get session peers and their configurations peers = session.peers() peer_config = session.get_peer_configuration(alice) session.set_peer_configuration(alice, SessionPeerConfig(observe_me=False)) # Message management session.add_messages([ alice.message("Hello everyone!"), assistant.message("Hi Alice! How can I help today?") ]) # Get messages (with optional pagination) messages = session.messages() messages = session.messages(page=1, size=100, reverse=True) # Get a single message by ID message = session.get_message("message-id") # Get conversation context context = session.context(summary=True, tokens=2000) # Get context with peer representation included context = session.context( tokens=2000, peer_target="user", peer_perspective="assistant", search_query="What are my preferences?", limit_to_session=True, search_top_k=10, search_max_distance=0.8, include_most_frequent=True, max_conclusions=25 ) # Search session content results = session.search("help") # Working representation queries with semantic search global_rep = session.representation("alice") targeted_rep = session.representation(alice, target=bob) searched_rep = session.representation( "alice", search_query="preferences", search_top_k=10, include_most_frequent=True ) # Upload a file to create messages messages = session.upload_file( file=open("document.pdf", "rb"), peer="user", metadata={"source": "upload"}, created_at="2024-01-15T10:30:00Z" ) # Clone a session (creates a copy with all data) # Copies: messages, metadata, configuration, peers, and peer configurations cloned = session.clone() # Clone up to a specific message (inclusive) # Only messages up to and including the specified message are copied cloned_partial = session.clone(message_id="msg-123") # Delete session (async - returns 202) session.delete() # Metadata management session.set_metadata({"topic": "product planning", "status": "active"}) metadata = session.get_metadata() ``` ```typescript TypeScript theme={null} // Create session (returns Promise) const session = await honcho.session("conversation-1"); // Session properties console.log(`Session ID: ${session.id}`); console.log(`Created: ${session.createdAt}`); // Available after API fetch console.log(`Active: ${session.isActive}`); // Available after API fetch // Peer management await session.addPeers([alice, assistant]); await session.addPeers("single-peer-id"); await session.setPeers([alice, bob, charlie]); // Replace all peers await session.removePeers([alice]); await session.removePeers("single-peer-id"); // Get session peers const peers = await session.peers(); // Message management await session.addMessages([ alice.message("Hello everyone!"), assistant.message("Hi Alice! How can I help today?") ]); // Get messages (with optional pagination) const messages = await session.messages(); const paged = await session.messages({ page: 1, size: 100, reverse: true }); // Get a single message by ID const message = await session.getMessage("message-id"); // Get conversation context const context = await session.context({ summary: true, tokens: 2000 }); // Get context with peer representation included const richContext = await session.context({ tokens: 2000, peerTarget: "user", peerPerspective: "assistant", limitToSession: true, representationOptions: { searchQuery: "What are my preferences?", searchTopK: 10, searchMaxDistance: 0.8, includeMostFrequent: true, maxConclusions: 25 } }); // Search session content const results = await session.search("help"); // Working representation queries with semantic search const globalRep = await session.representation("alice"); const targetedRep = await session.representation(alice, { target: bob }); const searchedRep = await session.representation("alice", { searchQuery: "preferences", searchTopK: 10, includeMostFrequent: true }); // Upload a file to create messages const messages = await session.uploadFile( fileBuffer, "user", { metadata: { source: "upload" }, createdAt: "2024-01-15T10:30:00Z" } ); // Clone a session (creates a copy with all data) // Copies: messages, metadata, configuration, peers, and peer configurations const cloned = await session.clone(); // Clone up to a specific message (inclusive) // Only messages up to and including the specified message are copied const clonedPartial = await session.clone("msg-123"); // Delete session (async - returns 202) await session.delete(); // Metadata management await session.setMetadata({ topic: "product planning", status: "active" }); const metadata = await session.getMetadata(); ``` **Session-Level Theory of Mind Configuration:** **Theory of Mind** controls whether peers can form models of what other peers think. Use `observe_others=False` to prevent a peer from modeling others within a session, and `observe_me=False` to prevent others from modeling this peer within a session. ```python Python theme={null} from honcho.api_types import SessionPeerConfig # Configure peer observation settings config = SessionPeerConfig( observe_others=False, # Form theory-of-mind of other peers -- False by default observe_me=True # Don't let others form theory-of-mind of me -- True by default ) session.add_peers([(alice, config)]) ``` ```typescript TypeScript theme={null} // Configure peer observation settings const config = new SessionPeerConfig({ observeOthers: false, // Form theory-of-mind of other peers -- False by default observeMe: true // Don't let others form theory-of-mind of me -- True by default }); await session.addPeers([alice, config]); ``` ### SessionContext Provides formatted conversation context for LLM integration: ```python Python theme={null} # Get session context context = session.context(summary=True, tokens=1500) # Convert to LLM-friendly formats openai_messages = context.to_openai(assistant=assistant) anthropic_messages = context.to_anthropic(assistant=assistant) ``` ```typescript TypeScript theme={null} // Get session context const context = await session.context({ summary: true, tokens: 1500 }); // Convert to LLM-friendly formats const openaiMessages = context.toOpenAI(assistant); const anthropicMessages = context.toAnthropic(assistant); ``` The SessionContext object has the following structure: ```json theme={null} { "id": "string", "messages": [ { "id": "string", "content": "string", "peer_id": "string", "session_id": "string", "workspace_id": "string", "metadata": {}, "created_at": "2024-01-15T10:30:00Z", "token_count": 42 } ], "summary": { "content": "string", "message_id": 123, "summary_type": "short|long", "created_at": "2024-01-15T10:30:00Z" }, "peer_representation": "string (optional)", "peer_card": ["string"] // optional, included when peer_target is provided } ``` **Session Context Parameters:** | Parameter | Type | Description | | ------------------------------------------- | ------------------ | -------------------------------------------------- | | `summary` | `bool` | Whether to include summary (default: true) | | `tokens` | `int` | Maximum tokens to include | | `peer_target` | `str` | Peer ID to get representation for | | `peer_perspective` | `str` | Peer ID for perspective (requires peer\_target) | | `limit_to_session` | `bool` | Limit representation to session only | | `representationOptions.searchQuery` | `str` or `Message` | Query string or Message object for semantic search | | `representationOptions.searchTopK` | `int` | Number of semantic search results (1-100) | | `representationOptions.searchMaxDistance` | `float` | Max semantic distance (0.0-1.0) | | `representationOptions.includeMostFrequent` | `bool` | Include most frequent conclusions | | `representationOptions.maxConclusions` | `int` | Max conclusions to include (1-100) | ## Advanced Usage ### Multi-Party Conversations ```python Python theme={null} # Create multiple peers users = [honcho.peer(f"user-{i}") for i in range(5)] moderator = honcho.peer("moderator") # Create group session group_chat = honcho.session("group-discussion") group_chat.add_peers(users + [moderator]) # Add messages from different peers group_chat.add_messages([ users[0].message("What's our agenda for today?"), moderator.message("We'll discuss the new feature roadmap"), users[1].message("I have some concerns about the timeline") ]) # Query different perspectives user_perspective = users[0].chat("What are people's concerns?") moderator_view = moderator.chat("What feedback am I getting?", session=group_chat.id) ``` ```typescript TypeScript theme={null} // Create multiple peers const users = await Promise.all( Array.from({ length: 5 }, (_, i) => honcho.peer(`user-${i}`)) ); const moderator = await honcho.peer("moderator"); // Create group session const groupChat = await honcho.session("group-discussion"); await groupChat.addPeers([...users, moderator]); // Add messages from different peers await groupChat.addMessages([ users[0].message("What's our agenda for today?"), moderator.message("We'll discuss the new feature roadmap"), users[1].message("I have some concerns about the timeline") ]); // Query different perspectives const userPerspective = await users[0].chat("What are people's concerns?"); const moderatorView = await moderator.chat("What feedback am I getting?", { sessionId: groupChat.id }); ``` ### LLM Integration ```python Python theme={null} import openai # Get conversation context context = session.context(tokens=3000) messages = context.to_openai(assistant=assistant) # Call OpenAI API response = openai.chat.completions.create( model="gpt-4", messages=messages + [ {"role": "user", "content": "Summarize the key discussion points."} ] ) ``` ```typescript TypeScript theme={null} import OpenAI from 'openai'; const openai = new OpenAI(); // Get conversation context const context = await session.context({ tokens: 3000 }); const messages = context.toOpenAI(assistant); // Call OpenAI API const response = await openai.chat.completions.create({ model: "gpt-4", messages: [ ...messages, { role: "user", content: "Summarize the key discussion points." } ] }); ``` ### Custom Message Timestamps When creating messages, you can optionally specify a custom `created_at` timestamp instead of using the server's current time: ```bash theme={null} curl -X POST "https://api.honcho.dev/v3/workspaces/{workspace_id}/sessions/{session_id}/messages" \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "messages": [ { "peer_id": "user123", "content": "This message happened yesterday", "created_at": "2024-01-01T12:00:00Z", "metadata": {"source": "historical_data"} } ] }' ``` This is useful for: * Importing historical conversation data * Backfilling messages from other systems * Maintaining accurate timeline ordering when processing batch data If `created_at` is not provided, messages will use the server's current timestamp. ### Metadata and Filtering See [Using Filters](/docs/v3/documentation/features/advanced/using-filters) for more examples on how to use filters. ```python Python theme={null} # Add messages with metadata session.add_messages([ alice.message("Let's discuss the budget", metadata={ "topic": "finance", "priority": "high" }), assistant.message("I'll prepare the financial report", metadata={ "action_item": True, "due_date": "2024-01-15" }) ]) # Filter messages by metadata finance_messages = session.messages(filters={"metadata": {"topic": "finance"}}) action_items = session.messages(filters={"metadata": {"action_item": True}}) ``` ```typescript TypeScript theme={null} // Add messages with metadata await session.addMessages([ alice.message("Let's discuss the budget", { metadata: { topic: "finance", priority: "high" } }), assistant.message("I'll prepare the financial report", { metadata: { action_item: true, due_date: "2024-01-15" } }) ]); // Filter messages by metadata const financeMessages = await session.messages({ filters: { metadata: { topic: "finance" } } }); const actionItems = await session.messages({ filters: { metadata: { action_item: true } } }); ``` ### Pagination All list methods support `page`, `size`, and `reverse` parameters: ```python Python theme={null} # Default pagination (page 1, size 50) for session in honcho.sessions(): print(f"Session: {session.id}") # Custom page size for message in session.messages(size=100): print(f" {message.peer_id}: {message.content}") # Start at a specific page page3 = session.messages(page=3, size=25) # Reverse ordering recent_first = session.messages(reverse=True) # Combine with filters filtered = session.messages(filters={"peer_id": "alice"}, size=10) ``` ```typescript TypeScript theme={null} // Default pagination (page 1, size 50) const peersPage = await honcho.peers(); // Custom page size and filtering const filtered = await honcho.peers({ filters: { metadata: { role: "user" } }, size: 25 }); // Start at a specific page const page3 = await session.messages({ page: 3, size: 25 }); // Reverse ordering const recent = await session.messages({ reverse: true }); // Iterate through all items (auto-paginates) for await (const peer of await honcho.peers()) { console.log(`Peer: ${peer.id}`); } // Manual pagination let currentPage = peersPage; while (currentPage) { const data = await currentPage.data(); console.log(`Processing ${data.length} items`); currentPage = await currentPage.nextPage(); } ``` ## Best Practices ### Resource Management ```python Python theme={null} # Peers and sessions are lightweight - create as needed alice = honcho.peer("alice") session = honcho.session("chat-1") # Use descriptive IDs for better debugging user_session = honcho.session(f"user-{user_id}-support-{ticket_id}") support_agent = honcho.peer(f"agent-{agent_id}") ``` ```typescript TypeScript theme={null} // Peers and sessions are lightweight - create as needed const alice = await honcho.peer("alice"); const session = await honcho.session("chat-1"); // Use descriptive IDs for better debugging const userSession = await honcho.session(`user-${userId}-support-${ticketId}`); const supportAgent = await honcho.peer(`agent-${agentId}`); ``` ### Performance Optimization ```python Python theme={null} # Create peers (each makes a get-or-create call) peers = [honcho.peer(f"user-{i}") for i in range(100)] # Batch operations when possible session.add_messages([peer.message(f"Message {i}") for i, peer in enumerate(peers)]) # Use context limits to control token usage context = session.context(tokens=1500) # Limit context size ``` ```typescript TypeScript theme={null} // Create peers (each makes a get-or-create call) const peers = await Promise.all( Array.from({ length: 100 }, (_, i) => honcho.peer(`user-${i}`)) ); // Batch operations when possible await session.addMessages( peers.map((peer, i) => peer.message(`Message ${i}`)) ); // Use context limits to control token usage const context = await session.context({ tokens: 1500 }); // Limit context size // Iterate efficiently with async iteration for await (const peer of await honcho.peers()) { // Process one peer at a time without loading all into memory } ``` # Agent Zero Source: https://honcho.dev/docs/v3/guides/community/agent0 Add AI-native memory to Agent Zero [Agent Zero](https://github.com/agent0ai/agent-zero) is a general AI agent framework with a plugin-first architecture. The Honcho plugin gives Agent Zero persistent memory across chat sessions — users are remembered with their preferences, context, and behavioral patterns, even after sessions end and new ones begin. ## Getting Started The Honcho plugin is a community integration. See the [plugin README](https://github.com/alogotron/a0-plugin-honcho) for full installation and configuration instructions. The integration requires: 1. A Honcho API key from [app.honcho.dev](https://app.honcho.dev) 2. Cloning the plugin into your Agent Zero plugins directory 3. Enabling it via **Settings > Plugins** in Agent Zero's UI ## How It Works The plugin hooks into Agent Zero's extension system. It syncs user and assistant messages to Honcho after every turn, prefetches user context into the system prompt on each new turn, and maintains separate peer models for the user and agent. If Honcho is unavailable, the agent continues normally. ## Next Steps Source code, installation, and full documentation. Learn about peers, sessions, and dialectic reasoning. # Pi Source: https://honcho.dev/docs/v3/guides/community/pi-honcho-memory Persistent memory extension for the pi coding agent [pi-honcho-memory](https://github.com/agneym/pi-honcho-memory) is a persistent memory extension for [pi](https://pi.dev), a coding agent CLI. It gives pi long-term memory across sessions — user preferences, project context, and past decisions are remembered and automatically injected into the system prompt. ## Getting Started Install the extension inside pi: ```bash theme={null} pi install npm:@agney/pi-honcho-memory ``` The integration requires: 1. A Honcho API key from [app.honcho.dev](https://app.honcho.dev) 2. Running `/honcho-setup` inside pi for interactive configuration, or setting `HONCHO_API_KEY` in your environment The Honcho plugin is a community integration. See the [plugin README](https://github.com/agneym/pi-honcho-memory/blob/main/README.md) for full installation and configuration instructions. ## How It Works The extension hooks into pi's extension system. It automatically syncs user and assistant messages to Honcho after each agent response, injects cached user profile and project context into the system prompt with zero network latency, and exposes LLM tools (`honcho_search`, `honcho_chat`, `honcho_remember`) for active memory operations. Session scoping is configurable — memory can be shared per repo, per git branch, or per directory. If Honcho is unavailable, pi continues working normally. ## Next Steps Source code, installation, and full documentation. Learn about peers, sessions, and dialectic reasoning. # Discord Bots with Honcho Source: https://honcho.dev/docs/v3/guides/discord Use Honcho to build a Discord bot with conversational memory and context management. > Example code is available on [GitHub](https://github.com/plastic-labs/discord-python-starter) Any application interface that defines logic based on events and supports special commands can work easily with Honcho. Here's how to use Honcho with **Discord** as an interface. If you're not familiar with Discord bot application logic, the [py-cord](https://pycord.dev/) docs would be a good place to start. ## Events Most Discord bots have async functions that listen for specific events, the most common one being messages. We can use Honcho to store messages by user and session based on an interface's event logic. Take the following function definition for example: ```python theme={null} @bot.event async def on_message(message): """ Receive a message from Discord and respond with a message from our LLM assistant. """ if not validate_message(message): return input = sanitize_message(message) # If the message is empty after sanitizing, ignore it if not input: return peer = honcho_client.peer(id=get_peer_id_from_discord(message)) session = honcho_client.session(id=str(message.channel.id)) async with message.channel.typing(): response = llm(session, input) await send_discord_message(message, response) # Save both the user's message and the bot's response to the session session.add_messages( [ peer.message(input), assistant.message(response), ] ) ``` Let's break down what this code is doing... ```python theme={null} @bot.event async def on_message(message): if not validate_message(message): return ``` This is how you define an event function in `py-cord` that listens for messages. We use a helper function `validate_message()` to check if the message should be processed. ## Helper Functions The code uses several helper functions to keep the main logic clean and readable. Let's examine each one: ### Message Validation ```python theme={null} def validate_message(message) -> bool: """ Determine if the message is valid for the bot to respond to. Return True if it is, False otherwise. Currently, the bot will only respond to messages that tag it with an @mention in a public channel and are not from the bot itself. """ if message.author == bot.user: # ensure the bot does not reply to itself return False if isinstance(message.channel, discord.DMChannel): return False if not bot.user.mentioned_in(message): return False return True ``` This function centralizes all the logic for determining whether the bot should respond to a message. It checks that: * The message isn't from the bot itself * The message isn't in a DM channel * The bot is mentioned in the message ### Message Sanitization ```python theme={null} def sanitize_message(message) -> str | None: """Remove the bot's mention from the message content if present""" content = message.content.replace(f"<@{bot.user.id}>", "").strip() if not content: return None return content ``` This helper removes the bot's mention from the message content, leaving just the actual user input. ### Peer ID Generation ```python theme={null} def get_peer_id_from_discord(message): """Get a Honcho peer ID for the message author""" return f"discord_{str(message.author.id)}" ``` This creates a unique peer identifier for each Discord user by prefixing their Discord ID. ### LLM Integration ```python theme={null} def llm(session, prompt) -> str: """ Call the LLM with the given prompt and chat history. You should expand this function with custom logic, prompts, etc. """ messages: list[dict[str, object]] = session.context().to_openai( assistant=assistant ) messages.append({"role": "user", "content": prompt}) try: completion = openai.chat.completions.create( model=MODEL_NAME, messages=messages, ) return completion.choices[0].message.content except Exception as e: print(e) return f"Error: {e}" ``` This function handles the LLM interaction. It uses Honcho's built-in `to_openai()` method to automatically convert the session context into the format expected by OpenAI's chat completions API. ### Message Sending ```python theme={null} async def send_discord_message(message, response_content: str): """Send a message to the Discord channel""" if len(response_content) > 1500: # Split response into chunks at newlines, keeping under 1500 chars chunks = [] current_chunk = "" for line in response_content.splitlines(keepends=True): if len(current_chunk) + len(line) > 1500: chunks.append(current_chunk) current_chunk = line else: current_chunk += line if current_chunk: chunks.append(current_chunk) for chunk in chunks: await message.channel.send(chunk) else: await message.channel.send(response_content) ``` This function handles sending messages to Discord, automatically splitting long responses into multiple messages to stay within Discord's character limits. ## Honcho Integration The new Honcho peer/session API makes integration much simpler: ```python theme={null} peer = honcho_client.peer(id=get_peer_id_from_discord(message)) session = honcho_client.session(id=str(message.channel.id)) ``` Here we create a peer object for the user and a session object using the Discord channel ID. This automatically handles user and session management. ```python theme={null} # Save both the user's message and the bot's response to the session session.add_messages( [ peer.message(input), assistant.message(response), ] ) ``` After generating the response, we save both the user's input and the bot's response to the session using the `add_messages()` method. The `peer.message()` creates a message from the user, while `assistant.message()` creates a message from the assistant. ## Slash Commands Discord bots also offer slash command functionality. Here's an example using Honcho's chat endpoint feature: ```python theme={null} @bot.slash_command( name="chat", description="Chat with Honcho about a peer.", ) async def chat(ctx, query: str): await ctx.defer() try: peer = honcho_client.peer(id=get_peer_id_from_discord(ctx)) session = honcho_client.session(id=str(ctx.channel.id)) response = peer.chat( query=query, session_id=session.id, ) if response: await ctx.followup.send(response) else: await ctx.followup.send( f"I don't know anything about {ctx.author.name} because we haven't talked yet!" ) except Exception as e: logger.error(f"Error calling Dialectic API: {e}") await ctx.followup.send( f"Sorry, there was an error processing your request: {str(e)}" ) ``` This slash command uses Honcho's chat endpoint functionality to answer questions about the user based on their conversation history. ## Setup and Configuration The bot requires several environment variables and setup: ```python theme={null} honcho_client = Honcho() assistant = honcho_client.peer(id="assistant", config={"observe_me": False}) openai = OpenAI(base_url="https://openrouter.ai/api/v1", api_key=MODEL_API_KEY) ``` * `honcho_client`: The main Honcho client * `assistant`: A peer representing the bot/assistant * `openai`: OpenAI client configured to use OpenRouter ## Recap The new Honcho peer/session API makes Discord bot integration much simpler and more intuitive. Key patterns we learned: * **Peer/Session Model**: Users are represented as peers, conversations as sessions * **Automatic Context Management**: `session.context().to_openai()` automatically formats chat history * **Message Storage**: `session.add_messages()` stores both user and assistant messages * **Representation Queries**: `peer.chat()` enables querying conversation history * **Helper Functions**: Clean code organization with focused helper functions This approach provides a clean, maintainable structure for building Discord bots with conversational memory and context management. # Gmail Source: https://honcho.dev/docs/v3/guides/gmail Load Gmail threads into Honcho to give your AI agents memory of email conversations. In this tutorial, we'll walk through how to ingest your Gmail emails into Honcho. By the end, each email thread will be a Honcho session and each participant will be a peer — giving your agents memory of who said what across your email history. This guide includes a ready-to-run Python script that handles everything: Gmail OAuth, thread fetching, participant extraction, and Honcho ingestion. You can run it as-is or use the full tutorial below to understand each piece as you go. The full script is available on [GitHub](https://github.com/plastic-labs/honcho/tree/main/examples/gmail). This is a developer-focused tutorial — it requires creating a Google Cloud project and OAuth credentials. ## TL;DR If you just want to get your emails into Honcho, here's everything you need. ### 1. Set Up Google Cloud Credentials Follow Google's official [Gmail API Python Quickstart](https://developers.google.com/gmail/api/quickstart/python) to: 1. Create a Google Cloud project and enable the Gmail API 2. Configure the OAuth consent screen 3. Create OAuth credentials (select **Desktop app** as the application type) 4. Download the credentials JSON into the same directory as the script The script auto-detects Google's default `client_secret_*.json` filename, so no renaming needed. The script only needs the `gmail.readonly` scope. ### 2. Install Dependencies ```bash uv theme={null} uv pip install google-api-python-client google-auth-oauthlib honcho-ai ``` ```bash pip theme={null} pip install google-api-python-client google-auth-oauthlib honcho-ai ``` ### 3. Preview with a Dry Run ```bash uv theme={null} uv run honcho_gmail.py --dry-run --max-threads 5 ``` ```bash python theme={null} python honcho_gmail.py --dry-run --max-threads 5 ``` On first run, a browser window opens for OAuth consent. After authorizing, a `token.json` file is created — future runs skip this step. ### 4. Load into Honcho ```bash uv theme={null} export HONCHO_API_KEY=your_api_key uv run honcho_gmail.py --workspace gmail-inbox --max-threads 20 ``` ```bash python theme={null} export HONCHO_API_KEY=your_api_key python honcho_gmail.py --workspace gmail-inbox --max-threads 20 ``` You can filter threads with Gmail search syntax: ```bash uv theme={null} uv run honcho_gmail.py --query "from:alice@example.com" uv run honcho_gmail.py --label INBOX uv run honcho_gmail.py --query "after:2024/01/01 has:attachment" --max-threads 50 ``` ```bash python theme={null} python honcho_gmail.py --query "from:alice@example.com" python honcho_gmail.py --label INBOX python honcho_gmail.py --query "after:2024/01/01 has:attachment" --max-threads 50 ``` That's it — your emails are now queryable in Honcho. Read on if you want to understand how the script works and the design decisions behind it. *** ## Full Tutorial ### How Gmail Maps to Honcho The core idea is straightforward: each Gmail thread becomes a Honcho session, and each email participant becomes a peer. Here's the full mapping: | Gmail Concept | Honcho Concept | Details | | ------------------ | ----------------------------- | ------------------------------------------------- | | Your Gmail account | Workspace (`gmail`) | One workspace for all email data | | Email participant | Peer | Email address as ID for deduplication | | Email thread | Session (`gmail-thread-{id}`) | One session per thread, all participants attached | | Individual email | Message | Attributed to the sender with original timestamp | ### Email as Peer ID The script normalizes email addresses into URL-safe peer IDs — `alice@example.com` becomes `alice-example-com`. This means the same person is automatically deduplicated across threads. If Alice emails you in 10 different threads, all of those conversations accumulate under a single peer. ```python theme={null} def peer_id_from_email(email: str) -> str: """Convert email to a valid Honcho peer ID.""" return email.replace("@", "-").replace(".", "-") ``` This also means peers are consistent across data sources. If you import Granola meetings and Gmail threads for the same person, they merge under the same peer ID. ### Extracting Participants Every email has a sender, recipients, and optionally CC/BCC addresses. The script extracts all of these to build a complete picture of who's involved in each thread: ```python theme={null} for m in msgs: register_peer(m["from"]) for addr in parse_address_list(m["to"]): register_peer(addr) for addr in parse_address_list(m["cc"]): register_peer(addr) for addr in parse_address_list(m["bcc"]): register_peer(addr) ``` Display names are extracted when available (e.g., `Alice Smith ` → name: "Alice Smith"). When only an email is present, the script generates a name from the local part. ### Message Attribution and Timestamps Each email becomes a message attributed to its sender via `peer.message()`. The original email timestamp is preserved using `created_at`, so Honcho sees the conversation in chronological order — not the order you imported it. ```python theme={null} honcho_msgs.append(peer.message( content, metadata={ "gmail_id": m["id"], "subject": m["subject"], "from": m["from"], "to": m["to"], "labels": m["labels"], }, created_at=m["timestamp"], )) ``` ### Multi-Peer Sessions Each thread's session is linked to all participants using `session.add_peers()`. This means when you query Honcho about a peer, it has context not just from their messages but from the full conversations they participated in. ```python theme={null} session = honcho.session(session_id, metadata={ "gmail_thread_id": tid, "subject": subject, "source": "gmail", "message_count": len(msgs), }) session.add_peers(thread_peers) ``` ### Stripping Quoted Replies Email threads are full of quoted replies — each message repeats everything above it. The script strips these out so only the new content is stored per message, avoiding duplication in Honcho's memory: ```python theme={null} def strip_quoted_replies(text: str) -> str: """Strip quoted reply text, keeping only the new content.""" lines = text.split("\n") clean_lines = [] for line in lines: stripped = line.strip() if re.match(r"^On .+wrote:\s*$", stripped): break if stripped.startswith(">"): break # ... other reply markers clean_lines.append(line) return "\n".join(clean_lines).rstrip() ``` ### Querying After Import Once your emails are in Honcho, you can query any peer: ```python theme={null} import os from honcho import Honcho honcho = Honcho(workspace_id="gmail-inbox", api_key=os.environ["HONCHO_API_KEY"]) alice = honcho.peer("alice-example-com") print(alice.chat("What has Alice been discussing with me?")) print(alice.chat("What action items has Alice mentioned?")) ``` *** ## CLI Reference ``` usage: honcho_gmail.py [-h] [--workspace WORKSPACE] [--query QUERY] [--label LABEL] [--max-threads N] [--dry-run] [--credentials PATH] [--token PATH] options: --workspace, -w Honcho workspace ID (default: gmail) --query, -q Gmail search query (e.g., 'from:alice@example.com') --label, -l Gmail label to filter by (e.g., INBOX) --max-threads, -n Max threads to fetch (default: 10) --dry-run Preview without writing to Honcho --credentials, -c Path to OAuth credentials JSON (auto-detects client_secret*.json) --token, -t Path to store access token (default: token.json) ``` ## Troubleshooting ### "No client\_secret\*.json file found" Download OAuth credentials from Google Cloud Console and place the `client_secret_*.json` file in the same directory as the script. ### "Access blocked: This app's request is invalid" Your OAuth consent screen may not be configured correctly. Ensure you've added the `gmail.readonly` scope. ### "Token has been expired or revoked" Delete `token.json` and run the script again to re-authenticate. ### Rate Limits The script includes a small delay when creating peers to avoid hitting Honcho's rate limits. For large imports (100+ threads), consider running in batches. ### Unique Messages Use an AI assistant in your inbox? Want to parse out its messages differently? Feel free to modify and improve the structure of this script to fit your bespoke email setup. This script was written for agents and as such is easy to update with your coding assistant. ## Full Script ```python theme={null} #!/usr/bin/env python3 """Load Gmail messages into Honcho. Uses the Gmail API directly (with OAuth) to fetch emails and the Honcho Python SDK to store them. Each Gmail thread becomes a Honcho session, each sender becomes a peer. Prerequisites: 1. Create a Google Cloud project and enable the Gmail API 2. Create OAuth 2.0 credentials (Desktop app type) 3. Download the credentials JSON (client_secret_*.json) into this directory 4. Install dependencies: pip install google-api-python-client google-auth-oauthlib honcho-ai On first run, a browser window will open for OAuth consent. After authorizing, a 'token.json' file will be created to store your credentials for future runs. """ import argparse import base64 import glob import os import re import time from datetime import datetime, timezone from email.header import decode_header, make_header from email.utils import getaddresses, parseaddr from google.auth.transport.requests import Request from google.oauth2.credentials import Credentials from google_auth_oauthlib.flow import InstalledAppFlow from googleapiclient.discovery import build from googleapiclient.errors import HttpError SCOPES = ["https://www.googleapis.com/auth/gmail.readonly"] PEER_ID_PATTERN = re.compile(r"^[a-zA-Z0-9_-]+$") def find_credentials() -> str: """Find a Google OAuth credentials file in the current directory.""" matches = glob.glob("client_secret*.json") if matches: return matches[0] raise FileNotFoundError( "No client_secret*.json file found.\n" "Download OAuth credentials from Google Cloud Console:\n" "1. Go to console.cloud.google.com\n" "2. Create/select a project and enable Gmail API\n" "3. Create OAuth 2.0 credentials (Desktop app)\n" "4. Download the JSON into this directory" ) def get_gmail_service(credentials_file: str | None = None, token_file: str = "token.json"): """Authenticate and return a Gmail API service instance.""" creds = None if os.path.exists(token_file): creds = Credentials.from_authorized_user_file(token_file, SCOPES) if not creds or not creds.valid: if creds and creds.expired and creds.refresh_token: print("Refreshing expired credentials...") creds.refresh(Request()) else: if credentials_file is None: credentials_file = find_credentials() print(f"Using credentials: {credentials_file}") print("Opening browser for OAuth consent...") flow = InstalledAppFlow.from_client_secrets_file(credentials_file, SCOPES) creds = flow.run_local_server(port=0) with open(token_file, "w") as token: token.write(creds.to_json()) print(f"Credentials saved to {token_file}") return build("gmail", "v1", credentials=creds) def list_threads(service, query: str = None, label_ids: list = None, max_results: int = 10) -> list[dict]: """List Gmail threads with pagination support.""" all_threads = [] page_token = None while len(all_threads) < max_results: try: params = { "userId": "me", "maxResults": min(100, max_results - len(all_threads)), } if query: params["q"] = query if label_ids: params["labelIds"] = label_ids if page_token: params["pageToken"] = page_token response = service.users().threads().list(**params).execute() threads = response.get("threads", []) all_threads.extend(threads) page_token = response.get("nextPageToken") if not page_token: break except HttpError as e: print(f"Error listing threads: {e}") break return all_threads[:max_results] def get_thread(service, thread_id: str) -> dict: """Fetch a complete Gmail thread with all messages.""" try: return service.users().threads().get( userId="me", id=thread_id, format="full" ).execute() except HttpError as e: print(f"Error fetching thread {thread_id}: {e}") return {} def _decode_header_str(header: str) -> str: """Decode an RFC 2047 encoded header string to plain Unicode.""" return str(make_header(decode_header(header))) def extract_email(from_header: str) -> str: """Extract bare email from an RFC 5322 header value.""" _, addr = parseaddr(_decode_header_str(from_header)) return addr.lower().strip() def extract_name(from_header: str) -> str: """Extract display name from an RFC 5322 header value.""" name, _ = parseaddr(_decode_header_str(from_header)) return name.strip() or from_header.strip() def decode_body(payload: dict) -> str: """Recursively extract plain text from a Gmail message payload.""" if payload.get("mimeType") == "text/plain": data = payload.get("body", {}).get("data", "") if data: return base64.urlsafe_b64decode(data).decode("utf-8", errors="replace") parts = payload.get("parts", []) for part in parts: text = decode_body(part) if text: return text return "" def strip_quoted_replies(text: str) -> str: """Strip quoted reply text from an email body, keeping only the new content.""" lines = text.split("\n") clean_lines = [] for line in lines: stripped = line.strip() if re.match(r"^On .+wrote:\s*$", stripped): break if stripped.startswith("---------- Forwarded message"): break if stripped.startswith(">"): break if re.match(r"^[-_]{10,}$", stripped): break clean_lines.append(line) return "\n".join(clean_lines).rstrip() def parse_address_list(header: str) -> list[str]: """Parse a comma-separated email header into individual addresses.""" if not header.strip(): return [] decoded = _decode_header_str(header) return [ f"{name} <{addr}>" if name else addr for name, addr in getaddresses([decoded]) if addr ] def peer_id_from_email(email: str) -> str: """Convert email to a valid Honcho peer ID.""" peer_id = re.sub(r"[^A-Za-z0-9_-]+", "-", email).strip("-").lower() peer_id = re.sub(r"-{2,}", "-", peer_id) if not peer_id: peer_id = "unknown-peer" if not PEER_ID_PATTERN.fullmatch(peer_id): raise ValueError(f"Generated peer ID is invalid: {peer_id!r}") return peer_id def fetch_thread_messages(service, thread_id: str) -> list[dict]: """Fetch all messages in a Gmail thread with full content.""" data = get_thread(service, thread_id) messages = [] for msg in data.get("messages", []): headers = {h["name"]: h["value"] for h in msg.get("payload", {}).get("headers", [])} body = strip_quoted_replies(decode_body(msg.get("payload", {}))) ts = int(msg.get("internalDate", "0")) / 1000 messages.append({ "id": msg["id"], "thread_id": msg["threadId"], "from": headers.get("From", ""), "to": headers.get("To", ""), "cc": headers.get("Cc", ""), "bcc": headers.get("Bcc", ""), "subject": headers.get("Subject", ""), "date": headers.get("Date", ""), "timestamp": datetime.fromtimestamp(ts, tz=timezone.utc), "body": body.strip(), "labels": msg.get("labelIds", []), "snippet": msg.get("snippet", ""), }) return messages def main(): parser = argparse.ArgumentParser(description="Load Gmail messages into Honcho") parser.add_argument("--workspace", "-w", default="gmail", help="Honcho workspace ID (default: gmail)") parser.add_argument("--query", "-q", default=None, help="Gmail search query (e.g. 'from:alice@example.com')") parser.add_argument("--label", "-l", default=None, help="Gmail label to filter by (e.g. INBOX)") parser.add_argument("--max-threads", "-n", type=int, default=10, help="Max threads to fetch (default: 10)") parser.add_argument("--dry-run", action="store_true", help="Print what would be loaded without writing to Honcho") parser.add_argument("--credentials", "-c", default=None, help="Path to OAuth credentials JSON (auto-detects client_secret*.json)") parser.add_argument("--token", "-t", default="token.json", help="Path to store/load access token") args = parser.parse_args() # Authenticate print("Authenticating with Gmail API...") service = get_gmail_service(args.credentials, args.token) print(" Authenticated successfully!") label_ids = [args.label] if args.label else None # List threads print(f"\nFetching up to {args.max_threads} threads from Gmail...") threads = list_threads(service, query=args.query, label_ids=label_ids, max_results=args.max_threads) print(f" Found {len(threads)} threads") if not threads: print("No threads found. Try adjusting --query or --label.") return # Fetch full messages for each thread all_thread_messages = {} seen_peers = {} def register_peer(addr: str): email = extract_email(addr) if email and email not in seen_peers: name = extract_name(addr) if name.lower().strip() == email or "@" in name: name = email.split("@")[0].replace(".", " ").title() seen_peers[email] = { "name": name, "peer_id": peer_id_from_email(email), "email": email, } for i, t in enumerate(threads): tid = t["id"] print(f" Fetching thread {i+1}/{len(threads)}: {tid}") msgs = fetch_thread_messages(service, tid) all_thread_messages[tid] = msgs for m in msgs: register_peer(m["from"]) for addr in parse_address_list(m["to"]): register_peer(addr) for addr in parse_address_list(m["cc"]): register_peer(addr) for addr in parse_address_list(m["bcc"]): register_peer(addr) # Summary total_msgs = sum(len(v) for v in all_thread_messages.values()) print("\nSummary:") print(f" Threads: {len(all_thread_messages)}") print(f" Messages: {total_msgs}") print(f" Unique participants: {len(seen_peers)}") for email, info in seen_peers.items(): print(f" {info['peer_id']} ({info['name']} <{email}>)") if args.dry_run: print("\n[DRY RUN] Would create the above in Honcho. Showing first message per thread:") for tid, msgs in all_thread_messages.items(): m = msgs[0] body_preview = m["body"][:120].replace("\n", " ") if m["body"] else m["snippet"][:120] print(f" Thread {tid}: {m['subject']}") print(f" {m['from']} @ {m['date']}") print(f" {body_preview}...") return # Load into Honcho from honcho import Honcho print(f"\nLoading into Honcho workspace '{args.workspace}'...") honcho = Honcho(workspace_id=args.workspace) # Create peers peers = {} for i, (email, info) in enumerate(seen_peers.items()): if i > 0 and i % 4 == 0: time.sleep(1) peers[email] = honcho.peer(info["peer_id"], metadata={ "email": email, "name": info["name"], "source": "gmail", }) print(f" Peer: {info['peer_id']}") # Create sessions and messages per thread for tid, msgs in all_thread_messages.items(): subject = msgs[0]["subject"] if msgs else "No subject" session_id = f"gmail-thread-{tid}" thread_peer_emails = set() for m in msgs: thread_peer_emails.add(extract_email(m["from"])) for addr in parse_address_list(m["to"]): thread_peer_emails.add(extract_email(addr)) for addr in parse_address_list(m["cc"]): thread_peer_emails.add(extract_email(addr)) for addr in parse_address_list(m["bcc"]): thread_peer_emails.add(extract_email(addr)) thread_peers = [peers[e] for e in thread_peer_emails if e in peers] session = honcho.session(session_id, metadata={ "gmail_thread_id": tid, "subject": subject, "source": "gmail", "message_count": len(msgs), }) session.add_peers(thread_peers) honcho_msgs = [] for m in msgs: email = extract_email(m["from"]) peer = peers.get(email) if not peer: continue content = m["body"] if m["body"] else m["snippet"] if not content: continue honcho_msgs.append(peer.message( content, metadata={ "gmail_id": m["id"], "subject": m["subject"], "from": m["from"], "to": m["to"], "labels": m["labels"], }, created_at=m["timestamp"], )) if honcho_msgs: session.add_messages(honcho_msgs) print(f" Session {session_id}: {len(honcho_msgs)} messages — {subject[:60]}") print(f"\nDone! Loaded {total_msgs} messages into workspace '{args.workspace}'.") if __name__ == "__main__": main() ``` ## Next Steps See how the Granola integration maps to common Honcho patterns. Source code and example script. # Granola Source: https://honcho.dev/docs/v3/guides/granola Import meeting notes and transcripts from Granola into Honcho In this tutorial, we'll walk through how to import your [Granola](https://granola.ai) meeting data into Honcho. By the end, your meeting participants, transcripts, and summaries will be mapped onto Honcho's peer and session model — giving your agents queryable memory of the people you meet with. This guide includes a ready-to-run Python script that handles everything: Granola OAuth, meeting fetching, participant detection, and interactive import. You can run it as-is or use the full tutorial below to understand each design decision. The full script is available on [GitHub](https://github.com/plastic-labs/honcho/tree/main/examples/granola). ## TL;DR If you just want to get your meetings into Honcho, here's everything you need. ### 1. Install Dependencies ```bash uv theme={null} uv pip install honcho-ai httpx ``` ```bash pip theme={null} pip install honcho-ai httpx ``` ### 2. Set Your API Key ```bash theme={null} export HONCHO_API_KEY="your-key-from-app.honcho.dev" ``` ### 3. Run the Script ```bash uv theme={null} uv run python honcho_granola.py ``` ```bash python theme={null} python honcho_granola.py ``` The script will: 1. Open your browser for Granola OAuth authentication 2. Fetch all meetings and their content 3. Walk you through each meeting interactively — confirm peers, choose import mode, skip meetings you don't want 4. Print a summary of what was transferred That's it — your meetings are now queryable in Honcho. Read on if you want to understand how the script works and the design decisions behind it. *** ## Full Tutorial ### How Granola Maps to Honcho The core idea is straightforward: each Granola meeting becomes a Honcho session, and each participant becomes a peer. Here's the full mapping: | Granola Concept | Honcho Concept | Details | | -------------------- | ------------------------- | --------------------------------------------- | | Your Granola account | Workspace (`granola`) | One workspace for all meetings | | Meeting participant | Peer | Email as ID for deduplication across meetings | | Individual meeting | Session (`meeting-{id}`) | One session per meeting | | Transcript turns | Messages with attribution | Two-person calls get full speaker attribution | | Meeting summary | Message from note creator | Multi-person calls store the summary | ### Email as Peer ID The script uses email addresses as the basis for peer IDs, normalized to a URL-safe format (e.g., `alice@example.com` becomes `alice-example-com`). This ensures consistent identification across meetings — if you meet someone in 5 different calls, all conversations accumulate under the same peer. ```python theme={null} # These all resolve to the same peer: honcho.peer("alice-example-com") # From Meeting A honcho.peer("alice-example-com") # From Meeting B ``` This also means peers are consistent across data sources. If you import both Granola meetings and Gmail threads for the same person, they merge under the same peer ID. ### Auto-Detecting "Me" Granola marks the note creator in its participant list with `(note creator)`. The script uses this to identify you automatically — no configuration needed. ``` Participants: You (note creator) from Your Company , Alice from Acme Corp ``` ### Two-Person Calls: Full Attribution When exactly one other participant is present *and* the transcript contains `Them:` turns, the script stores the transcript with speaker-attributed messages. Consecutive same-speaker turns are merged before storing, cleaning up the fragmentation that's common in raw transcripts. ```python theme={null} session.add_messages([ me.message("What's your timeline for the launch?"), them.message("We're targeting Q2, but it depends on the API integration."), ]) ``` ### Multi-Person Calls: Summary Mode Granola's transcript uses `Them:` for all non-creator speakers with no disambiguation — in a 4-person call, everyone else is just `Them:`. Rather than guess incorrectly, the script stores Granola's summary as your record of the meeting, with participants in metadata. ```python theme={null} session.add_messages([ me.message( f"Meeting: Product Planning\n" f"Date: Mar 5, 2026 2:00 PM\n" f"Participants: Alice from Acme Corp, Bob from Widgets Inc\n\n" f"{meeting_summary}", metadata={ "participants": "Alice from Acme Corp, Bob from Widgets Inc", "mode": "summary", "granola_meeting_id": meeting_id, } ) ]) ``` The summary is attributed to you because it's *your* record of what happened. Granola captured your notes from a meeting where those people were present. ### Interactive Confirmation For each meeting, you choose the import mode: two-person (full attribution), summary, or skip. For multi-person calls that are actually 1:1s (extra participants listed but didn't speak), you can override the detection and select the actual speaker. ### Noisy Transcripts Preserved Granola's raw transcripts are often fragmented (`Me: Yeah. Them: Yeah. Me: And.`). The script merges consecutive same-speaker turns but otherwise preserves the raw content. Honcho's reasoning extracts signal from noisy data. ### Querying After Import Once your meetings are in Honcho, you can query any peer: ```python theme={null} import os from honcho import Honcho honcho = Honcho(workspace_id="granola", api_key=os.environ["HONCHO_API_KEY"]) # Peer IDs are normalized from emails: alice@example.com -> alice-example-com alice = honcho.peer("alice-example-com") print(alice.chat("What is Alice working on?")) print(alice.chat("What concerns has Alice raised?")) me = honcho.peer("you-example-com") print(me.chat("What topics do I discuss most frequently?")) ``` ### Combining with Other Sources Because meetings live in a standard Honcho workspace, you can enrich peer representations with data from other channels: ```python theme={null} # Same workspace, same peer — data accumulates alice = honcho.peer("alice-example-com") me = honcho.peer("you-example-com") discord_session = honcho.session("discord-general-2024-03") discord_session.add_messages([ alice.message("Just shipped the new API version!"), me.message("Congrats! How's the migration guide coming?"), ]) # Queries now draw from both meeting transcripts AND Discord history alice.chat("What has Alice shipped recently?") ``` *** ## Troubleshooting | Issue | Fix | | -------------------------------- | ------------------------------------------------------------------------------------------------------------ | | Granola OAuth fails | Ensure you have a paid Granola plan (MCP requires Pro+). Clear cached token and retry. | | Missing transcripts | Free tier has no transcript access. The script falls back to summary content. | | 500 errors from Honcho | Check for null bytes or control characters in transcript content. The script sanitizes these automatically. | | Rate limiting with many meetings | The script processes sequentially with delays. Honcho ingestion is async — don't poll for immediate results. | ## Full Script ```python theme={null} #!/usr/bin/env python3 """Load Granola meeting notes into Honcho. Uses the Granola MCP server (with OAuth) to fetch meetings and the Honcho Python SDK to store them. Each meeting becomes a Honcho session. Two-person meetings get full speaker attribution; multi-person meetings are stored as summaries. Prerequisites: pip install honcho-ai httpx Environment Variables: HONCHO_API_KEY - Your Honcho API key (get from app.honcho.dev/api-keys) Usage: python honcho_granola.py """ import asyncio import base64 import hashlib import json import os import re import secrets import sys import threading import traceback import webbrowser from dataclasses import dataclass, field from datetime import datetime, timezone from http.server import HTTPServer, BaseHTTPRequestHandler from typing import Any from urllib.parse import parse_qs, urlencode, urlparse import httpx @dataclass class Participant: name: str email: str | None = None org: str | None = None @dataclass class ParsedParticipants: note_creator: Participant | None = None others: list[Participant] = field(default_factory=list) @dataclass class TranscriptTurn: speaker: str text: str # Granola MCP + OAuth endpoints GRANOLA_MCP_URL = "https://mcp.granola.ai/mcp" AUTH_BASE = "https://mcp-auth.granola.ai" OAUTH_REDIRECT_PORT = 8765 OAUTH_REDIRECT_URI = f"http://localhost:{OAUTH_REDIRECT_PORT}/callback" # Honcho message size limit (25000 max, leave headroom) MAX_MESSAGE_LEN = 24000 # --------------------------------------------------------------------------- # OAuth callback handler (must be a class for BaseHTTPRequestHandler) # --------------------------------------------------------------------------- class _OAuthCallback(BaseHTTPRequestHandler): auth_result: dict[str, str | None] = {"code": None, "error": None} def do_GET(self): params = parse_qs(urlparse(self.path).query) if "code" in params: _OAuthCallback.auth_result["code"] = params["code"][0] self.send_response(200) self.send_header("Content-Type", "text/html") self.end_headers() self.wfile.write(b"

Authenticated! You can close this window.

") elif "error" in params: _OAuthCallback.auth_result["error"] = params.get("error_description", params["error"])[0] self.send_response(400) self.send_header("Content-Type", "text/html") self.end_headers() self.wfile.write(f"

Error: {_OAuthCallback.auth_result['error']}

".encode()) else: self.send_response(404) self.end_headers() def log_message(self, fmt, *args): pass # --------------------------------------------------------------------------- # Granola OAuth + MCP # --------------------------------------------------------------------------- async def authenticate(http_client: httpx.AsyncClient) -> str: """Perform OAuth (DCR + PKCE) with Granola. Returns access token.""" _OAuthCallback.auth_result = {"code": None, "error": None} print("\nAuthenticating with Granola...") # Register client (DCR) resp = await http_client.post( f"{AUTH_BASE}/oauth2/register", json={ "client_name": "Granola to Honcho Transfer", "redirect_uris": [OAUTH_REDIRECT_URI], "grant_types": ["authorization_code"], "response_types": ["code"], "token_endpoint_auth_method": "none", }, ) if resp.status_code not in (200, 201): raise RuntimeError(f"Client registration failed: {resp.status_code}") client_id = resp.json().get("client_id") # PKCE verifier = secrets.token_urlsafe(32) challenge = base64.urlsafe_b64encode(hashlib.sha256(verifier.encode()).digest()).rstrip(b"=").decode() # Browser auth auth_url = f"{AUTH_BASE}/oauth2/authorize?" + urlencode({ "client_id": client_id, "redirect_uri": OAUTH_REDIRECT_URI, "response_type": "code", "state": "granola-honcho-transfer", "code_challenge": challenge, "code_challenge_method": "S256", }) server = HTTPServer(("localhost", OAUTH_REDIRECT_PORT), _OAuthCallback) thread = threading.Thread(target=server.handle_request) thread.start() print(" Opening browser for authentication...") webbrowser.open(auth_url) thread.join(timeout=120) server.server_close() auth_result = _OAuthCallback.auth_result if auth_result["error"]: raise RuntimeError(f"Authentication failed: {auth_result['error']}") if not auth_result["code"]: raise RuntimeError("Authentication timed out") # Exchange code for token resp = await http_client.post( f"{AUTH_BASE}/oauth2/token", data={ "grant_type": "authorization_code", "code": auth_result["code"], "redirect_uri": OAUTH_REDIRECT_URI, "client_id": client_id, "code_verifier": verifier, }, headers={"Content-Type": "application/x-www-form-urlencoded"}, ) if resp.status_code != 200: raise RuntimeError(f"Token exchange failed: {resp.status_code}") print(" Authenticated successfully!") return resp.json()["access_token"] async def call_mcp_tool( http_client: httpx.AsyncClient, access_token: str, tool_name: str, arguments: dict[str, Any] | None = None, ) -> dict[str, Any]: """Call a Granola MCP tool, handling both JSON and SSE responses.""" resp = await http_client.post( GRANOLA_MCP_URL, json={ "jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": tool_name, "arguments": arguments or {}}, }, headers={ "Authorization": f"Bearer {access_token}", "Content-Type": "application/json", "Accept": "application/json, text/event-stream", }, ) if resp.status_code != 200: raise RuntimeError(f"MCP call failed: {resp.status_code} - {resp.text}") # SSE response if "text/event-stream" in resp.headers.get("content-type", ""): result = None for line in resp.text.split("\n"): if line.strip().startswith("data: "): try: parsed = json.loads(line.strip()[6:]) if "result" in parsed: result = parsed elif "error" in parsed: raise RuntimeError(f"MCP error: {parsed['error']}") except json.JSONDecodeError: continue if result: final = result.get("result", {}) return final if isinstance(final, dict) else {"result": final} raise RuntimeError("No result in SSE response") # JSON response result = resp.json() if "error" in result: raise RuntimeError(f"MCP error: {result['error']}") return result.get("result", {}) def extract_mcp_text(result: dict[str, Any]) -> str: """Extract text from the first content block of an MCP result. Raises ValueError if the response structure is unexpected. """ content = result.get("content", []) if not isinstance(content, list) or not content: raise ValueError(f"MCP response missing content array: {list(result.keys())}") first = content[0] if not isinstance(first, dict) or "text" not in first: raise ValueError(f"MCP content block missing 'text' field: {first}") return str(first["text"]) # --------------------------------------------------------------------------- # Granola data fetching # --------------------------------------------------------------------------- async def list_meetings( http_client: httpx.AsyncClient, access_token: str, limit: int = 100, ) -> list[dict[str, Any]]: """List meetings from Granola MCP. Parses Granola's XML-like response format.""" result = await call_mcp_tool(http_client, access_token, "list_meetings", {"limit": limit}) text = extract_mcp_text(result) meetings: list[dict[str, Any]] = [] for match in re.finditer(r'", match.end()) block = text[match.end():block_end] if block_end != -1 else "" p_match = re.search(r"\s*(.*?)\s*", block, re.DOTALL) meetings.append({ "id": mid, "title": title, "date": date, "participants": p_match.group(1).strip() if p_match else "", }) return meetings async def get_meeting_details( http_client: httpx.AsyncClient, access_token: str, meeting_id: str, ) -> dict[str, Any]: """Get full meeting details including notes.""" result = await call_mcp_tool(http_client, access_token, "get_meetings", {"meeting_ids": [meeting_id]}) text = extract_mcp_text(result) return {"id": meeting_id, "raw_content": text} async def get_meeting_transcript( http_client: httpx.AsyncClient, access_token: str, meeting_id: str, max_retries: int = 3, ) -> str | None: """Get transcript for a meeting (paid tiers only). Retries on rate limit responses with exponential backoff. """ for attempt in range(max_retries): try: result = await call_mcp_tool(http_client, access_token, "get_meeting_transcript", {"meeting_id": meeting_id}) text = extract_mcp_text(result) except Exception as e: print(f" Transcript unavailable: {e}") return None if not text or "no transcript" in text.lower(): return None # Granola returns rate limit errors as content text, not HTTP errors if "rate limit" in text.lower(): wait = 2 ** attempt * 3 # 3s, 6s, 12s print(f" ⚠ Granola rate limit hit (attempt {attempt + 1}/{max_retries}), waiting {wait}s...") await asyncio.sleep(wait) continue return text print(f" ⚠ Transcript skipped after {max_retries} rate limit retries") return None async def fetch_all_meetings( http_client: httpx.AsyncClient, access_token: str, ) -> list[dict[str, Any]]: """Fetch meeting list and enrich each with transcript and details.""" print("\nFetching meetings from Granola...") meetings = await list_meetings(http_client, access_token, limit=500) if not meetings: print("No meetings found.") return [] print(f" Found {len(meetings)} meetings. Fetching content...\n") for i, m in enumerate(meetings, 1): mid = m.get("id") if not mid: continue transcript = await get_meeting_transcript(http_client, access_token, mid) if transcript: m["transcript"] = transcript try: m.update(await get_meeting_details(http_client, access_token, mid)) except Exception as exc: print(f" Failed to fetch details for {mid}: {exc}") has_t = "transcript" in m has_s = bool(extract_summary(m)) label = "transcript+summary" if has_t and has_s else "transcript only" if has_t else "summary only" if has_s else "basic only" print(f" [{i}/{len(meetings)}] {label}: {m.get('title', 'Untitled')[:45]}") await asyncio.sleep(1.5) # rate limit return meetings # --------------------------------------------------------------------------- # Parsing helpers # --------------------------------------------------------------------------- def parse_participants(participants_str: str) -> ParsedParticipants: """Parse Granola's participant string into structured participants. Warns on unparsable entries instead of silently dropping them. """ result = ParsedParticipants() if not participants_str: return result # Split on commas, but not inside angle brackets entries, current, depth = [], [], 0 for ch in participants_str: if ch == "<": depth += 1 elif ch == ">": depth = max(depth - 1, 0) elif ch == "," and depth == 0: entries.append("".join(current)) current = [] continue current.append(ch) if current: entries.append("".join(current)) for entry in entries: entry = entry.strip() if not entry: continue is_creator = "(note creator)" in entry clean = entry.replace("(note creator)", "").strip() email_match = re.search(r"<([^>]+)>", clean) email = email_match.group(1) if email_match else None name = re.sub(r"\s*<[^>]+>", "", clean).strip() if not name: print(f" Warning: could not parse participant entry: {entry!r}") continue org = None org_match = re.match(r"(.+?)\s+from\s+(.+)", name) if org_match: name, org = org_match.group(1).strip(), org_match.group(2).strip() person = Participant(name=name, email=email, org=org) if is_creator: result.note_creator = person else: result.others.append(person) return result def parse_transcript_turns(raw: str) -> list[TranscriptTurn]: """Split a Granola transcript into speaker turns.""" # Unwrap JSON wrapper if present try: parsed = json.loads(raw) if isinstance(parsed, dict) and "transcript" in parsed: raw = str(parsed["transcript"]) except (json.JSONDecodeError, TypeError): pass parts = re.split(r"(?:^|\s{2,})(Me|Them):\s*", raw) turns: list[TranscriptTurn] = [] i = 1 while i < len(parts) - 1: text = parts[i + 1].strip() if text: turns.append(TranscriptTurn(speaker=parts[i], text=text)) i += 2 return turns def extract_summary(meeting: dict[str, Any]) -> str: """Extract best available summary text from meeting data.""" candidates = [] for key in ("summary", "notes", "note", "meeting_notes", "description"): val = meeting.get(key) if isinstance(val, str) and val.strip(): candidates.append(val.strip()) raw = meeting.get("raw_content") if isinstance(raw, str) and raw.strip(): candidates.append(raw.strip()) for c in candidates: for tag in ("summary", "notes"): m = re.search(rf"<{tag}>\s*(.*?)\s*", c, re.DOTALL) if m: return m.group(1).strip() return candidates[0] if candidates else "" def peer_id_from(value: str) -> str: """Normalize a name or email into a Honcho-safe peer ID.""" norm = re.sub(r"[^a-z0-9_-]+", "-", value.strip().lower()) norm = re.sub(r"-{2,}", "-", norm).strip("-_") return (norm or "peer")[:100] def sanitize(text: str) -> str: """Remove null bytes and control characters.""" return re.sub(r"[\x00-\x08\x0b\x0c\x0e-\x1f\x7f]", "", text) def parse_date(date_str: str) -> datetime: """Parse Granola's date format into a timezone-aware datetime. Raises ValueError if the date string doesn't match any known format. """ for fmt in ["%b %d, %Y %I:%M %p", "%b %d, %Y %I:%M:%S %p", "%B %d, %Y %I:%M %p"]: try: return datetime.strptime(date_str, fmt).replace(tzinfo=timezone.utc) except ValueError: continue raise ValueError(f"Unrecognized date format: {date_str!r}") # --------------------------------------------------------------------------- # Honcho import helpers # --------------------------------------------------------------------------- def build_messages( peer: Any, content: str, metadata: dict[str, object] | None, created_at: datetime, ) -> list[Any]: """Build chunked messages for a single peer, attaching metadata to the first chunk.""" messages = [] content = sanitize(content) for start in range(0, len(content), MAX_MESSAGE_LEN): chunk = content[start:start + MAX_MESSAGE_LEN] msg_meta = metadata if start == 0 else None messages.append(peer.message(chunk, metadata=msg_meta, created_at=created_at)) return messages def send_messages(session: Any, messages: list[Any]) -> None: """Send messages to a session in batches of 100.""" for batch_start in range(0, len(messages), 100): session.add_messages(messages[batch_start:batch_start + 100]) def import_two_person( honcho: Any, session: Any, me_peer_id: str, them_peer_id: str, turns: list[TranscriptTurn], metadata: dict[str, object], created_at: datetime, ) -> None: """Import a two-person meeting with speaker attribution.""" me_peer = honcho.peer(me_peer_id) them_peer = honcho.peer(them_peer_id) # Merge consecutive same-speaker turns merged: list[TranscriptTurn] = [] for t in turns: if merged and merged[-1].speaker == t.speaker: merged[-1].text += " " + t.text else: merged.append(TranscriptTurn(speaker=t.speaker, text=t.text)) messages: list[Any] = [] for i, t in enumerate(merged): peer = me_peer if t.speaker == "Me" else them_peer msg_meta = metadata if i == 0 else None messages.extend(build_messages(peer, t.text, msg_meta, created_at)) send_messages(session, messages) print(f" -> Imported as 2-person ({me_peer_id} + {them_peer_id})") def import_summary( honcho: Any, session: Any, me_peer_id: str, meeting: dict[str, Any], metadata: dict[str, object], created_at: datetime, ) -> None: """Import a meeting as a summary message.""" me_peer = honcho.peer(me_peer_id) summary = extract_summary(meeting) if not summary: raw_t = meeting.get("transcript", "") try: parsed = json.loads(raw_t) summary = str(parsed.get("transcript", "")) if isinstance(parsed, dict) else raw_t except (json.JSONDecodeError, TypeError): summary = raw_t summary = summary or "No content available" title = meeting.get("title", "Untitled") date = meeting.get("date", "") header = f"Meeting: {title}\nDate: {date}\nParticipants: {meeting.get('participants', '')}\n\n" messages = build_messages(me_peer, header + summary, metadata, created_at) send_messages(session, messages) print(" -> Imported as summary") def resolve_them_participant(others: list[Participant]) -> Participant | None: """Ask user to pick which participant is 'Them' from a multi-person meeting.""" for j, p in enumerate(others, 1): email_str = f" <{p.email}>" if p.email else "" print(f" {j}. {p.name}{email_str}") idx_str = input(f" Who is 'Them'? [1-{len(others)}]: ").strip() try: return others[int(idx_str) - 1] except (ValueError, IndexError): print(" Invalid selection.") return None def review_meeting( index: int, total: int, meeting: dict[str, Any], participants: ParsedParticipants, turns: list[TranscriptTurn], ) -> tuple[str, Participant | None]: """Display meeting info and get user's import choice. Returns (mode, them_participant) where mode is one of: - "two_person": import with speaker attribution using them_participant - "summary": import as a single summary message - "skip": skip this meeting """ title = meeting.get("title", "Untitled") date = meeting.get("date", "") creator = participants.note_creator others = participants.others me_turns = sum(1 for t in turns if t.speaker == "Me") them_turns = len(turns) - me_turns total_words = sum(len(t.text.split()) for t in turns) print(f"\n{'─' * 60}") print(f" [{index}/{total}] {title}") print(f" Date: {date}") if creator: print(f" You: {creator.name} <{creator.email}>") for j, p in enumerate(others, 1): email_str = f" <{p.email}>" if p.email else "" org_str = f" ({p.org})" if p.org else "" print(f" {j}. {p.name}{email_str}{org_str}") has_transcript = bool(meeting.get("transcript")) if turns: print(f" Transcript: {me_turns} Me, {them_turns} Them, ~{total_words} words") if them_turns == 0: print(" ** No 'Them' turns — nobody else spoke **") if total_words < 30: print(" ** Very short — might be empty **") elif has_transcript: raw = meeting["transcript"] print(f" Transcript: present ({len(raw)} chars) but could not parse speaker turns") print(f" Preview: {raw[:200]!r}") else: print(f" Content: {'summary available' if extract_summary(meeting) else 'metadata only'}") # Two-person default: exactly one other participant with transcript if len(others) == 1 and them_turns > 0: them_label = others[0].name + (f" <{others[0].email}>" if others[0].email else "") print(f"\n Detected: 2-person call (you + {them_label})") choice = input(" [Enter] 2-person / [s]ummary / [k] skip: ").strip().lower() while choice not in ("", "s", "k"): choice = input(" [Enter] 2-person / [s]ummary / [k] skip: ").strip().lower() if choice == "k": return ("skip", None) if choice == "s": return ("summary", None) return ("two_person", others[0]) # Multi-person with transcript if len(others) > 1 and them_turns > 0: print(f"\n {len(others)} participants") choice = input(" [Enter] summary / [2] 2-person / [k] skip: ").strip().lower() while choice not in ("", "2", "k"): choice = input(" [Enter] summary / [2] 2-person / [k] skip: ").strip().lower() if choice == "k": return ("skip", None) if choice == "2": them = resolve_them_participant(others) if them is None: return ("summary", None) return ("two_person", them) return ("summary", None) # No transcript or no other speakers choice = input(" [Enter] summary / [k] skip: ").strip().lower() while choice not in ("", "k"): choice = input(" [Enter] summary / [k] skip: ").strip().lower() if choice == "k": return ("skip", None) return ("summary", None) # --------------------------------------------------------------------------- # Main # --------------------------------------------------------------------------- async def main(): print("=" * 60) print(" Granola -> Honcho Meeting Notes Transfer") print("=" * 60) if not os.environ.get("HONCHO_API_KEY"): print("\nError: HONCHO_API_KEY not set.") print(" Get your key at: https://app.honcho.dev/api-keys") sys.exit(1) async with httpx.AsyncClient(timeout=60.0) as http_client: try: access_token = await authenticate(http_client) meetings = await fetch_all_meetings(http_client, access_token) if not meetings: sys.exit(0) from honcho import Honcho honcho = Honcho(workspace_id="granola") seen_peers: set[str] = set() results = {"imported": 0, "skipped": 0, "failed": 0} print("\n" + "=" * 60) print(" Review each meeting") print("=" * 60) for i, m in enumerate(meetings, 1): mid = m.get("id") if not mid: continue participants = parse_participants(m.get("participants", "")) turns = parse_transcript_turns(m["transcript"]) if m.get("transcript") else [] mode, them = review_meeting(i, len(meetings), m, participants, turns) if mode == "skip": print(" -> Skipped") results["skipped"] += 1 continue # Resolve creator peer creator = participants.note_creator me_source = (creator.email or creator.name) if creator else None if not me_source: print(" -> Skipped (no creator identifier)") results["skipped"] += 1 continue me_peer_id = peer_id_from(me_source) if me_peer_id not in seen_peers: print(f" New peer: {me_source} ({me_peer_id})") seen_peers.add(me_peer_id) try: created_at = parse_date(m.get("date", "")) session = honcho.session(f"meeting-{mid}") metadata: dict[str, object] = { "title": m.get("title", "Untitled"), "date": m.get("date", ""), "granola_meeting_id": mid, "mode": mode, } if mode == "two_person" and them is not None: them_source = them.email or them.name them_peer_id = peer_id_from(them_source) if them_peer_id not in seen_peers: print(f" New peer: {them_source} ({them_peer_id})") seen_peers.add(them_peer_id) import_two_person(honcho, session, me_peer_id, them_peer_id, turns, metadata, created_at) else: import_summary(honcho, session, me_peer_id, m, metadata, created_at) results["imported"] += 1 except ValueError as e: print(f" -> FAILED: {e}") results["failed"] += 1 except Exception as e: print(f" -> FAILED: {e}") traceback.print_exc() results["failed"] += 1 # Done print("\n" + "=" * 60) print(" Transfer Complete!") print("=" * 60) print(f"\n Imported: {results['imported']}") print(f" Skipped: {results['skipped']}") print(f" Failed: {results['failed']}") print(" Workspace: granola") print(f" Peers: {sorted(seen_peers)}") except KeyboardInterrupt: print("\n\nAborted.") sys.exit(0) except Exception as e: print(f"\nTransfer failed: {e}") traceback.print_exc() sys.exit(1) if __name__ == "__main__": asyncio.run(main()) ```
## Next Steps See how the Granola integration maps to common Honcho patterns. Source code and example script. # Claude Code Source: https://honcho.dev/docs/v3/guides/integrations/claude-code Add AI-native memory to Claude Code Give Claude Code long-term memory that survives context wipes, session restarts, and `ctrl+c`. Claude remembers what you're working on, your preferences, and what it was doing across all your projects. ## Quick Start ### Step 1: Get Your Honcho API Key 1. Go to **[app.honcho.dev](https://app.honcho.dev)** 2. Sign up or log in 3. Copy your API key (starts with `hch-`) ### Step 2: Set Environment Variables Add these to your shell config (`~/.zshrc`, `~/.bashrc`, or `~/.profile`): ```bash theme={null} # Required export HONCHO_API_KEY="hch-your-api-key-here" # Optional (defaults shown) export HONCHO_PEER_NAME="$USER" # Your name/identity ``` Then reload your shell: ```bash theme={null} source ~/.zshrc # or ~/.bashrc ``` ### Step 3: Install the Plugin This plugin requires [Bun](https://bun.sh). If you don't have it: `curl -fsSL https://bun.sh/install | bash` Open Claude Code and run: ``` /plugin marketplace add plastic-labs/claude-honcho ``` Then install: ``` /plugin install honcho@honcho ``` ### Step 4: Restart Claude Code ```bash theme={null} # Exit Claude Code (ctrl+c or /exit) # Start it again claude ``` **That's it!** You should see the Honcho pixel art and memory loading on startup. ### Step 5: (Optional) Kickstart with an Interview ``` /honcho:interview ``` Claude will interview you about your personal preferences to kickstart a representation of you. What it learns will be saved in Honcho and remembered forever. The interview is specific to the peer name you chose — it carries across different projects! ## What You Get * **Persistent Memory** — Claude remembers your preferences, projects, and context across sessions * **Survives Context Wipes** — Even when Claude's context window resets, memory persists * **Git Awareness** — Detects branch switches, commits, and changes made outside Claude * **Flexible Sessions** — Map sessions per directory, per git branch, or per chat instance * **AI Self-Awareness** — Claude knows what it was working on, even after restarts * **Cross-Tool Context** — Link workspaces across Claude Code, Cursor, and other hosts so context flows between tools * **Team Support** — Multiple people can share a workspace and build context together * **MCP Tools** — Search memory, query knowledge about you, and save insights ## Configuration All configuration lives in a single global file at `~/.honcho/config.json`. You can edit it directly, use the `/honcho:config` skill interactively, or use the `set_config` MCP tool. Environment variables work for initial setup but the config file takes precedence once it exists. ```jsonc theme={null} { // Required "apiKey": "hch-v2-...", // Identity "peerName": "alice", // Your name (default: $USER) // Host-specific settings — each tool gets its own workspace and AI peer "hosts": { "claude_code": { "workspace": "claude_code", // Workspace for Claude Code sessions "aiPeer": "claude", // AI identity in this workspace "linkedHosts": ["cursor"] // Read context from other hosts (optional) }, "cursor": { "workspace": "cursor", "aiPeer": "cursor" } }, // Session mapping "sessionStrategy": "per-directory", // "per-directory" | "git-branch" | "chat-instance" "sessionPeerPrefix": true, // Prefix session names with peerName (default: true) // Message handling "saveMessages": true, "messageUpload": { "maxUserTokens": null, // Truncate user messages (null = no limit) "maxAssistantTokens": null, // Truncate assistant messages (null = no limit) "summarizeAssistant": false // Summarize instead of sending full assistant text }, // Context retrieval "contextRefresh": { "messageThreshold": 30, // Refresh context every N messages "ttlSeconds": 300, // Cache TTL for context "skipDialectic": false // Skip dialectic chat() calls in user-prompt hook }, // Endpoint "endpoint": { "environment": "production" // "production" | "local" // or: "baseUrl": "http://your-server:8000/v3" }, // Miscellaneous "localContext": { "maxEntries": 50 }, "enabled": true, "logging": true, // Advanced: force all hosts to use the same workspace "globalOverride": false } ``` ### Session Strategies Session strategy controls how Honcho maps your conversations to sessions: | Strategy | Behavior | Best for | | ------------------------- | ----------------------------------------------------------------------------------- | --------------------------------------------------------- | | `per-directory` (default) | One session per project directory. Stable across restarts. | Most users — each project accumulates its own memory | | `git-branch` | Session name includes the current git branch. Switching branches switches sessions. | Feature-branch workflows where context per branch matters | | `chat-instance` | Each Claude Code chat gets its own session. No continuity between restarts. | Ephemeral usage or when you want a clean slate each time | Session names are prefixed with your `peerName` by default (e.g., `alice-my-project`). Set `sessionPeerPrefix: false` if you're the only user and want shorter names. ### Host-Aware Configuration The plugin auto-detects which tool is running it (Claude Code, Cursor, etc.) and reads the matching block from `hosts`. Each host gets its own workspace and AI peer name, so data stays separated by default. **Host detection priority:** 1. `HONCHO_HOST` env var (explicit override) 2. `cursor_version` in hook stdin (Cursor detected) 3. `CURSOR_PROJECT_DIR` env var (Cursor child process) 4. Default: `claude_code` ### Linking Hosts for Cross-Tool Context If you use both Claude Code and Cursor, you can link them so context from one is readable in the other. Writes always stay in the current host's workspace — linking only adds read access. ```jsonc theme={null} { "hosts": { "claude_code": { "workspace": "claude_code", "aiPeer": "claude", "linkedHosts": ["cursor"] // Claude Code can read Cursor's context }, "cursor": { "workspace": "cursor", "aiPeer": "cursor", "linkedHosts": ["claude_code"] // Cursor can read Claude Code's context } } } ``` Or use `/honcho:config` and select **Workspace > Linking** to set this up interactively. ### Global Override If you want all hosts to share a single workspace (instead of per-host isolation), set `globalOverride: true` and a flat `workspace` field: ```jsonc theme={null} { "globalOverride": true, "workspace": "shared", "hosts": { "claude_code": { "aiPeer": "claude" }, "cursor": { "aiPeer": "cursor" } } } ``` All tools will read and write to the `shared` workspace. Each tool still uses its own AI peer name. ## Building with Teammates Multiple people can share context by pointing to the same workspace. Each person uses their own `peerName` as identity, and sessions are automatically prefixed with it to avoid collisions. **Person A** (`~/.honcho/config.json`): ```json theme={null} { "apiKey": "hch-v2-team-key...", "peerName": "alice", "hosts": { "claude_code": { "workspace": "team-acme", "aiPeer": "claude" } } } ``` **Person B** (`~/.honcho/config.json`): ```json theme={null} { "apiKey": "hch-v2-team-key...", "peerName": "bob", "hosts": { "claude_code": { "workspace": "team-acme", "aiPeer": "claude" } } } ``` Both Alice and Bob write to the `team-acme` workspace. Their sessions are namespaced (e.g., `alice-my-project`, `bob-my-project`) so data doesn't collide, but Honcho's dialectic reasoning can draw on context from both users. ## Logging The plugin logs activity to `~/.honcho/` and to Claude Code's verbose mode, so you can see exactly how Honcho is being used — what context is loaded at session start, what messages are saved, and what context is injected into Claude's prompts. Set `logging` to `false` in your config (or `HONCHO_LOGGING=false`) to disable file logging. ## MCP Tools The plugin provides these tools via MCP: | Tool | Description | | ------------------- | ----------------------------------------------- | | `search` | Semantic search across session messages | | `chat` | Query Honcho's knowledge about the user | | `create_conclusion` | Save insights about the user to memory | | `get_config` | View current configuration and status | | `set_config` | Change any configuration field programmatically | ## Skills (Slash Commands) | Command | Description | | ------------------- | ----------------------------------------------------------- | | `/honcho:status` | Show current memory status and connection info | | `/honcho:config` | Interactive configuration menu | | `/honcho:setup` | First-time setup — validate API key and create config | | `/honcho:interview` | Interview to capture stable, cross-project user preferences | ### The Interview The `/honcho:interview` skill conducts a short interview to learn stable, cross-project aspects about you: * **Communication style** — Concise answers, detailed explanations, or a mix * **Tone** — Direct and professional or conversational * **Structure** — Bullet points, step-by-step, or narrative * **Technical depth** — Beginner, intermediate, or expert * **Code quality focus** — Clarity, performance, tests, or minimal changes * **Collaboration style** — Make changes directly, propose options, or ask first Each answer is saved as a conclusion in Honcho memory and persists across all your projects. ## Environment Variables Environment variables work for initial bootstrap (before a config file exists). Once `~/.honcho/config.json` is written, the config file takes precedence for host-specific fields like `workspace`. | Variable | Required | Default | Description | | ---------------------- | -------- | ------------- | ----------------------------------------------------------------- | | `HONCHO_API_KEY` | **Yes** | — | Your Honcho API key from [app.honcho.dev](https://app.honcho.dev) | | `HONCHO_PEER_NAME` | No | `$USER` | Your identity in the memory system | | `HONCHO_WORKSPACE` | No | `claude_code` | Workspace name (used only when no config file exists) | | `HONCHO_AI_PEER` | No | `claude` | AI peer name | | `HONCHO_HOST` | No | auto-detected | Force host detection: `claude_code`, `cursor`, or `obsidian` | | `HONCHO_ENDPOINT` | No | `production` | `production`, `local`, or a full URL | | `HONCHO_ENABLED` | No | `true` | Set to `false` to disable | | `HONCHO_SAVE_MESSAGES` | No | `true` | Set to `false` to stop saving messages | | `HONCHO_LOGGING` | No | `true` | Set to `false` to disable file logging to `~/.honcho/` | ### Using a local Honcho instance Via config file: ```json theme={null} { "endpoint": { "environment": "local" } } ``` Or via env var: ```bash theme={null} export HONCHO_ENDPOINT="local" # Uses http://localhost:8000/v3 ``` *** ## Using Honcho with Claude Desktop You can also use Honcho with the Claude Desktop app via MCP. This lets Claude manage its own memory in the native desktop experience. ### Step 1: Get Your API Key Get an API key from [app.honcho.dev](https://app.honcho.dev). ### Step 2: Configure Claude Desktop This requires [Node.js](https://nodejs.org). Claude Desktop or Claude Code can help you install it! Navigate to Claude Desktop's custom MCP servers settings and add Honcho: ```json theme={null} { "mcpServers": { "honcho": { "command": "npx", "args": [ "mcp-remote", "https://mcp.honcho.dev", "--header", "Authorization:${AUTH_HEADER}", "--header", "X-Honcho-User-Name:${USER_NAME}" ], "env": { "AUTH_HEADER": "Bearer ", "USER_NAME": "" } } } } ``` **Optional customization** — You can also set a custom assistant name and workspace ID: ```json theme={null} { "mcpServers": { "honcho": { "command": "npx", "args": [ "mcp-remote", "https://mcp.honcho.dev", "--header", "Authorization:${AUTH_HEADER}", "--header", "X-Honcho-User-Name:${USER_NAME}", "--header", "X-Honcho-Assistant-Name:${ASSISTANT_NAME}", "--header", "X-Honcho-Workspace-ID:${WORKSPACE_ID}" ], "env": { "AUTH_HEADER": "Bearer ", "USER_NAME": "", "ASSISTANT_NAME": "", "WORKSPACE_ID": "" } } } } ``` ### Step 3: Restart Claude Desktop Upon relaunch, Honcho should start and the tools will be available. ### Step 4: Add Instructions The Desktop app doesn't allow system prompts directly, but you can create a project and paste [these instructions](https://raw.githubusercontent.com/plastic-labs/honcho/refs/heads/main/mcp/instructions.md) into the "Project Instructions" field. Claude will then query for insights before responding and write your messages to storage! *** ## Next Steps Source code, issues, and README. Learn about peers, sessions, and dialectic reasoning. # CrewAI Source: https://honcho.dev/docs/v3/guides/integrations/crewai Build AI agents with persistent memory using CrewAI and Honcho Integrate Honcho with CrewAI to build AI agents that maintain memory across sessions. This guide shows you how to use Honcho's memory layer with CrewAI's agent orchestration framework. The full code is available on [GitHub](https://github.com/plastic-labs/honcho/tree/main/examples/crewai) with examples in [Python](https://github.com/plastic-labs/honcho/tree/main/examples/crewai/python/examples) ## What We're Building We'll create AI agents that remember and reason over past conversations. Here's how the pieces fit together: * **CrewAI** orchestrates agent behavior and task execution * **Honcho** stores messages and retrieves relevant context The key benefit: CrewAI automatically retrieves relevant conversation history from Honcho without you needing to manually manage context, token limits, or message formatting. This tutorial demonstrates single-agent setup to show how Honcho integrates with CrewAI. For production applications, you can extend this to multi-agent crews with shared or individual memory using Honcho's `peer` system. ## Setup Install required packages: ```bash Python (uv) theme={null} uv add honcho-crewai crewai python-dotenv ``` ```bash Python (pip) theme={null} pip install honcho-crewai crewai python-dotenv ``` Use any LLM provider for your Crew. Create a `.env` file with your API keys: ```bash theme={null} OPENAI_API_KEY=your_openai_key ``` This tutorial uses the Honcho demo server at [https://demo.honcho.dev](https://demo.honcho.dev) which runs a small instance of Honcho on the latest version. For production, get your Honcho API key at [app.honcho.dev](https://app.honcho.dev). For local development, use `environment="local"`. ## CrewAI Honcho Storage The `honcho_crewai` package provides `HonchoStorage`, a storage provider that implements CrewAI's `Storage` interface using Honcho's session-based memory. Before proceeding, it's important to understand Honcho's core concepts (`Peers` and `Sessions`). Review the [Honcho Architecture](/docs/v3/documentation/core-concepts/architecture) to familiarize yourself with these primitives. `HonchoStorage` implements CrewAI's `Storage` interface using Honcho's `peer` and `session` primitives. ```python theme={null} storage = HonchoStorage( user_id="demo-user", # Required: Honcho `peer` ID for the user session_id=None, # Optional: Specific `session` ID (auto-generated UUID if None) honcho_client=None, # Optional: Pre-configured Honcho client instance ) ``` The `HonchoStorage` class implements three key methods: * **`save()`** - Stores messages in Honcho's `session`, associating them with the appropriate `peer` (user or assistant) * **`search()`** - Performs semantic vector search using `session.search()` to find messages most relevant to the query. Supports optional `filters` parameter for fine-grained scoping. * **`reset()`** - Creates a new `session` to start fresh conversations CrewAI automatically calls these methods when agents need to store or retrieve memory, creating a seamless integration. ### Search with Filters The `search()` method supports an optional `filters` parameter for fine-grained scoping of search results: ```python theme={null} # Search with peer_id filter (only messages from a specific peer) results = storage.search("query", filters={"peer_id": "user123"}) # Search with metadata filter results = storage.search("query", filters={"metadata": {"priority": "high"}}) # Search with time range filter results = storage.search("query", filters={"created_at": {"gte": "2024-01-01"}}) # Complex filter with logical operators results = storage.search("query", filters={ "AND": [ {"peer_id": "user123"}, {"metadata": {"topic": "python"}} ] }) ``` For the full filter syntax including logical operators (AND, OR, NOT), comparison operators, and metadata filtering, see the [Using Filters](https://docs.honcho.dev/v3/documentation/features/advanced/using-filters) documentation. For comprehensive details about CrewAI's memory system, see the [official CrewAI Memory documentation](https://docs.crewai.com/en/concepts/memory). Let's create a basic example showing how CrewAI agents use Honcho's memory automatically: ```python Python theme={null} from dotenv import load_dotenv from crewai import Agent, Task, Crew, Process from crewai.memory.external.external_memory import ExternalMemory from honcho_crewai import HonchoStorage load_dotenv() storage = HonchoStorage(user_id="simple-demo-user") external_memory = ExternalMemory(storage=storage) messages = [ ("user", "I'm learning Python programming"), ("assistant", "Great! Python is an excellent language to learn."), ("user", "I'm particularly interested in web development"), ] for role, message in messages: external_memory.save(message, metadata={"agent": role}) agent = Agent( role="Programming Mentor", goal="Help users learn programming by remembering their interests and progress", backstory=( "You are a patient programming mentor who remembers what students " "have told you about their learning journey and interests." ), verbose=True, allow_delegation=False ) task = Task( description=( "Based on what you know about the user's interests, " "suggest a simple web development project they could build to practice Python." ), expected_output="A specific project suggestion with brief explanation", agent=agent ) crew = Crew( agents=[agent], tasks=[task], process=Process.sequential, external_memory=external_memory, verbose=True ) result = crew.kickoff() print(result.raw) ``` ## CrewAI Tool Integration Honcho provides specialized tools that give CrewAI agents explicit control over memory retrieval: * **`HonchoGetContextTool`** - Retrieves comprehensive conversation history with token limits. Use for tasks needing broad conversation understanding. * **`HonchoDialecticTool`** - Queries representations about `peer`s. Use for understanding user preferences and characteristics without full message history. * **`HonchoSearchTool`** - Performs semantic search for specific information. Supports optional `filters` parameter for fine-grained scoping. Use for targeted queries like "what did the user say about budget?" Agents can use multiple tools in sequence: search for topics, query dialectic for preferences, then get full context for generation. Here's an example demonstrating all three tools: ```python Python theme={null} from dotenv import load_dotenv from crewai import Agent, Task, Crew, Process from honcho import Honcho from honcho_crewai import ( HonchoGetContextTool, HonchoDialecticTool, HonchoSearchTool, ) load_dotenv() honcho = Honcho() user_id = "demo-user-45" session_id = "tools-demo-session" user = honcho.peer(user_id) session = honcho.session(session_id) messages = [ "I'm planning a trip to Japan in March", "I love trying authentic local cuisine, especially ramen and sushi", "My budget is around $3000 for a 10-day trip", "I'm interested in visiting both Tokyo and Kyoto", "I prefer staying in traditional ryokans over hotels", ] for msg in messages: session.add_messages([user.message(msg)]) context_tool = HonchoGetContextTool( honcho=honcho, session_id=session_id, peer_id=user_id ) dialectic_tool = HonchoDialecticTool( honcho=honcho, session_id=session_id, peer_id=user_id ) search_tool = HonchoSearchTool(honcho=honcho, session_id=session_id) # Note: The search tool supports optional filters for fine-grained scoping # Agents can use filters like {"peer_id": "user123"} or {"metadata": {"priority": "high"}} travel_agent = Agent( role="Travel Planning Specialist", goal="Create personalized travel recommendations using memory tools", backstory=( "You are an expert travel planner with access to conversation memory tools. " "Use the tools to understand the user's preferences before making recommendations." ), tools=[context_tool, dialectic_tool, search_tool], verbose=True, allow_delegation=False ) task = Task( description=( "Create a personalized 3-day Tokyo itinerary. " "Use the memory tools to understand:\n" " • Food preferences (use search_tool for 'cuisine' or 'food')\n" " • Travel style and budget (use dialectic_tool to query user knowledge)\n" " • Recent context (use context_tool to get conversation history)\n" "Then create a detailed plan matching their interests." ), expected_output=( "A 3-day Tokyo itinerary with:\n" " • Daily activities matching user interests\n" " • Restaurant recommendations\n" " • Accommodation suggestions\n" " • Budget considerations" ), agent=travel_agent ) crew = Crew( agents=[travel_agent], tasks=[task], process=Process.sequential, verbose=True ) crew.kickoff() ``` ## Tool-Based vs Automatic Memory **Use `HonchoStorage`** for automatic memory - CrewAI handles everything transparently. Best for simple conversational flows. **Use Honcho Tools** for strategic control - agents decide when and how to query memory. Best for multi-step reasoning, when different query types are needed, or multi-agent systems. You can combine both: automatic memory for baseline context, tools for specific queries. See the [hybrid memory example](https://github.com/plastic-labs/honcho/blob/main/examples/crewai/python/examples/hybrid_memory_example.py) for a complete implementation. **Multi-Agent Memory:** Use Honcho tools with different `peer_id` values to give each agent distinct memory and identity. ## Next Steps Now that you have a working CrewAI integration with Honcho, you can: * **Create specialized agents** with domain-specific memory and context * **Use CrewAI's advanced features** like hierarchical processes, tool delegation, and conditional task execution * **Leverage logical reasoning** via the Dialectic API for deep `peer` understanding * **Implement custom tools** to give agents explicit control over memory retrieval ## Related Resources Understand Honcho's peer-based model and core primitives Learn about retrieving and formatting conversation context Query `peer` representations for deeper understanding Build stateful agents with LangGraph and Honcho # Hermes Agent + Honcho Source: https://honcho.dev/docs/v3/guides/integrations/hermes How Hermes Agent uses Honcho for persistent cross-session memory and user modeling [Hermes Agent](https://github.com/NousResearch/hermes-agent) is an open-source AI agent from [Nous Research](https://nousresearch.com) with tool-calling, terminal access, a skills system, and multi-platform deployment (Telegram, Discord, Slack, WhatsApp). Honcho gives Hermes persistent cross-session memory and user modeling. For setup, configuration, and CLI commands, see the [Hermes Agent Honcho docs](https://hermes-agent.nousresearch.com/docs/user-guide/features/honcho). ## What Honcho provides Honcho acts as a long-term memory and user-model layer alongside Hermes' built-in memory files (`MEMORY.md` and `USER.md`). It gives Hermes three capabilities: 1. **Prompt-time context injection** -- durable context about a user loaded into the prompt before generating a response. 2. **Cross-session continuity** -- recall of stable preferences, project history, and working context across conversations. 3. **Durable writeback** -- stable facts learned during a conversation stored back for future turns. These sit alongside Hermes' local session history. Session history remembers the current conversation. Honcho remembers what should still matter later. ## Dual-peer architecture Both the user and the AI agent have peer representations in Honcho: * **User peer**: observed from user messages. Learns preferences, goals, communication style. * **AI peer**: observed from assistant messages. Builds the agent's knowledge representation. Both representations are injected into the system prompt, giving Hermes awareness of both who it's talking to and what it knows. ## Available tools Hermes exposes four Honcho tools to the agent: | Tool | What it does | | ----------------- | --------------------------------------------------------------------------------------------------- | | `honcho_profile` | Fast peer card retrieval (no LLM). Returns curated key facts about the user. | | `honcho_search` | Semantic search over memory. Returns raw excerpts ranked by relevance. | | `honcho_context` | Dialectic Q\&A powered by Honcho's LLM. Synthesizes answers from conversation history. | | `honcho_conclude` | Writes durable facts to Honcho when the user states preferences, corrections, or important context. | ## Running Honcho locally with Hermes Follow the [Self-Hosting Guide](/docs/v3/contributing/self-hosting) to get Honcho running locally. Once it's up, point Hermes at your instance: ```bash theme={null} hermes memory setup # select "honcho", enter http://localhost:8000 as the base URL ``` Or manually create/edit the config file (checked in order: `$HERMES_HOME/honcho.json` > `~/.hermes/honcho.json` > `~/.honcho/config.json`): ```json theme={null} { "baseUrl": "http://localhost:8000", "hosts": { "hermes": { "enabled": true, "aiPeer": "hermes", "peerName": "your-name", "workspace": "hermes" } } } ``` For the full list of config fields (`recallMode`, `writeFrequency`, `sessionStrategy`, `dialecticReasoningLevel`, etc.), see the [Hermes memory provider docs](https://hermes-agent.nousresearch.com/docs/user-guide/features/memory-providers#honcho). **Community quick-start**: [elkimek/honcho-self-hosted](https://github.com/elkimek/honcho-self-hosted) provides a one-command installer with pre-configured model tiers and Hermes Agent integration. ## Verifying the integration ### 1. Check status ```bash theme={null} hermes memory status ``` This should show Honcho as the active memory provider with your base URL. ### 2. Store a fact and recall it across sessions In one conversation, tell Hermes something specific: ```text theme={null} My favorite programming language is Rust and I always use dark mode. ``` Start a **new session** (different thread, new CLI invocation, or a different platform). Ask: ```text theme={null} What do you know about my preferences? ``` If Hermes mentions Rust and dark mode without being told again, cross-session memory is working. The deriver processed your messages, extracted observations, and the dialectic recalled them. ### 3. Test tool calling directly Ask Hermes to use a specific Honcho tool: ```text theme={null} Use your honcho_search tool to find anything you know about me. ``` If Hermes calls the tool and returns results, the full tool pipeline (API connection, vector search, embedding) is functional. ## Configuration options | Field | Default | Description | | ------------------------- | --------------- | ----------------------------------------------------------------------------- | | `recallMode` | `hybrid` | `hybrid` (auto-inject + tools), `context` (inject only), `tools` (tools only) | | `writeFrequency` | `async` | `async`, `turn`, `session`, or integer N | | `sessionStrategy` | `per-directory` | `per-directory`, `per-repo`, `per-session`, `global` | | `dialecticReasoningLevel` | `low` | `minimal`, `low`, `medium`, `high`, `max` | | `dialecticDynamic` | `true` | Auto-bump reasoning level by query complexity | | `messageMaxChars` | `25000` | Max chars per message (chunked if exceeded) | ## Next steps Setup, configuration, CLI commands, and all config options. Source code, installation, and full documentation. Peers, sessions, and how reasoning works. Full local environment setup, provider configuration, and troubleshooting. # LangGraph Source: https://honcho.dev/docs/v3/guides/integrations/langgraph Build a stateful conversational AI agent with LangGraph and Honcho Integrate Honcho with LangGraph to build a conversational AI agent that maintains memory across sessions. This guide shows you how to use Honcho's memory layer with LangGraph's orchestration. The full code is available on [GitHub](https://github.com/plastic-labs/honcho/tree/main/examples/langgraph) with examples in both [Python](https://github.com/plastic-labs/honcho/blob/main/examples/langgraph/python/main.py) and [TypeScript](https://github.com/plastic-labs/honcho/blob/main/examples/langgraph/typescript/main.ts) ## What We're Building We'll create a conversational agent that remembers and reasons over past exchanges with the user. Here's how the pieces fit together: * **LangGraph** orchestrates the conversation flow * **Honcho** stores messages and retrieves relevant context * **Your LLM** generates responses using Honcho's formatted context The key benefit: You don't manually manage conversation history, token limits, or message formatting. Honcho handles memory so you can focus on your agent's logic. This tutorial demonstrates a simple linear conversation flow to show how Honcho integrates with LangGraph. For production applications, you'll likely want to add LangGraph features like conditional routing, tool calling, and multi-agent orchestration. ## Setup Install required packages: ```bash Python (uv) theme={null} uv add honcho-ai langgraph langchain-core openai python-dotenv ``` ```bash Python (pip) theme={null} pip install honcho-ai langgraph langchain-core openai python-dotenv ``` ```bash TypeScript (npm) theme={null} npm install @honcho-ai/sdk @langchain/langgraph openai dotenv ``` ```bash TypeScript (yarn) theme={null} yarn add @honcho-ai/sdk @langchain/langgraph openai dotenv ``` ```bash TypeScript (pnpm) theme={null} pnpm add @honcho-ai/sdk @langchain/langgraph openai dotenv ``` This tutorial uses OpenAI, but Honcho works with any LLM provider. Create a `.env` file with your API keys: ```bash theme={null} OPENAI_API_KEY=your_openai_key ``` This tutorial uses the Honcho demo server at [https://demo.honcho.dev](https://demo.honcho.dev) which runs a small instance of Honcho on the latest version. For production, get your Honcho API key at [app.honcho.dev](https://app.honcho.dev). For local development, use `environment="local"`. ## Initialize Clients ```python Python theme={null} import os from dotenv import load_dotenv from typing_extensions import TypedDict from honcho import Honcho, Peer, Session from openai import OpenAI from langgraph.graph import StateGraph, START, END load_dotenv() # Initialize Honcho honcho = Honcho() # Initialize OpenAI llm = OpenAI(api_key=os.environ.get("OPENAI_API_KEY")) ``` ```typescript TypeScript theme={null} import * as dotenv from "dotenv"; import { Honcho, Peer, Session } from "@honcho-ai/sdk"; import OpenAI from "openai"; import { Annotation } from "@langchain/langgraph"; import { StateGraph, START, END } from "@langchain/langgraph"; import * as readline from "readline/promises"; dotenv.config(); // Initialize Honcho const honcho = new Honcho({}); // Initialize OpenAI const llm = new OpenAI({ apiKey: process.env.OPENAI_API_KEY }); ``` ## Define LangGraph State Define your state schema to pass data through the graph. The state stores Honcho objects directly along with the current user message and assistant response. Before proceeding, it's important to understand Honcho's core concepts (`Peers` and `Sessions`). Review the [Honcho Architecture](/docs/v3/documentation/core-concepts/architecture) to familiarize yourself with these primitives. ```python Python theme={null} class State(TypedDict): user_message: str assistant_response: str user: Peer assistant: Peer session: Session ``` ```typescript TypeScript theme={null} const StateAnnotation = Annotation.Root({ userMessage: Annotation(), assistantResponse: Annotation(), user: Annotation(), assistant: Annotation(), session: Annotation(), }); type State = typeof StateAnnotation.State; ``` ## Build the LangGraph Define your chatbot logic, using Honcho to retrieve conversation context. This function demonstrates how Honcho can store messages, retrieve context, and generate responses. ```python Python theme={null} def chatbot(state: State): user_message = state["user_message"] # Get objects from state user = state["user"] assistant = state["assistant"] session = state["session"] # Step 1: Store the user's message in the session # This adds it to Honcho's memory for future context retrieval session.add_messages([user.message(user_message)]) # Step 2: Get context in OpenAI format with token limit # context() retrieves relevant conversation history # tokens=2000 limits the context to 2000 tokens to manage costs and fit within model limits # to_openai() converts it to the format expected by OpenAI's API messages = session.context(tokens=2000).to_openai(assistant=assistant) # Step 3: Generate response using the context response = llm.chat.completions.create( model="gpt-5.1", messages=messages ) assistant_response = response.choices[0].message.content # Step 4: Store assistant response in Honcho for future context session.add_messages([assistant.message(assistant_response)]) return {"assistant_response": assistant_response} ``` ```typescript TypeScript theme={null} async function chatbot(state: State) { const userMessage = state.userMessage; // Get objects from state const user = state.user; const assistant = state.assistant; const session = state.session; // Step 1: Store the user's message in the session // This adds it to Honcho's memory for future context retrieval await session.addMessages([user.message(userMessage)]); // Step 2: Get context in OpenAI format with token limit // context() retrieves relevant conversation history // tokens: 2000 limits the context to 2000 tokens to manage costs and fit within model limits // toOpenAI() converts it to the format expected by OpenAI's API const messages = (await session.context({ tokens: 2000 })).toOpenAI(assistant); // Step 3: Generate response using the context const response = await llm.chat.completions.create({ model: "gpt-5.1", messages: messages }); const assistantResponse = response.choices[0].message.content!; // Step 4: Store assistant response for future context await session.addMessages([assistant.message(assistantResponse)]); return { assistantResponse: assistantResponse }; } ``` Now let's build the LangGraph: ```python Python theme={null} graph = StateGraph(State) \ .add_node("chatbot", chatbot) \ .add_edge(START, "chatbot") \ .add_edge("chatbot", END) \ .compile() ``` ```typescript TypeScript theme={null} const graph = new StateGraph(StateAnnotation) .addNode("chatbot", chatbot) .addEdge(START, "chatbot") .addEdge("chatbot", END) .compile(); ``` ### Understanding context() The [`context()`](/docs/v3/documentation/features/get-context) method retrieves comprehensive conversation context and formats it for your LLM. It automatically: * **Manages conversation history** - Tracks all messages and determines what's relevant * **Respects token limits** - Stays within context window constraints without manual counting * **Handles long conversations** - Combines recent detailed messages with summaries of older exchanges * **Provides `peer` understanding** - Includes representations and `peer` cards when requested The `SessionContext` object always includes fields for messages, summaries, `peer` representations, and `peer` cards. By default, only `messages` and `summary` are populated. To populate peer-specific context, pass a `peer_target` parameter: **Using `peer_target` for Context:** * **Without `peer_perspective`**: Returns Honcho's omniscient view of `peer_target` (all conclusions and context) * **With `peer_perspective`**: Returns what `peer_perspective` knows about `peer_target` (perspective-based conclusions and context) That's it. Call `session.context().to_openai(assistant)` and you get properly formatted context tailored for your assistant. **Adding System Prompts:** Since `context()` returns conversation messages, you can easily prepend custom system instructions. Just add your system prompt to the beginning of the messages array before sending it to your LLM: `[{"role": "system", "content": "..."}, ...context_messages]`. For more details on all available parameters, see [`context() documentation`](/docs/v3/documentation/features/get-context) ## Chat Loop Now we'll create the main conversation function. To simplify logic, we initialize Honcho objects once per conversation and pass them through the LangGraph state. The `run_conversation_turn` function initializes a Honcho `Session` and `Peer` objects, passes them to the LangGraph, and returns the assistant's response. By calling it repeatedly with the same `user_id` and in the same session, the chat builds context over time. **Production Usage:** Honcho accepts any nanoid-compatible string for `user_id` and `session_id`. You can use IDs directly from your authentication system (Auth0, Firebase, Clerk, etc.) and session management without modification. This tutorial uses hardcoded values for simplicity. ```python Python theme={null} def run_conversation_turn(user_id: str, user_input: str, session_id: str | None = None): if not session_id: session_id = f"session_{user_id}" # Initialize Honcho objects user = honcho.peer(user_id) assistant = honcho.peer("assistant") session = honcho.session(session_id) result = graph.invoke({ "user_message": user_input, "user": user, "assistant": assistant, "session": session }) return result["assistant_response"] if __name__ == "__main__": print("Welcome to the AI Assistant! How can I help you today?") user_id = "test-user-123" while True: user_input = input("You: ") if user_input.lower() in ['quit', 'exit']: break response = run_conversation_turn(user_id, user_input) print(f"Assistant: {response}\n") ``` ```typescript TypeScript theme={null} async function runConversationTurn( userId: string, userInput: string, sessionId?: string ): Promise { if (!sessionId) { sessionId = `session_${userId}`; } // Initialize Honcho objects const user = await honcho.peer(userId); const assistant = await honcho.peer("assistant"); const session = await honcho.session(sessionId); const result = await graph.invoke({ userMessage: userInput, user: user, assistant: assistant, session: session, }); return result.assistantResponse; } // Interactive chat loop async function main() { console.log("Welcome to the AI Assistant! How can I help you today?"); const userId = "test-user-123"; const rl = readline.createInterface({ input: process.stdin, output: process.stdout, }); while (true) { const userInput = await rl.question("You: "); if (userInput.toLowerCase() === "quit" || userInput.toLowerCase() === "exit") { rl.close(); break; } const response = await runConversationTurn(userId, userInput); console.log(`Assistant: ${response}\n`); } } main(); ``` ## Next Steps Now that you have a working LangGraph integration with Honcho, you can: * **Create custom [LangChain tools](https://docs.langchain.com/oss/python/langchain/tools#customize-tool-properties) for your agent** - to fully utilize Honcho's memory & context management features * **Build a multi-agent LangGraph** where each agent is a Honcho `Peer` with its own memory ## Related Resources Learn more about retrieving and formatting conversation context Use Honcho in Claude Desktop with MCP # Model Context Protocol (MCP) Source: https://honcho.dev/docs/v3/guides/integrations/mcp Give any AI tool persistent memory with the Honcho MCP server The Honcho MCP server gives any MCP-compatible AI tool persistent memory and personalization. Connect it once and your AI assistant learns who you are, remembers your preferences, and gets better over time — across every conversation. **Server URL:** `https://mcp.honcho.dev` You'll need an API key from [app.honcho.dev](https://app.honcho.dev) to use the hosted MCP server. ## Client Setup Pick your client below and add the config. After adding, **restart the client fully** for changes to take effect. ### Claude Desktop Edit `~/Library/Application Support/Claude/claude_desktop_config.json`: Edit `%APPDATA%\Claude\claude_desktop_config.json`: ```json theme={null} { "mcpServers": { "honcho": { "command": "npx", "args": [ "mcp-remote", "https://mcp.honcho.dev", "--header", "Authorization:${AUTH_HEADER}", "--header", "X-Honcho-User-Name:${USER_NAME}" ], "env": { "AUTH_HEADER": "Bearer hch-your-key-here", "USER_NAME": "YourName" } } } } ``` After saving, fully quit and relaunch Claude Desktop. The Honcho tools should appear in the tool picker. For best results, create a project and paste these [instructions](https://raw.githubusercontent.com/plastic-labs/honcho/refs/heads/main/mcp/instructions.md) into the "Project Instructions" field so Claude knows how to use the memory tools. ### Claude Code ```bash theme={null} claude mcp add honcho \ --transport http \ --url "https://mcp.honcho.dev" \ --header "Authorization: Bearer hch-your-key-here" \ --header "X-Honcho-User-Name: YourName" ``` Or if you prefer the [Claude Code Honcho plugin](/docs/v3/guides/integrations/claudecode) for a deeper integration with persistent memory, git awareness, and agent skills: ```bash theme={null} /plugin marketplace add plastic-labs/claude-honcho ``` ### Codex Add to `~/.codex/config.toml`: ```toml theme={null} [mcp_servers.honcho] command = "npx" args = [ "mcp-remote", "https://mcp.honcho.dev", "--header", "Authorization:Bearer hch-your-key-here", "--header", "X-Honcho-User-Name:YourName" ] ``` Codex only supports stdio transport, so it uses `mcp-remote` as a bridge. Restart both the Codex CLI and VS Code extension after editing. ### Cursor Cursor supports MCP servers natively via HTTP. Add to your global config at `~/.cursor/mcp.json` or per-project at `.cursor/mcp.json`: ```json theme={null} { "mcpServers": { "honcho": { "url": "https://mcp.honcho.dev", "headers": { "Authorization": "Bearer hch-your-key-here", "X-Honcho-User-Name": "YourName" } } } } ``` Alternatively, go to **Cursor Settings → MCP** and add a new HTTP server with the URL and headers above. ### Windsurf Add to `~/.codeium/windsurf/mcp_config.json`: ```json theme={null} { "mcpServers": { "honcho": { "serverUrl": "https://mcp.honcho.dev", "headers": { "Authorization": "Bearer hch-your-key-here", "X-Honcho-User-Name": "YourName" } } } } ``` Windsurf uses `serverUrl` instead of `url`. ### VS Code (Copilot Chat) Add to your workspace `.vscode/mcp.json`: ```json theme={null} { "servers": { "honcho": { "type": "http", "url": "https://mcp.honcho.dev", "headers": { "Authorization": "Bearer hch-your-key-here", "X-Honcho-User-Name": "YourName" } } } } ``` Or add to your User Settings JSON (`Cmd+Shift+P` → "Preferences: Open User Settings (JSON)"): ```json theme={null} { "mcp": { "servers": { "honcho": { "type": "http", "url": "https://mcp.honcho.dev", "headers": { "Authorization": "Bearer hch-your-key-here", "X-Honcho-User-Name": "YourName" } } } } } ``` ### Cline Cline supports remote MCP servers natively. Open Cline's MCP settings at: `~/Library/Application Support/Code/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json` `%APPDATA%\Code\User\globalStorage\saoudrizwan.claude-dev\settings\cline_mcp_settings.json` ```json theme={null} { "mcpServers": { "honcho": { "url": "https://mcp.honcho.dev", "headers": { "Authorization": "Bearer hch-your-key-here", "X-Honcho-User-Name": "YourName" } } } } ``` Or add it via the Cline sidebar: click the MCP Servers icon → **Configure** → **Remote Servers**. ### Zed Add to `~/.config/zed/settings.json`: ```json theme={null} { "context_servers": { "honcho": { "url": "https://mcp.honcho.dev", "headers": { "Authorization": "Bearer hch-your-key-here", "X-Honcho-User-Name": "YourName" } } } } ``` Zed uses `context_servers` instead of `mcpServers`. Native HTTP support requires Zed v0.214.5 or later. *** ## Optional Configuration You can customize the assistant name and workspace ID by adding extra headers. Both are optional. | Header | Default | Description | | ------------------------- | ------------- | --------------------------- | | `Authorization` | *required* | `Bearer hch-your-key-here` | | `X-Honcho-User-Name` | *required* | What the AI should call you | | `X-Honcho-Assistant-Name` | `"Assistant"` | Name for the AI peer | | `X-Honcho-Workspace-ID` | `"default"` | Isolate memory per project | Example with all headers (Claude Desktop format): ```json theme={null} { "mcpServers": { "honcho": { "command": "npx", "args": [ "mcp-remote", "https://mcp.honcho.dev", "--header", "Authorization:${AUTH_HEADER}", "--header", "X-Honcho-User-Name:${USER_NAME}", "--header", "X-Honcho-Assistant-Name:${ASSISTANT_NAME}", "--header", "X-Honcho-Workspace-ID:${WORKSPACE_ID}" ], "env": { "AUTH_HEADER": "Bearer hch-your-key-here", "USER_NAME": "YourName", "ASSISTANT_NAME": "Claude", "WORKSPACE_ID": "my-project" } } } } ``` *** ## Available Tools The recommended flow for a standard conversation uses `create_session` + `add_messages_to_session` + `chat`. See the [full instructions](https://raw.githubusercontent.com/plastic-labs/honcho/refs/heads/main/mcp/instructions.md) for a complete walkthrough. **Workspace** — `inspect_workspace`, `list_workspaces`, `search`, `get_metadata`, `set_metadata` **Peers** — `create_peer`, `list_peers`, `chat`, `get_peer_card`, `set_peer_card`, `get_peer_context`, `get_representation` **Sessions** — `create_session`, `list_sessions`, `delete_session`, `clone_session`, `add_peers_to_session`, `remove_peers_from_session`, `get_session_peers`, `inspect_session`, `add_messages_to_session`, `get_session_messages`, `get_session_message`, `get_session_context` **Conclusions** — `list_conclusions`, `query_conclusions`, `create_conclusions`, `delete_conclusion` **System** — `schedule_dream`, `get_queue_status` *** ## Verify It Works After setup, try asking your AI assistant: > "What do you know about me?" On the first conversation there won't be much — but after a few exchanges, Honcho's background reasoning will start building a representation of you. Ask again after a couple of conversations and you'll see the difference. *** ## Troubleshooting | Problem | Fix | | ----------------------------------- | -------------------------------------------------------------------------------------------- | | Tools don't show up | Make sure you fully restarted the client after adding the config. | | Authorization errors | Check your API key at [app.honcho.dev](https://app.honcho.dev). It should start with `hch-`. | | `npx` not found | Install Node.js — your AI assistant can help with this. | | "No personalization insights found" | Normal for new users. Honcho needs a few conversations to build context. | | Connection timeouts | Check that `https://mcp.honcho.dev` is accessible from your network. | Need help? Join us on [Discord](https://discord.gg/honcho) or open an issue on [GitHub](https://github.com/plastic-labs/honcho/tree/main/mcp). # n8n Source: https://honcho.dev/docs/v3/guides/integrations/n8n Connect Honcho to your n8n workflows to build intelligent automation workflows and agents that leverage persistent memory across sessions. ## Quick Start ### Prerequisites * n8n instance (self-hosted or cloud) * Honcho API key ([get one here](https://app.honcho.dev)) * Basic understanding of n8n workflows * Basic understanding of [Honcho architecture](/docs/v3/documentation/core-concepts/architecture). Specifically **workspaces**, **sessions**, **peers**, and **messages**. ### Before You Start **This integration uses HTTP Request nodes.** There's no native Honcho node for n8n yet. While this requires more setup, it gives you full control over the API and works with any n8n version. **This tutorial is instructional, not production-ready.** We load a single Gmail message with hardcoded IDs to demonstrate the concepts clearly. See [Next Steps](#next-steps) for handling multiple messages and dynamic configurations. **Why Honcho over n8n's built-in memory?** n8n's "memory" nodes are vector databases for RAG-style retrieval. Honcho offers richer context and reasoning—it builds understanding of users over time, not just similarity search. [Learn more](https://blog.plasticlabs.ai/blog/Memory-as-Reasoning). ### Setting Up the HTTP Request Node The Honcho integration in n8n uses the HTTP Request node to interact with the Honcho API. Here's how to configure it: 1. Add an **HTTP Request** node to your workflow (Core > HTTP Request)
Adding HTTP Request node from Core nodes
2. Set the **Method** based on your operation (typically `POST` for creating resources, `GET` for retrieving) 3. Set the **URL** to the appropriate Honcho API endpoint. For example, create workspace is `https://api.honcho.dev/v3/workspaces` 4. For authentication, select **Generic Credential Type** and then **Bearer Auth** 5. Click **Create New Credential** and paste your Honcho API key in the Bearer Token field
Bearer Auth credential setup in n8n
6. Check **Send Body** and select **JSON** as the Body Content Type when creating resources ## Step-by-Step Tutorial We'll build a workflow that ingests Gmail emails into Honcho, then uses that memory to power a conversational AI chatbot. The workflow has two parts (separated by sticky notes in the canvas): 1. **Data Ingestion**: Manual trigger → Workspace → Session → Gmail → Extract Peers → Create Peers → Add to Session → Create Messages 2. **AI Chat Interface**: Chat Trigger → Agent (with LLM and Honcho tools) The Agent uses Honcho's `context()` endpoint to retrieve relevant information about email conversations, enabling contextual conversations about your email data. Complete workflow overview showing both data ingestion and chat sections ### Part 1: Loading Email Data into Honcho These nodes handle the initial setup and data ingestion: **Pro Tip: Copy from API Playground** The fastest way to configure any Honcho endpoint is to copy the curl command directly from the [app.honcho.dev](https://app.honcho.dev) API playground: 1. Navigate to the endpoint you want to use in the API playground 2. Fill in your parameters, verify the results and click **Copy as cURL** 3. Then in n8n use the **import cURL** button to directly import the request (be sure to verify the bearer token imported correctly) #### Step 1: Manual Trigger Start with a **Manual Trigger** node to execute the workflow on demand. This is useful for initial setup and testing before automating with a Gmail trigger. #### Step 2: Get or Create Workspace 1. Add an **HTTP Request** node 2. **Method**: `POST` 3. **URL**: `https://api.honcho.dev/v3/workspaces` 4. **Body** (JSON): `{ "id": "email-test", "metadata": {} }` **Verify Your Data in Honcho** As you build the data ingestion workflow, verify everything is created correctly in your [Honcho instance](https://app.honcho.dev/). #### Step 3: Get or Create Session 1. Add another **HTTP Request** node 2. **Method**: `POST` 3. **URL**: `https://api.honcho.dev/v3/workspaces/{{ $('Get or Create Workspace').item.json.id }}/sessions` 4. **Body** (JSON): `{ "id": "new_session" }` #### Step 4: Get Gmail Message 1. Add a **Gmail** node 2. **Operation**: Get 3. **Message ID**: Your target message ID (a string of letters & numbers) 4. Configure your Gmail OAuth2 credentials **Finding the Gmail Message ID** The easiest way to find a Gmail message ID is to use n8n's Gmail Get Many operation. Temporarily add it, set the limit to 1, and execute. Use the message ID in the output for the message ID field. In this tutorial, we load in only a single message to demonstrate the workflow. #### Step 5: Extract Peers from Email Use native n8n nodes to extract email participants as peers: **5a. Add a Set node ("Combine Email Fields")** * Combines From, To, Cc, Bcc into an array of individual emails * **Field name**: `allEmails` * **Type**: Array * **Value**: `{{ [$json.From, $json.To, $json.Cc, $json.Bcc].filter(Boolean).flatMap(field => field.split(',').map(e => e.trim())).filter(Boolean) }}` **5b. Add a Split Out node** * Splits the array into individual items (one per email address) * **Field to Split Out**: `allEmails` **5c. Add a Set node ("Clean Names")** * Extracts the display name from each email and formats it * **Field name**: `name` * **Value**: `{{ $json.allEmails.split('<')[0].trim().replace(/ /g, '_') }}` #### Step 6: Get or Create Peer 1. Add an **HTTP Request** node 2. **Method**: `POST` 3. **URL**: `https://api.honcho.dev/v3/workspaces/{{ $('Get or Create Workspace').item.json.id }}/peers` 4. **Body**: `{ "id": "{{ $json.name }}" }` This creates a peer for each email participant, allowing Honcho to build understanding of each person. #### Step 7: Add Peers to Session 1. Add an **HTTP Request** node 2. **Method**: `POST` 3. **URL**: `https://api.honcho.dev/v3/workspaces/{{ $('Get or Create Workspace').item.json.id }}/sessions/{{ $json.id }}/peers` 4. **Body**: `{ "{{ $json.id }}": {} }` #### Step 8: Limit Node Add a **Limit** node to control the flow so the message is only added once to the session. #### Step 9: Create Message for Session 1. Add an **HTTP Request** node 2. **Method**: `POST` 3. **URL**: `https://api.honcho.dev/v3/workspaces/{{ $('Get or Create Workspace').item.json.id }}/sessions/{{ $('Get or Create Session').item.json.id }}/messages/` 4. **Body** (JSON): `{ "messages": [{ "content": "{{ $('Get a message').item.json.snippet }}", "peer_id": "{{ $('Get a message').item.json.From.split('<')[0].trim().replace(/ /g, '_') }}" }] }` The `peer_id` must exactly match a peer created in Step 6. The expression above uses the same cleaning logic as the Clean Names node (`split('<')[0].trim().replace(/ /g, '_')`). ### Part 2: Building a Stateful AI Chatbot Now that data is loaded into Honcho, create a chat interface that leverages this memory: #### Step 1: Chat Trigger Add a **When chat message received** node (from LangChain nodes) to create an interactive chat interface. #### Step 2: AI Agent 1. Add an **Agent** node (LangChain) 2. Configure the system message: ``` You are a helpful assistant that retrieves context about email conversations. Use the Context tool to retrieve session context. Today's date: {{ $now }} ``` #### Step 3: Connect LLM Add an **OpenAI Chat Model** node (or your preferred LLM) and connect it to the Agent. #### Step 4: Add Honcho Tools Create an HTTP Request Tool node for Honcho's context retrieval: **Context Tool:** * **Method**: `GET` * **URL**: `https://api.honcho.dev/v3/workspaces/email-test/sessions/new_session/context` * Returns formatted context for the entire session including all messages and peer interactions Connect the tool to the Agent node. The URL uses the same workspace (`email-test`) and session (`new_session`) IDs created during data ingestion. *** ## Import the Workflow Want to skip the manual setup? Import this workflow directly into n8n. In n8n, go to **Workflows** → **Import from URL** (use the raw JSON link below) or **Import from File**. [Import from URL (raw JSON)](https://raw.githubusercontent.com/plastic-labs/honcho/main/examples/n8n/n8n.json) or expand below to copy: ```json theme={null} { "name": "Honcho Empowered Email AI Agent", "nodes": [ { "parameters": { "content": "## Data Ingestion\nLoads Gmail email into Honcho.\n\n**Run this section first** by clicking 'Execute workflow'.", "height": 356, "width": 2008 }, "type": "n8n-nodes-base.stickyNote", "typeVersion": 1, "position": [ -16, -64 ], "id": "53f767a8-8df9-4ad7-8a3d-37c145495627", "name": "Sticky Note - Data Ingestion" }, { "parameters": { "content": "## AI Chat With Honcho context()\nQuery your email data using natural language.\n\n**Run after data ingestion** to chat with the agent.", "height": 480, "width": 752 }, "type": "n8n-nodes-base.stickyNote", "typeVersion": 1, "position": [ 32, 464 ], "id": "34066961-d29e-4fe8-93bf-8f7043e142b0", "name": "Sticky Note - AI Chat" }, { "parameters": { "model": "gpt-4o", "options": {} }, "id": "16da7a98-5622-427a-bbab-1be39f828d0b", "name": "OpenAI Chat Model", "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi", "position": [ 240, 800 ], "typeVersion": 1, "credentials": { "openAiApi": { "id": "qrvGphL3ydUODxQZ", "name": "OpenAi account" } } }, { "parameters": { "options": { "systemMessage": "You are a helpful assistant that retrieves context about email conversations.\n\nUse the Context tool to retrieve session context.\n\nToday's date: {{ $now }}" } }, "id": "049c3c19-756c-4755-88d9-94312857d9bb", "name": "AI Agent", "type": "@n8n/n8n-nodes-langchain.agent", "position": [ 368, 576 ], "typeVersion": 1.7 }, { "parameters": { "options": {} }, "id": "b9a2ef6a-0e81-45c9-a5ea-16e74a9aa77d", "name": "When chat message received", "type": "@n8n/n8n-nodes-langchain.chatTrigger", "position": [ 80, 576 ], "webhookId": "c91764c2-0b51-4025-ad74-d5f44127aa5a", "typeVersion": 1.1 }, { "parameters": { "method": "POST", "url": "=https://api.honcho.dev/v3/workspaces/{{ $('Get or Create Workspace').item.json.id }}/peers", "authentication": "predefinedCredentialType", "nodeCredentialType": "httpBearerAuth", "sendBody": true, "bodyParameters": { "parameters": [ { "name": "id", "value": "={{ $json.name }}" } ] }, "options": {} }, "type": "n8n-nodes-base.httpRequest", "typeVersion": 4.3, "position": [ 1296, 96 ], "id": "f2e81445-a866-4d9f-9a8a-b2dc8cbaea8b", "name": "Get or Create Peer", "credentials": { "httpBearerAuth": { "id": "NbrkGo1GdYWQY3OX", "name": "Bearer Auth account" } } }, { "parameters": { "operation": "get", "messageId": "19b8fee837985953" }, "type": "n8n-nodes-base.gmail", "typeVersion": 2.2, "position": [ 608, 96 ], "id": "ee5d8cc7-876b-4096-b6e5-14f1b92084a6", "name": "Get a message", "webhookId": "4ab02540-af03-405d-a6eb-dfcaa76477fc", "credentials": { "gmailOAuth2": { "id": "a2RvA5NMNjfOeHtd", "name": "Gmail account 2" } } }, { "parameters": { "method": "POST", "url": "=https://api.honcho.dev/v3/workspaces/{{ $('Get or Create Workspace').item.json.id }}/sessions", "authentication": "predefinedCredentialType", "nodeCredentialType": "httpBearerAuth", "sendBody": true, "bodyParameters": { "parameters": [ { "name": "id", "value": "=new_session" } ] }, "options": {} }, "type": "n8n-nodes-base.httpRequest", "typeVersion": 4.3, "position": [ 448, 96 ], "id": "eef01db2-3355-484d-97b8-34480c40fcf8", "name": "Get or Create Session", "credentials": { "httpBearerAuth": { "id": "NbrkGo1GdYWQY3OX", "name": "Bearer Auth account" } } }, { "parameters": {}, "type": "n8n-nodes-base.manualTrigger", "typeVersion": 1, "position": [ 48, 96 ], "id": "a9de9d8f-7911-444f-99bd-7d287f587eff", "name": "When clicking 'Execute workflow'" }, { "parameters": { "method": "POST", "url": "https://api.honcho.dev/v3/workspaces", "authentication": "predefinedCredentialType", "nodeCredentialType": "httpBearerAuth", "sendBody": true, "specifyBody": "json", "jsonBody": "{\n \"id\": \"email-test\",\n \"metadata\": {}\n}", "options": {} }, "type": "n8n-nodes-base.httpRequest", "typeVersion": 4.3, "position": [ 240, 96 ], "id": "49b91f82-1cd8-49aa-85a5-57a8ad7b5128", "name": "Get or Create Workspace", "credentials": { "httpBearerAuth": { "id": "NbrkGo1GdYWQY3OX", "name": "Bearer Auth account" } } }, { "parameters": { "assignments": { "assignments": [ { "id": "allEmails", "name": "allEmails", "type": "array", "value": "={{ [$json.From, $json.To, $json.Cc, $json.Bcc].filter(Boolean).flatMap(field => field.split(',').map(e => e.trim())).filter(Boolean) }}" } ] }, "options": {} }, "type": "n8n-nodes-base.set", "typeVersion": 3.4, "position": [ 784, 96 ], "id": "3eb4ce63-d95c-42d7-8a16-f35e2419d8c0", "name": "Combine Email Fields" }, { "parameters": { "fieldToSplitOut": "allEmails", "options": {} }, "type": "n8n-nodes-base.splitOut", "typeVersion": 1, "position": [ 960, 96 ], "id": "4702001d-40f5-4b44-b443-6fb0b94e58a9", "name": "Split Out" }, { "parameters": { "assignments": { "assignments": [ { "id": "name", "name": "name", "type": "string", "value": "={{ $json.allEmails.split('<')[0].trim().replace(/ /g, '_') }}" } ] }, "options": {} }, "type": "n8n-nodes-base.set", "typeVersion": 3.4, "position": [ 1136, 96 ], "id": "93327201-bfbe-4385-bfcd-7646f0513a62", "name": "Clean Names" }, { "parameters": { "method": "POST", "url": "=https://api.honcho.dev/v3/workspaces/{{ $('Get or Create Workspace').item.json.id }}/sessions/{{ $('Get or Create Session').item.json.id }}/messages/", "authentication": "predefinedCredentialType", "nodeCredentialType": "httpBearerAuth", "sendBody": true, "specifyBody": "json", "jsonBody": "={\"messages\": [{\"content\": \"{{ $('Get a message').item.json.snippet }}\", \"peer_id\": \"{{ $('Get a message').item.json.From.split('<')[0].trim().replace(/ /g, '_') }}\"}]}", "options": {} }, "type": "n8n-nodes-base.httpRequest", "typeVersion": 4.3, "position": [ 1808, 96 ], "id": "cb6732d0-e5a9-4d6e-98ef-8fe1c290740b", "name": "Create Message for Session", "credentials": { "httpBearerAuth": { "id": "NbrkGo1GdYWQY3OX", "name": "Bearer Auth account" } } }, { "parameters": { "method": "POST", "url": "=https://api.honcho.dev/v3/workspaces/{{ $('Get or Create Workspace').item.json.id }}/sessions/{{ $('Get or Create Session').item.json.id }}/peers", "authentication": "predefinedCredentialType", "nodeCredentialType": "httpBearerAuth", "sendBody": true, "specifyBody": "json", "jsonBody": "={\"{{ $json.id }}\": {}}", "options": {} }, "type": "n8n-nodes-base.httpRequest", "typeVersion": 4.3, "position": [ 1472, 96 ], "id": "3c2065b6-bd68-4698-a740-db3cd52f2267", "name": "Add Peers to Session", "credentials": { "httpBearerAuth": { "id": "NbrkGo1GdYWQY3OX", "name": "Bearer Auth account" } } }, { "parameters": {}, "type": "n8n-nodes-base.limit", "typeVersion": 1, "position": [ 1632, 96 ], "id": "d7c38f49-1a76-4a67-a540-4bec7aae1d9f", "name": "Limit" }, { "parameters": { "url": "https://api.honcho.dev/v3/workspaces/email-test/sessions/new_session/context", "authentication": "predefinedCredentialType", "nodeCredentialType": "httpBearerAuth", "options": {} }, "type": "n8n-nodes-base.httpRequestTool", "typeVersion": 4.3, "position": [ 656, 784 ], "id": "095e62d6-aae3-4eb8-9b9c-c473bfe2716d", "name": "Context", "credentials": { "httpBearerAuth": { "id": "NbrkGo1GdYWQY3OX", "name": "Bearer Auth account" } } } ], "pinData": {}, "connections": { "OpenAI Chat Model": { "ai_languageModel": [ [ { "node": "AI Agent", "type": "ai_languageModel", "index": 0 } ] ] }, "When chat message received": { "main": [ [ { "node": "AI Agent", "type": "main", "index": 0 } ] ] }, "When clicking 'Execute workflow'": { "main": [ [ { "node": "Get or Create Workspace", "type": "main", "index": 0 } ] ] }, "Get or Create Workspace": { "main": [ [ { "node": "Get or Create Session", "type": "main", "index": 0 } ] ] }, "Get a message": { "main": [ [ { "node": "Combine Email Fields", "type": "main", "index": 0 } ] ] }, "Get or Create Session": { "main": [ [ { "node": "Get a message", "type": "main", "index": 0 } ] ] }, "Combine Email Fields": { "main": [ [ { "node": "Split Out", "type": "main", "index": 0 } ] ] }, "Split Out": { "main": [ [ { "node": "Clean Names", "type": "main", "index": 0 } ] ] }, "Clean Names": { "main": [ [ { "node": "Get or Create Peer", "type": "main", "index": 0 } ] ] }, "Get or Create Peer": { "main": [ [ { "node": "Add Peers to Session", "type": "main", "index": 0 } ] ] }, "Add Peers to Session": { "main": [ [ { "node": "Limit", "type": "main", "index": 0 } ] ] }, "Limit": { "main": [ [ { "node": "Create Message for Session", "type": "main", "index": 0 } ] ] }, "Context": { "ai_tool": [ [ { "node": "AI Agent", "type": "ai_tool", "index": 0 } ] ] } }, "active": false, "settings": { "executionOrder": "v1", "availableInMCP": false }, "versionId": "ba0a3b77-cc19-49fd-9189-aaee65b35f99", "meta": { "templateCredsSetupCompleted": true, "instanceId": "4e34c96e55eb26be21fa69ca62c4851a5d09b678190481f5d47c084b6b327003" }, "id": "dKOYeEOdrZOetmFRmIAUJ", "tags": [] } ``` **Important:** After importing, you'll need to: * Add your Honcho API key to the Bearer Auth credential * Connect your Gmail OAuth2 credential * Add your OpenAI API key (or swap for your preferred LLM) * Update the Gmail Message ID in "Get a message" node **Running the workflow:** 1. First, execute the data ingestion section (click "Execute workflow") 2. Then use the chat interface to query your email data *** ## Next Steps Once you have the basic workflow running, consider these enhancements: * **Dynamic IDs**: Use n8n variables instead of hardcoding `email-test` and `new_session` * **Chat with Peers**: Add an HTTP Request Tool for natural language queries about peer representations. Read more in the [docs](/docs/v3/documentation/features/chat). * **Load more messages**: Use Gmail's "Get All" operation to load entire conversation threads * **Make it real-time**: Add a **Gmail Trigger** node to automatically ingest new emails as they arrive * **Add error handling**: Connect an **Error Trigger** node with notifications (Email, Slack) and retry logic * **Expand to other data sources**: Honcho works with Slack messages, CRM interactions, support tickets, and more *** ## Related Resources Understand workspaces, sessions, peers, and messages Learn about retrieving formatted conversation context # OpenClaw Source: https://honcho.dev/docs/v3/guides/integrations/openclaw Add AI-native memory to OpenClaw [OpenClaw](https://openclaw.ai) is a general AI agent that can perform actions on behalf of a user. The Honcho plugin gives OpenClaw memory across every channel — WhatsApp, Telegram, Discord, Slack, and more. Honcho can run entirely locally with OpenClaw — no external API required. Keep your data on your machine while getting full memory capabilities across all channels. See the [self-hosting guide](/docs/v3/contributing/self-hosting) to get started. For OpenClaw's own documentation on Honcho, see the [Honcho Memory guide](https://docs.openclaw.ai/concepts/memory-honcho). ## Install the Plugin ```bash theme={null} openclaw plugins install @honcho-ai/openclaw-honcho openclaw honcho setup openclaw gateway --force ``` `openclaw honcho setup` prompts for your API key, writes the config, and optionally uploads any legacy memory files to Honcho.