Skip to main content

Documentation Index

Fetch the complete documentation index at: https://honcho.dev/docs/llms.txt

Use this file to discover all available pages before exploring further.

Welcome to the Honcho changelog! This section documents all notable changes to the Honcho API and SDKs.
Each release is documented with:
  • Added: New features and capabilities
  • Changed: Modifications to existing functionality
  • Deprecated: Features that will be removed in future versions
  • Removed: Features that have been removed
  • Fixed: Bug fixes and corrections
  • Security: Security-related improvements

Version Format

Honcho follows Semantic Versioning:
  • MAJOR version for incompatible API changes
  • MINOR version for backwards-compatible functionality additions
  • PATCH version for backwards-compatible bug fixes

Honcho API and SDK Changelogs

v3.0.6 (Current)

Changed

  • Tightened transaction scopes across search, agent tools, queue manager, and webhook delivery to minimize DB connection hold time during external operations (#525)
  • Search operations refactored to two-phase pattern — external work (embeddings, LLM calls) completes before opening a transaction (#525)
  • Agent tool executor performs external operations before acquiring DB sessions (#525)
  • Queue manager transaction scope reduced to only the critical section (#525)
  • Webhook delivery no longer holds a DB session parameter (#525)

Fixed

  • Session leakage in non-session-scoped dialectic chat calls (#526)

Added

  • Health check endpoint (/health) for container orchestration and load balancer probes (#510)
v3.0.5

Fixed

  • explicit rollback on all transactions to force connection closed
v3.0.4

Added

  • JSONB metadata validation enforces 100 key limit and max depth of 5 (#419)

Changed

  • Schemas refactored from single schemas.py into schemas/api.py, schemas/configuration.py, and schemas/internal.py with backwards-compatible re-exports (#419)

Fixed

  • Missing deleted_at filter on RepresentationManager._query_documents_recent() and ._query_documents_most_derived() allowed soft-deleted documents to leak into the deriver’s working representation (#456)
  • CleanupStaleItemsCompletedEvent emitted spuriously when no queue item was actually deleted (#454)
  • Empty JSON file uploads caused unhandled errors; now returns normalized error responses (#434)
  • Memory leak: _observation_locks switched to WeakValueDictionary to prevent unbounded growth (#419)
  • SQL injection in dependencies.py: parameterized set_config calls to prevent injection via request context (#419)
  • NUL byte crashes: string inputs (message content, queries, peer cards) now stripped at schema level (#419)
  • Filter recursion depth capped at 5 to prevent stack overflow (#419)
  • Dedup-skipped observations now correctly reflected in created counts (#477)
  • External vector store support for message search — routes queries through configured external vector store with oversampling and deduplication to handle chunked embeddings (#479)
  • Dialectic agent no longer holds a DB connection during LLM calls — embeddings are pre-computed before tool execution, DB sessions isolated in extract_preferences, query_documents no longer accepts a DB session parameter (#477)
v3.0.3

Added

  • Consolidated session context into a single DB session with 40/60 token budget allocation between summary and messages
  • Observation validation via ObservationInput Pydantic schema with partial-success support and batch embedding with per-observation fallback
  • Peer card hard cap of 40 facts with case-insensitive deduplication and whitespace normalization
  • Safe integer coercion (_safe_int) for all LLM tool inputs to handle non-integer values like "Infinity"
  • Embedding pre-computation and reuse across multiple search calls in dialectic and representation flows
  • Peer existence validation in dialectic chat endpoints — raises ResourceNotFoundException instead of silently failing
  • Logging filter to suppress noisy GET /metrics access logs
  • Oolong long-context aggregation benchmark (synth and real variants, 1K–4M token context windows)
  • MolecularBench fact quality evaluation (ambiguity, decontextuality, minimality scoring)
  • CoverageBench information recall evaluation (gold fact extraction, coverage matching, QA verification)
  • LoCoMo summary-as-context baseline evaluation
  • Webhook delivery tests, dependency lifecycle tests, queue cleanup tests, summarizer fallback tests
  • Parallel test execution via pytest-xdist with worker-specific databases
  • test_reasoning_levels.py script for LOCOM dataset testing across reasoning levels

Changed

  • Workspace deletion is now async — returns 202 Accepted, validates no active sessions (409 Conflict), cascade-deletes in background
  • Redis caching layer now stores plain-dict instead of ORM objects, with v2-prefixed keys, storage, resilient safe_cache_set/safe_cache_delete helpers, and deferred post-commit cache invalidation
  • All get_or_create_* CRUD operations now use savepoints (db.begin_nested()) instead of commit/rollback for race condition prevention
  • Reconciler vector sync uses direct ORM mutation instead of batch parameterized UPDATE statements
  • Summarizer enforces hard word limit in prompt and creates fallback text for empty summaries with summary_tokens = 0
  • Blocked Gemini responses (SAFETY, RECITATION, PROHIBITED_CONTENT, BLOCKLIST) now raise LLMError to trigger retry/backup-provider logic
  • Gemini client explicitly sets max_output_tokens from max_tokens parameter
  • All deriver and metrics collector logging replaced with structured logging.getLogger(__name__) calls
  • Dreamer specialist prompts updated to enforce durable-facts-only peer cards with max 40 entries and deduplication
  • GetOrCreateResult changed from NamedTuple to dataclass with async post_commit() method
  • FastAPI upgraded from 0.111.0 to 0.131.0; added pyarrow dependency
  • Queue status filtering to only show user-facing tasks (representation, summary, dream); excludes internal infrastructure tasks

Fixed

  • JWT timestamp bug — JWTParams.t was evaluated once at class definition time instead of per-instance
  • Session cache invalidation on deletion was missing
  • get_peer_card() now properly propagates ResourceNotFoundException instead of swallowing it
  • set_peer_card() ensures peer exists via get_or_create_peers() before updating
  • Backup provider failover with proper tool input type safety
  • Removed setup_admin_jwt() from server startup
  • Sentry coroutine detection switched from asyncio.iscoroutinefunction to inspect.iscoroutinefunction

Removed

  • explicit.py and obex.py benchmarks replaced by coverage.py and molecular.py
  • Claude Code review automation workflow (.github/workflows/claude.yml)
  • Coverage reporting from default pytest configuration
v3.0.2

Added

  • Documentation for reasoning_level and Claude Code plugin

Changed

  • Gave dreaming sub-agents better prompting around peer card creation, tweaked overall prompts

Fixed

  • Added message-search fallback for memory search tool, necessary in fresh sessions
  • Made FLUSH_ENABLED a config value
  • Removed N+1 query in search_messages
v3.0.1

Fixed

  • Token counting in Explicit Agent Loop
  • Backwards compatibility of queue items
v3.0.0

Added

  • Agentic Dreamer for intelligent memory consolidation using LLM agents
  • Agentic Dialectic for query answering using LLM agents with tool use
  • Reasoning levels configuration for dialectic (minimal, low, medium, high, max)
  • Prometheus token tracking for deriver and dialectic operations
  • n8n integration
  • Cloud Events for auditable telemetry
  • External Vector Store support for turbopuffer and lancedb with reconciliation flow

Changed

  • API route renaming for consistency
  • Dreamer and dialectic now respect peer card configuration settings
  • Observations renamed to Conclusions across API and SDKs
  • Deriver to buffer representation tasks to normalize workloads
  • Local Representation tasks to create singular QueueItems
  • getContext endpoint to use search_query rather than force last_user_message

Fixed

  • Dream scheduling bugs
  • Summary creation when start_message_id > end_message_id
  • Cashews upgrade to prevent NoScriptError
  • Memory leak in accumulate_metric call

Removed

  • Peer card configuration from message configuration; peer cards no longer created/updated in deriver process
v2.5.1

Fixed

  • Backwards compatibility for message_ids field in documents to handle legacy tuple format
v2.5.0

Added

  • Message level configurations
  • CRUD operations for observations
  • Comprehensive test cases for harness
  • Peer level get_context
  • Set Peer Card Method
  • Manual dreaming trigger endpoint

Changed

  • Configurations to support more flags for fine-grained control of the deriver, peer cards, summaries, etc.
  • Working Representations to support more fine-grained parameters

Fixed

  • File uploads to match MessageCreate structure
  • Cache invalidation strategy
v2.4.3

Added

  • Redis caching to improve DB IO
  • Backup LLM provider to avoid failures when a provider is down

Changed

  • QueueItems to use standardized columns
  • Improved Deduplication logic for Representation Tasks
  • More finegrained metrics for representation, summary, and peer card tasks
  • DB constraint to follow standard naming conventions
v2.4.2

Fixed

  • Langfuse tracing to have readable waterfalls
  • Alembic Migrations to match models.py
  • message_in_seq correctly included in webhook payload

Changed

  • Alembic to always use a session pooler
  • Statement timeout during alembic operations to 5 min
v2.4.1

Added

  • Alembic migration validation test suite

Fixed

  • Alembic migrations to batch changes
  • Batch message creation sequence number

Changed

  • Logging infrastructure to remove noisy messages
  • Sentry integration is centralized
v2.4.0

Added

  • Unified Representation class
  • vllm client support
  • Periodic queue cleanup logic
  • WIP Dreaming Feature
  • LongMemEval to Test Bench
  • Prometheus Client for better Metrics
  • Performance metrics instrumentation
  • Error reporting to deriver
  • Workspace Delete Method
  • Multi-db option in test harness

Changed

  • Working Representations are Queried on the fly rather than cached in metadata
  • EmbeddingStore to RepresentationFactory
  • Summary Response Model to use public_id of message for cutoff
  • Semantic across codebase to reference resources based on observer and observed
  • Prompts for Deriver & Dialectic to reference peer_id and add examples
  • Get Context route returns peer card and representation in addition to messages and summaries
  • Refactoring logger.info calls to logger.debug where applicable

Fixed

  • Gemini client to use async methods
v2.3.3

Changed

  • Deriver Rollup Queue processes interleaved messages for more context

Fixed

  • Dialectic Streaming to follow SSE conventions
  • Sentry tracing in the deriver
v2.3.2

Added

  • Get peer cards endpoint (GET /v2/peers/{peer_id}/card) for retrieving targeted peer context information

Changed

  • Replaced Mirascope dependency with small client implementation for better control
  • Optimized deriver performance by using joins on messages table instead of storing token count in queue payload
  • Database scope optimization for various operations
  • Batch representation task processing for ~10x speed improvement in practice

Fixed

  • Separated clean and claim work units in queue manager to prevent race conditions
  • Skip locked ActiveQueueSession rows on delete operations
  • Langfuse SDK integration updates for compatibility
  • Added configurable maximum message size to prevent token overflow in deriver
  • Various minor bugfixes
v2.3.1

Fixed

  • Added max message count to deriver in order to not overflow token limits
v2.3.0

Added

  • getSummaries endpoint to get all available summaries for a session directly
  • Peer Card feature to improve context for deriver and dialectic

Changed

  • Session Peer limit to be based on observers instead, renamed config value to SESSION_OBSERVERS_LIMIT
  • Messages can take a custom timestamp for the created_at field, defaulting to the current time
  • get_context endpoint returns detailed Summary object rather than just summary content
  • Working representations use a FIFO queue structure to maintain facts rather than a full rewrite
  • Optimized deriver enqueue by prefetching message sequence numbers (eliminates N+1 queries)

Fixed

  • Deriver uses get_context internally to prevent context window limit errors
  • Embedding store will truncate context when querying documents to prevent embedding token limit errors
  • Queue manager to schedule work based on available works rather than total number of workers
  • Queue manager to use atomic db transactions rather than long lived transaction for the worker lifecycle
  • Timestamp formats unified to ISO 8601 across the codebase
  • Internal get_context method’s cutoff value is exclusive now
v2.2.0

Added

  • Arbitrary filters now available on all search endpoints
  • Search combines full-text and semantic using reciprocal rank fusion
  • Webhook support (currently only supports queue_empty and test events, more to come)
  • Small test harness and custom test format for evaluating Honcho output quality
  • Added MCP server and documentation for it

Changed

  • Search has 10 results by default, max 100 results
  • Queue structure generalized to handle more event types
  • Summarizer now exhaustive by default and tuned for performance

Fixed

  • Resolve race condition for peers that leave a session while sending messages
  • Added explicit rollback to solve integrity error in queue
  • Re-introduced Sentry tracing to deriver
  • Better integrity logic in get_or_create API methods
v2.1.2

Fixed

  • Summarizer module to ignore empty summaries and pass appropriate one to get_context
  • Structured Outputs calls with OpenAI provider to pass strict=True to Pydantic Schema
v2.1.1

Added

  • Test harness for custom Honcho evaluations
  • Better support for session and peer aware dialectic queries
  • Langfuse settings
  • Added recent history to dialectic prompt, dynamic based on new context window size setting

Fixed

  • Summary queue logic
  • Formatting of logs
  • Filtering by session
  • Peer targeting in queries

Changed

  • Made query expansion in dialectic off by default
  • Overhauled logging
  • Refactor summarization for performance and code clarity
  • Refactor queue payloads for clarity
v2.1.0

Added

  • File uploads
  • Brand new “ROTE” deriver system
  • Updated dialectic system
  • Local working representations
  • Better logging for deriver/dialectic
  • Deriver Queue Status no longer has redundant data

Fixed

  • Document insertion
  • Session-scoped and peer-targeted dialectic queries work now
  • Minor bugs

Removed

  • Peer-level messages

Changed

  • Dialectic chat endpoint takes a single query
  • Rearranged configuration values (LLM, Deriver, Dialectic, History->Summary)
v2.0.5

Fixed

  • Groq API client to use the Async library
v2.0.4

Fixed

  • Migration/provision scripts did not have correct database connection arguments, causing timeouts
v2.0.3

Fixed

  • Bug that causes runtime error when Sentry flags are enabled
v2.0.2

Fixed

  • Database initialization was misconfigured and led to provision_db script failing: switch to consistent working configuration with transaction pooler
v2.0.1

Added

  • Ergonomic SDKs for Python and TypeScript (uses Stainless underneath)
  • Deriver Queue Status endpoint
  • Complex arbitrary filters on workspace/session/peer/message
  • Message embedding table for full semantic search

Changed

  • Overhauled documentation
  • BasedPyright typing for entire project
  • Resource filtering expanded to include logical operators

Fixed

  • Various bugs
  • Use new config arrangement everywhere
  • Remove hardcoded responses
v2.0.0

Added

  • Ability to get a peer’s working representation
  • Metadata to all data primitives (Workspaces, Peers, Sessions, Messages)
  • Internal metadata to store Honcho’s state no longer exposed in API
  • Batch message operations and enhanced message querying with token and message count limits
  • Search and summary functionalities scoped by workspace, peer, and session
  • Session context retrieval with summaries and token allocatio
  • HNSW Index for Documents Table
  • Centralized Configuration via Environment Variables or config.toml file

Changed

  • New architecture centered around the concept of a “peer” replaces the former “app”/“user”/“session” paradigm
  • Workspaces replace “apps” as top-level namespace
  • Peers replace “users”
  • Sessions no longer nested beneath peers and no longer limited to a single user-assistant model. A session exists independently of any one peer and peers can be added to and removed from sessions.
  • Dialectic API is now part of the Peer, not the Session
  • Dialectic API now allows queries to be scoped to a session or “targeted” to a fellow peer
  • Database schema migrated to adopt workspace/peer/session naming and structure
  • Authentication and JWT scopes updated to workspace/peer/session hierarchy
  • Queue processing now works on ‘work units’ instead of sessions
  • Message token counting updated with tiktoken integration and fallback heuristic
  • Queue and message processing updated to handle sender/target and task types for multi-peer scenarios

Fixed

  • Improved error handling and validation for batch message operations and metadata
  • Database Sessions to be more atomic to reduce idle in transaction time

Removed

  • Metamessages removed in favor of metadata
  • Collections and Documents no longer exposed in the API, solely internal
  • Obsolete tests for apps, users, collections, documents, and metamessages

v1.1.0

Added

  • Normalize resources to remove joins and increase query performance
  • Query tracing for debugging

Changed

  • /list endpoints to not require a request body
  • metamessage_type to label with backwards compatibility
  • Database Provisioning to rely on alembic
  • Database Session Manager to explicitly rollback transactions before closing the connection

Fixed

  • Alembic Migrations to include initial database migrations
  • Sentry Middleware to not report Honcho Exceptions
v1.0.0

Added

  • JWT based API authentication
  • Configurable logging
  • Consolidated LLM Inference via ModelClient class
  • Dynamic logging configurable via environment variables

Changed

  • Deriver & Dialectic API to use Hybrid Memory Architecture
  • Metamessages are not strictly tied to a message
  • Database provisioning is a separate script instead of happening on startup
  • Consolidated session/chat and session/chat/stream endpoints

Previous Releases

For a complete history of all releases, see our GitHub Releases page.

Getting Help

If you encounter issues using the Honcho API or its SDKs:
  1. Open an issue on GitHub
  2. Join our Discord community for support