State-of-the-art on LongMem, LoCoMo, and BEAM.View Results

MEMORY THAT
REASONS

MEMORY THAT REASONS

Continual learning for stateful agents. Better context. Fewer tokens.

$ uv add honcho-ai
npx skills add plastic-labs/honcho

Run in your terminal to install the Honcho skill. Learn more

[Why Honcho]

Beyond Store & Retrieve

Challenge

Memory systems store facts. Users and agents are more than facts. And facts waste tokens. Delivering the context that matters requires reasoning.

Solution

Honcho doesn't just store data. It continually learns. Every message triggers comprehensive reasoning that saves tokens downstream:

Custom Models
[Powered by Neuromancer]
Neuromancer

Our reasoning models achieve SOTA performance at lower cost and latency than frontier models. Neuromancer powers learning beyond explicit facts, it reasons toward conclusions that follow, patterns across interactions, and hypotheses to test against new data.

The result: token savings and richer user context.

Model Card
Open Benchmarks
Verify on GitHub
60-90%
Token savings
Token Efficiency
Get the 10K tokens you need, not the 100K you don't. Optimize savings and accuracy. Context window management is solved.
SOTA Scores
[Benchmarks]
LoCoMo
89.9%
LongMem S
90.4%
BEAM 100K
0.630
BEAM 500K
0.646
BEAM 1M
0.618
BEAM 10M
0.409
honcho
baseline
View evals
[Honcho Memory]

Statefulness, Solved

Memory is solved. Retrieval is unlimited. Stateful agents with a single method call.

How it works

Write messages to Honcho and two things happen:

  • Messages are stored and indexed
  • Neuromancer reasons and learns automatically

Just call get_context() for effortless state.

get_context() returns instantly curated reasoning plus conversation history. Everything an agent needs to maintain continuity and efficient enough to use at every turn.

Granular control. Shape retrieval with search, token budgets, summaries, and peer scoping.

Explore the docs
[Basic]
What should I do about the job offer?
session.get_context()
Default context with summary + messages
Sounds like you already know. What's holding you back?
You're right, I've already decided.

Any stack. Any scale. Honcho just works.

AUTOMATIC
Ingestion triggers reasoning
~200MS
Fast enough for every turn
TOKEN-BUDGET
You set the limit; we optimize
MODEL-AGNOSTIC
OpenAI, Anthropic, custom
UNLIMITED RETRIEVAL
It's your data.
[Honcho Reasoning]

Advanced Reasoning On-Demand

Call .chat() for specific queries. More reasoning resources when it matters.

Minimal

$0.001Instant

Single semantic search. One lookup.

Queries
.chat()
Response
...

Low

$0.01Instant

Conclusions with surrounding context. Default tier.

Queries
.chat()
Response

Medium

$0.05Fast

Multiple searches. Directed synthesis.

Queries
.chat()
Response

High

$0.10Async

Multi-pass analysis. Patterns over time.

Queries
.chat()
Response

Max

$0.50Async

Research-grade. Exhaustive full history search with quantitative methods.

Queries
.chat()
Response
Swipe to browse
[Built for the Future]

Transcend Memory

Honcho is more than memory. It enables interaction patterns that weren't possible before.

Peers

Other memory systems restrict use to the user-assistant paradigm. Honcho is more dynamic. It can learn about any entity, modeling users, agents, NPCs, groups, and their relationships.

Peers can be freely added or removed from Sessions, and their perspectives scoped global or surgically. Designed for ultimate flexibility.

Architect:
Group Conversations
Agents + Subagents
Adaptive NPC Memory
Scoped Perspectives
Granular Context Management
[Peer Dynamics]
Sessions
Peers
Messages
Inference

Dreaming

Honcho also reasons in the background. Dreaming extends continual learning.

Asynchronous reasoning runs automatically, continuously optimizing Honcho's understanding of each Peer without impacting runtime performance.

Schedule:
Pattern Identification
Hypothesis Testing
Conclusion Weighting
Conflict Resolution
Deep Research
[Peer Dynamics]
Sessions
Peers
Messages
Inference
What you get

Standard dreaming is included with every workspace. Enhanced dreaming options are available for deeper reasoning use cases.

[Honcho Deployed]

Stateful Production

Honcho's primitives support diverse agents. Vibecode to scale and vertical-agnostic.

AI COMPANIONS
Relationships that deepen, not reset.

Enable what matters: emotional arcs, shared history, implicit meaning, preference alignment. No more "dementia." Build lasting connections.

CODING AGENTS
Permanent first days— solved.

Learn team conventions and architecture philosophy. Anticipate intent. Code that matches your style, not generic best practices.

GAMING
Memory that scales with games.

NPCs form opinions, track relationships, build identites, and reshape narratives without $0.50/interaction costs breaking the budget.

EDUCATION
From adaptive difficulty to adaptive pedagogy

Track misconceptions across sessions. Predict obstacles before they stall progress. Tutoring that compounds over semesters.

CUSTOMER SUPPORT
Context that survives handoffs.

Customer history persists across sessions, channels, agents. No more "can you start from the beginning?"

PRODUCTIVITY
End infinite onboarding.

Workflow context persists across tools and time. Needs aren't just met, they're anticipated.

[Pricing]

Transparent. Predictable.

Use the 10K tokens you need, not the 100K you don't.

HONCHO MEMORY

Ingestion
Store + Neuromancer reasoning
$2.00/M
get_context()
No limits. ~200ms.
UNLIMITED

HONCHO REASONING

Minimal
Basic conclusions — instant
$0.001/q
Low
Efficient synthesis — instant
$0.01/q
Medium
Steerable reasoning — fast
$0.05/q
High
Deep synthesis — async
$0.10/q
Max
Research-grade — async
$0.50/q
Dreaming
Background inference
Included

Startups (<$5M raised)

$1,000 in credits. 12 months subsidized pricing. Integration support. Grow with Honcho.

Apply Now

Enterprise

Custom plans. Forward-deployed engineers. Dedicated integration and maintenance support.

Contact Founders