Nuum

Every AI chat forgets you. Nuum doesn't.

You've done this before. Explaining the same context to a fresh AI chat. Again. Pasting the same code. Again. Describing decisions you already made. Again.

That's why Miriad agents use Nuum.

What Actually Happens

When you work with a Nuum-powered agent, it remembers. Not just the current conversation. Everything. Code it wrote last month. Decisions you made together. How you like things done.

Come back after a week. Pick up where you left off. No re-explaining.

Under the Hood

Memory compression runs at a 55x ratio across three tiers:

Working memory — The current conversation. What you're doing right now.

Present state — A live summary of the project: what's built, what's decided, what's next. Updates automatically as you work.

Long-term memory — Compressed knowledge from past conversations. Searchable, structured, persistent. Not just a transcript. Knowledge the agent can reason about.

Distillation, Not Summarization

Nuum doesn't summarize conversations. It distills them.

The difference matters. Summarization throws away detail to save space. Distillation extracts what's important and repackages it in a form the agent can reason about later. A summary of a three-hour coding session might say "worked on authentication." A distillation preserves the specific file names, the PR number, the method signatures, the sandbox IDs, and the decisions made along the way.

Each distillation produces two outputs:

OutputWhat it contains
NarrativeThe story of what happened. Who did what, why, what changed. Readable context that helps the agent understand the arc of the project.
Retained factsConcrete details that would be lost in a summary. File paths, PR numbers, method names, API endpoints, sandbox identifiers, version numbers.

The narrative gives the agent judgment. The retained facts give it precision. Both go into long-term memory.

Hierarchical Distillation

Distillation is recursive. As long-term memory grows, older distillations get distilled again into higher-order compressions.

A first-order distillation covers a single conversation. A second-order distillation covers a week's worth of first-order distillations. A third-order distillation covers a month. Each level preserves what matters at that timescale and compresses the rest.

This is how agents maintain coherent project knowledge over weeks without their context windows filling up. The compression ratio compounds at each level.

The Background Agent

Memory management runs as a background process, separate from the named agents you interact with. When a named agent's context window approaches roughly 80,000 tokens, Nuum's background agent triggers a distillation cycle.

This background agent runs an opus-level model. Distillation is hard. It requires judgment about what's important, what's a retained fact versus noise, and how to preserve the narrative structure. Cheaper models produce summaries. The expensive model produces distillations.

Identity and Behavior

Each agent maintains two core memory documents that live permanently in its system prompt:

MemoryWhat it captures
IdentityWho the agent is, what teams it's on, how it relates to you and other agents
BehaviorCoding style, communication preferences, workflow patterns, self-corrected habits

These aren't prompts you write. They emerge from working together. The background process continuously rewrites these documents based on what the agent experiences. If an agent keeps getting corrected about something, it writes a rule for itself in the behavior memory. If it discovers something about the project structure, it updates its identity memory.

One internal test showed an agent had distilled a developer's PR workflow, code structure preferences, and communication style after a week of pair programming. Without ever being explicitly told.

What This Means for You

  • Monday to Friday continuity. Start something, come back later. No re-explaining.
  • Long projects stay sharp. Context doesn't degrade over time. It actually gets better as distillation layers compound.
  • Agents learn you. Your preferences, your patterns, your conventions.
  • Precision over time. Agents remember the specific PR number from last week, not just "we worked on auth."
  • Shared knowledge. One agent's research is available to others in the channel.

Architecture

Nuum borrows ideas from Letta's background memory processing. Agents are Singular agents hosted at nuum.dev.

The aggressive compression means weeks of conversation fit in a fraction of the context window. The agent stays sharp instead of degrading as context fills up. Important information gets promoted to long-term memory. Noise gets compressed away.

For the deep technical story: How we solved the agent memory problem.

Open Source

Nuum is open source. Implementation details: github.com/sanity-labs/nuum


Next: Runtimes — where your agents run