AI Context Engine

Context Engine

Shared memory and context layer for AI agents and platform services.

Persistent

memory across sessions

Ranked

retrieval within token budgets

Cross-agent

shared context & coordination

Role: Design & implementation (in progress) — memory model, retrieval pipeline, agent coordination

TypeScript

Vector Search

PostgreSQL

Embeddings

LLM Pipelines

Problem

LLMs are stateless; products aren't. Every agent interaction started from zero — re-explaining the user, re-fetching the same context, repeating questions answered last week. Context windows kept growing, but stuffing history into a prompt isn't memory, it's expensive amnesia.

It got worse with multiple agents: two agents serving the same user had no way to share what either had learned, and no protocol for splitting work without stepping on each other.

Solution

The Context Engine is a memory layer that sits between agents and models. Agents write observations; an extraction pipeline distills them into structured memories — facts, preferences, episodes, procedures — with provenance and timestamps.

On read, a retrieval pipeline scores memories by semantic relevance, recency, and importance, then packs the best set into the caller's token budget. The agent gets a context block, not a database query result.

Shared context is the coordination primitive: agents subscribe to the same memory scope, so what one learns, others know — and task-level state (who's doing what) lives in the same substrate.

Architecture

Memory is written asynchronously — extraction never blocks the agent's response path.

Storage is polyglot behind one API: vectors for similarity, relational rows for structured facts and scopes. Callers see memories, not storage engines.

Challenges

Relevance vs. recency

Pure vector similarity surfaces stale-but-similar memories; pure recency forgets what matters. Scoring blends similarity, age decay, and access frequency — and the blend is tunable per memory type, because preferences age differently than events.

Write amplification

Naively storing every message drowns the store in noise. The extraction pipeline consolidates: new observations update or supersede existing memories instead of accumulating near-duplicates.

Token budgets

Retrieval is a packing problem — maximize useful context within a hard budget. Memories carry a compact rendered form, and the packer trades breadth against detail depending on the task.

Tenancy and trust

Memory scopes enforce isolation: per-user, per-team, per-agent. An agent can't read what it wasn't granted, and provenance makes every memory auditable back to its source.

Lessons

Memory is a ranking problem wearing a storage costume. The store was easy; deciding what deserves the context window is the actual product.

Forgetting is a feature. Decay and consolidation matter as much as writes — a memory layer that only accumulates becomes noise.

Agent coordination doesn't need a new protocol; shared, scoped memory gets surprisingly far.

Get in touch Next: CommAgent