Skip to content

Knowledge System

The Problem

AI agents lose context between sessions. The richest reasoning happens in conversation (decisions debated, alternatives rejected, rationale formed), but when the context window closes, that reasoning evaporates. The next agent sees what was decided but not why.

This problem compounds in a multi-agent environment. If each agent maintains its own context, the research agent doesn't know what the coding agent decided. A terminal session's insights die when the terminal closes. The human becomes the sole carrier of project continuity, re-explaining context to every new agent in every new session.

The Design Principle

Sidespace solves this with a singular memory organ: an independent knowledge layer that doesn't belong to any single agent, any single session, or any single project.

Three properties make this architecturally significant:

  • Agent-independent. Every agent reads from and writes to the same memory bank. A decision made during a Claude Code terminal session is immediately available to Hoshi in the chat panel, and vice versa. Agents don't have separate memories. They share one.
  • Session-independent. Context survives session boundaries. When a new agent starts, it inherits the accumulated knowledge from every agent that came before. Starting a new session means picking up where the last one left off.
  • Project-spanning. Learnings from one project can inform work on another. Architectural patterns discovered in Project A are available when working on Project B. The memory organ connects knowledge across projects rather than siloing it.

This is not a feature of the knowledge system. It is the connective tissue of the entire product.

What's Built

The Memory Bank

The core is a PostgreSQL table with vector search capabilities, currently holding over 1,500 memories. Each memory is a self-contained piece of knowledge (an architectural decision, a debugging insight, a user preference) written so that an agent with zero prior context can understand it.

Every memory carries a type (decision, architecture, fact, preference, insight), an importance level, and dual search indexes: a vector embedding for semantic search and a keyword index for exact matches. Memories are never hard-deleted; soft archival preserves them while removing them from active results. Each memory tracks the creating agent by constellation name (Orion, Lyra, Vela), so multi-agent sessions have clear attribution.

Memories are living data. Each has a strength value that increases when the memory is surfaced to an agent and decays when unused. Frequently valuable memories naturally rise to the top of search results over time.

Search uses two engines running in parallel:

  1. BM25 keyword search is the always-on backbone. It handles exact matches like ticket numbers, function names, and error codes reliably.
  2. Vector cosine similarity is the semantic boost. It bridges vocabulary gaps so a query about "rate limiting" can match a memory about "429 retry backoff."

Results from both engines are fused into a combined relevance score. The final ranking blends 70% combined relevance with 30% memory strength (a lifecycle metric that rises when a memory is surfaced and decays when unused). This means frequently valuable memories naturally rank higher over time.

If the embedding pipeline is temporarily down, search degrades gracefully. Keyword search continues natively with no external dependencies.

Document Indexing

Formal project documents (design docs, session logs, plans) live on disk in each project's directory. A file watcher monitors these directories. When a file changes, it is uploaded, chunked, and vectorized. The app shows a read-only indexed view: you author documents in your editor, Sidespace makes them searchable.

Documents and memories serve different purposes. Documents are project-scoped reference material. Memories are extracted insights that span projects. Both are searchable through the same interface.

Continuous Enrichment

The memory organ doesn't just store what agents write to it. It is continuously enriched from multiple sources:

  • Agent sessions. Decisions, patterns, and insights are captured during normal work and extracted from session transcripts.
  • Research pipeline. Umbra scans external sources, stages findings for quality review, and promoted results join the memory bank.
  • Automated mining. Kosmos processes session transcripts after each session, extracting valuable context that might otherwise be lost.
  • Document indexing. Formal docs are chunked and made searchable alongside memories.

Context Injection

Memories surface themselves automatically. On every user message in Hoshi chat, the system searches for relevant memories and injects the top results into the agent's prompt. This gives the agent continuity ("you've been here before") without requiring an explicit search.

Terminal agents access the same memory bank through MCP tools. When a Claude Code session starts in Squad View, it can search memories, read project context, and write new memories, using the same data that Hoshi and every other agent sees.

Health Monitoring

Seven automated checks run every 6 hours, covering embedding health, search accuracy, data integrity, and canary quality (a known test memory that should always return at expected similarity). When a check fails, the Kosmos heal cycle (running every 12 hours) picks up the results and executes automated remediation: re-embedding orphaned memories, re-testing canaries, and surfacing unrecoverable issues to the user.

The memory organ is not static storage. It is a maintained, self-healing system.

Where It's Heading

The next phase is cross-project intelligence: not just sharing memories across projects, but actively connecting them. Pattern recognition that says "you solved a similar problem in Project A, here's what worked." Knowledge gap detection that identifies what the system should know about a project but does not. And as memory counts grow, a two-step retrieval pattern (search returns snippets, then drill down on demand) will keep context injection lean while maintaining depth.