Skip to content

Memory & Learning

OpenKoi maintains a layered memory system that persists knowledge across sessions, enables semantic recall, and drives continuous improvement through learning extraction and decay. All data is stored locally -- no cloud dependencies.

Storage Layout

OpenKoi follows XDG conventions, splitting configuration from data:

~/.openkoi/                              # XDG_CONFIG_HOME/openkoi
  config.toml                            # Configuration (TOML)
  credentials/                           # API keys (chmod 600)
    providers.json
    integrations.json
  SOUL.md                                # Agent identity (user-editable)

~/.local/share/openkoi/                  # XDG_DATA_HOME/openkoi
  openkoi.db                             # SQLite (structured data + vectors)
  sessions/
    <session-id>.jsonl                   # Full transcripts (episodic memory)
  skills/
    managed/                             # Installed skills
    proposed/                            # Auto-proposed from pattern mining
    user/                                # User-created task skills
  evaluators/
    managed/                             # Installed evaluator skills
    proposed/                            # Auto-proposed evaluator skills
    user/                                # User-created evaluator skills
  plugins/
    wasm/                                # WASM plugin binaries
    scripts/                             # Rhai scripts

The configuration directory (~/.openkoi/) holds user-facing files: settings, credentials, and the soul document. The data directory (~/.local/share/openkoi/) holds operational data: the database, session transcripts, skills, and plugins.

You can override these paths with environment variables:

bash
OPENKOI_CONFIG=~/.openkoi/config.toml
OPENKOI_DATA=~/.local/share/openkoi

SQLite Schema

All structured data lives in a single SQLite database (openkoi.db). Vector search is provided by sqlite-vec loaded as an extension. Full-text search uses SQLite's built-in FTS5.

Core Tables

TablePurposeKey Fields
sessionsTracks conversation sessionsid, channel, model_provider, model_id, total_tokens, total_cost_usd, transcript_path
tasksRecords every task and its outcomeid, description, category, session_id, final_score, iterations, decision, total_tokens, total_cost_usd
iteration_cyclesPer-iteration data within a taskid, task_id, iteration, score, decision, input_tokens, output_tokens, duration_ms
findingsIndividual evaluation findingsid, cycle_id, severity, dimension, title, description, location, fix, resolved_in
learningsExtracted knowledge from task outcomesid, type, content, category, confidence, source_task, reinforced, last_used, expires_at
skill_effectivenessPerformance tracking per skill per categoryskill_name, task_category, avg_score, sample_count, last_used

Memory Tables

TablePurposeKey Fields
memory_chunksSource text for semantic memoryid, source, text, created_at
memory_vecVector index (sqlite-vec virtual table)id, embedding float[1536]
memory_ftsFull-text search index (FTS5 virtual table)text (content synced from memory_chunks)

Usage & Pattern Tables

TablePurposeKey Fields
usage_eventsRaw event log for pattern miningid, event_type, channel, description, category, skills_used (JSON), score, timestamp, day, hour, day_of_week
usage_patternsDetected recurring patternsid, pattern_type, description, frequency, trigger_json, confidence, sample_count, proposed_skill, status
_migrationsSchema version trackingversion, name, applied_at

Full Schema (SQL)

sql
CREATE TABLE sessions (
  id              TEXT PRIMARY KEY,
  channel         TEXT,
  model_provider  TEXT,
  model_id        TEXT,
  created_at      TEXT NOT NULL,
  updated_at      TEXT NOT NULL,
  total_tokens    INTEGER DEFAULT 0,
  total_cost_usd  REAL DEFAULT 0.0,
  transcript_path TEXT
);

CREATE TABLE tasks (
  id              TEXT PRIMARY KEY,
  description     TEXT NOT NULL,
  category        TEXT,
  session_id      TEXT REFERENCES sessions(id),
  final_score     REAL,
  iterations      INTEGER,
  decision        TEXT,
  total_tokens    INTEGER,
  total_cost_usd  REAL,
  created_at      TEXT NOT NULL,
  completed_at    TEXT
);

CREATE TABLE iteration_cycles (
  id              TEXT PRIMARY KEY,
  task_id         TEXT NOT NULL REFERENCES tasks(id),
  iteration       INTEGER NOT NULL,
  score           REAL,
  decision        TEXT NOT NULL,
  input_tokens    INTEGER,
  output_tokens   INTEGER,
  duration_ms     INTEGER,
  created_at      TEXT NOT NULL,
  UNIQUE(task_id, iteration)
);

CREATE TABLE findings (
  id              TEXT PRIMARY KEY,
  cycle_id        TEXT REFERENCES iteration_cycles(id),
  severity        TEXT NOT NULL,
  dimension       TEXT NOT NULL,
  title           TEXT NOT NULL,
  description     TEXT,
  location        TEXT,
  fix             TEXT,
  resolved_in     TEXT REFERENCES iteration_cycles(id)
);

CREATE TABLE learnings (
  id              TEXT PRIMARY KEY,
  type            TEXT NOT NULL,
  content         TEXT NOT NULL,
  category        TEXT,
  confidence      REAL NOT NULL,
  source_task     TEXT REFERENCES tasks(id),
  reinforced      INTEGER DEFAULT 0,
  created_at      TEXT NOT NULL,
  last_used       TEXT,
  expires_at      TEXT
);

CREATE TABLE skill_effectiveness (
  skill_name      TEXT NOT NULL,
  task_category   TEXT NOT NULL,
  avg_score       REAL NOT NULL,
  sample_count    INTEGER NOT NULL,
  last_used       TEXT NOT NULL,
  PRIMARY KEY (skill_name, task_category)
);

CREATE TABLE memory_chunks (
  id              TEXT PRIMARY KEY,
  source          TEXT NOT NULL,
  text            TEXT NOT NULL,
  created_at      TEXT NOT NULL
);

CREATE VIRTUAL TABLE memory_vec USING vec0(
  id TEXT PRIMARY KEY,
  embedding float[1536]
);

CREATE VIRTUAL TABLE memory_fts USING fts5(
  text, content='memory_chunks', content_rowid='rowid'
);

CREATE TABLE usage_events (
  id              TEXT PRIMARY KEY,
  event_type      TEXT NOT NULL,
  channel         TEXT,
  description     TEXT,
  category        TEXT,
  skills_used     TEXT,
  score           REAL,
  timestamp       TEXT NOT NULL,
  day             TEXT NOT NULL,
  hour            INTEGER,
  day_of_week     INTEGER
);

CREATE TABLE usage_patterns (
  id              TEXT PRIMARY KEY,
  pattern_type    TEXT NOT NULL,
  description     TEXT NOT NULL,
  frequency       TEXT,
  trigger_json    TEXT,
  confidence      REAL NOT NULL,
  sample_count    INTEGER NOT NULL,
  first_seen      TEXT NOT NULL,
  last_seen       TEXT NOT NULL,
  proposed_skill  TEXT,
  status          TEXT DEFAULT 'detected'
);

CREATE INDEX idx_events_day ON usage_events(day);
CREATE INDEX idx_learnings_type ON learnings(type);
CREATE INDEX idx_tasks_category ON tasks(category);

Memory Layers

OpenKoi uses five distinct memory layers, each serving a different temporal and functional role:

LayerNameStorageLifetimePurpose
1Working MemoryLLM context windowSingle turnActive reasoning; compressed between iterations
2Task MemoryIn-process structsSingle taskAccumulates iteration cycles; flushed to SQLite on completion
3Long-term MemorySQLite + vec + FTS5PersistentLearnings, task history, semantic search across all past data
4Episodic MemoryJSONL transcriptsPersistentComplete session transcripts, indexed into chunks for recall
5Skill MemorySKILL.md files + effectiveness matrixPersistentReusable instructions plus per-category performance tracking

How the Layers Interact

Working Memory (context window)
       ^
       | recall / inject
       |
Task Memory (in-process)
       |
       | flush on completion
       v
Long-term Memory (SQLite) <---> Episodic Memory (JSONL)
       ^                               |
       | effectiveness updates          | chunk + embed
       |                               v
Skill Memory (SKILL.md files)    memory_chunks + memory_vec

During a task, the orchestrator pulls relevant context from long-term memory into working memory via the recall system. When a task completes, task memory is persisted to SQLite. Session transcripts are written as JSONL and their key content is chunked into the embedding index.

Context Compaction

When a conversation grows too long, OpenKoi compacts older messages to stay within the model's context window. Compaction triggers when the estimated token count of the message history exceeds max_tokens.

Compaction Process

  1. Split point: Messages are divided at the 2/3 mark. The older two-thirds are candidates for compaction; the recent one-third is kept intact.
  2. Fact extraction: Before summarizing, durable facts are extracted from the old messages and persisted to long-term memory. This prevents information loss.
  3. Summarization: The old messages are summarized to approximately 500 tokens using the LLM.
  4. Reassembly: The compacted history becomes: [Compacted history summary] + [recent messages].
rust
pub async fn compact(
    messages: &[Message],
    max_tokens: u32,
    model: &dyn ModelProvider,
) -> Result<Vec<Message>> {
    let total = estimate_total_tokens(messages);
    if total <= max_tokens {
        return Ok(messages.to_vec());
    }

    let split_point = messages.len() * 2 / 3;
    let (old, recent) = messages.split_at(split_point);

    // Extract durable facts before summarizing
    let facts = extract_facts(old, model).await?;
    persist_facts(&facts).await?;

    // Summarize to ~500 tokens
    let summary = summarize(old, model, 500).await?;

    let mut compacted = vec![Message::system(format!(
        "[Compacted history]\n{summary}"
    ))];
    compacted.extend_from_slice(recent);
    Ok(compacted)
}

The key insight is that compaction is not just truncation -- it extracts facts into long-term memory first, so nothing truly important is lost.

Learning Types

OpenKoi extracts three types of learning from completed tasks:

TypePatternExampleUse
Heuristic"Do X""Diminishing returns after 2 iterations on this type of task. Consider reducing max_iterations to 2."Positive guidance injected into future task prompts
AntiPattern"Don't do X""Iteration 2 regressed from 0.85 to 0.72. The attempted fix was counterproductive."Highest-priority recall to prevent repeating mistakes
Preference"X better than Y for Z""For SQL tasks, the sql-safety evaluator produces more actionable findings than general."Comparative knowledge guiding skill and strategy selection

Learning Extraction

Learning extraction happens automatically after every task completes. It uses a two-tier approach to balance cost and depth.

Tier 1: Rule-Based Extraction (0 Tokens)

Rule-based extraction runs on every task and costs nothing. It detects:

  • Regressions: When a score drops by more than 0.1 between iterations, the attempted fix is recorded as an AntiPattern.
  • Diminishing returns: When the last two iterations in a 3+ iteration task improved by less than 0.02, a Heuristic is recorded suggesting fewer iterations.
  • Recurring blockers: When the same evaluation dimension produces 2+ blocker-severity findings across iterations, an AntiPattern is recorded.

Tier 2: LLM-Assisted Extraction (~500 Tokens)

LLM extraction runs only when the task is complex enough to contain non-obvious learnings. The conditions are:

  • The task ran for 2 or more iterations (cycles >= 2)
  • At least one evaluation produced 3 or more findings (findings.len() >= 3)

When triggered, the LLM is given a tight budget (~500 tokens) and asked to extract 1-3 reusable learnings as single sentences.

Extract 1-3 reusable learnings from this task execution.
Each learning should be a single sentence that would help
with similar future tasks.

[Summary of iteration cycles, scores, and findings]

Extraction Flow

Task completes
     |
     +-- Rule-based extraction (0 tokens)
     |     +-- Score regressions       --> AntiPattern
     |     +-- Diminishing returns     --> Heuristic
     |     +-- Recurring findings      --> AntiPattern
     |
     +-- LLM extraction (~500 tokens, conditional)
     |     +-- Non-obvious learnings   --> Heuristic | Preference
     |
     +-- Skill effectiveness update (0 tokens)
     |     +-- (skill_name, category, avg_score) --> skill_effectiveness table
     |
     +-- Deduplication
     |     +-- Similar existing (>0.8) --> reinforce existing
     |     +-- Novel                   --> persist new learning
     |
     +-- Persist to SQLite

Learning Lifecycle

Deduplication

Before persisting a new learning, OpenKoi checks for semantic overlap with existing learnings. If text_similarity between a new learning and an existing one exceeds 0.8, the new learning is dropped and the existing learning is reinforced instead (its confidence is bumped and last_used is updated).

Skill Effectiveness Updates

After every task, each skill that was used has its effectiveness record updated:

sql
-- Upsert into skill_effectiveness
INSERT INTO skill_effectiveness (skill_name, task_category, avg_score, sample_count, last_used)
VALUES (?1, ?2, ?3, 1, ?4)
ON CONFLICT(skill_name, task_category)
DO UPDATE SET
  avg_score = (avg_score * sample_count + ?3) / (sample_count + 1),
  sample_count = sample_count + 1,
  last_used = ?4;

This creates a running average of how well each skill performs for each task category, which feeds into the skill selector for future tasks.

Confidence Decay

Learnings lose confidence over time unless they are reinforced by appearing relevant to new tasks. The decay formula is:

confidence *= exp(-rate_per_week * weeks_since_reinforced)
ParameterDefaultMeaning
rate_per_week0.05Decay rate per week
Prune threshold0.1Learnings below this confidence are deleted

Example decay timeline (starting confidence 0.8, never reinforced):

WeeksConfidenceStatus
00.80Active
40.65Active
80.54Active
160.36Active
240.24Active
320.16Active
400.11Active
440.09Pruned

A learning that is reinforced resets last_used to now, effectively restarting the decay clock. Frequently-reinforced learnings become long-lived knowledge.

rust
pub fn apply_decay(learnings: &mut [Learning], rate_per_week: f32) {
    let now = Utc::now();
    for learning in learnings.iter_mut() {
        let weeks_since_reinforced = (now - learning.last_used)
            .num_days() as f32 / 7.0;
        let decay = (-rate_per_week * weeks_since_reinforced).exp();
        learning.confidence *= decay;
    }
    learnings.retain(|l| l.confidence >= 0.1);
}

The decay rate is configurable in config.toml:

toml
[memory]
learning_decay_rate = 0.05   # per week

Recall System

When a new task arrives, the Historian recalls relevant context from long-term memory. Recall is token-budgeted -- it fills context in priority order until the budget is exhausted.

Recall Priority

PriorityCategoryMax ItemsRationale
1 (highest)Anti-patterns5"Don't do X" is the most valuable recall -- prevents repeating known mistakes
2Skill recommendations3Low token cost, directly influences skill selection
3Learnings (heuristics/preferences)5"Do X" and "prefer X over Y" guidance for the executor
4 (lowest)Similar past tasks3Expensive (full task summaries), only loaded if budget remains

The recall budget is typically 1/10 of the total task token budget. For a default 200,000-token budget, recall gets up to 20,000 tokens.

rust
pub async fn recall(
    &self,
    task: &TaskInput,
    token_budget: u32,
) -> Result<HistoryRecall> {
    let embedding = self.embed(&task.description).await?;
    let mut used_tokens: u32 = 0;
    let mut recall = HistoryRecall::default();

    // Priority 1: Anti-patterns
    let anti_patterns = self.query_learnings(
        LearningType::AntiPattern, &embedding, 5
    ).await?;
    for ap in anti_patterns {
        let tokens = estimate_tokens(&ap.content);
        if used_tokens + tokens > token_budget { break; }
        used_tokens += tokens;
        recall.anti_patterns.push(ap);
    }

    // Priority 2: Skill recommendations
    // Priority 3: Learnings (heuristics)
    // Priority 4: Similar past tasks (only if budget > 50% remaining)
    // ...
}

Similar past tasks are only loaded if less than half the recall budget has been consumed by higher-priority items. This prevents expensive task summaries from crowding out actionable learnings.

Embedding Model

OpenKoi uses text-embedding-3-small from OpenAI by default, producing vectors with 1536 dimensions. The embedding model is configured separately from the chat model:

toml
[models]
embedder = "openai/text-embedding-3-small"

Vector Storage

Embeddings are stored in a sqlite-vec virtual table, which provides approximate nearest-neighbor search directly within SQLite:

sql
CREATE VIRTUAL TABLE memory_vec USING vec0(
  id TEXT PRIMARY KEY,
  embedding float[1536]
);

Recall uses a hybrid search strategy combining vector similarity (semantic) with FTS5 (keyword):

  1. Vector search via memory_vec: Finds semantically similar content even when exact keywords differ. Used for finding similar tasks, relevant learnings, and related memory chunks.
  2. FTS5 search via memory_fts: Catches exact keyword matches that embedding similarity might miss (e.g., specific function names, error codes, file paths).

Results from both searches are merged, deduplicated, and ranked by a combined score. This hybrid approach provides better recall than either method alone.

How Learnings Feed Back

Learning TypeWhere UsedEffect
HeuristicRecalled into system prompt (Priority 3)"Do X" guidance for the executor
AntiPatternRecalled into system prompt (Priority 1)"Don't do X" -- highest priority recall
PreferenceRecalled into system prompt (Priority 3)"Prefer X over Y" guidance
Skill effectivenessSkill selector scoringHigher-scoring skills ranked first
Reinforced learningsConfidence stays high, survives decayLong-lived knowledge
Unreinforced learningsConfidence decays over timeEventually pruned from memory

The learning system creates a feedback loop: task outcomes produce learnings, learnings influence future tasks, and successful influence reinforces the learning. Over time, this produces an agent that avoids known pitfalls and gravitates toward strategies that work for your specific usage patterns.

Released under the MIT License.