Building Persistent Memory for AI Agents with ChromaDB and OpenClaw

February 5, 2026

One of the fundamental limitations of LLM-based assistants is their lack of persistent memory. Each conversation starts fresh—your AI has no recollection of previous interactions, decisions, or lessons learned. This article explores how to solve this problem by implementing a semantic vector memory system using ChromaDB, specifically designed for AI agents running on platforms like OpenClaw.

The Memory Problem

Large Language Models like Claude or GPT have context windows—a limited amount of text they can process at once. While these windows have grown significantly (200K+ tokens for Claude), they're still fundamentally session-based. Your AI assistant can't:

Remember what you discussed last week
Learn from corrections you've made
Build up knowledge about your preferences over time
Recall decisions and their reasoning

Traditional solutions involve dumping conversation logs into the context, but this is inefficient and doesn't scale. What we need is semantic memory—the ability to store and retrieve information based on meaning, not just keywords.

Why Vector Memory?

Vector databases store text as high-dimensional embeddings—numerical representations that capture semantic meaning. When you search for "my coffee preferences," it can find entries about "I like dark roast from Ethiopia" even though the words don't match.

This is fundamentally different from traditional databases:

Traditional DB	Vector DB
Exact keyword matching	Semantic similarity
Rigid schema	Flexible documents
SQL queries	Natural language queries
Returns exact matches	Returns related concepts

For AI agents, vector memory enables:

Semantic recall: Find relevant memories by meaning
Efficient storage: Only store what matters, retrieve what's relevant
Learning over time: Accumulate knowledge without bloating context
Correction memory: Remember mistakes to avoid repeating them

The Architecture

Our solution uses ChromaDB, an open-source embedding database that runs locally without external API calls. The architecture is simple:

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│  AI Agent   │────▶│ Memory CLI   │────▶│  ChromaDB   │
│  (OpenClaw) │     │   (Python)   │     │  (Local)    │
└─────────────┘     └──────────────┘     └─────────────┘
                           │
                    ┌──────┴──────┐
                    │ Memory Types │
                    ├─────────────┤
                    │ • Learnings │
                    │ • Decisions │
                    │ • Corrections│
                    │ • Files     │
                    │ • Notes     │
                    └─────────────┘

The AI agent interacts with memory through a CLI wrapper, which handles embedding generation, storage, and retrieval. All data stays local—no external API calls for embeddings.

Implementation

Step 1: Set Up the Directory Structure

mkdir -p ~/.openclaw/vector-memory
mkdir -p ~/.openclaw/bin

Step 2: Create the Memory Service

Create ~/.openclaw/vector-memory/memory_service.py:

#!/usr/bin/env python3
"""
Long-Term Memory Service
Semantic memory storage and retrieval using ChromaDB
"""

import chromadb
from chromadb.config import Settings
import json
import hashlib
from datetime import datetime
from pathlib import Path
import argparse

# Configuration
MEMORY_DIR = Path.home() / ".openclaw" / "vector-memory"
CHROMA_DIR = MEMORY_DIR / "chroma_db"
COLLECTION_NAME = "agent_memory"

# Initialize ChromaDB client (local, no external APIs)
client = chromadb.PersistentClient(
    path=str(CHROMA_DIR),
    settings=Settings(anonymized_telemetry=False)
)

collection = client.get_or_create_collection(
    name=COLLECTION_NAME,
    metadata={"description": "Agent long-term semantic memory"}
)


def generate_id(text: str, source: str = "") -> str:
    """Generate a unique ID for a memory chunk."""
    content = f"{text}{source}{datetime.now().isoformat()}"
    return hashlib.sha256(content.encode()).hexdigest()[:16]


def add_memory(text: str, source: str = "manual", 
               memory_type: str = "note", metadata: dict = None) -> str:
    """Add a new memory to the store."""
    if not text.strip():
        return "Error: Empty text"
    
    doc_id = generate_id(text, source)
    meta = {
        "source": source,
        "type": memory_type,
        "timestamp": datetime.now().isoformat(),
        "date": datetime.now().strftime("%Y-%m-%d"),
    }
    if metadata:
        meta.update(metadata)
    
    collection.add(
        documents=[text],
        ids=[doc_id],
        metadatas=[meta]
    )
    
    return f"Memory added: {doc_id}"


def search_memory(query: str, top_k: int = 5, 
                  memory_type: str = None) -> list:
    """Search memories semantically."""
    where_filter = {"type": memory_type} if memory_type else None
    
    results = collection.query(
        query_texts=[query],
        n_results=top_k,
        where=where_filter
    )
    
    memories = []
    if results and results['documents'] and results['documents'][0]:
        for i, doc in enumerate(results['documents'][0]):
            memories.append({
                "text": doc,
                "metadata": results['metadatas'][0][i],
                "distance": results['distances'][0][i],
                "id": results['ids'][0][i]
            })
    
    return memories


def add_learning(text: str, context: str = "") -> str:
    """Add a learning/insight to memory."""
    full_text = f"Context: {context}\n\nLearning: {text}" if context else text
    return add_memory(full_text, "learning", "learning", {"context": context})


def add_decision(decision: str, reasoning: str = "") -> str:
    """Record a decision and its reasoning."""
    full_text = f"Decision: {decision}"
    if reasoning:
        full_text += f"\n\nReasoning: {reasoning}"
    return add_memory(full_text, "decision", "decision", {"reasoning": reasoning})


def add_correction(mistake: str, correction: str, context: str = "") -> str:
    """Record a correction for learning from mistakes."""
    full_text = f"Mistake: {mistake}\n\nCorrection: {correction}"
    if context:
        full_text = f"Context: {context}\n\n{full_text}"
    return add_memory(full_text, "correction", "correction", {"context": context})


def index_markdown_file(filepath: str, chunk_size: int = 500) -> int:
    """Index a markdown file, splitting into semantic chunks."""
    path = Path(filepath)
    if not path.exists():
        return 0
    
    content = path.read_text()
    chunks = []
    current_chunk = ""
    
    for line in content.split('\n'):
        if line.startswith('#') and current_chunk:
            if len(current_chunk.strip()) > 50:
                chunks.append(current_chunk.strip())
            current_chunk = line + "\n"
        else:
            current_chunk += line + "\n"
            if len(current_chunk) > chunk_size:
                chunks.append(current_chunk.strip())
                current_chunk = ""
    
    if current_chunk.strip() and len(current_chunk.strip()) > 50:
        chunks.append(current_chunk.strip())
    
    indexed = 0
    for i, chunk in enumerate(chunks):
        doc_id = f"{path.name}_{i}_{hashlib.md5(chunk.encode()).hexdigest()[:8]}"
        
        existing = collection.get(ids=[doc_id])
        if existing and existing['ids']:
            continue
        
        collection.add(
            documents=[chunk],
            ids=[doc_id],
            metadatas=[{
                "source": str(filepath),
                "type": "file",
                "filename": path.name,
                "chunk_index": i,
                "timestamp": datetime.now().isoformat(),
            }]
        )
        indexed += 1
    
    return indexed


def get_stats() -> dict:
    """Get memory statistics."""
    return {
        "total_memories": collection.count(),
        "collection": COLLECTION_NAME,
        "storage_path": str(CHROMA_DIR)
    }

Step 3: Create the CLI Wrapper

Create ~/.openclaw/bin/memory:

#!/bin/bash
python3 ~/.openclaw/vector-memory/memory_service.py "$@"

Make it executable:

chmod +x ~/.openclaw/bin/memory

Step 4: Set Up the Python Environment

cd ~/.openclaw/vector-memory
python3 -m venv venv
source venv/bin/activate
pip install chromadb

Update the CLI to use the virtual environment:

#!/bin/bash
~/.openclaw/vector-memory/venv/bin/python3 \
    ~/.openclaw/vector-memory/memory_service.py "$@"

Using the Memory System

Basic Commands

# Search memories semantically
~/.openclaw/bin/memory search "user preferences for coffee" -n 5

# Add a learning
~/.openclaw/bin/memory learn "User prefers dark roast coffee" \
    -c "Mentioned during morning routine discussion"

# Record a decision
~/.openclaw/bin/memory decide "Use PostgreSQL for the project" \
    -r "Better JSON support and team familiarity"

# Record a correction (when you correct the AI)
~/.openclaw/bin/memory correct \
    "Assumed user was in EST timezone" \
    "User is in CET (Europe/Rome)" \
    -c "Scheduling meeting"

# Index a markdown file
~/.openclaw/bin/memory index-file ~/notes/project-requirements.md

# Check stats
~/.openclaw/bin/memory stats

Integrating with Your AI Agent

Add instructions to your agent's configuration to use memory:

## Memory System

Before answering questions about past work, preferences, or decisions:
\`\`\`bash
~/.openclaw/bin/memory search "relevant query" -n 5
\`\`\`

When learning something important:
\`\`\`bash
~/.openclaw/bin/memory learn "what I learned" -c "context"
\`\`\`

When corrected by user:
\`\`\`bash
~/.openclaw/bin/memory correct "mistake" "correction" -c "context"
\`\`\`

The Memory Workflow

Having the system is one thing—using it consistently is another. After running this in production, here's the workflow that actually sticks:

At Session Start

Read context files: MEMORY.md and recent daily logs give immediate context
Search before answering: If the user asks about past work, run memory search first:

# User asks: "What did we decide about the database?"
~/.openclaw/bin/memory search "database decision" -n 5

This prevents the AI from making things up or forgetting prior decisions.

During Work

Trigger	Action
Completed a significant feature	`memory learn "Built auth system with JWT" -c "Project X"`
Made a non-obvious decision	`memory decide "Using Redis for sessions" -r "Need sub-ms latency"`
User corrected a mistake	`memory correct "Assumed UTC" "User is in CET" -c "scheduling"`
Created important documentation	`memory index-file ./docs/architecture.md`

At Session End

Update the daily log (memory/YYYY-MM-DD.md) with key events
Consider what's worth adding to long-term MEMORY.md
Index any new important files

The Key Principle

If you might need to remember it next session, write it down NOW.

Mental notes don't survive context window resets. Files and vector memories do.

Memory Types and Their Uses

Type	Purpose	Example
`learning`	Insights and lessons	"User prefers concise responses"
`decision`	Recorded choices with reasoning	"Chose React over Vue for better ecosystem"
`correction`	Mistakes and their fixes	"Don't assume timezone; always ask"
`file`	Indexed document chunks	Project docs, meeting notes
`note`	General information	Contact details, preferences

Why ChromaDB?

Several vector databases could work for this use case, but ChromaDB offers specific advantages for local AI agents:

Local-first: Runs entirely on your machine, no API keys or external calls
Built-in embeddings: Uses Sentence Transformers by default—no OpenAI API needed
Lightweight: Single Python package, minimal dependencies
Persistent: Data survives restarts without explicit save calls
Simple API: Add, query, and manage with minimal code

Results in Practice

After implementing this system, the AI agent can:

Recall context from weeks ago: "What did we decide about the database schema?"
Learn from corrections: Same mistake doesn't repeat after being corrected once
Build institutional knowledge: Preferences, patterns, and decisions accumulate
Search semantically: Find related memories even with different phrasing

The difference is transformative. The assistant goes from a stateless tool to something that feels like it actually knows you.

Conclusion

Persistent memory transforms AI assistants from stateless responders into genuine collaborators that learn and grow. With ChromaDB and a simple CLI wrapper, you can add this capability to any LLM-based agent in under an hour.

The key insight is that memory doesn't need to be complex. A vector database, some careful categorization of memory types, and clear instructions for when to store and retrieve—that's all it takes to give your AI assistant a functioning long-term memory.

View OpenClaw on GitHub