Stefano De Cillis

Engineer | Tech Entrepreneur

Building Persistent Memory for AI Agents with ChromaDB and OpenClaw

February 5, 2026

One of the fundamental limitations of LLM-based assistants is their lack of persistent memory. Each conversation starts fresh—your AI has no recollection of previous interactions, decisions, or lessons learned. This article explores how to solve this problem by implementing a semantic vector memory system using ChromaDB, specifically designed for AI agents running on platforms like OpenClaw.


The Memory Problem

Large Language Models like Claude or GPT have context windows—a limited amount of text they can process at once. While these windows have grown significantly (200K+ tokens for Claude), they're still fundamentally session-based. Your AI assistant can't:

  • Remember what you discussed last week
  • Learn from corrections you've made
  • Build up knowledge about your preferences over time
  • Recall decisions and their reasoning

Traditional solutions involve dumping conversation logs into the context, but this is inefficient and doesn't scale. What we need is semantic memory—the ability to store and retrieve information based on meaning, not just keywords.

Why Vector Memory?

Vector databases store text as high-dimensional embeddings—numerical representations that capture semantic meaning. When you search for "my coffee preferences," it can find entries about "I like dark roast from Ethiopia" even though the words don't match.

This is fundamentally different from traditional databases:

Traditional DBVector DB
Exact keyword matchingSemantic similarity
Rigid schemaFlexible documents
SQL queriesNatural language queries
Returns exact matchesReturns related concepts

For AI agents, vector memory enables:

  • Semantic recall: Find relevant memories by meaning
  • Efficient storage: Only store what matters, retrieve what's relevant
  • Learning over time: Accumulate knowledge without bloating context
  • Correction memory: Remember mistakes to avoid repeating them

The Architecture

Our solution uses ChromaDB, an open-source embedding database that runs locally without external API calls. The architecture is simple:

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│  AI Agent   │────▶│ Memory CLI   │────▶│  ChromaDB   │
│  (OpenClaw) │     │   (Python)   │     │  (Local)    │
└─────────────┘     └──────────────┘     └─────────────┘
                           │
                    ┌──────┴──────┐
                    │ Memory Types │
                    ├─────────────┤
                    │ • Learnings │
                    │ • Decisions │
                    │ • Corrections│
                    │ • Files     │
                    │ • Notes     │
                    └─────────────┘

The AI agent interacts with memory through a CLI wrapper, which handles embedding generation, storage, and retrieval. All data stays local—no external API calls for embeddings.

Implementation

Step 1: Set Up the Directory Structure

mkdir -p ~/.openclaw/vector-memory
mkdir -p ~/.openclaw/bin

Step 2: Create the Memory Service

Create ~/.openclaw/vector-memory/memory_service.py:

#!/usr/bin/env python3
"""
Long-Term Memory Service
Semantic memory storage and retrieval using ChromaDB
"""

import chromadb
from chromadb.config import Settings
import json
import hashlib
from datetime import datetime
from pathlib import Path
import argparse

# Configuration
MEMORY_DIR = Path.home() / ".openclaw" / "vector-memory"
CHROMA_DIR = MEMORY_DIR / "chroma_db"
COLLECTION_NAME = "agent_memory"

# Initialize ChromaDB client (local, no external APIs)
client = chromadb.PersistentClient(
    path=str(CHROMA_DIR),
    settings=Settings(anonymized_telemetry=False)
)

collection = client.get_or_create_collection(
    name=COLLECTION_NAME,
    metadata={"description": "Agent long-term semantic memory"}
)


def generate_id(text: str, source: str = "") -> str:
    """Generate a unique ID for a memory chunk."""
    content = f"{text}{source}{datetime.now().isoformat()}"
    return hashlib.sha256(content.encode()).hexdigest()[:16]


def add_memory(text: str, source: str = "manual", 
               memory_type: str = "note", metadata: dict = None) -> str:
    """Add a new memory to the store."""
    if not text.strip():
        return "Error: Empty text"
    
    doc_id = generate_id(text, source)
    meta = {
        "source": source,
        "type": memory_type,
        "timestamp": datetime.now().isoformat(),
        "date": datetime.now().strftime("%Y-%m-%d"),
    }
    if metadata:
        meta.update(metadata)
    
    collection.add(
        documents=[text],
        ids=[doc_id],
        metadatas=[meta]
    )
    
    return f"Memory added: {doc_id}"


def search_memory(query: str, top_k: int = 5, 
                  memory_type: str = None) -> list:
    """Search memories semantically."""
    where_filter = {"type": memory_type} if memory_type else None
    
    results = collection.query(
        query_texts=[query],
        n_results=top_k,
        where=where_filter
    )
    
    memories = []
    if results and results['documents'] and results['documents'][0]:
        for i, doc in enumerate(results['documents'][0]):
            memories.append({
                "text": doc,
                "metadata": results['metadatas'][0][i],
                "distance": results['distances'][0][i],
                "id": results['ids'][0][i]
            })
    
    return memories


def add_learning(text: str, context: str = "") -> str:
    """Add a learning/insight to memory."""
    full_text = f"Context: {context}\n\nLearning: {text}" if context else text
    return add_memory(full_text, "learning", "learning", {"context": context})


def add_decision(decision: str, reasoning: str = "") -> str:
    """Record a decision and its reasoning."""
    full_text = f"Decision: {decision}"
    if reasoning:
        full_text += f"\n\nReasoning: {reasoning}"
    return add_memory(full_text, "decision", "decision", {"reasoning": reasoning})


def add_correction(mistake: str, correction: str, context: str = "") -> str:
    """Record a correction for learning from mistakes."""
    full_text = f"Mistake: {mistake}\n\nCorrection: {correction}"
    if context:
        full_text = f"Context: {context}\n\n{full_text}"
    return add_memory(full_text, "correction", "correction", {"context": context})


def index_markdown_file(filepath: str, chunk_size: int = 500) -> int:
    """Index a markdown file, splitting into semantic chunks."""
    path = Path(filepath)
    if not path.exists():
        return 0
    
    content = path.read_text()
    chunks = []
    current_chunk = ""
    
    for line in content.split('\n'):
        if line.startswith('#') and current_chunk:
            if len(current_chunk.strip()) > 50:
                chunks.append(current_chunk.strip())
            current_chunk = line + "\n"
        else:
            current_chunk += line + "\n"
            if len(current_chunk) > chunk_size:
                chunks.append(current_chunk.strip())
                current_chunk = ""
    
    if current_chunk.strip() and len(current_chunk.strip()) > 50:
        chunks.append(current_chunk.strip())
    
    indexed = 0
    for i, chunk in enumerate(chunks):
        doc_id = f"{path.name}_{i}_{hashlib.md5(chunk.encode()).hexdigest()[:8]}"
        
        existing = collection.get(ids=[doc_id])
        if existing and existing['ids']:
            continue
        
        collection.add(
            documents=[chunk],
            ids=[doc_id],
            metadatas=[{
                "source": str(filepath),
                "type": "file",
                "filename": path.name,
                "chunk_index": i,
                "timestamp": datetime.now().isoformat(),
            }]
        )
        indexed += 1
    
    return indexed


def get_stats() -> dict:
    """Get memory statistics."""
    return {
        "total_memories": collection.count(),
        "collection": COLLECTION_NAME,
        "storage_path": str(CHROMA_DIR)
    }

Step 3: Create the CLI Wrapper

Create ~/.openclaw/bin/memory:

#!/bin/bash
python3 ~/.openclaw/vector-memory/memory_service.py "$@"

Make it executable:

chmod +x ~/.openclaw/bin/memory

Step 4: Set Up the Python Environment

cd ~/.openclaw/vector-memory
python3 -m venv venv
source venv/bin/activate
pip install chromadb

Update the CLI to use the virtual environment:

#!/bin/bash
~/.openclaw/vector-memory/venv/bin/python3 \
    ~/.openclaw/vector-memory/memory_service.py "$@"

Using the Memory System

Basic Commands

# Search memories semantically
~/.openclaw/bin/memory search "user preferences for coffee" -n 5

# Add a learning
~/.openclaw/bin/memory learn "User prefers dark roast coffee" \
    -c "Mentioned during morning routine discussion"

# Record a decision
~/.openclaw/bin/memory decide "Use PostgreSQL for the project" \
    -r "Better JSON support and team familiarity"

# Record a correction (when you correct the AI)
~/.openclaw/bin/memory correct \
    "Assumed user was in EST timezone" \
    "User is in CET (Europe/Rome)" \
    -c "Scheduling meeting"

# Index a markdown file
~/.openclaw/bin/memory index-file ~/notes/project-requirements.md

# Check stats
~/.openclaw/bin/memory stats

Integrating with Your AI Agent

Add instructions to your agent's configuration to use memory:

## Memory System

Before answering questions about past work, preferences, or decisions:
\`\`\`bash
~/.openclaw/bin/memory search "relevant query" -n 5
\`\`\`

When learning something important:
\`\`\`bash
~/.openclaw/bin/memory learn "what I learned" -c "context"
\`\`\`

When corrected by user:
\`\`\`bash
~/.openclaw/bin/memory correct "mistake" "correction" -c "context"
\`\`\`

The Memory Workflow

Having the system is one thing—using it consistently is another. After running this in production, here's the workflow that actually sticks:

At Session Start

  1. Read context files: MEMORY.md and recent daily logs give immediate context
  2. Search before answering: If the user asks about past work, run memory search first:
# User asks: "What did we decide about the database?"
~/.openclaw/bin/memory search "database decision" -n 5

This prevents the AI from making things up or forgetting prior decisions.

During Work

TriggerAction
Completed a significant featurememory learn "Built auth system with JWT" -c "Project X"
Made a non-obvious decisionmemory decide "Using Redis for sessions" -r "Need sub-ms latency"
User corrected a mistakememory correct "Assumed UTC" "User is in CET" -c "scheduling"
Created important documentationmemory index-file ./docs/architecture.md

At Session End

  • Update the daily log (memory/YYYY-MM-DD.md) with key events
  • Consider what's worth adding to long-term MEMORY.md
  • Index any new important files

The Key Principle

If you might need to remember it next session, write it down NOW.

Mental notes don't survive context window resets. Files and vector memories do.

Memory Types and Their Uses

TypePurposeExample
learningInsights and lessons"User prefers concise responses"
decisionRecorded choices with reasoning"Chose React over Vue for better ecosystem"
correctionMistakes and their fixes"Don't assume timezone; always ask"
fileIndexed document chunksProject docs, meeting notes
noteGeneral informationContact details, preferences

Why ChromaDB?

Several vector databases could work for this use case, but ChromaDB offers specific advantages for local AI agents:

  1. Local-first: Runs entirely on your machine, no API keys or external calls
  2. Built-in embeddings: Uses Sentence Transformers by default—no OpenAI API needed
  3. Lightweight: Single Python package, minimal dependencies
  4. Persistent: Data survives restarts without explicit save calls
  5. Simple API: Add, query, and manage with minimal code

Results in Practice

After implementing this system, the AI agent can:

  • Recall context from weeks ago: "What did we decide about the database schema?"
  • Learn from corrections: Same mistake doesn't repeat after being corrected once
  • Build institutional knowledge: Preferences, patterns, and decisions accumulate
  • Search semantically: Find related memories even with different phrasing

The difference is transformative. The assistant goes from a stateless tool to something that feels like it actually knows you.

Conclusion

Persistent memory transforms AI assistants from stateless responders into genuine collaborators that learn and grow. With ChromaDB and a simple CLI wrapper, you can add this capability to any LLM-based agent in under an hour.

The key insight is that memory doesn't need to be complex. A vector database, some careful categorization of memory types, and clear instructions for when to store and retrieve—that's all it takes to give your AI assistant a functioning long-term memory.