Skip to content

Intelligent persistent memory system for LLM agents using Firebolt Core with HNSW vector search

License

Notifications You must be signed in to change notification settings

firebolt-db/firebolt-memory-layer

Repository files navigation

Firebolt Memory Layer (FML)

An intelligent, persistent memory system for LLM agents using local Firebolt Core with HNSW vector search. Designed to give Cursor, Claude Code, Antigravity Codes, and other MCP-compatible tools long-term memory that persists across sessions. Includes setup instructions for Antigravity Codes, Claude Code, Google Gemini, and Cursor IDE.

🔒 Security-First Mindset: FML brings a security-first approach to AI IDE development, with built-in secret detection, credential protection, and privacy-focused architecture to keep sensitive data safe.

Credits

Core Author: John Tuite - The FML MCP server architecture and implementation is John's work. This project builds upon his foundational design for intelligent memory management in LLM agents.


What This Does

  • Working Memory: Session-scoped context that persists during a conversation
  • Long-Term Memory: Vector-indexed persistent storage with semantic search
  • Auto-Classification: Memories are automatically categorized (episodic, semantic, procedural, preference)
  • Semantic Recall: Find relevant memories based on meaning, not just keywords
  • 100% Local: Runs entirely on your machine using Firebolt Core + Ollama (no cloud dependencies)
  • Monitoring Dashboard: Built-in local UI for real-time monitoring, troubleshooting, and analytics - critical for ongoing heavy usage
  • Security-First: Built-in secret detection, credential protection, and privacy-focused architecture to keep sensitive data safe in AI IDE workflows

Getting Started

Prerequisites

Required services (must be running before setup):

  1. Firebolt Core - Local database server

    # Install: bash <(curl -s https://get-core.firebolt.io/)
    # Verify running:
    curl http://localhost:3473/?output_format=TabSeparated -d "SELECT 1"
  2. Ollama - Local LLM for embeddings and classification

    # Install: brew install ollama
    # Start: ollama serve
    # Pull models: ollama pull llama3:8b nomic-embed-text

Installation

git clone git@github.com:firebolt-db/firebolt-memory-layer.git
cd firebolt-memory-layer/fml/fml-server
./scripts/bootstrap.sh

The bootstrap script sets up everything automatically. Restart Cursor (Cmd+Q) after completion.

Cursor IDE

⚠️ Prerequisites Required: Before adding FML to Cursor, ensure Firebolt Core and Ollama are running (see Prerequisites above). The MCP server needs these services to function.

Add to Cursor:

Add to Cursor

Note: Click the button above to open Cursor's MCP settings. Then follow the steps below to configure FML.

Setup Steps:

  1. Run the bootstrap script (recommended - automatically configures Cursor):

    cd firebolt-memory-layer/fml/fml-server
    ./scripts/bootstrap.sh

    This automatically configures Cursor for you.

  2. Or manually configure by editing ~/.cursor/mcp.json (or via Cursor Settings > Tools & MCP):

    {
      "mcpServers": {
        "fml": {
          "command": "/FULL/PATH/TO/firebolt-memory-layer/fml/fml-server/.venv/bin/python",
          "args": ["-m", "src.server"],
          "cwd": "/FULL/PATH/TO/firebolt-memory-layer/fml/fml-server",
          "env": {
            "PYTHONPATH": "/FULL/PATH/TO/firebolt-memory-layer/fml/fml-server"
          }
        }
      }
    }

    Replace /FULL/PATH/TO/ with your actual repository path.

  3. Restart Cursor (Cmd+Q) to activate FML.

Other MCP Clients: See Platform Setup Guide for Claude Code, Google Gemini, and others.


Manual Setup Guide (Detailed)

If you prefer manual setup or the bootstrap script doesn't work for your environment:

Prerequisites

You need:

  • macOS (tested on macOS 15.x)
  • Docker Desktop installed and running
  • Python 3.10+ (3.14 recommended)
  • Homebrew for package management
  • Cursor IDE (or any MCP-compatible client)

Step 1: Install Dependencies

# Install Homebrew if not already installed
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install Python (if needed)
brew install python@3.14

# Install Ollama (local LLM)
brew install ollama

# Start Ollama and pull required models
ollama serve  # Run in a separate terminal, keep it running
ollama pull llama3:8b        # For classification
ollama pull nomic-embed-text  # For embeddings (768 dimensions)

Step 2: Install Firebolt Core (Local Database)

Firebolt Core is a local version of Firebolt that runs in Docker.

# Install Firebolt Core using the official installer
bash <(curl -s https://get-core.firebolt.io/)

# Or if you have the manage script:
# cd firebolt-core-local && ./manage-firebolt.sh start

# Verify it's running (should return a response)
curl http://localhost:3473/?output_format=TabSeparated -d "SELECT 1"

Firebolt Core runs at http://localhost:3473 by default.

Step 3: Clone and Set Up FML

# Clone this repository
git clone git@github.com:firebolt-db/firebolt-memory-layer.git
cd firebolt-memory-layer

# Create Python virtual environment
cd fml/fml-server
python3 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -e ".[dev]"

Step 4: Configure Environment

# Copy example config
cp config/env.example .env

# Edit .env with your settings
nano .env

For local-only setup (Firebolt Core + Ollama), use:

# Firebolt Core (Local)
FIREBOLT_USE_CORE=true
FIREBOLT_CORE_URL=http://localhost:3473
FIREBOLT_DATABASE=fml_memory

# Ollama (Local LLM - runs at localhost:11434)
OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=llama3:8b
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
OLLAMA_EMBEDDING_DIMENSIONS=768

# Optional: OpenAI (only if you want to use OpenAI embeddings instead)
# OPENAI_API_KEY=your-key-here

Step 5: Initialize Database Schema

# Ensure virtual environment is active
source .venv/bin/activate

# Create database and tables in Firebolt Core
python scripts/migrate.py

This creates:

  • fml_memory database
  • session_contexts table (working memory sessions)
  • working_memory_items table (active context)
  • long_term_memories table (persistent memories with vector index)
  • memory_access_log table (analytics)
  • idx_memories_embedding HNSW vector index (768 dimensions for Ollama)

Step 6: Seed Core Memories (Recommended)

Pre-load essential memories for security rules, workflows, and troubleshooting:

# Seed core memories for optimal experience
python scripts/seed_core_memories.py

This creates foundational memories that help FML:

  • Enforce security rules consistently
  • Follow proper session workflows
  • Troubleshoot common issues
  • Understand FML architecture

Step 7: Test the Setup

# Test all connections
python scripts/test_connections.py

# Test the MCP tools
python scripts/test_tools.py

# Test security validation
python scripts/test_security.py

Step 8: Configure MCP Client to Use FML

FML works with any MCP-compatible client. See Platform Setup Guide for detailed instructions for:

  • Claude Code (Anthropic)
  • Google Gemini / Antigravity Codes
  • Cursor IDE (shown below)

For Cursor IDE

Create or edit ~/.cursor/mcp.json:

{
  "mcpServers": {
    "fml": {
      "command": "/path/to/firebolt-memory-layer/fml/fml-server/.venv/bin/python",
      "args": ["-m", "src.server"],
      "cwd": "/path/to/firebolt-memory-layer/fml/fml-server",
      "env": {
        "PYTHONPATH": "/path/to/firebolt-memory-layer/fml/fml-server"
      }
    }
  }
}

Important: Replace /path/to/ with the actual path where you cloned the repo.

Example for typical setup:

{
  "mcpServers": {
    "fml": {
      "command": "/Users/YOUR_USERNAME/DevelopmentArea/firebolt-memory-layer/fml/fml-server/.venv/bin/python",
      "args": ["-m", "src.server"],
      "cwd": "/Users/YOUR_USERNAME/DevelopmentArea/firebolt-memory-layer/fml/fml-server",
      "env": {
        "PYTHONPATH": "/Users/YOUR_USERNAME/DevelopmentArea/firebolt-memory-layer/fml/fml-server"
      }
    }
  }
}

Configuration templates are available in fml/fml-server/config/ directory.

Step 9: Add Global Rules (Cursor IDE Only)

Note: This step is specific to Cursor IDE. Other MCP clients (Claude Code, Gemini CLI) don't use separate rule files.

For Cursor IDE: Create ~/.cursor/rules/fml-memory.mdc with the content from cursor-rules/fml-memory.mdc in this repo. This tells all Cursor agents to use FML automatically.

Step 10: Restart Your MCP Client

After adding the MCP config and rules, restart your MCP client completely:

  • Cursor IDE: Quit completely (Cmd+Q on Mac, then reopen)
  • Claude Code: Quit the application completely and reopen
  • Google Gemini CLI: Restart the CLI (gemini command)
  • Other MCP clients: Restart the application completely

Verification

After setup, test in a new Cursor chat:

  1. The agent should automatically call init_session at the start
  2. The agent should call recall_memories based on your query
  3. You can explicitly ask: "What do you remember about me?" to test recall

Architecture

┌─────────────────────────────────────────────────────────────┐
│  Cursor IDE (MCP Client)                                    │
└─────────────────────┬───────────────────────────────────────┘
                      │ MCP Protocol (stdio)
                      ▼
┌─────────────────────────────────────────────────────────────┐
│  FML MCP Server (Python)                                    │
│  ├── Working Memory Tools (5 tools)                         │
│  ├── Long-Term Memory Tools (5 tools)                       │
│  ├── Context Assembly Tools (2 tools)                       │
│  └── Stats/Analytics Tools (3 tools)                        │
└─────────────────────┬───────────────────────────────────────┘
                      │
          ┌──────────┴──────────┐
          ▼                     ▼
┌─────────────────┐   ┌─────────────────┐
│  Firebolt Core  │   │     Ollama      │
│  (localhost:    │   │  (localhost:    │
│   3473)         │   │   11434)        │
│  - SQL Storage  │   │  - Embeddings   │
│  - Vector Index │   │  - Classification│
└─────────────────┘   └─────────────────┘

MCP Tools Available

Working Memory (5 tools)

Tool Description
init_session Initialize or resume a memory session
add_to_working_memory Add item to working memory
get_working_memory Retrieve current working memory state
update_working_memory_item Update item properties (pinned, relevance)
clear_working_memory Clear working memory for a session

Long-Term Memory (5 tools)

Tool Description
store_memory Store a memory with auto-classification
recall_memories Semantic search for relevant memories
update_memory Update an existing memory
forget_memory Delete a memory (soft delete)
forget_all_user_memories GDPR-compliant full deletion

Context Assembly (2 tools)

Tool Description
get_relevant_context Assemble optimal context from all sources
checkpoint_working_memory Promote working memory to long-term

Stats (3 tools)

Tool Description
get_fml_stats Server statistics and metrics
get_recent_calls Recent API call history
get_memory_analytics Memory distribution analytics

Memory Categories

Memories are auto-classified into human-aligned categories:

Category Use For Subtypes
episodic Events, decisions, outcomes decision, outcome, interaction, milestone
semantic Facts, knowledge, entities entity, concept, relationship, domain
procedural Workflows, patterns, how-to workflow, pattern, command, troubleshooting
preference User preferences, style coding_style, communication, tool_preference, constraint

Troubleshooting

FML not responding in your MCP client

  1. Check Firebolt Core is running: curl http://localhost:3473/?output_format=TabSeparated -d "SELECT 1"
  2. Check Ollama is running: curl http://localhost:11434/api/tags
  3. Check MCP config path is correct:
    • Cursor IDE: ~/.cursor/mcp.json
    • Claude Code: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
    • Google Gemini CLI: ~/.gemini/settings.json
  4. Restart your MCP client completely:
    • Cursor IDE: Cmd+Q (Mac) or quit and reopen
    • Claude Code: Quit application completely and reopen
    • Google Gemini CLI: Restart the CLI
  5. Use the dashboard (if running): Check http://localhost:5174 for real-time error logs, connection status, and system health metrics

Dashboard showing incorrect/stale data

The HTTP API server (used by the dashboard) doesn't auto-reload when code changes. If you see:

  • Incorrect "All-Time Calls" counts (e.g., showing 4 instead of hundreds)
  • Missing data in sections (Recent LLM Calls empty, Memory Data Size showing 0)
  • Features not working that should be implemented

Solution: Check if the server needs a restart:

# Check server sync status
curl http://localhost:8082/api/version

# If "needs_restart": true, restart the server:
pkill -f "python.*http_api"
cd fml/fml-server
source .venv/bin/activate
PYTHONPATH=. python -m src.http_api &

The dashboard will also show a yellow warning banner when the server is running stale code, with the restart command included.

Why this happens: The http_api server is a separate Python process that loads code once at startup. Unlike the MCP server (which restarts with each MCP client session), the HTTP API runs continuously. After code changes, the running server still has the old code in memory until manually restarted.

"tuple index out of range" error

This usually means the database is empty or the vector index doesn't exist. Run:

python scripts/migrate.py

Vector dimension mismatch

Ensure you're using Ollama's nomic-embed-text (768 dimensions). If switching from OpenAI (1536 dimensions), you need to:

  1. Drop the existing vector index
  2. Recreate with dimension = 768
  3. Re-embed all existing memories

Transaction conflicts

Firebolt Core only allows one write transaction at a time. The FML server uses a mutex to serialize requests, but if you see transaction errors, wait a moment and retry.


Dashboard (Recommended for Heavy Usage)

FML includes a local monitoring dashboard - a React-based UI that provides real-time insights into your memory system. This is critical for ongoing heavy usage as it helps you:

  • Monitor performance: Track API call rates, response times, and system health
  • Troubleshoot issues: Identify errors, connection problems, and configuration issues
  • Analyze usage patterns: View memory distribution, access patterns, and user activity
  • Track memory growth: Monitor database size, memory counts, and storage usage
  • Debug problems: Inspect recent calls, error logs, and system status

Quick Start

# Start the HTTP API (required for dashboard)
cd fml/fml-server
source .venv/bin/activate
python -m src.http_api  # Runs on http://localhost:8082

# In another terminal, start the dashboard
cd fml/dashboard
npm install
npm run dev  # Runs on http://localhost:5174

Open http://localhost:5174 in your browser to access the dashboard.

The dashboard visualizes both the FML memory flow and the optional Firebolt MCP query flow (see below), giving you complete visibility into your memory system's operation.


Running with Firebolt MCP (Optional)

For full Firebolt capabilities alongside memory management, you can run both FML and the official Firebolt MCP server in parallel. This gives your agent:

  • FML: Persistent memory storage and semantic recall
  • Firebolt MCP: Direct SQL queries against Firebolt Core databases

The dashboard UI shows both data flows side-by-side, demonstrating how they work together through Firebolt Core.

Installing Firebolt MCP

The Firebolt MCP server is available via npm:

npm install -g @anthropic/firebolt-mcp
# Or use npx directly in the config

Combined MCP Configuration

To run both servers, update your ~/.cursor/mcp.json:

{
  "mcpServers": {
    "fml": {
      "command": "/path/to/firebolt-memory-layer/fml/fml-server/.venv/bin/python",
      "args": ["-m", "src.server"],
      "cwd": "/path/to/firebolt-memory-layer/fml/fml-server",
      "env": {
        "PYTHONPATH": "/path/to/firebolt-memory-layer/fml/fml-server"
      }
    },
    "firebolt": {
      "command": "npx",
      "args": ["-y", "@anthropic/firebolt-mcp"],
      "env": {
        "FIREBOLT_CORE_URL": "http://localhost:3473"
      }
    }
  }
}

How They Work Together

Server Purpose Tools
FML Memory management store_memory, recall_memories, get_relevant_context, etc.
Firebolt MCP SQL queries firebolt_query, firebolt_connect, firebolt_docs_search

Example workflow:

  1. Agent uses FML to recall relevant memories about a project
  2. Agent uses Firebolt MCP to query actual data from your databases
  3. Agent uses FML to store new learnings from the analysis

Both servers connect to the same Firebolt Core instance, so your memory data and query data live in the same database engine.


Project Structure

firebolt-memory-layer/
├── fml/
│   ├── fml-server/              # Core MCP server (Python)
│   │   ├── src/
│   │   │   ├── server.py        # MCP server entry point
│   │   │   ├── config.py        # Configuration management
│   │   │   ├── db/              # Database client
│   │   │   ├── llm/             # Ollama/OpenAI integration
│   │   │   ├── memory/          # Memory taxonomy
│   │   │   └── tools/           # MCP tool implementations
│   │   ├── scripts/
│   │   │   ├── schema.sql       # Database schema
│   │   │   └── migrate.py       # Migration script
│   │   └── config/
│   │       └── env.example      # Example environment config
│   └── dashboard/               # React monitoring dashboard
├── cursor-rules/
│   └── fml-memory.mdc           # Global Cursor rules for FML
├── firebolt-core-local/         # Firebolt Core management (if present)
└── README.md                    # This file

Security

FML implements multiple layers of security to protect sensitive data and brings a security-first mindset to AI IDE development. This ensures that sensitive information like API keys, credentials, and secrets are never accidentally stored in memory or exposed.

Built-in Secret Detection

FML includes programmatic security validation that automatically blocks storage of sensitive data. When you attempt to store content containing secrets, the operation is rejected with a detailed error.

Detected patterns include:

Category Examples
API Keys OpenAI (sk-*), GitHub (ghp_*), AWS (AKIA*), Google (AIza*), Anthropic, Slack, Stripe
Tokens Bearer tokens, JWT tokens, Authorization headers
Passwords Password assignments, database connection strings, passwords in URLs
Private Keys RSA, PGP, and generic private key blocks
Secrets Generic secret/token assignments, .env-style content

Example rejection:

{
  "error": "SECURITY_VIOLATION",
  "message": "SECURITY BLOCK: Content contains 1 critical security violation(s). Detected: OpenAI API Key.",
  "hint": "Sensitive data like API keys, passwords, and tokens should not be stored in memory."
}

Security Best Practices

When using FML, follow these guidelines:

  1. Never store actual credentials - Store descriptions or references instead

    • "The API key is sk-abc123..."
    • "OpenAI API key is stored in .env as OPENAI_API_KEY"
  2. Use environment variables - All secrets should be in .env files

    • .env files are git-ignored by default
    • Use .env.example for templates with placeholder values
  3. Local-first architecture - When using Firebolt Core + Ollama:

    • No data leaves your machine
    • No cloud API calls required
    • Full control over your data

Pre-commit Hooks

The repository includes pre-commit hooks for additional security:

# Install pre-commit hooks
cd fml/fml-server
pip install pre-commit detect-secrets
pre-commit install

# Run security scan manually
pre-commit run --all-files

# Scan for secrets
detect-secrets scan .

Configured hooks:

  • detect-secrets - Scans for hardcoded credentials
  • detect-private-key - Prevents committing private keys
  • check-env-files - Blocks .env file commits

Running Security Tests

Verify the security validation is working:

cd fml/fml-server
source .venv/bin/activate
python scripts/test_security.py

Reporting Security Issues

If you discover a security vulnerability, please:

  1. Do not open a public issue
  2. Email the maintainer directly with details
  3. Allow time for a fix before public disclosure

License

Apache 2.0

About

Intelligent persistent memory system for LLM agents using Firebolt Core with HNSW vector search

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •