#semantic-search #knowledge-graph #database #cognitive

bin+lib kotadb

A custom database for distributed human-AI cognition

4 releases (breaking)

0.5.0 Aug 15, 2025
0.4.0 Aug 15, 2025
0.3.1 Aug 14, 2025
0.2.1 Aug 13, 2025

#68 in Database implementations

MIT license

1.5MB
28K SLoC

Rust 15K SLoC // 0.1% comments TypeScript 7K SLoC // 0.1% comments Python 4K SLoC // 0.2% comments Shell 1.5K SLoC // 0.1% comments JavaScript 337 SLoC // 0.3% comments Just 143 SLoC // 0.3% comments

KotaDB

A custom database for distributed human-AI cognition, built entirely by LLM agents.

Rust Tests License

๐Ÿš€ Quick Start - Choose Your Language

Python

PyPI version Python Downloads

pip install kotadb-client

TypeScript/JavaScript

npm version npm downloads

npm install kotadb-client

Rust

Crates.io Crates.io Downloads

cargo add kotadb

Go (Coming Soon)

๐Ÿšง Work in Progress - Go client is currently under development. See #114 for progress.

# Will be available soon at:
# go get github.com/jayminwest/kota-db/clients/go

โšก 60-Second Quick Start

Get from zero to first query in under 60 seconds:

Option 1: Docker (Easiest)

# One command to start everything
docker-compose -f docker-compose.quickstart.yml up -d

# Run Python demo (shows all features)
docker-compose -f docker-compose.quickstart.yml --profile demo up python-demo

Option 2: Shell Script (Local Install)

# One-liner installation and demo
curl -sSL https://raw.githubusercontent.com/jayminwest/kota-db/main/quickstart/install.sh | bash

Option 3: Manual Setup

# Start server
docker run -p 8080:8080 ghcr.io/jayminwest/kota-db:latest serve

# Install client and try it
pip install kotadb-client
python -c "
from kotadb import KotaDB, DocumentBuilder
db = KotaDB('http://localhost:8080')
doc_id = db.insert_with_builder(
    DocumentBuilder()
    .path('/hello.md')
    .title('Hello KotaDB!')
    .content('My first document')
)
print(f'Created document: {doc_id}')
results = db.query('hello')
print(f'Found {len(results.get(\"documents\", []))} documents')
"

๐ŸŽ‰ That's it! You're now running KotaDB with type-safe client libraries.

KotaDB combines document storage, graph relationships, and semantic search
into a unified system designed for the way humans and AI think together.

Performance

Real-world benchmarks on Apple Silicon:

Operation Latency Throughput
B+ Tree Search 489 ยตs 2,000 queries/sec
Trigram Search <10 ms 100+ queries/sec
Document Insert 277 ยตs 3,600 ops/sec
Bulk Operations 20 ms 50,000 ops/sec

10,000 document dataset, Apple Silicon M-series


๐ŸŽฏ Complete Examples

Production-ready applications demonstrating real-world usage:

๐ŸŒ Flask Web App

Complete web application with REST API and UI

cd examples/flask-web-app && pip install -r requirements.txt && python app.py
# Visit http://localhost:5000

๐Ÿ“ Note-Taking App

Advanced document management with folders and tags

cd examples/note-taking-app && pip install -r requirements.txt && python note_app.py
# Visit http://localhost:5001  

๐Ÿง  RAG Pipeline

AI-powered question answering with document retrieval

cd examples/rag-pipeline && pip install -r requirements.txt && python rag_demo.py
# Requires OPENAI_API_KEY for best results

โšก Quick Examples

# Python type-safe usage
from kotadb import KotaDB, DocumentBuilder, ValidatedPath

db = KotaDB("http://localhost:8080")
doc_id = db.insert_with_builder(
    DocumentBuilder()
    .path(ValidatedPath("/notes/meeting.md"))
    .title("Team Meeting")
    .content("Discussion about project timeline...")
    .add_tag("meeting")
    .add_tag("important")
)

# Advanced search with filters
from kotadb import QueryBuilder
results = db.query_with_builder(
    QueryBuilder()
    .text("project timeline") 
    .tag_filter("meeting")
    .limit(10)
)

๐Ÿฆ€ Rust (Full Feature Access)

# Clone and build
git clone https://github.com/jayminwest/kota-db.git
cd kota-db && cargo build --release

# Start server
cargo run --bin kotadb -- serve

# CLI operations  
cargo run --bin kotadb -- insert /docs/rust.md "Rust Guide" "Ownership concepts..."
cargo run --bin kotadb -- search "ownership"  # Full-text search
cargo run --bin kotadb -- search "*"          # List all documents  
cargo run --bin kotadb -- stats              # Database statistics
Development Commands
just dev              # Auto-reload development server
just test             # Run comprehensive test suite
just check            # Format, lint, and test everything
just bench            # Performance benchmarks
just release-preview  # Preview next release

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    Query Interface                           โ”‚
โ”‚              Natural Language + Structured                   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                    Query Router                              โ”‚
โ”‚         Automatic index selection based on query             โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚   Primary    โ”‚   Full-Text   โ”‚     Graph     โ”‚   Semantic   โ”‚
โ”‚   B+ Tree    โ”‚    Trigram    โ”‚  (Planned)    โ”‚     HNSW     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                    Storage Engine                            โ”‚
โ”‚        Pages + WAL + Compression + Memory Map                โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Core Features

Storage

  • Native Format: Markdown files with YAML frontmatter
  • Git Compatible: Human-readable, diff-friendly
  • Crash-Safe: WAL ensures data durability
  • Zero Database Dependencies: No external database required

Indexing

  • B+ Tree: O(log n) path-based lookups
  • Trigram: Fuzzy-tolerant full-text search
  • Graph: Relationship traversal (MCP tools only, not fully implemented)
  • Vector: Semantic similarity with HNSW

Safety

  • Systematic Testing: 6-stage risk reduction methodology
  • Type Safety: Validated types (Rust compile-time, Python/TypeScript runtime)
  • Observability: Distributed tracing on every operation (Rust only)
  • Resilience: Automatic retries with exponential backoff (all client libraries)

Code Examples

Rust (Full Feature Access)

use kotadb::{create_file_storage, DocumentBuilder};

#[tokio::main]
async fn main() -> Result<()> {
    // Production-ready storage with all safety features
    let mut storage = create_file_storage("~/.kota/db", Some(1000)).await?;
    
    // Type-safe document construction
    let doc = DocumentBuilder::new()
        .path("/knowledge/rust-patterns.md")?
        .title("Advanced Rust Design Patterns")?
        .content(b"# Advanced Rust Patterns\n\n...")?
        .build()?;
    
    // Automatically traced, validated, cached, with retries
    storage.insert(doc).await?;
    
    Ok(())
}

Python (Client Library)

from kotadb import KotaDB, DocumentBuilder, QueryBuilder, ValidatedPath

# Connect to KotaDB server
db = KotaDB("http://localhost:8080")

# Type-safe document construction (runtime validation)
doc_id = db.insert_with_builder(
    DocumentBuilder()
    .path(ValidatedPath("/knowledge/python-patterns.md"))
    .title("Python Design Patterns")
    .content("# Python Patterns\n\n...")
    .add_tag("python")
    .add_tag("patterns")
)

# Query with builder pattern
results = db.query_with_builder(
    QueryBuilder()
    .text("design patterns")
    .limit(10)
    .tag_filter("python")
)

TypeScript (Client Library)

import { KotaDB, DocumentBuilder, QueryBuilder, ValidatedPath } from 'kotadb-client';

// Connect to KotaDB server
const db = new KotaDB({ url: 'http://localhost:8080' });

// Type-safe document construction (runtime validation)
const docId = await db.insertWithBuilder(
  new DocumentBuilder()
    .path("/knowledge/typescript-patterns.md")
    .title("TypeScript Design Patterns")
    .content("# TypeScript Patterns\n\n...")
    .addTag("typescript")
    .addTag("patterns")
);

// Query with builder pattern and full IntelliSense support
const results = await db.queryWithBuilder(
  new QueryBuilder()
    .text("design patterns")
    .limit(10)
    .tagFilter("typescript")
);

Query Language

Natural, intuitive queries designed for human-AI interaction:

// Natural language
"meetings about rust programming last week"

// Structured precision
{
  type: "semantic",
  query: "distributed systems",
  filter: { tags: { $contains: "architecture" } },
  limit: 10
}

// Graph traversal
GRAPH {
  start: "projects/kota-ai/README.md",
  follow: ["related", "references"],
  depth: 2
}

Project Status

Complete

  • Storage engine with WAL and compression
  • B+ tree primary index with persistence
  • Trigram full-text search with ranking
  • Intelligent query routing
  • CLI interface
  • Performance benchmarks

In Progress

  • Model Context Protocol (MCP) server
  • Python/TypeScript client libraries
  • Semantic vector search
  • Graph relationship queries

Documentation

Architecture โ€ข API Reference โ€ข Development Guide โ€ข Agent Guide


Installation

Client Libraries

Python

PyPI version

pip install kotadb-client

TypeScript/JavaScript

npm install kotadb-client
# or
yarn add kotadb-client

Go (Coming Soon)

# Go client is currently under development
# See https://github.com/jayminwest/kota-db/issues/114
# Will be available at: github.com/jayminwest/kota-db/clients/go

Server Installation

As a CLI Tool

cargo install kotadb
# or from source:
cargo install --path .

kotadb serve                    # Start HTTP server
kotadb insert /path "Title" "Content"  # Insert document
kotadb search "query"           # Search documents

As a Rust Library

Crates.io

[dependencies]
kotadb = "0.3.0"
# or from git:
kotadb = { git = "https://github.com/jayminwest/kota-db" }

Docker

# Using pre-built image (recommended)
docker pull ghcr.io/jayminwest/kota-db:latest
docker run -p 8080:8080 ghcr.io/jayminwest/kota-db:latest serve

# Or build from source
docker build -t kotadb .
docker run -p 8080:8080 kotadb serve

Language Support Matrix

Feature Rust Python TypeScript Go
Basic Operations
Document CRUD โœ… โœ… โœ… โŒ
Text Search โœ… โœ… โœ… โŒ
Semantic Search โœ… โœ… โœ… โŒ
Hybrid Search โœ… โœ… โœ… โŒ
Type Safety
Validated Types โœ… โœ… โœ… โŒ
Builder Patterns โœ… โœ… โœ… โŒ
Advanced Features
Query Routing โœ… โŒ* โŒ* โŒ*
Graph Queries ๐Ÿšง โŒ โŒ โŒ
Direct Storage Access โœ… โŒ โŒ โŒ
Observability/Tracing โœ… โŒ โŒ โŒ
Development
Connection Pooling โœ… โœ… โœ… โŒ
Retry Logic โœ… โœ… โœ… โŒ
Error Handling โœ… โœ… โœ… โŒ

Legend: โœ… Complete โ€ข ๐Ÿšง In Progress โ€ข โŒ Not Available

*Query routing happens automatically on the server for client libraries


Benchmarks Detail

Apple M2 Ultra (192GB RAM)
Operation Size Latency Throughput
BTree Insert 100 15.8 ยตs 63,300 ops/sec
BTree Insert 1,000 325 ยตs 3,080 ops/sec
BTree Insert 10,000 4.77 ms 210 ops/sec
BTree Search 100 2.08 ยตs 482,000 queries/sec
BTree Search 1,000 33.2 ยตs 30,100 queries/sec
BTree Search 10,000 546 ยตs 1,830 queries/sec
Bulk Operations 1,000 25.4 ms 39,400 ops/sec
Bulk Operations 5,000 23.7 ms 211,000 ops/sec

Contributing

This project is developed entirely by LLM agents. Human contributions follow the same process:

  1. Open an issue describing the change
  2. Agents will review and implement
  3. Changes are validated through comprehensive testing
  4. Documentation is automatically updated

See AGENT.md for the agent collaboration protocol.


License

MIT - See LICENSE for details.


Built for KOTA โ€ข Inspired by LevelDB, Tantivy, and FAISS

The best database is the one designed specifically for your problem.

Dependencies

~25โ€“50MB
~811K SLoC