6 releases
| 0.1.7 | Jan 26, 2026 |
|---|---|
| 0.1.6 |
|
| 0.1.1 |
|
#264 in Database interfaces
967 downloads per month
Used in 2 crates
1MB
24K
SLoC
ReasonKit Mem
Memory & Retrieval Infrastructure for ReasonKit
The Long-Term Memory Layer ("Hippocampus") for AI Reasoning
ReasonKit Mem is the memory layer ("Hippocampus") for ReasonKit. It provides vector storage, hybrid search, RAPTOR trees, and embedding support.
Features
- Vector Storage - Qdrant-based dense vector storage with embedded mode
- Hybrid Search - Dense (Qdrant) + Sparse (Tantivy BM25) fusion
- RAPTOR Trees - Hierarchical retrieval for long-form QA
- Embeddings - Local (BGE-M3) and remote (OpenAI) embedding support
- Reranking - Cross-encoder reranking for precision
Installation
Universal Installer (Recommended)
Installs all 4 ReasonKit projects together:
curl -fsSL https://get.reasonkit.sh | bash -s -- --with-memory
Platform & Shell Support:
- ✅ All platforms (Linux/macOS/Windows/WSL)
- ✅ All shells (Bash/Zsh/Fish/Nu/PowerShell/Elvish)
- ✅ Auto-detects shell and configures PATH
- ✅ Beautiful progress visualization
Cargo (Rust Library)
Add to your Cargo.toml:
[dependencies]
reasonkit-mem = "0.1"
tokio = { version = "1", features = ["full"] }
Usage
Basic Usage (Embedded Mode)
use reasonkit_mem::storage::Storage;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Create embedded storage (automatic file storage fallback)
let storage = Storage::new_embedded().await?;
// Use storage...
Ok(())
}
Storage with Custom Configuration
use reasonkit_mem::storage::{Storage, EmbeddedStorageConfig};
use std::path::PathBuf;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Create storage with custom file path
let config = EmbeddedStorageConfig::file_only(PathBuf::from("./data"));
let storage = Storage::new_embedded_with_config(config).await?;
// Or use Qdrant (requires running server)
let qdrant_config = EmbeddedStorageConfig::with_qdrant(
"http://localhost:6333",
"my_collection",
1536,
);
let qdrant_storage = Storage::new_embedded_with_config(qdrant_config).await?;
Ok(())
}
Hybrid Search with KnowledgeBase
use reasonkit_mem::retrieval::KnowledgeBase;
use reasonkit_mem::{Document, DocumentType, Source, SourceType};
use chrono::Utc;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Create in-memory knowledge base
let kb = KnowledgeBase::in_memory()?;
// Create a document
let source = Source {
source_type: SourceType::Local,
url: None,
path: Some("notes.md".to_string()),
arxiv_id: None,
github_repo: None,
retrieved_at: Utc::now(),
version: None,
};
let doc = Document::new(DocumentType::Note, source)
.with_content("Machine learning is a subset of artificial intelligence.".to_string());
// Add document to knowledge base
kb.add(&doc).await?;
// Search using sparse retrieval (BM25)
let results = kb.retriever().search_sparse("machine learning", 5).await?;
for result in results {
println!("Score: {:.3}, Text: {}", result.score, result.text);
}
Ok(())
}
Using Embeddings
use reasonkit_mem::embedding::{EmbeddingConfig, EmbeddingPipeline, OpenAIEmbedding};
use reasonkit_mem::retrieval::KnowledgeBase;
use std::sync::Arc;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Create OpenAI embedding provider (requires OPENAI_API_KEY env var)
let embedding_provider = OpenAIEmbedding::openai()?;
let pipeline = Arc::new(EmbeddingPipeline::new(Arc::new(embedding_provider)));
// Create knowledge base with embedding support
let kb = KnowledgeBase::in_memory()?
.with_embedding_pipeline(pipeline);
// Now hybrid search will use both dense (vector) and sparse (BM25)
// let results = kb.query("semantic search query", 10).await?;
Ok(())
}
Embedded Mode Documentation
For detailed information about embedded mode, see docs/EMBEDDED_MODE_GUIDE.md.
Architecture
The RAPTOR Algorithm (Hierarchical Indexing)
ReasonKit Mem implements RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval) to answer high-level questions across large document sets.

The Memory Dashboard


Technology Stack
| Component | Technology | Purpose |
|---|---|---|
| Qdrant | qdrant-client 1.10+ | Dense vector storage |
| Tantivy | tantivy 0.22+ | BM25 sparse search |
| RAPTOR | Custom Rust | Hierarchical retrieval |
| Embeddings | BGE-M3 / OpenAI | Dense representations |
| Reranking | Cross-encoder | Final precision boost |
Project Structure
reasonkit-mem/
├── src/
│ ├── storage/ # Qdrant vector + file-based storage
│ ├── embedding/ # Dense vector embeddings
│ ├── retrieval/ # Hybrid search, fusion, reranking
│ ├── raptor/ # RAPTOR hierarchical tree structure
│ ├── indexing/ # BM25/Tantivy sparse indexing
│ └── rag/ # RAG pipeline orchestration
├── benches/ # Performance benchmarks
├── examples/ # Usage examples
├── docs/ # Additional documentation
└── Cargo.toml
Feature Flags
| Feature | Description |
|---|---|
default |
Core functionality |
python |
Python bindings via PyO3 |
local-embeddings |
Local BGE-M3 embeddings via ONNX Runtime |
API Reference
Core Types (re-exported at crate root)
use reasonkit_mem::{
// Documents
Document, DocumentType, DocumentContent,
// Chunks
Chunk, EmbeddingIds,
// Sources
Source, SourceType,
// Metadata
Metadata, Author,
// Search
SearchResult, MatchSource, RetrievalConfig,
// Processing
ProcessingStatus, ProcessingState, ContentFormat,
// Errors
MemError, MemResult,
};
Storage Module
use reasonkit_mem::storage::{
Storage,
EmbeddedStorageConfig,
StorageBackend,
InMemoryStorage,
FileStorage,
QdrantStorage,
AccessContext,
AccessLevel,
};
Embedding Module
use reasonkit_mem::embedding::{
EmbeddingProvider, // Trait for embedding backends
OpenAIEmbedding, // OpenAI API embeddings
EmbeddingConfig, // Configuration
EmbeddingPipeline, // Batch processing pipeline
EmbeddingResult, // Single embedding result
EmbeddingVector, // Vec<f32> alias
cosine_similarity, // Utility function
normalize_vector, // Utility function
};
Retrieval Module
use reasonkit_mem::retrieval::{
HybridRetriever, // Main retrieval engine
KnowledgeBase, // High-level API
HybridResult, // Search result
RetrievalStats, // Statistics
// Fusion
FusionEngine,
FusionStrategy,
// Reranking
Reranker,
RerankerConfig,
};
Docsets (@Docs-style URL indexing)
reasonkit-mem can crawl documentation sites (start URL + subpages), normalize pages to Markdown, chunk, and index them for retrieval. This is exposed via the optional docset feature and the rk-mem CLI.
Notes:
- Refresh uses conditional HTTP (
ETag/Last-Modified) when available to skip unchanged pages (304 Not Modified) and speed up weekly updates. - Per-URL refresh metadata is stored in
RKMEM_DATA_DIR/docsets/<DOCSET_ID>.json(deleting it is safe; it only reduces refresh efficiency until rebuilt).
# Build/install the CLI (docset feature)
cargo install --path reasonkit-mem --features docset
# Add and refresh a docset
rk-mem docs add react https://react.dev/reference/
rk-mem docs refresh --due
# Query (agent-friendly JSON)
rk-mem docs query "useState" --docset react --top-k 5 --json
MCP server (agents)
Run a Rust MCP server over stdio so agents can call rkmem_docs_query directly:
cargo install --path reasonkit-mem --features docset
rk-mem-docs-mcp
Example MCP config:
{
"mcpServers": {
"rkmem_docs": {
"command": "rk-mem-docs-mcp",
"args": [],
"env": {
"RKMEM_DATA_DIR": "/home/user/.local/share/reasonkit/mem",
"RKMEM_MCP_ALLOW_WRITE": "0"
}
}
}
}
Write operations (rkmem_docs_add / rkmem_docs_refresh / rkmem_docs_remove) are disabled unless RKMEM_MCP_ALLOW_WRITE=1.
Weekly refresh without reasonkit-org (systemd user timer)
mkdir -p ~/.config/systemd/user
cp reasonkit-mem/systemd/user/rkmem-docs-refresh.* ~/.config/systemd/user/
systemctl --user daemon-reload
systemctl --user enable --now rkmem-docs-refresh.timer
Use RKMEM_DATA_DIR to control where docsets and indexes are stored.
Version & Maturity
| Component | Status | Notes |
|---|---|---|
| Vector Storage | ✅ Stable | Qdrant integration production-ready |
| Hybrid Search | ✅ Stable | Dense + Sparse fusion working |
| RAPTOR Trees | ✅ Stable | Hierarchical retrieval implemented |
| Embeddings | ✅ Stable | OpenAI API fully supported |
| Local Embeddings | 🔶 Beta | BGE-M3 ONNX (enable with local-embeddings feature) |
| Python Bindings | 🔶 Beta | Build from source with --features python |
Current Version: v0.1.2 | CHANGELOG | Releases
Verify Installation
use reasonkit_mem::storage::Storage;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Quick verification - creates in-memory storage
let storage = Storage::new_embedded().await?;
println!("ReasonKit Mem initialized successfully!");
Ok(())
}
License
Apache License 2.0 - see LICENSE
Dependencies
~43–110MB
~1.5M SLoC
