13 releases (7 breaking)

0.10.1 Aug 12, 2024
0.9.0 Aug 12, 2024
0.5.1 Jul 30, 2024

#773 in Database interfaces

MIT license

98KB
2.5K SLoC

dRAGon

dRAGon is an embedded Vector database in Rust with helper functions for Retrieval-Augmented Generation (RAG). It provides a powerful and flexible solution for managing and querying vector embeddings, making it ideal for various natural language processing and machine learning applications.

Features

  • 🚀 Efficient vector storage and retrieval using PostgreSQL with pgvector
  • 📄 Support for multiple file formats (PDF, TXT, DOCX, CSV, Markdown)
  • 🔍 Similarity search functionality
  • 📦 Batch operations for adding and updating vectors
  • 🧠 Customizable embedding model and inference via fastembed-rs
  • 🌐 RESTful API for easy integration
  • 🔄 Hybrid search combining full-text and vector similarity
  • 🔒 Embedded PostgreSQL for easy setup and deployment

Table of Contents

Installation

Prerequisites

  • Rust
  • protoc (Protocol Buffers Compiler)

Installing protoc

MacOS

brew install protobuf

Linux

sudo apt-get install protobuf-compiler

Windows

choco install protobuf

or

scoop install protobuf

Building from Source

  1. Clone the repository:

    git clone https://github.com/portalcorp/dRAGon.git
    cd dRAGon
    
  2. Build the project:

    cargo build --release
    
  3. Run the server:

    cargo run
    

Usage

As a Library

dRAGon can be used as a library in your Rust projects. Here's how you can import and use it:

  1. Add dRAGon to your project
cargo add dragon_db
  1. In your Rust code, import and use dRAGon:
use dragon_db::start_server;

#[rocket::launch]
pub async fn run() -> _ {
    start_server().await
}

This code snippet demonstrates how to start the dRAGon server from your own package. It uses the start_server() function from the dRAGon library to configure and launch the server.

API Usage

Once the server is running, you can interact with it using HTTP requests. Here are some example API calls:

startLine: 282
endLine: 433

For detailed API documentation, refer to the docs UI available at http://localhost:8000/docs when running the server.

API Endpoints

For detailed API documentation, refer to the docs UI available at http://localhost:8000/docs when running the server.

Configuration

The dRAGon server can be configured using environment variables:

  • ROCKET_PORT: The port on which the server will listen (default: 8000)
  • BASE_PATH: The base path on which the server will listen (default: "/")
  • DB_PATH: The path to store the database files (default: "data/lance")
  • COLLECTION_NAME: The name of the default collection (default: "vectors")
  • EMBEDDING_MODEL: The embedding model to use (default: "BGESmallENV15")

Development

To set up the development environment:

  1. Install Rust and Cargo: https://www.rust-lang.org/tools/install
  2. Clone the repository and navigate to the project directory
  3. Install dependencies: cargo build
  4. Run the development server: cargo run

Testing

To run the test suite:

cargo test -- --test-threads=1

We disable parallel database tests to avoid creating multiple temporary databases and increasing memory usage.

License

dRAGon is released under the MIT License. See the LICENSE file for details.

Dependencies

~89–125MB
~2M SLoC