1 unstable release

new 0.1.0 May 13, 2025

#53 in Machine learning

Apache-2.0

3.5MB
2K SLoC

🚧 Under Construction

Reservoir is in active development. It's not ready for production use yet. Expect breaking changes.

Reservoir

What is Reservoir?

Reservoir is your helpful memory for AI conversations. It sits between your app and the OpenAI Chat Completions API, making it easier to have rich, ongoing conversations with your favorite language models.

Why does this matter?

When you use the OpenAI Chat Completions API, you need to send the full conversation history with every request. For example:

[
  {"role": "user", "content": "What is 1 + 1?"},
  {"role": "assistant", "content": "2"},
  {"role": "user", "content": "What is the answer times 3?"}
]

If you only send the last question, the model won't know what "the answer" refers to. You have to keep track of all previous messages and include them every time.

This can get tricky as conversations grow!

Reservoir acts as a smart proxy: it automatically stores your chat history and inserts the right context into each request. You just talk to the API as usual and Reservoir handles the memory, context, and even finds other relevant messages from your past conversations to help the model give better answers.

  • No more manual history management
  • Automatic context enrichment
  • Your data stays private and local

Use Reservoir with Multiple Apps

You can point multiple apps or clients to a single Reservoir instance. This means you can keep context and history across different tools on your computer—like your terminal, a web app, or a chat client. If you want to keep conversations separate, you can use Reservoir's partitioning feature to organize chats by app, project, or any context you choose.

Why Use Reservoir?

  • Own your AI history: All your conversations are stored locally, never in the cloud.
  • Search and recall: Instantly find previous chats, ideas, or code snippets from your AI interactions.
  • Enrich context: Automatically inject relevant history into new prompts for more coherent, personalized responses.
  • Visualize conversations: See how your discussions branch and connect over time.
  • Stay private: Your data never leaves your device.

Screenshot

Reservoir lets you have conversations with multiple AI models and providers, all while keeping your data private and local. Every interaction is stored on your device, building a personal knowledge base that never leaves your network. A single thread of conversation can span multiple models without losing context, allowing you to seamlessly switch between different AI providers while maintaining the flow of your discussion.

Table of Contents

Overview

Reservoir intercepts your API calls, enriches them with relevant history, manages token limits, and then forwards them to the actual LLM service.

sequenceDiagram
    participant App
    participant Reservoir
    participant Neo4j
    participant LLM as OpenAI/Ollama

    App->>Reservoir: Request (e.g. /v1/chat/completions/$USER/my-application)
    Reservoir->>Reservoir: Check if last message exceeds token limit (Return error if true)
    Reservoir->>Reservoir: Tag with Trace ID + Partition
    Reservoir->>Neo4j: Store original request message(s)

    %% --- Context Enrichment Steps ---
    Reservoir->>Neo4j: Query for similar & recent messages
    Neo4j-->>Reservoir: Return relevant context messages
    Reservoir->>Reservoir: Inject context messages into request payload
    %% --- End Enrichment Steps ---

    Reservoir->>Reservoir: Check total token count & truncate if needed (preserving system/last messages)

    Reservoir->>LLM: Forward enriched & potentially truncated request
    LLM->>Reservoir: Return LLM response
    Reservoir->>Neo4j: Store LLM response message
    Reservoir->>App: Return LLM response

This sequence diagram provides a high-level overview of how Reservoir processes requests and responses.

Conversation Threads via Synapses

Reservoir uses synapse relationships to create "threads" of semantically related messages within the conversation graph. As messages are added, synapses link them sequentially, forming a continuous flow. When the similarity between messages drops below a threshold, the thread is split, marking a topic change. This results in distinct conversation threads, making it easy to visualize and retrieve related exchanges.

You can see an example of this structure in the following graph visualization:

Conversation Graph View

Documentation

Reservoir's documentation is organized into the following sections:

  • Architecture: System and component overview.
  • API: API endpoints, usage, and examples.
  • Data Model: How data is stored in Neo4j, including the schema.
  • Development: Setting up the development environment, running locally, and contributing.
  • Features: Key features and future roadmap.
  • Deployment: Steps to deploy Reservoir locally or in production.
  • FAQ: Troubleshooting, common questions, and tips.

Quick Start

Reservoir provides an OpenAI-compatible API endpoint. You can use your system username as the partition and your application name as the instance for best results.

Starting the Server

To start the Reservoir server:

cargo run -- start

This command:

  1. Initializes the vector index in Neo4j for semantic search
  2. Starts the server on the configured port (default: 3017)

The server will be available at http://localhost:3017 (or your configured port).

Import/Export Data

Reservoir supports exporting all message nodes to a JSON file and importing them back into the database. This is useful for backup, migration, or sharing your AI conversation history.

Export all message nodes to JSON

cargo run -- export > messages.json

This command prints all message nodes in the database as pretty-printed JSON to stdout. Redirect the output to a file to save it.

Import message nodes from a JSON file

cargo run -- import path/to/messages.json

View the last N messages

cargo run -- view <COUNT> [--partition <PARTITION>] [--instance <INSTANCE>]

Displays the last <COUNT> messages in the specified partition and instance. If not provided, partition defaults to "default" and instance defaults to the partition.

Example:

cargo run -- view 5 --partition sales --instance eu-west

Sample output:

2025-05-09T14:23:01+00:00 [abc123] user: Hello there!
2025-05-09T14:23:02+00:00 [abc123] assistant: Hi! How can I help?
2025-05-09T14:24:10+00:00 [def456] user: Show me last week's sales report.
2025-05-09T14:24:12+00:00 [def456] assistant: Here is the summary for last week's sales...
2025-05-09T14:25:00+00:00 [ghi789] user: Thanks!

This command reads the specified JSON file (in the same format as the export) and imports all message nodes into the database.

Example Usage

  • Instead of:
    https://api.openai.com/v1/chat/completions
  • Use:
    http://127.0.0.1:3017/partition/$USER/instance/reservoir/v1/chat/completions

Here, $USER is your system username, and reservoir is the instance name.

Curl Example

curl "http://127.0.0.1:3017/partition/$USER/instance/reservoir/v1/chat/completions" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $OPENAI_API_KEY" \
    -d '{
        "model": "gpt-4",
        "messages": [
            {
                "role": "user",
                "content": "Write a one-sentence bedtime story about a brave little toaster."
            }
        ]
    }'

Python Example (using openai library)

import os
from openai import OpenAI

INSTANCE = "my-application"
PARTITION = os.getenv("USER")
RESERVOIR_PORT = os.getenv('RESERVOIR_PORT', '3017')
RESERVOIR_BASE_URL = f"http://localhost:{RESERVOIR_PORT}/v1/partition/{PARTITION}/instance/{INSTANCE}"

client = OpenAI(
    base_url=RESERVOIR_BASE_URL,
    api_key=os.environ.get("OPENAI_API_KEY")
)

completion = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {
            "role": "user",
            "content": "Write a one-sentence bedtime story about a curious robot."
        }
    ]
)
print(completion.choices[0].message.content)

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Dependencies

~35–52MB
~782K SLoC