1 unstable release

0.1.1 Aug 9, 2023
0.1.0 Aug 9, 2023

#6 in #preset

Download history 10/week @ 2023-12-15 15/week @ 2023-12-22 3/week @ 2023-12-29 13/week @ 2024-01-05 8/week @ 2024-01-12 12/week @ 2024-01-19 2/week @ 2024-01-26 7/week @ 2024-02-02 12/week @ 2024-02-09 28/week @ 2024-02-16 33/week @ 2024-02-23 30/week @ 2024-03-01 22/week @ 2024-03-08 24/week @ 2024-03-15 41/week @ 2024-03-22 48/week @ 2024-03-29

136 downloads per month
Used in 8 crates (6 directly)

MPL-2.0 license

8KB
139 lines

llmvm

GitHub

A protocol and modular application suite for language models.

Includes a code assistant that automatically retrieves context, powered by LSP.

Overview

llmvm consists of three types of executable applications:

  • Frontends: specialized applications that use language models
  • The core: acts as middleman between frontend and backend; manages state related to text generation, such as:
    • Model presets
    • Prompt templates
    • Message threads
    • Projects/workspaces
  • Backends: wrappers for language models, handles raw text generation requests

The protocol acts as the glue between the above applications. Uses multilink and tower to achieve this.

Available crates

  • Frontends
    • codeassist: A LLM-powered code assistant that automatically retrieves context (i.e. type definitions) from a Language Server Protocol server
    • chat: A CLI chat interface
  • Core
  • Backends
    • outsource: Forwards generation requests to known hosted language model providers such as OpenAI, Hugging Face and Ollama.
    • llmrs: Uses the llm crate to process generation requests. Supported models include LLaMA, GPT-2, GPT-J and more.

IPC details

Each component can interact with a dependency component via three methods:

  • Local child process: the component invokes the dependency component as a child process, and communicates via stdio using JSON-RPC
  • Remote HTTP service: the dependency component acts as a HTTP API, and the dependent component is configured to make web requests to the API
  • Direct linking: The core and backends have library crates which can be used directly. Only works if dependent component is a Rust application.

This allows for some flexible hosting configurations. Here are some examples:

Hosting scenarios

Benefits

  • Single protocol for state-managed text generation requests
  • A frontend or backend can be implemented in any language, only requires a stdio and/or HTTP server/client to be available.
  • Uses Handlebars for prompt templates, allowing powerful prompt generation
  • Saves message threads, presets and prompt templates on the filesystem for easy editing/tweaking
  • Workspace / project management for isolating project state from global state
  • Modular design; any component can by invoked by the user via CLI for a one-off low-level or high-level request.

Installation

cargo is needed to install the binaries. Use rustup to install cargo.

Install the core by running:

cargo install llmvm-core

Install the desired frontends & backends listed under "Available crates". See their READMEs for more details.

Usage / configuration

See the README of each relevant component for more information on usage and configuration.

Model IDs

Model IDs in llmvm are strings consisting of three parts:

<backend name>/<provider name>/<model name>

The provider name must have the suffix -chat or -text

Examples:

  • outsource/openai-chat/gpt-3.5-turbo
  • llmrs/llmrs-text/mpt-7b-chat-q4_0-ggjt

By default, the core will invoke the process llmvm-<backend name> for local process communication.

Presets / Projects / Threads / Prompt Templates

See the core README for more information.

Model weights

See the relevant backend README (i.e. llmrs).

License

Mozilla Public License, version 2.0

Dependencies

~4–17MB
~186K SLoC