3 releases

0.1.3 May 18, 2025
0.1.2 May 18, 2025
0.1.1 May 17, 2025

#1570 in Network programming

35 downloads per month

MIT license

41KB
756 lines

Model Service CLI

GitHub Actions Workflow Status Crates.io Version Crates.io Downloads (recent) dependency status Crates.io License Crates.io Size

A command-line interface (CLI) to interact with the Model Service Orchestrator/Proxy. This tool allows you to register, unregister, and list model services managed by the backend server.

Features

  • Register: Register a new model service (e.g., a vLLM instance) with the orchestrator, specifying its model name and address.
  • Unregister: Remove a previously registered model service from the orchestrator using its address.
  • List: Display all currently registered model services, showing their model names and addresses.

Prerequisites

  1. Rust Toolchain: You need Rust and Cargo installed to build the CLI. Visit rust-lang.org for installation instructions.
  2. Running Backend Server: The Axum-based Model Service Orchestrator/Proxy must be running and accessible. By default, this CLI expects the server to be at http://127.0.0.1:11450.

Building

  1. Clone the repository (if you have it in one):
    git clone <your-repo-url>
    cd <repository-name>
    
  2. Build the CLI:
    • For a development build:
      cargo build
      
      The executable will be in ./target/debug/llmproxy.
    • For a release (optimized) build:
      cargo build --release
      
      The executable will be in ./target/release/llmproxy.

Usage

The general command structure is:

./path/to/llmproxy <COMMAND> [OPTIONS]

You can get help for the main command or any subcommand:

./path/to/llmproxy --help
./path/to/llmproxy register --help

Commands

1. register

Registers a new model service with the orchestrator.

Options:

  • --model-name <MODEL_NAME>: The name of the model being served (e.g., "Qwen/Qwen2-7B-Instruct"). (Required)
  • --addr <ADDR>: The address (host:port) of the model service (e.g., "localhost:8001"). (Required)

Example:

./target/debug/llmproxy register --model-name "Qwen/Qwen2-7B-Instruct" --addr "127.0.0.1:8001"

Expected Output (Success):

Success (201 Created): Server registered successfully

or if already registered:

Success (200 OK): Server already registered

2. unregister

Unregisters an existing model service from the orchestrator using its address.

Options:

  • --addr <ADDR>: The address (host:port) of the model service to unregister (e.g., "127.0.0.1:8001"). (Required)

Example:

./target/debug/llmproxy unregister --addr "127.0.0.1:8001"

Expected Output (Success):

Success (200 OK): Server unregistered successfully

or if not found:

Failed (404 Not Found): Server not found

3. list

Lists all currently registered model services.

Example:

./target/debug/llmproxy list

Expected Output:

Registered model services (2):
  - Model: Qwen/Qwen2-7B-Instruct, Addr: 127.0.0.1:8001
  - Model: Llama3-8B, Addr: 127.0.0.1:8002

or if none are registered:

No model services registered.

Backend Server

This CLI tool is a client for the Axum-based backend server. Ensure the server is running and configured correctly (defaulting to http://127.0.0.1:11450). The server is responsible for:

  • Maintaining the list of active model services.
  • Proxying incoming requests to the appropriate registered model service based on the model field in the request body.
cargo run --release --bin llmproxyd

Troubleshooting

  • Connection Refused: Ensure the backend server is running and accessible at http://127.0.0.1:11450 (or the configured address if you modify the BASE_URL in the CLI source).
  • Unexpected JSON Errors: Verify that the backend server's API responses match what the CLI expects.
  • Failed to parse server response: This could indicate an issue with the server's response format or a network problem. The CLI will attempt to print the raw body which might give clues.

Dependencies

~18–31MB
~467K SLoC