3 releases
0.1.3 | May 18, 2025 |
---|---|
0.1.2 | May 18, 2025 |
0.1.1 | May 17, 2025 |
#1570 in Network programming
35 downloads per month
41KB
756 lines
Model Service CLI
A command-line interface (CLI) to interact with the Model Service Orchestrator/Proxy. This tool allows you to register, unregister, and list model services managed by the backend server.
Features
- Register: Register a new model service (e.g., a vLLM instance) with the orchestrator, specifying its model name and address.
- Unregister: Remove a previously registered model service from the orchestrator using its address.
- List: Display all currently registered model services, showing their model names and addresses.
Prerequisites
- Rust Toolchain: You need Rust and Cargo installed to build the CLI. Visit rust-lang.org for installation instructions.
- Running Backend Server: The Axum-based Model Service Orchestrator/Proxy must be running and accessible. By default, this CLI expects the server to be at
http://127.0.0.1:11450
.
Building
- Clone the repository (if you have it in one):
git clone <your-repo-url> cd <repository-name>
- Build the CLI:
- For a development build:
The executable will be incargo build
./target/debug/llmproxy
. - For a release (optimized) build:
The executable will be incargo build --release
./target/release/llmproxy
.
- For a development build:
Usage
The general command structure is:
./path/to/llmproxy <COMMAND> [OPTIONS]
You can get help for the main command or any subcommand:
./path/to/llmproxy --help
./path/to/llmproxy register --help
Commands
1. register
Registers a new model service with the orchestrator.
Options:
--model-name <MODEL_NAME>
: The name of the model being served (e.g., "Qwen/Qwen2-7B-Instruct"). (Required)--addr <ADDR>
: The address (host:port) of the model service (e.g., "localhost:8001"). (Required)
Example:
./target/debug/llmproxy register --model-name "Qwen/Qwen2-7B-Instruct" --addr "127.0.0.1:8001"
Expected Output (Success):
Success (201 Created): Server registered successfully
or if already registered:
Success (200 OK): Server already registered
2. unregister
Unregisters an existing model service from the orchestrator using its address.
Options:
--addr <ADDR>
: The address (host:port) of the model service to unregister (e.g., "127.0.0.1:8001"). (Required)
Example:
./target/debug/llmproxy unregister --addr "127.0.0.1:8001"
Expected Output (Success):
Success (200 OK): Server unregistered successfully
or if not found:
Failed (404 Not Found): Server not found
3. list
Lists all currently registered model services.
Example:
./target/debug/llmproxy list
Expected Output:
Registered model services (2):
- Model: Qwen/Qwen2-7B-Instruct, Addr: 127.0.0.1:8001
- Model: Llama3-8B, Addr: 127.0.0.1:8002
or if none are registered:
No model services registered.
Backend Server
This CLI tool is a client for the Axum-based backend server. Ensure the server is running and configured correctly (defaulting to http://127.0.0.1:11450
). The server is responsible for:
- Maintaining the list of active model services.
- Proxying incoming requests to the appropriate registered model service based on the
model
field in the request body.
cargo run --release --bin llmproxyd
Troubleshooting
- Connection Refused: Ensure the backend server is running and accessible at
http://127.0.0.1:11450
(or the configured address if you modify theBASE_URL
in the CLI source). - Unexpected JSON Errors: Verify that the backend server's API responses match what the CLI expects.
Failed to parse server response
: This could indicate an issue with the server's response format or a network problem. The CLI will attempt to print the raw body which might give clues.
Dependencies
~18–31MB
~467K SLoC