1 unstable release

Uses new Rust 2024

0.1.0 Aug 1, 2025

#1390 in Machine learning

MIT license

3KB

llama.rust

LLM inference in Rust

Inference

cargo run --release -p llama-rust -- --model "meta-llama/Llama-2-7b-hf" \
    --prompt "What is the capital of France?" --max-tokens 20 --temperature 0.7 --cpu

Parameters

  • --model: Path to the Hugging Face model ID.
  • --prompt: The prompt to use for inference.
  • --max-tokens: The maximum number of tokens to generate.
  • --temperature: The temperature to use for sampling.

References

This project draws inspiration from:

No runtime deps