#cuda #llm-client

llm_devices

Managing Devices and Builds for LLMs

2 releases

0.0.2 Oct 4, 2024
0.0.1 Oct 4, 2024

#100 in #cuda

Download history 155/week @ 2024-09-28 182/week @ 2024-10-05 35/week @ 2024-10-12 8/week @ 2024-10-19 1/week @ 2024-10-26 4/week @ 2024-11-02

57 downloads per month
Used in 2 crates

MIT license

59KB
1.5K SLoC

llm_devices: Managing Devices and Builds for LLMs

This crate is part of the llm_client crate.

The llm_interface crate uses it as a dependency for building llama.cpp.

It's functionality includes:

  • Cloning the specified tag, and building llama.cpp.

  • Checking for device availabilty (CUDA, MacOS) to determine what platform to build for.

  • Fetching available VRAM or system RAM for estimating the correct model to load.

  • Offloading model layers to memory.

  • Logging tools.

See the build documentation for more notes.

Dependencies

~7–33MB
~492K SLoC