#gguf #ggml #llama-cpp

app gguf-utils

Utilities for handling gguf files

3 releases

0.0.2 Dec 29, 2024
0.0.1 Dec 10, 2024
0.0.0 Nov 25, 2024

#109 in Filesystem

Download history 83/week @ 2024-11-19 66/week @ 2024-11-26 165/week @ 2024-12-10 2/week @ 2024-12-17 125/week @ 2024-12-24 4/week @ 2024-12-31

296 downloads per month

MIT license

155KB
4K SLoC

gguf

CI Latest version Documentation license

GitHub Issues GitHub Pull Requests GitHub repo size GitHub code size in bytes GitHub contributors GitHub commit activity

ggus 库

gguf 实用工具

帮助信息

gguf-utils --help

# in project dir
cargo xtask --help
gguf-utils is a command-line tool for working with gguf files

Usage: gguf-utils <COMMAND>

Commands:
  show      Show the contents of gguf files
  split     Split gguf files into shards
  merge     Merge shards into a single gguf file
  cast      Cast data types in gguf files
  convert   Convert gguf files to different format
  to-llama  Convert gguf files to Llama format
  set-meta  Set metadata of gguf files
  help      Print this message or the help of the given subcommand(s)

Options:
  -h, --help     Print help
  -V, --version  Print version

阅读内容

gguf-utils show --help

# in project dir
cargo show --help
Show the contents of gguf files

Usage: gguf-utils show [OPTIONS] <FILE>

Arguments:
  <FILE>  The file to show

Options:
      --shards                         If set, show all shards in the directory
  -n, --array-detail <ARRAY_DETAIL>    How many elements to show in arrays, `all` for all elements [default: 8]
  -m, --filter-meta <FILTER_META>      Meta to show [default: *]
  -t, --filter-tensor <FILTER_TENSOR>  Tensors to show [default: *]
      --log <LOG>                      Log level, may be "off", "trace", "debug", "info" or "error"
  -h, --help                           Print help

分片

gguf-utils split --help

# in project dir
cargo split --help
Split gguf files into shards

Usage: gguf-utils split [OPTIONS] <FILE>

Arguments:
  <FILE>  File to split

Options:
  -o, --output-dir <OUTPUT_DIR>    Output directory for converted files
  -t, --max-tensors <MAX_TENSORS>  Max count of tensors per shard
  -s, --max-bytes <MAX_BYTES>      Max size in bytes per shard
      --no-tensor-first            If set, the first shard will not contain any tensor
      --no-data                    If set, tensor data will not be written to output files
      --log <LOG>                  Log level, may be "off", "trace", "debug", "info" or "error"
  -h, --help                       Print help

合并

gguf-utils merge --help

# in project dir
cargo merge --help
Merge shards into a single gguf file

Usage: gguf-utils merge [OPTIONS] <FILE>

Arguments:
  <FILE>  One of the shards to merge

Options:
  -o, --output-dir <OUTPUT_DIR>  Output directory for merged file
      --no-data                  If set, tensor data will not be written to output files
      --log <LOG>                Log level, may be "off", "trace", "debug", "info" or "error"
  -h, --help                     Print help

转换数据类型

gguf-utils cast --help

# in project dir
cargo cast --help
Cast data types in gguf files

Usage: gguf-utils cast [OPTIONS] --types <TYPES> <FILE>

Arguments:
  <FILE>  File to convert

Options:
  -x, --types <TYPES>
  -o, --output-dir <OUTPUT_DIR>    Output directory for converted files
  -t, --max-tensors <MAX_TENSORS>  Max count of tensors per shard
  -s, --max-bytes <MAX_BYTES>      Max size in bytes per shard
      --no-tensor-first            If set, the first shard will not contain any tensor
      --no-data                    If set, tensor data will not be written to output files
      --log <LOG>                  Log level, may be "off", "trace", "debug", "info" or "error"
  -h, --help                       Print help

转换格式

gguf-utils convert --help

# in project dir
cargo convert --help
Convert gguf files to different format

Usage: gguf-utils convert [OPTIONS] --steps <STEPS> <FILE>

Arguments:
  <FILE>  File to convert

Options:
  -x, --steps <STEPS>              Steps to apply, separated by "->", maybe "sort", "merge-linear", "split-linear", "filter-meta:<key>" or "filter-tensor:<name>"
  -o, --output-dir <OUTPUT_DIR>    Output directory for converted files
  -t, --max-tensors <MAX_TENSORS>  Max count of tensors per shard
  -s, --max-bytes <MAX_BYTES>      Max size in bytes per shard
      --no-tensor-first            If set, the first shard will not contain any tensor
      --no-data                    If set, tensor data will not be written to output files
      --log <LOG>                  Log level, may be "off", "trace", "debug", "info" or "error"
  -h, --help                       Print help

转换到标准 llama

gguf-utils to-llama --help

# in project dir
cargo to-llama --help
Convert gguf files to Llama format

Usage: gguf-utils to-llama [OPTIONS] <FILE>

Arguments:
  <FILE>  File to convert

Options:
  -x, --extra <EXTRA>              Extra metadata for convertion
  -o, --output-dir <OUTPUT_DIR>    Output directory for converted files
  -t, --max-tensors <MAX_TENSORS>  Max count of tensors per shard
  -s, --max-bytes <MAX_BYTES>      Max size in bytes per shard
      --no-tensor-first            If set, the first shard will not contain any tensor
      --no-data                    If set, tensor data will not be written to output files
      --log <LOG>                  Log level, may be "off", "trace", "debug", "info" or "error"
  -h, --help                       Print help

修改元信息

gguf-utils set-meta --help

# in project dir
cargo set-meta --help
Set metadata of gguf files

Usage: gguf-utils set-meta [OPTIONS] <FILE> <META_KVS>

Arguments:
  <FILE>      File to set metadata
  <META_KVS>  Meta data to set for the file

Options:
  -o, --output-dir <OUTPUT_DIR>    Output directory for converted files
  -t, --max-tensors <MAX_TENSORS>  Max count of tensors per shard
  -s, --max-bytes <MAX_BYTES>      Max size in bytes per shard
      --no-tensor-first            If set, the first shard will not contain any tensor
      --no-data                    If set, tensor data will not be written to output files
      --log <LOG>                  Log level, may be "off", "trace", "debug", "info" or "error"
  -h, --help                       Print help

<META_KVS> 是具有特定格式的字符串或文本文件路径。工具将先检查文件是否为路径,如果是则从文件读取,否则视作字符串字面量。

格式要求如下:

  1. 配置代数类型元信息

    代数类型包括整型、无符号整型、浮点型和布尔。

    '<KEY>'<Ty> <VAL>
    
  2. 配置字符串元信息

    单行字符串:

    '<KEY>'str "<VAL>"
    

    多行字符串:

    '<KEY>'str<Sep>
    <Sep> [Content]
    <Sep> [Content]
    <Sep> [Content]
    
    

    其中 Sep 是表示字符串继续的分隔符。必须紧邻 str,之间不能包含空白字符,且分隔符中也不能包含空白字符。 连续的多行字符串,每行必须以分隔符+空格起始,此行后续所有字符(包括换行符)都被视作多行字符串的内容,不转义。 任何不以分隔符开始的行(包括空行)都将结束多行字符串。

  3. 配置数组元信息

    TODO: 当前此功能未实现。

这是一个配置元信息的示例文件内容:

'llama.block_count'             u64 22
'llama.context_length'          u64 2048
'llama.embedding_length'        u64 2048
'llama.feed_forward_length'     u64 5632
'llama.attention.head_count'    u64 32
'llama.attention.head_count_kv' u64 4
'llama.rope.dimension_count'    u64 64

'tokenizer.chat_template' str|
| {%- for message in messages -%}
| {%- if message['role'] == 'user' -%}
| {{ '<|user|>
| ' + message['content'] + eos_token }}
| {%- elif message['role'] == 'system' -%}
| {{ '<|system|>
| ' + message['content'] + eos_token }}
| {%- elif message['role'] == 'assistant' -%}
| {{ '<|assistant|>
| ' + message['content'] + eos_token }}
| {%- endif -%}
| {%- if loop.last and add_generation_prompt -%}
| {{ '<|assistant|>
| ' }}
| {%- endif -%}
| {%- endfor -%}

Dependencies

~7–19MB
~208K SLoC