4 releases (breaking)
new 0.16.0 | Jan 30, 2025 |
---|---|
0.15.0 | Jan 29, 2025 |
0.14.0 | Jan 29, 2025 |
0.13.9 | Jan 29, 2025 |
#568 in Development tools
95 downloads per month
62KB
1K
SLoC
yek
A fast Rust based tool to serialize text-based files in a repository or directory for LLM consumption.[^1]
By default:
- Uses
.gitignore
rules to skip unwanted files. - Uses the Git history to infer what files are more important.
- Infers additional ignore patterns (binary, large, etc.).
- Automatically detects if output is being piped and streams content instead of writing to files.
- Supports processing multiple directories in a single command.
- Configurable via a
yek.yaml
file.
Yek يک means "One" in Farsi/Persian.
Consider having a simple repo like this:
.
├── README.md
├── src
│ ├── main.rs
│ └── utils.rs
└── tests
└── test.rs
Running yek
in this directory will produce a single file and write it to the temp directory with the following content:
>>>> README.md
... content ...
>>>> tests/test.rs
... content ...
>>>> src/utils.rs
... content ...
>>>> src/main.rs
... content ...
[!NOTE]
yek
will prioritize more important files to come last in the output. This is useful for LLM consumption since LLMs tend to pay more attention to content that appears later in the context.
Installation
Choose the installation method for your platform:
Unix-like Systems (macOS, Linux)
curl -fsSL https://bodo.run/yek.sh | bash
For Windows (PowerShell):
irm https://bodo.run/yek.ps1 | iex
Build from Source
git clone https://github.com/bodo-run/yek
cd yek
cargo install --path .
Usage
yek
has sensible defaults, you can simply run yek
in a directory to serialize the entire repository. It will serialize all files in the repository and write them into a temporary file. The path to the file will be printed to the console.
Examples
Process current directory and write to temp directory:
yek
Pipe output to clipboard (macOS):
yek src/ | pbcopy
Cap the max output size to 128K tokens:
yek --tokens 128k
[!NOTE]
yek
will remove any files that won't fit in the capped context size. It will try to fit in more important files
yek --max-size 100KB --output-dir /tmp/yek src/
Process multiple directories:
yek src/ tests/
CLI Reference
yek --help
Usage: yek [OPTIONS] [input-dirs]...
Arguments:
[input-dirs]...
Options:
--no-config
--config-file <CONFIG_FILE>
--max-size <MAX_SIZE> [default: 10MB]
--tokens <TOKENS>
--json
--debug
--output-dir [<OUTPUT_DIR>]
--output-template <OUTPUT_TEMPLATE> [default: ">>>> FILE_PATH\nFILE_CONTENT"]
--ignore-patterns <IGNORE_PATTERNS>...
--unignore-patterns <UNIGNORE_PATTERNS>...
-h, --help Print help
Configuration File
You can place a file called yek.yaml
at your project root or pass a custom path via --config
. The configuration file allows you to:
- Add custom ignore patterns
- Define file priority rules for processing order
- Add additional binary file extensions to ignore (extends the built-in list)
- Configure Git-based priority boost
- Define output directory
- Define output template
Example yek.yaml
You can also use yek.toml
or yek.json
instead of yek.yaml
.
This is optional, you can configure the yek.yaml
file at the root of your project.
# Add patterns to ignore (in addition to .gitignore)
ignore_patterns:
- "ai-promots/**"
- "__generated__/**"
# Configure Git-based priority boost (optional)
git_boost_max: 50 # Maximum score boost based on Git history (default: 100)
# Define priority rules for processing order
# Higher scores are processed first
priority_rules:
- score: 100
pattern: "^src/lib/"
- score: 90
pattern: "^src/"
- score: 80
pattern: "^docs/"
# Add additional binary file extensions to ignore
# These extend the built-in list (.jpg, .png, .exe, etc.)
binary_extensions:
- ".blend" # Blender files
- ".fbx" # 3D model files
- ".max" # 3ds Max files
- ".psd" # Photoshop files
# Define output directory
output_dir: /tmp/yek
# Define output template.
# FILE_PATH and FILE_CONTENT are expected to be present in the template.
output_template: "{{{FILE_PATH}}}\n\nFILE_CONTENT"
All configuration keys are optional. By default:
- No extra ignore patterns, only the ones from
.gitignore
are used. - All files have equal priority (score: 1)
- Git-based priority boost maximum is 100
- Common binary file extensions are ignored (.jpg, .png, .exe, etc. - see source for full list)
Performance
yek
is fast. It's written in Rust and does many things in parallel to speed up processing.
Here is a benchmark comparing it to Repomix serializing the Next.js project:
time yek
Executed in 5.19 secs fish external
usr time 2.85 secs 54.00 micros 2.85 secs
sys time 6.31 secs 629.00 micros 6.31 secs
time repomix
Executed in 22.24 mins fish external
usr time 21.99 mins 0.18 millis 21.99 mins
sys time 0.23 mins 1.72 millis 0.23 mins
yek
is 230x faster than repomix
.
Roadmap
See proposed features. I am open to accepting new feature requests. Please write a detailed proposal to discuss new features.
Alternatives
- Repomix: A tool to serialize a repository into a single file in a similar way to
yek
. - Aider: A full IDE like experience for coding using AI
License
[^1]: yek
is not "blazingly" fast. It's just fast, as fast as your computer can be.
Dependencies
~25–39MB
~697K SLoC