#file #directory #file-content #concatenate #utility #cli-file #file-extension

app dirscribe

A CLI tool that combines contents of files with specific extensions from a directory

10 releases (stable)

1.1.3 Feb 1, 2025
1.1.2 Jan 29, 2025
0.5.0-alpha.1 Jan 11, 2025
0.4.0-alpha.1 Dec 7, 2024
0.1.0 Dec 3, 2024

#106 in Filesystem

Download history 367/week @ 2024-12-01 69/week @ 2024-12-08 2/week @ 2024-12-15 1/week @ 2024-12-22 55/week @ 2025-01-05 172/week @ 2025-01-12 136/week @ 2025-01-19 525/week @ 2025-01-26 176/week @ 2025-02-02

1,064 downloads per month

MIT license

73KB
1.5K SLoC

dirscribe

A CLI tool that collects and combines files with specific extensions from a directory into a single output. The output is copied to the clipboard by default.

Features and Options

  • Recursively traverse directory and filter by file extension
  • Automatically applies .gitignore
  • Configure subpaths to include or exclude
  • Filter by positive and/or negative keyword filters
  • Only output diff, between commit ids or from a specified commit id to the current state
  • Embed output in prompt template
  • Write output to file
  • Create summaries of file contents using LLM APIs
  • Save summaries as comments on top of files
  • Retrieve summaries from files with summaries added to them

Installation

cargo install dirscribe

Usage

Basic syntax:

dirscribe <comma_separated_suffixes_or_file_names_or_wildcard> [options]

Examples:

dirscribe md,py,Dockerfile
dirscribe "*"

Demo (on Youtube)

Video showing how to use dirscribe

Options

'Deterministic' Processing options

  • --exclude-paths: Comma-separated paths to exclude
  • --include-paths: Comma-separated paths to include
  • --or-keywords: Only include files containing at least one of these keywords
  • --and-keywords: Only include files containing all of these keywords
  • --exclude-keywords: Exclude files containing any of these keywords
  • --diff-only: Only process files that have Git changes
  • --start-commit-id: Starting commit ID for Git diff range (optional). If provided alone without end-commit-id, diffs from this commit to the current working directory
  • --end-commit-id: Ending commit ID for Git diff range (optional). Must be used with start-commit-id
  • --prompt-template-path: Path to a template file that will wrap the output. The template must contain the placeholder ${${CONTENT}$}$ where the collected content should be inserted
  • --output-path: Path where the output file should be written. If not provided, output will be copied to clipboard
  • --dont-use-gitignore: include files covered by .gitignore

LLM based options

  • --summarize: Pass either file content or file diffs to LLM for summarization
  • --summarize-keywords: Pass either file content or file diffs to LLM for summarization, and extract classes, functions and methods defined or used
  • --apply: Write the LLM-generated summaries as multiline comments at the top of each file, to reduce duplicate work
  • --retrieve: Retrieve summaries from files, after they were "applied" at a previous point

Example with Diff Only

# Example using Git commit range
dirscribe rs,md \
  --diff-only \
  --start-commit-id abc123 \
  --end-commit-id def456

This will only process files that changed between commits abc123 and def456.

Example with Summarize

dirscribe rs,md --summarize --apply

This will pass each file that was discovered to the Deepkseek or Anthropic API, or a locally running Ollama endpoint. The provider is set with the env variable DIRSCRIBE_PROVIDER, which can be set to anthropic, deepseek, gemini or ollama.

For each non-local provider, PROVIDER_API_KEY needs to be set.

The model used can be specified using DIRSCRIBE_MODEL.

The number of concurrent requests used can be set using DIRSCRIBE_CONCURRENT_REQUESTS.

Example with Prompt Template

dirscribe rs,md \
  --exclude-paths src/core,src/temp \
  --or-keywords "TODO,FIXME" \
  --prompt-template-path "summarize-issues-to-address-prompt.txt"

Output Format

The output is in this format:

File Paths:
/path/to/file1.txt
/path/to/file2.md

File Contents:
File: /path/to/file1.txt
[Contents of file1.txt]

File: /path/to/file2.md
[Contents of file2.md]

If a prompt template path is specified, this output will be embedded in that template for the final output.

Template

You can specify a template to embed the output in. The template should be a txt file that contains the string "${${CONTENT}$}$" (without quotation marks), and that string will be replaced with the output as shown above.

License

MIT License

Dependencies

~20–34MB
~570K SLoC