2 releases
0.1.1 | Jun 18, 2024 |
---|---|
0.1.0 | Jun 18, 2024 |
#885 in Database interfaces
43 downloads per month
780KB
669 lines
esdump-rs
Dump Elasticsearch or OpenSearch indexes to blob storage, really-really fast 🚀
Features:
- Super-dooper fast
- Supports compressing output with zstd or gzip
- Natively supports blob storage on AWS, Google Cloud and Azure
- Supports filtering and selecting specific fields
- Detailed progress output and logging
- Comes as a single, small static binary or a Docker image
- Runs on Windows, Linux or MacOS
- Written in Rust 🦀
Installation
Releases: Grab a pre-built executable from the releases page
Docker: docker run ghcr.io/gitguardian/esdump-rs:v0.1.0
Usage
Pass the Elasticsearch or OpenSearch HTTP(s) URL and a blob storage URL. Set the credentials in the environment (see example.env), and run!
$ esdump-rs http://localhost:9200 s3://es-dump/test/ \
--index=test-index \
--batches-per-file=5 \
--batch-size=5000 \
--concurrency=10
Settings such as the batch size and concurrency can be set as flags
Usage: esdump-rs [OPTIONS] --index <INDEX> --concurrency <CONCURRENCY> --batch-size <BATCH_SIZE> --batches-per-file <BATCHES_PER_FILE> <ELASTICSEARCH_URL> <OUTPUT_LOCATION>
Arguments:
<ELASTICSEARCH_URL> Elasticsearch cluster to dump
<OUTPUT_LOCATION> Location to write results. Can be a file://, s3:// or gs:// URL
Options:
-i, --index <INDEX>
Index to dump
-c, --concurrency <CONCURRENCY>
Number of concurrent requests to use
-l, --limit <LIMIT>
Limit the total number of records returned
-b, --batch-size <BATCH_SIZE>
Number of records in each batch
--batches-per-file <BATCHES_PER_FILE>
Number of batches to write per file
-q, --query <QUERY>
A file path containing a query to execute while dumping
-f, --field <FIELD>
Specific fields to fetch
--compression <COMPRESSION>
Compress the output files [default: zstd] [possible values: gzip, zstd]
--concurrent-uploads <CONCURRENT_UPLOADS>
Max chunks to concurrently upload *per task*
--upload-size <UPLOAD_SIZE>
Size of each uploaded [default: 15MB]
-d, --distribution <DISTRIBUTION>
Distribution of the cluster [possible values: elasticsearch, opensearch]
--env-file <ENV_FILE>
Distribution of the cluster [default: .env]
-h, --help
Print help
-V, --version
Print version
Dependencies
~25–40MB
~660K SLoC