#csv #rabbitmq #line-count #split #chunks #amqp #splitter

app csv-splitter

Splits CSV files into chunks by line count. Supports chunk upload to RabbitMQ.

3 unstable releases

0.2.1 Sep 24, 2023
0.1.1 Sep 23, 2023
0.1.0 Sep 23, 2023

#3 in #splitter

MIT and LGPL-3.0

16KB
326 lines

MIT License Rust

Super Fast CSV Splitter

This is a Rust program that splits a CSV file into multiple files. It is designed to be as fast as possible.

This is also a work in progress, so your mileage may vary.

Installation

$ cargo install csv-splitter

or:

$ git clone git@github.com:patterns-complexity/csv-splitter
$ cd csv-splitter
$ cargo build --release
$ cd ./target/release/
$ ./csv-splitter -h

Usage and help

$ csv-splitter -i <input_directory> -o <output_directory> -g <granularity>

or:

$ csv-splitter --input <input_directory> --output <output_directory> --granularity <granularity>

or (with AMQP):

$ csv-splitter -i <input_directory> -o <output_directory> -g <granularity> -q <queue_name> -u <amqp_url>

or (with AMQP):

$ csv-splitter --input <input_directory> --output <output_directory> --granularity <granularity> --queue_name <queue_name> --amqp_url <amqp_url>

help:

$ csv-splitter -h

Arguments

Short form Long form Description Required
-h --help Prints help information No
-i --input The directory containing the CSV files to split Yes
-o --output The directory to output the chunks to Yes
-g --granularity Chunk line count Yes
-q --queue_name AMQP queue name, needs -u if supplied No
-u --amqp_url The URI of the AMQP queue (amqp://user:password@host:port) No

Example

$ csv-splitter -i ./input -o ./output -g 500

Example with AMQP

$ csv-splitter -i ./input -o ./output -g 500 -q my_queue -u amqp://user:password@host:port

License

MIT License

Author

Patterns Complexity

Dependencies

~15–27MB
~463K SLoC