5 releases

0.2.15 Oct 9, 2024
0.2.14 Sep 6, 2024
0.2.13 Aug 15, 2024
0.2.12 Jul 25, 2024
0.1.5 Mar 27, 2024

#1342 in Parser implementations

Download history 86/week @ 2024-07-01 128/week @ 2024-07-08 4/week @ 2024-07-15 122/week @ 2024-07-22 9/week @ 2024-07-29 112/week @ 2024-08-12 148/week @ 2024-09-02 1/week @ 2024-09-09 10/week @ 2024-09-16 5/week @ 2024-09-23 168/week @ 2024-10-07 27/week @ 2024-10-14

200 downloads per month

Custom license

105KB
269 lines

Convert CSV To Parquet (CC2P)

Build Publish cc2p

(CC2P) is a Rust-based project that converts CSV files in a selected folder into parquet format. This tool provides a simple and efficient way of handling and converting your CSV data files.

Installation & Usage

Prerequisites

  • Rust 1.75

Building

Provide instructions on how to build the project, for example, installing the Rust compiler and necessary crates.

Here is how to install the cc2p directly from the Git repository:

cargo install cc2p

Running

Provide Instructions on how to run the scripts. For example, how to specify the input CSV file and the output Parquet file.

cc2p [OPTIONS] /path/to/csv/file.csv

Options:

  • delimiter : delimiter char used in CSV files (default: ,)
  • no-header : whether to include the header in the CSV search column (default: false)
  • worker: Number of worker threads to use for performing the task (default: 4)
  • sampling: Number of rows to sample for inferring the schema (default: 100)
> cc2p --help

Convert a CSV to parquet file format

Usage: cc2p.exe [OPTIONS] [PATH]

Arguments:
  [PATH]  Represents the folder path for CSV search [default: *.csv]

Options:
  -d, --delimiter <DELIMITER>  Represents the delimiter used in CSV files [default: ,]
  -n, --no-header              Represents whether to include the header in the CSV search column
  -w, --worker <WORKER>        Number of worker threads to use for performing the task [default: 1]
  -s, --sampling <SAMPLING>    Number of rows to sample for inferring the schema. [default: 100]
  -h, --help                   Print help
  -V, --version                Print version

Features

  • Fast and reliable CSV to Parquet conversion.
  • Multithreaded processing with the help of the tokio crate.
  • Progress indication during conversion with the help of the indicatif crate.

Contributing

If you wish to contribute, please feel free to fork the repository, make your changes, and submit a pull request. All contributions are welcome!

License

This project is licensed under MIT, see the LICENSE file for details.

Contact

Project Link: https://github.com/rayyildiz/cc2p

Dependencies

~31–42MB
~806K SLoC