#csv #parquet #cli

bin+lib cc2p

Convert a CSV to parquet file format

3 unstable releases

Uses new Rust 2024

new 0.3.7 May 4, 2025
0.3.5 Mar 30, 2025
0.3.1 Dec 21, 2024
0.3.0 Nov 14, 2024
0.1.5 Mar 27, 2024

#1191 in Parser implementations

Download history 112/week @ 2025-01-22 7/week @ 2025-01-29 8/week @ 2025-02-05 65/week @ 2025-02-12 256/week @ 2025-02-19 17/week @ 2025-02-26 108/week @ 2025-03-26 85/week @ 2025-04-02 51/week @ 2025-04-09 6/week @ 2025-04-16 119/week @ 2025-04-30

241 downloads per month

Custom license

105KB
255 lines

Convert CSV To Parquet (CC2P)

Build Publish cc2p

(CC2P) is a Rust-based project that converts CSV files in a selected folder into parquet format. This tool provides a simple and efficient way of handling and converting your CSV data files.

Installation & Usage

Prerequisites

  • Rust 1.85 (edition: 2024)

Building

Provide instructions on how to build the project, for example, installing the Rust compiler and necessary crates.

Here is how to install the cc2p directly from the Git repository:

cargo install cc2p

Running

Provide Instructions on how to run the scripts. For example, how to specify the input CSV file and the output Parquet file.

cc2p [OPTIONS] /path/to/csv/file.csv

Options:

  • delimiter : delimiter char used in CSV files (default: ,)
  • no-header : whether to include the header in the CSV search column (default: false)
  • worker: Number of worker threads to use for performing the task (default: 4)
  • sampling: Number of rows to sample for inferring the schema (default: 100)
> cc2p --help

Convert a CSV to parquet file format

Usage: cc2p.exe [OPTIONS] [PATH]

Arguments:
  [PATH]  Represents the folder path for CSV search [default: *.csv]

Options:
  -d, --delimiter <DELIMITER>  Represents the delimiter used in CSV files [default: ,]
  -n, --no-header              Represents whether to include the header in the CSV search column
  -w, --worker <WORKER>        Number of worker threads to use for performing the task [default: 1]
  -s, --sampling <SAMPLING>    Number of rows to sample for inferring the schema. [default: 100]
  -h, --help                   Print help
  -V, --version                Print version

MacOS Users

NOTE for macOS Users: Our Apple signing/notarization is not entirely done yet, thus you have to run the following command once to run the application. Download the app and run this command:

xattr -c cc2p

Features

  • Fast and reliable CSV to Parquet conversion.
  • Multithreaded processing with the help of the tokio crate.
  • Progress indication during conversion with the help of the indicatif crate.

Contributing

If you wish to contribute, please feel free to fork the repository, make your changes, and submit a pull request. All contributions are welcome!

License

This project is licensed under MIT, see the LICENSE file for details.

Contact

Project Link: https://github.com/rayyildiz/cc2p

Dependencies

~33–44MB
~831K SLoC