#csv #parquet #cli #cc2p

bin+lib cc2p

Convert a CSV to parquet file format

2 unstable releases

Uses new Rust 2024

0.3.6 Apr 8, 2025
0.3.5 Mar 30, 2025
0.3.4 Feb 19, 2025
0.3.1 Dec 21, 2024
0.1.5 Mar 27, 2024

#1742 in Parser implementations

Download history 7/week @ 2024-12-24 109/week @ 2025-01-21 9/week @ 2025-01-28 9/week @ 2025-02-04 43/week @ 2025-02-11 272/week @ 2025-02-18 17/week @ 2025-02-25 6/week @ 2025-03-04 93/week @ 2025-03-25 35/week @ 2025-04-01 114/week @ 2025-04-08

242 downloads per month

Custom license

105KB
255 lines

Convert CSV To Parquet (CC2P)

Build Publish cc2p

(CC2P) is a Rust-based project that converts CSV files in a selected folder into parquet format. This tool provides a simple and efficient way of handling and converting your CSV data files.

Installation & Usage

Prerequisites

  • Rust 1.85 (edition: 2024)

Building

Provide instructions on how to build the project, for example, installing the Rust compiler and necessary crates.

Here is how to install the cc2p directly from the Git repository:

cargo install cc2p

Running

Provide Instructions on how to run the scripts. For example, how to specify the input CSV file and the output Parquet file.

cc2p [OPTIONS] /path/to/csv/file.csv

Options:

  • delimiter : delimiter char used in CSV files (default: ,)
  • no-header : whether to include the header in the CSV search column (default: false)
  • worker: Number of worker threads to use for performing the task (default: 4)
  • sampling: Number of rows to sample for inferring the schema (default: 100)
> cc2p --help

Convert a CSV to parquet file format

Usage: cc2p.exe [OPTIONS] [PATH]

Arguments:
  [PATH]  Represents the folder path for CSV search [default: *.csv]

Options:
  -d, --delimiter <DELIMITER>  Represents the delimiter used in CSV files [default: ,]
  -n, --no-header              Represents whether to include the header in the CSV search column
  -w, --worker <WORKER>        Number of worker threads to use for performing the task [default: 1]
  -s, --sampling <SAMPLING>    Number of rows to sample for inferring the schema. [default: 100]
  -h, --help                   Print help
  -V, --version                Print version

MacOS Users

NOTE for macOS Users: Our Apple signing/notarization is not entirely done yet, thus you have to run the following command once to run the application. Download the app and run this command:

xattr -c cc2p

Features

  • Fast and reliable CSV to Parquet conversion.
  • Multithreaded processing with the help of the tokio crate.
  • Progress indication during conversion with the help of the indicatif crate.

Contributing

If you wish to contribute, please feel free to fork the repository, make your changes, and submit a pull request. All contributions are welcome!

License

This project is licensed under MIT, see the LICENSE file for details.

Contact

Project Link: https://github.com/rayyildiz/cc2p

Dependencies

~32–44MB
~819K SLoC