18 releases (7 breaking)

0.8.1 Mar 1, 2023
0.7.2 Feb 2, 2023
0.6.0 Oct 2, 2022
0.4.0 May 19, 2022
0.2.1 Mar 5, 2021

#447 in Database interfaces

Download history 2/week @ 2022-11-26 3/week @ 2022-12-03 3/week @ 2022-12-10 17/week @ 2022-12-17 2/week @ 2022-12-24 4/week @ 2022-12-31 33/week @ 2023-01-07 2/week @ 2023-01-14 5/week @ 2023-01-21 67/week @ 2023-01-28 46/week @ 2023-02-04 32/week @ 2023-02-11 38/week @ 2023-02-18 43/week @ 2023-02-25 2/week @ 2023-03-04 4/week @ 2023-03-11

96 downloads per month

MIT/Apache

11KB
87 lines

CSV to Arrow

Crates.io

Convert CSV files to Apache Arrow. This package is part of Arrow CLI tools.

Installation

Download prebuilt binaries

You can get the latest releases from https://github.com/domoritz/arrow-tools/releases.

With Cargo

cargo install csv2arrow

With Cargo B(inary)Install

To avoid re-compilation and speed up installation, you can install this tool with cargo binstall:

cargo binstall csv2arrow

Usage

Usage: csv2arrow [OPTIONS] <CSV> [ARROW]

Arguments:
  <CSV>    Input CSV file
  [ARROW]  Output file, stdout if not present

Options:
  -s, --schema-file <SCHEMA_FILE>
          File with Arrow schema in JSON format
  -m, --max-read-records <MAX_READ_RECORDS>
          The number of records to infer the schema from. All rows if not present. Setting max-read-records to zero will stop schema inference and all columns will be string typed
      --header <HEADER>
          Set whether the CSV file has headers [possible values: true, false]
  -d, --delimiter <DELIMITER>
          Set the CSV file's column delimiter as a byte character [default: ,]
  -p, --print-schema
          Print the schema to stderr
  -n, --dry
          Only print the schema
  -h, --help
          Print help information
  -V, --version
          Print version information

The --schema-file option uses the same file format as --dry and --print-schema.

Dependencies

~12MB
~247K SLoC