15 releases (7 breaking)
0.8.1 | Mar 1, 2023 |
---|---|
0.7.2 | Feb 2, 2023 |
0.6.0 | Oct 2, 2022 |
0.4.0 | May 19, 2022 |
0.2.2 | Mar 5, 2021 |
#265 in Database interfaces
73 downloads per month
11KB
81 lines
JSON to Arrow
Convert JSON files to Apache Arrow. This package is part of Arrow CLI tools.
Installation
Download prebuilt binaries
You can get the latest releases from https://github.com/domoritz/arrow-tools/releases.
With Cargo
cargo install json2arrow
With Cargo B(inary)Install
To avoid re-compilation and speed up installation, you can install this tool with cargo binstall
:
cargo binstall json2arrow
Usage
Usage: json2arrow [OPTIONS] <JSON> [ARROW]
Arguments:
<JSON> Input JSON file
[ARROW] Output file, stdout if not present
Options:
-s, --schema-file <SCHEMA_FILE>
File with Arrow schema in JSON format
-m, --max-read-records <MAX_READ_RECORDS>
The number of records to infer the schema from. All rows if not present. Setting max-read-records to zero will stop schema inference and all columns will be string typed
-p, --print-schema
Print the schema to stderr
-n, --dry
Only print the schema
-h, --help
Print help information
-V, --version
Print version information
The --schema-file option uses the same file format as --dry and --print-schema.
Limitations
Since we use the Arrow JSON loader, we are limited to what it supports. Right now, it supports JSON line-delimited files.
{ "a": 42, "b": true }
{ "a": 12, "b": false }
{ "a": 7, "b": true }
Dependencies
~10–37MB
~652K SLoC