28 releases (14 stable)

5.4.0 Sep 15, 2021
5.0.0 Jul 17, 2021
4.4.0 Jun 28, 2021
3.0.0 Jan 26, 2021
0.3.0 May 5, 2018

#44 in Database interfaces

Download history 5916/week @ 2021-06-02 5263/week @ 2021-06-09 5537/week @ 2021-06-16 6339/week @ 2021-06-23 5013/week @ 2021-06-30 5917/week @ 2021-07-07 7141/week @ 2021-07-14 6272/week @ 2021-07-21 6066/week @ 2021-07-28 7509/week @ 2021-08-04 7008/week @ 2021-08-11 7386/week @ 2021-08-18 7069/week @ 2021-08-25 4574/week @ 2021-09-01 6248/week @ 2021-09-08 4503/week @ 2021-09-15

26,988 downloads per month
Used in 29 crates (15 directly)

Apache-2.0

4MB
88K SLoC

Apache Parquet Official Native Rust Implementation

Crates.io

This crate contains the official Native Rust implementation of Apache Parquet, which is part of the Apache Arrow project.

Example

Example usage of reading data:

use std::fs::File;
use std::path::Path;
use parquet::file::reader::{FileReader, SerializedFileReader};

let file = File::open(&Path::new("/path/to/file")).unwrap();
let reader = SerializedFileReader::new(file).unwrap();
let mut iter = reader.get_row_iter(None).unwrap();
while let Some(record) = iter.next() {
    println!("{}", record);
}

See crate documentation on available API.

Rust Version Compatbility

This crate is tested with the latest stable version of Rust. We do not currrently test against other, older versions of the Rust compiler.

Upgrading from versions prior to 4.0

If you are upgrading from version 3.0 or previous of this crate, you likely need to change your code to use [ConvertedType] rather than [LogicalType] to preserve existing behaviour in your code.

Version 2.4.0 of the Parquet format introduced a LogicalType to replace the existing ConvertedType. This crate used parquet::basic::LogicalType to map to the ConvertedType, but this has been renamed to parquet::basic::ConvertedType from version 4.0 of this crate.

The ConvertedType is deprecated in the format, but is still written to preserve backward compatibility. It is preferred that LogicalType is used, as it supports nanosecond precision timestamps without using the deprecated Int96 Parquet type.

Supported Parquet Version

  • Parquet-format 2.6.0

To update Parquet format to a newer version, check if parquet-format version is available. Then simply update version of parquet-format crate in Cargo.toml.

Features

  • All encodings supported
  • All compression codecs supported
  • Read support
    • Primitive column value readers
    • Row record reader
    • Arrow record reader
  • Statistics support
  • Write support
    • Primitive column value writers
    • Row record writer
    • Arrow record writer
  • Predicate pushdown
  • Parquet format 2.6.0 support

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0.

Dependencies

~18MB
~377K SLoC