45 releases (major breaking)

new 55.0.0	Apr 11, 2025
54.3.1	Mar 30, 2025
54.2.1	Feb 27, 2025
54.0.0	Dec 23, 2024
25.0.0	Oct 17, 2022

#160 in Testing

299 downloads per month
Used in chewdata

Apache-2.0

595KB
11K SLoC

Support for the Apache Arrow JSON test data format

These utilities define structs that read the integration JSON format for integration testing purposes.

This is not a canonical format, but provides a human-readable way of verifying language implementations

Native Rust implementation of Apache Arrow and Apache Parquet

Welcome to the Rust implementation of Apache Arrow, the popular in-memory columnar format.

This repo contains the following main components:

Crate	Description	Latest API Docs	README
`arrow`	Core functionality (memory layout, arrays, low level computations)	docs.rs	(README)
`arrow-flight`	Support for Arrow-Flight IPC protocol	docs.rs	(README)
`object-store`	Support for object store interactions (aws, azure, gcp, local, in-memory)	docs.rs	(README)
`parquet`	Support for Parquet columnar file format	docs.rs	(README)
`parquet_derive`	A crate for deriving RecordWriter/RecordReader for arbitrary, simple structs	docs.rs	(README)

The current development version the API documentation in this repo can be found here.

Release Versioning and Schedule

`arrow` and `parquet` crates

The Arrow Rust project releases approximately monthly and follows Semantic Versioning.

Due to available maintainer and testing bandwidth, arrow crates (arrow, arrow-flight, etc.) are released on the same schedule with the same versions as the parquet and parquet-derive crates.

This crate releases every month. We release new major versions (with potentially breaking API changes) at most once a quarter, and release incremental minor versions in the intervening months. See ticket #5368 for more details.

To keep our maintenance burden down, we do regularly scheduled releases (major and minor) from the main branch. How we handle PRs with breaking API changes is described in the contributing guide.

Planned Release Schedule

Approximate Date	Version	Notes
Mar 2025	`54.2.0`	Minor, NO breaking API changes
Apr 2025	`55.0.0`	Major, potentially breaking API changes
May 2025	`55.1.0`	Minor, NO breaking API changes

`object_store` crate

The object_store crate is released independently of the arrow and parquet crates and follows Semantic Versioning. We aim to release new versions approximately every 2 months.

Planned Release Schedule

Approximate Date	Version	Notes
Feb 2025	`0.12.0`	Major, potentially breaking API changes
Apr 2025	`0.12.1`	Minor, NO breaking API changes

Guidelines for `panic` vs `Result`

In general, use panics for bad states that are unreachable, unrecoverable or harmful. For those caused by invalid user input, however, we prefer to report that invalidity gracefully as an error result instead of panicking. In general, invalid input should result in an Error as soon as possible. It is ok for code paths after validation to assume validation has already occurred and panic if not. See ticket #6737 for more nuances.

Deprecation Guidelines

Minor releases may deprecate, but not remove APIs. Deprecating APIs allows downstream Rust programs to still compile, but generate compiler warnings. This gives downstream crates time to migrate prior to API removal.

To deprecate an API:

Mark the API as deprecated using #[deprecated] and specify the exact arrow-rs version in which it was deprecated
Concisely describe the preferred API to help the user transition

The deprecated version is the next version which will be released (please consult the list above). To mark the API as deprecated, use the #[deprecated(since = "...", note = "...")] attribute.

Foe example

#[deprecated(since = "51.0.0", note = "Use `date_part` instead")]

In general, deprecated APIs will remain in the codebase for at least two major releases after they were deprecated (typically between 6 - 9 months later). For example, an API deprecated in 51.3.0 can be removed in 54.0.0 (or later). Deprecated APIs may be removed earlier or later than these guidelines at the discretion of the maintainers.

There are several related crates in different repositories

Crate	Description	Documentation
`datafusion`	In-memory query engine with SQL support	(README)
`ballista`	Distributed query execution	(README)
`object_store_opendal`	Use `opendal` as `object_store` backend	(README)
`parquet_opendal`	Use `opendal` for `parquet` Arrow IO	(README)

Collectively, these crates support a wider array of functionality for analytic computations in Rust.

For example, you can write SQL queries or a DataFrame (using the datafusion crate) to read a parquet file (using the parquet crate), evaluate it in-memory using Arrow's columnar format (using the arrow crate), and send to another process (using the arrow-flight crate).

Generally speaking, the arrow crate offers functionality for using Arrow arrays, and datafusion offers most operations typically found in SQL, including joins and window functions.

You can find more details about each crate in their respective READMEs.

Arrow Rust Community

The dev@arrow.apache.org mailing list serves as the core communication channel for the Arrow community. Instructions for signing up and links to the archives can be found on the Arrow Community page. All major announcements and communications happen there.

The Rust Arrow community also uses the official ASF Slack for informal discussions and coordination. This is a great place to meet other contributors and get guidance on where to contribute. Join us in the #arrow-rust channel and feel free to ask for an invite via:

the dev@arrow.apache.org mailing list
the GitHub Discussions
the Discord channel

The Rust implementation uses GitHub issues as the system of record for new features and bug fixes and this plays a critical role in the release process.

For design discussions we generally collaborate on Google documents and file a GitHub issue linking to the document.

There is more information in the contributing guide.

Dependencies

~16MB
~285K SLoC