44 releases (10 breaking)
new 0.12.0 | Jan 8, 2025 |
---|---|
0.11.5 | Dec 9, 2024 |
0.11.3 | Nov 21, 2024 |
0.7.7 | Jun 26, 2024 |
0.1.0 |
|
#75 in Data structures
2,552 downloads per month
410KB
10K
SLoC
An experimental (work-in-progress) statically typed implementation of Apache Arrow.
This crate provides methods to automatically generate types to support reading and writing instances of abstract data types in Arrow's in-memory data structures.
Why
- The arrow crate provides APIs that make sense when the array types are only known at run-time. Many of its APIs require the use of trait objects and downcasting. However, for applications where types are known at compile-time, these APIs are not ergonomic.
- Builders for nested array types are complex and error-prone.
There are other crates that aim to prevent users from having to maintain array builder code by providing derive macros. These builders typically produce type-erased arrays, whereas this crate only provides fully statically typed arrays.
Goals and non-goals
Goals
- Provide production ready, fully statically typed, safe and efficient Arrow array implementations
- Enable everyone using Rust to easily benefit from the Arrow ecosystem
- Provide zero-copy interop with the arrow crate
- Support custom buffer implementations e.g. to support accelerators
- Explore expressing Arrow concepts using the Rust type system, and mapping Rust concepts to Arrow
Non-goals
- Support arbitrary array types at runtime (the arrow crate supports this use case)
- Provide compute kernels
- Replace other Arrow implementations
Example
use narrow::{
array::{StructArray, UnionArray},
ArrayType, Length,
};
#[derive(ArrayType, Default, Clone, Debug, PartialEq, Eq)]
struct Foo {
a: bool,
b: u32,
c: Option<String>,
}
#[derive(ArrayType, Default, Clone, Debug, PartialEq, Eq)]
struct Bar(Vec<u8>);
#[derive(ArrayType, Clone, Debug, PartialEq, Eq)]
enum FooBar {
Foo(Foo),
Bar(Bar),
None,
}
let foos = vec![
Foo {
a: false,
b: 0,
c: None,
},
Foo {
a: true,
b: 42,
c: Some("hello world".to_owned()),
},
];
let struct_array = foos.clone().into_iter().collect::<StructArray<Foo>>();
assert_eq!(struct_array.len(), 2);
assert!(struct_array.0.a.iter().any(|x| x));
assert_eq!(struct_array.0.b.iter().sum::<u32>(), 42);
assert_eq!(struct_array.0.c.iter().filter_map(|x| x).collect::<String>(), "hello world");
assert_eq!(struct_array.into_iter().collect::<Vec<_>>(), foos);
let foo_bars = vec![
FooBar::Foo(Foo {
a: true,
b: 42,
c: Some("hello world".to_owned()),
}),
FooBar::Bar(Bar(vec![1, 2, 3, 4])),
FooBar::None,
FooBar::None,
];
let union_array = foo_bars
.clone()
.into_iter()
.collect::<UnionArray<FooBar, 3>>();
assert_eq!(union_array.len(), 4);
assert_eq!(union_array.into_iter().collect::<Vec<_>>(), foo_bars);
Features
The crate supports the following optional features:
Derive support
derive
: addsArrayType
derive support.
Interop
arrow-rs
: array conversion methods for arrow.
Additional ArrayType
implementations
chrono
: support for some chrono types.map
: support for std::collections::HashMap.uuid
: support for uuid::Uuid.
Docs
Minimum supported Rust version
The minimum supported Rust version for this crate is Rust 1.79.0.
License
Licensed under either of Apache License, Version 2.0 or MIT license at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Dependencies
~0–6MB
~32K SLoC