1 unstable release

Uses new Rust 2024

new 0.4.0-beta.4 May 14, 2025

#47 in Geospatial

MIT/Apache

1MB
17K SLoC

geoarrow-array

The central type in Apache Arrow are arrays, which are a known-length sequence of values all having the same type. This crate provides concrete implementations of each type defined in the GeoArrow specification, as well as a [GeoArrowArray] trait that can be used for type-erasure.

In order to minimize overhead of dynamic downcasting, the array types in this crate are defined "natively" and there's a O(1) conversion process that needs to happen to convert between a GeoArrow array type and an [arrow][arrow_array] array type.

Building a GeoArrow Array

Use [builders][builder] to construct GeoArrow arrays. These builders offer a push-based interface to construct arrays from a series of objects that implement [geo-traits][geo_traits].

# use geo_traits::{CoordTrait, PointTrait};
# use geoarrow_array::array::PointArray;
# use geoarrow_array::builder::PointBuilder;
# use geoarrow_array::scalar::Point;
# use geoarrow_array::GeoArrowArrayAccessor;
# use geoarrow_schema::{CoordType, Dimension, PointType};
#
let point_type = PointType::new(CoordType::Separated, Dimension::XY, Default::default());
let mut builder = PointBuilder::new(point_type);

builder.push_point(Some(&geo_types::point!(x: 0., y: 1.)));
builder.push_point(Some(&geo_types::point!(x: 2., y: 3.)));
builder.push_point(Some(&geo_types::point!(x: 4., y: 5.)));

let array: PointArray = builder.finish();

let point_0: Point<'_> = array.get(0).unwrap().unwrap();
assert_eq!(point_0.coord().unwrap().x_y(), (0., 1.));

Converting a builder to an array via finish() is always O(1).

Converting to and from [arrow][arrow_array] Arrays

The geoarrow crates depend on and are designed to be used in combination with the upstream [Arrow][arrow_array] crates. As such, we have easy integration to convert between representations of each crate.

Note that an Array or ArrayRef only maintains information about the physical DataType and will lose any extension type information. Because of this, it's imperative to store an Array and Field together since the Field persists the Arrow extension metadata. A RecordBatch holds an Array and Field together for each column, so a RecordBatch will persist extension metadata.

Converting to GeoArrow Arrays

If you have an Array and Field but don't know the geometry type of the array, you can use from_arrow_array:

# use std::sync::Arc;
#
# use arrow_array::Array;
# use arrow_schema::Field;
# use geoarrow_array::array::{from_arrow_array, PointArray};
# use geoarrow_array::cast::AsGeoArrowArray;
# use geoarrow_array::{GeoArrowArray, GeoArrowType};
#
fn use_from_arrow_array(array: &dyn Array, field: &Field) {
    let geoarrow_array: Arc<dyn GeoArrowArray> = from_arrow_array(array, field).unwrap();
    match geoarrow_array.data_type() {
        GeoArrowType::Point(_) => {
            let array: &PointArray = geoarrow_array.as_point();
        }
        _ => todo!("handle other geometry types"),
    }
}

If you know the geometry type of your array, you can use one of its TryFrom implementations to convert directly to that type. This means you don't have to downcast on the GeoArrow side from an Arc<dyn GeoArrowArray>.

# use arrow_array::Array;
# use arrow_schema::Field;
# use geoarrow_array::array::PointArray;
#
fn convert_to_point_array(array: &dyn Array, field: &Field) {
    let point_array = PointArray::try_from((array, field)).unwrap();
}

Converting to [arrow][arrow_array] Arrays

You can use the to_array_ref or into_array_ref methods on GeoArrowArray to convert to an ArrayRef.

Alternatively, if you have a concrete GeoArrow array type, you can use IntoArray to convert to a concrete arrow array type.

The easiest way today to access an arrow Field is to use IntoArray::ext_type and then call to_field on the result. We like to make this process simpler in the future.

Downcasting a GeoArrow array

Arrays are often passed around as a dynamically typed &dyn GeoArrowArray or [Arc<dyn GeoArrowArray>][GeoArrowArray].

While these arrays can be passed directly to compute functions, it is often the case that you wish to interact with the concrete arrays directly.

This requires downcasting to the concrete type of the array. Use the cast::AsGeoArrowArray extension trait to do this ergonomically.

use geoarrow_array::cast::AsGeoArrowArray;
use geoarrow_array::{GeoArrowArrayAccessor, GeoArrowArray};

fn iter_line_string_array(array: &dyn GeoArrowArray) {
    for row in array.as_line_string().iter() {
        // do something with each row
    }
}

Dependencies

~7–9.5MB
~179K SLoC