Cargo Features
[dependencies]
minarrow = { version = "0.4.1", default-features = false, features = ["parallel_proc", "c_ffi_tests", "extended_categorical", "extended_numeric_types", "cube", "scalar_type", "value_type", "chunked", "large_string", "views", "matrix", "zstd", "snappy", "cast_arrow", "cast_polars", "datetime", "simd", "datetime_ops", "str_arithmetic", "fast_hash", "broadcast", "size", "select", "regex"] }
- parallel_proc = rayon
-
Adds parallel iterators via
Rayon - c_ffi_tests = cc
-
Adds roundtrip FFI tests. Leave off if you don't need it in your build pipeline, as it's mostly C-code.
- extended_categorical = extended_numeric_types
-
Adds Categorical8, Categorical16, and Categorical64.
Highly recommend keeping these off unless required E.g., constrained or embedded environments, as they add combinatorial weight to the binary and enum match arms - extended_numeric_types extended_categorical?
-
Adds UInt8, UInt16, Int8, Int16 types.
Highly recommend keeping these off unless required E.g., constrained or embedded environments, as they add combinatorial weight to the binary and enum match arms.
For most analytical use cases, they get upcasted anyway. - cube
-
Adds a cube object for stacking tables on an extra axis Useful for time series, and group analytics
Affects
array::broadcast_array_to_cube,cube::broadcast_cube_to_array,cube::broadcast_fieldarray_to_cube,cube::broadcast_cube_to_fieldarray,cube::broadcast_table_to_cube,cube::broadcast_cube_to_table,broadcast::cube,minarrow::structs.cube,aliases::CubeV,cube::broadcast_cube_to_scalar,cube::broadcast_arrayview_to_cube,cube::broadcast_cube_to_arrayview,cube::broadcast_numericarrayview_to_cube,cube::broadcast_cube_to_numericarrayview,cube::broadcast_textarrayview_to_cube,cube::broadcast_cube_to_textarrayview,cube::broadcast_tableview_to_cube,cube::broadcast_superarray_to_cube,cube::broadcast_cube_to_superarray,cube::broadcast_supertable_to_cube… - scalar_type default
-
Adds a unified scalar type, that's useful for
Arrayaggregations, and other use cases where you end up with one value. However, it is one of several downcasting methods available in Rust, and when predominantly working with numbers, one might prefer usingmy_function::<i32>()semantics which addresses the type immediately, e.g., in conjunction withT: Numeric,T: IntegerorT:Floatgeneric functions, rather than getting aScalarobject make that then needs.i32()style access, or a manual match. It is a pain that Rust can't just get the value when it's wrapped in such cases, but this is an inherent type safety limitation.Affects
scalar::Scalar,array::broadcast_array_to_scalar,scalar::broadcast_scalar_to_table,scalar::broadcast_scalar_to_array,scalar::broadcast_scalar_to_tuple2,scalar::broadcast_scalar_to_tuple3,scalar::broadcast_scalar_to_tuple4,scalar::broadcast_scalar_to_tuple5,scalar::broadcast_scalar_to_tuple6,scalar::broadcast_scalar_to_fieldarray,scalar::broadcast_fieldarray_to_scalar,table::broadcast_table_to_scalar,arithmetic::scalar_arithmetic,minarrow::enums.scalar,cube::broadcast_cube_to_scalar,matrix::broadcast_matrix_scalar_add,matrix::broadcast_scalar_matrix_add,broadcast::broadcast_value,scalar::broadcast_scalar_to_tableview,scalar::broadcast_scalar_to_superarray… - value_type
-
Adds a unified value enum, that can be used for engine-level orchestration or any situation where a catch-all, unified encompassing type is required to satisfy the compiler. It includes roundtrip
FromandTryFromfor each inner type so that signatures do not need to couple to it directly. Recommend leaving off if you don't need it.Affects
minarrow::enums.value,matrix::broadcast_matrix_array_add,matrix::broadcast_array_matrix_add,broadcast::broadcast_value… - chunked default
-
ChunkedArrayandChunkedTableobjects that support iterating over multiple inner objects of the same type, for memory-mapped streaming etc.Affects
aliases::ChunkedTable,array::broadcast_array_to_supertable,super_array::broadcast_superarray_to_table,super_array::route_super_array_broadcast,super_table::broadcast_super_table_with_operator,super_table::broadcast_supertable_to_array,super_table::broadcast_fieldarray_to_supertable,super_table::broadcast_supertable_to_fieldarray,super_table::broadcast_superarray_to_supertable,super_table::broadcast_supertable_to_superarray,table::broadcast_super_table_add,table::broadcast_table_to_superarray,minarrow::structs.chunked,utils::create_aligned_chunks_from_array,array::broadcast_array_to_supertableview,cube::broadcast_superarray_to_cube,cube::broadcast_cube_to_superarray,cube::broadcast_supertable_to_cube,cube::broadcast_cube_to_supertable,field_array::broadcast_fieldarray_to_superarrayview… - large_string default cast_polars?
-
Int64-based string
- views default
-
Provides windowed collection views for Numeric, String, and Temporal types. Often, everything can be done with only the
ArrayViewabstraction, or, theArrayViewT(&Array, Offset, Length) tuple from aliases. These are for the cases where they fall short, e.g., you have numeric or text specific functions, and want to streamline type management. In those cases, these abstractions provide the equivalent ofInto<NumericArrayView>for several types, and accept both the original and windowed view variants. Therefore, one can unify numeric entry points through here enabling a flexible API, at the cost of more surface complexity.Affects
array_view::broadcast_arrayview_to_table,array_view::broadcast_arrayview_to_tableview,array_view::broadcast_arrayview_to_supertableview,super_table_view::broadcast_supertableview_to_arrayview,table::broadcast_table_to_arrayview,table_view::broadcast_tableview_to_tableview,table_view::broadcast_tableview_to_arrayview,minarrow::kernels.routing,minarrow::views.collections,minarrow::views.array_view,minarrow::views.table_view,minarrow::traits.view,aliases::CubeV,array::broadcast_array_to_supertableview,cube::broadcast_arrayview_to_cube,cube::broadcast_cube_to_arrayview,cube::broadcast_numericarrayview_to_cube,cube::broadcast_cube_to_numericarrayview,cube::broadcast_textarrayview_to_cube,cube::broadcast_cube_to_textarrayview… - matrix
-
Adds a 2D matrix that uses a flat buffer in the format compatible with BLAS and LAPACK Fortan and C kernels. Includes
TryFromconversion methods so it's easy to move fromTablecolumn selections into the matrix, without worrying too much about buffers and strides. Hence, if you are working only with matrices, you may want this from the get-go. If you are working predominantly with Tabular data but running PCA's and SVD's (for e.g.), you can keep your data in Table format and any functions that acceptMatrixshould also work for yourTable, with a small once-off performance penalty of cloning the columns into a contiguous buffer, that becomes noticeable with large data sizes.Affects
matrix::broadcast_matrix_add,broadcast::matrix,minarrow::structs.matrix,matrix::broadcast_matrix_scalar_add,matrix::broadcast_scalar_matrix_add,matrix::broadcast_matrix_array_add,matrix::broadcast_array_matrix_add… - zstd
-
Adds the zstd compression option for Parquet and IPC formats Zstd offers a higher compression ratio but is slightly slower.
Enables zstd
- snappy
-
Adds the snappy compression option for Parquet and IPC format.
Snappy is lightweight, minimal, but with less compression than zstd.Enables snappy
- cast_arrow = arrow, arrow-schema
-
Adds
to_apache_arrow()for casting into that library. - cast_polars = large_string, polars, polars-arrow
-
Adds
to_polars()for casting into that library. - datetime default datetime_ops?
-
Adds
Datetimearray types.Affects
aliases::DatetimeAVT,aliases::DtArr,minarrow::collections.temporal_array,minarrow::variants.datetime,minarrow::collections.temporal_array_view,datetime::DatetimeArray,scalar::broadcast_scalar_to_temporal_arrayview,scalar::broadcast_temporal_arrayview_to_scalar,cube::broadcast_temporalarrayview_to_cube,cube::broadcast_cube_to_temporalarrayview,super_table::broadcast_temporalarrayview_to_supertable,super_table::broadcast_supertable_to_temporal_arrayview… - simd default
-
Adds SIMD for the Bitmask and Arithmetic kernels
A much more extensive set of kernels is available under the downstream simd-kernels crate.Affects
arithmetic::simd,bitmask::simd… - datetime_ops = datetime
-
Adds full datetime functionality with the
timecrate including:- Human-readable datetime conversions
- Timezone-aware operations
- Date/time arithmetic (add/subtract durations, dates)
- Comparison operations
- Component extraction (year, month, day, hour, etc.) At, the expense of an external dependency.
Without this feature, datetime values are raw integer offsets. The
ArrowTypestored inFieldand/orFieldArrayspecifies the logical type (Date32, Date64, Timestamp, etc.) for Arrow FFI compatibility.Affects
tz::TimezoneInfo,tz::TZ_DATABASE,tz::ABBR_TO_OFFSET,tz::lookup_timezone… - str_arithmetic = memchr, ryu
-
Adds string arithmetic kernels
Includes (small) external dependencies, and supports str concatenation with floats for the arithmetic kernels e.g., "Hello" + 1.0 = "Hello1", etc.
Also overloads std::ops::Add, Mul, Sub, Div, Pow
with best-case String equivalents (e.g., '+' concatenates),
for type unification rather than panicking.Affects
string::apply_str_float,string::apply_str_str,string::apply_dict32_str,string::apply_str_dict32,string::apply_dict32_num,string::format_finite… - fast_hash
-
Replaces all hashmaps and hashsets used for count distinct operations and categorical dictionary interning with the faster ahash.
Enables ahash
- broadcast
-
Adds typed arithmetic broadcasting for add, sub, mult, div, rem
Affects
arithmetic::types,minarrow::kernels.broadcast… - size
-
Adds byte size trait for best-effort size calculation
Affects
minarrow::traits.byte_size… - select
-
Adds pandas-style selection for Table and TableV with .c() and .r() methods
Affects
minarrow::traits.selection,array_view::ArrayV.active_data_selection,table_view::TableV.active_col_selection,table_view::TableV.active_row_selection… - default = chunked, datetime, large_string, scalar_type, simd, views
-
These default features are set whenever
minarrowis added withoutsomewhere in the dependency tree.default-features = false
Features from optional dependencies
In crates that don't use the dep: syntax, optional dependencies automatically become Cargo features. These features may have been created by mistake, and this functionality may be removed in the future.
All of the below external dependencies do not need to be enabled directly See [features] for the relevant feature that enables them.
- arrow cast_arrow?
-
Enables arrow ^55.2.0
Arrow and Polars are for optional to/from_apache_arrow() and to/from_polars() via the optional
cast_arrowandcast_polarsfeatures. - arrow-schema cast_arrow?
-
Enables arrow-schema ^55.2.0
- polars cast_polars?
-
Enables polars ^0.50.0
- polars-arrow cast_polars?
-
Enables polars-arrow ^0.50.0
- rayon parallel_proc?
- ryu str_arithmetic?
- memchr str_arithmetic?
- regex implicit feature
-
Affects
string::regex_str_str,string::regex_dict_str,string::regex_str_dict,string::regex_dict_dict… - cc build c_ffi_tests?