12 unstable releases (4 breaking)
0.6.1 | Aug 13, 2023 |
---|---|
0.6.0 | May 22, 2023 |
0.5.4 | Apr 29, 2023 |
0.4.0 | Mar 16, 2023 |
0.2.1 | Sep 27, 2022 |
#293 in Machine learning
40 downloads per month
175KB
2.5K
SLoC
ai-dataloader
A rust port of pytorch
dataloader
library.
Note: This project is still heavily in development and is at an early stage.
Highlights
- Iterable or indexable (Map style)
DataLoader
. - Customizable
Sampler
,BatchSampler
andcollate_fn
. - Parallel dataloader using
rayon
for indexable dataloader (experimental). - Integration with
ndarray
andtch-rs
, CPU and GPU support. - Default collate function that will automatically collate most of your type (supporting nesting).
- Shuffling for iterable and indexable
DataLoader
.
More info in the documentation.
Examples
Examples can be found in the examples folder but here there is a simple one
use ai_dataloader::DataLoader;
let loader = DataLoader::builder(vec![(0, "hola"), (1, "hello"), (2, "hallo"), (3, "bonjour")]).batch_size(2).shuffle().build();
for (label, text) in &loader {
println!("Label {label:?}");
println!("Text {text:?}");
}
tch-rs
integration
In order to collate your data into torch tensor that can run on the GPU, you must activate the tch
feature.
This feature relies on the tch crate for bindings to the C++ libTorch
API. The libtorch
library is required can be downloaded either automatically or manually. The following provides a reference on how to set up your environment to use these bindings, please refer to the tch for detailed information or support.
Next Features
This features could be added in the future:
RandomSampler
with replacement- parallel
dataloader
for iterable dataset - distributed
dataloader
MSRV
The current MSRV is 1.60.
Dependencies
~3.5–6MB
~120K SLoC