8 releases
0.2.0 | Oct 17, 2024 |
---|---|
0.1.2 | Mar 6, 2024 |
0.1.1 | Feb 8, 2024 |
0.1.0-beta.2 | Jan 29, 2024 |
#179 in HTTP client
226 downloads per month
Used in bed-reader
46KB
387 lines
cloud-file
Simple reading of cloud files in Rust
Highlights
- HTTP, AWS S3, Azure, Google, or local
- Sequential or random access
- Simplifies use of the powerful
object_store
crate, focusing on a useful subset of its features - Access files via URLs and string-based options
- Read binary or text
- Fully async
- Used by genomics crate BedReader, which is used by other Rust and Python projects
- Also see Nine Rules for Accessing Cloud Files from Your Rust Code Practical Lessons from Upgrading Bed-Reader, a Bioinformatics Library in Towards Data Science.
Install
cargo add cloud-file
Examples
Find the size of a cloud file.
use cloud_file::CloudFile;
# Runtime::new().unwrap().block_on(async { // '#' needed for doctest
let url = "https://raw.githubusercontent.com/fastlmm/bed-sample-files/main/toydata.5chrom.fam";
let cloud_file = CloudFile::new(url)?;
let file_size = cloud_file.read_file_size().await?;
assert_eq!(file_size, 14_361);
# Ok::<(), Box<dyn std::error::Error>>(()) }).unwrap();
# use {cloud_file::CloudFileError, tokio::runtime::Runtime};
Find the number of lines in a cloud file.
use cloud_file::CloudFile;
use futures::StreamExt; // Enables `.next()` on streams.
# Runtime::new().unwrap().block_on(async { // '#' needed for doctest
let url = "https://raw.githubusercontent.com/fastlmm/bed-sample-files/main/toydata.5chrom.fam";
let cloud_file = CloudFile::new_with_options(url, [("timeout", "30s")])?;
let mut chunks = cloud_file.stream_chunks().await?;
let mut newline_count: usize = 0;
while let Some(chunk) = chunks.next().await {
let chunk = chunk?;
newline_count += bytecount::count(&chunk, b'\n');
}
assert_eq!(newline_count, 500);
# Ok::<(), Box<dyn std::error::Error>>(()) }).unwrap();
# use {cloud_file::CloudFileError, tokio::runtime::Runtime};
More examples
Example | Demonstrates |
---|---|
line_count |
Read a file as binary chunks. |
nth_line |
Read a file as text lines. |
bigram_counts |
Read random regions of a file, without regard to order. |
aws_file_size |
Find the size of a file on AWS. |
Project Links
Dependencies
~9–18MB
~236K SLoC