#http #caching

bin+lib cached-path

Download and cache HTTP resources

12 releases (4 breaking)

0.5.0 Jan 29, 2021
0.4.5 Sep 15, 2020
0.3.0 Jun 13, 2020
0.2.0 Jun 12, 2020
0.1.0 Dec 30, 2019

#4 in Caching

Download history 376/week @ 2020-11-04 229/week @ 2020-11-11 306/week @ 2020-11-18 497/week @ 2020-11-25 176/week @ 2020-12-02 245/week @ 2020-12-09 126/week @ 2020-12-16 144/week @ 2020-12-23 168/week @ 2020-12-30 572/week @ 2021-01-06 247/week @ 2021-01-13 296/week @ 2021-01-20 338/week @ 2021-01-27 311/week @ 2021-02-03 252/week @ 2021-02-10 241/week @ 2021-02-17

1,223 downloads per month
Used in 2 crates (via rust-bert)

Apache-2.0

72KB
1K SLoC

rust-cached-path

crates.io Documentation MIT/Apache-2 licensed CI

The idea behind cached-path is to provide a unified, simple interface for accessing both local and remote files. This can be used behind other APIs that need to access files agnostic to where they are located.

This is based on allennlp/common/file_utils.py and transformers/file_utils.py.

Installation

cached-path can be used as both a library and a command-line tool. To install cached-path as a command-line tool, run

cargo install --features build-binary cached-path

Usage

For remote resources, cached-path downloads and caches the resource, using the ETAG to know when to update the cache. The path returned is the local path to the latest cached version:

use cached_path::cached_path;

let path = cached_path(
    "https://github.com/epwalsh/rust-cached-path/blob/master/README.md"
).unwrap();
assert!(path.is_file());
# From the command line:
$ cached-path https://github.com/epwalsh/rust-cached-path/blob/master/README.md
/tmp/cache/055968a99316f3a42e7bcff61d3f590227dd7b03d17e09c41282def7c622ba0f.efa33e7f611ef2d163fea874ce614bb6fa5ab2a9d39d5047425e39ebe59fe782

For local files, the path returned is just the original path supplied:

use cached_path::cached_path;

let path = cached_path("README.md").unwrap();
assert_eq!(path.to_str().unwrap(), "README.md");
# From the command line:
$ cached-path README.md
README.md

For resources that are archives, like *.tar.gz files, cached-path can also automatically extract the files:

use cached_path::{cached_path_with_options, Options};

let path = cached_path_with_options(
    "https://raw.githubusercontent.com/epwalsh/rust-cached-path/master/test_fixtures/utf-8_sample/archives/utf-8.tar.gz",
    &Options::default().extract(),
).unwrap();
assert!(path.is_dir());
# From the command line:
$ cached-path --extract https://raw.githubusercontent.com/epwalsh/rust-cached-path/master/test_fixtures/utf-8_sample/archives/utf-8.tar.gz
README.md

It's also easy to customize the cache location, the HTTP client, and other options using a CacheBuilder to construct a custom Cache object. This is the recommended thing to do if your application makes multiple calls to cached_path, since it avoids the overhead of creating a new HTTP client on each call:

use cached_path::Cache;

let cache = Cache::builder()
    .dir(std::env::temp_dir().join("my-cache/"))
    .connect_timeout(std::time::Duration::from_secs(3))
    .build().unwrap();
let path = cache.cached_path("README.md").unwrap();
# From the command line:
$ cached-path --dir /tmp/my-cache/ --connect-timeout 3 README.md
README.md

Dependencies

~7–12MB
~271K SLoC