2 releases
0.1.1 | Mar 31, 2024 |
---|---|
0.1.0 | Mar 30, 2024 |
#7 in #robots-txt
27KB
592 lines
Roboto: Parse and use robots.txt files
Roboto provides a type-safe way to parse and use robots.txt files. It is based on the Robots Exclusion Protocol and is used to approximately try control the behavior of web crawlers and other web robots.
Installation
Add this to your Cargo.toml
:
[dependencies]
roboto = "0.1"
Usage
use roboto::Robots;
let robots = r#"
User-agent: *
Disallow: /private
Disallow: /tmp
"#.parse::<Robots>().unwrap();
let user_agent = "googlebot".parse().unwrap();
assert_eq!(robots.is_allowed(&user_agent, "/public"), true);
lib.rs
:
Parsing and applying robots.txt files.
Examples
use roboto::Robots;
let robots = r#"
User-agent: *
Disallow: /
"#.parse::<Robots>().unwrap();
assert!(!robots.is_allowed(&"googlebot".parse().unwrap(), "/"));
assert!(robots.is_allowed(&"googlebot".parse().unwrap(), "/robots.txt"));
assert!(!robots.is_allowed(&"googlebot".parse().unwrap(), "/foo/bar"));
References
Dependencies
~125KB