3 releases (breaking)
0.3.0 | Feb 17, 2023 |
---|---|
0.2.0 | Mar 21, 2022 |
0.1.0 | Oct 9, 2020 |
#270 in Parser implementations
83 downloads per month
Used in 2 crates
(via wild-doc)
245KB
5K
SLoC
MaybeXml
MaybeXml is a library to scan and evaluate XML-like data into tokens. In effect, the library provides a non-validating lexer. The interface is similar to many XML pull parsers.
The library does 3 things:
-
A
Scanner
receives byte slices and identifies the start and end of tokens like tags, character content, and declarations. -
An
Evaluator
transforms bytes from an input source (like instances of types which implementstd::io::BufRead
) into complete tokens via either a cursor or an iterator pull style API.From an implementation point of view, when a library user asks an
Evaluator
for the next token, theEvaluator
reads the input and passes the bytes to an internalScanner
. TheEvaluator
buffers the scanned bytes and keeps reading until theScanner
determines a token has been completely read. Then all of the bytes which represent the token are returned to the library user as a variant of a token type. -
Each token type provides methods which can provide views into the underlying bytes. For instance, a tag token could provide a
name()
method which returns aTagName
. TheTagName
provides a method liketo_str()
which can be called to get astr
representation of the tag name.
Purpose
The purpose of the library is to provide a way to read XML documents including office suite documents, RSS/Atom feeds, config files, SVG, and web service messages.
Installation
By default, features which depend on the Rust std
library are included.
[dependencies]
maybe_xml = "0.3.0"
Alloc Only
If the host environment has an allocator but does not have access to the Rust std
library:
[dependencies]
maybe_xml = { version = "0.3.0", default-features = false, features = ["alloc"]}
Most of the library, except for Evaluator
s which rely on std
types (such as std::io::BufRead
),
is still available.
No allocator
If the host environment does not have an allocator:
[dependencies]
maybe_xml = { version = "0.3.0", default-features = false }
The Scanner
and the borrowed versions of the tokens are available.
Example
The following is a short example showing the iterator API. The full example with all the module imports and error handling is in the lib.rs
source file.
use maybe_xml::token::owned::{Token, StartTag, Characters, EndTag};
let mut input = std::io::BufReader::new(r#"<ID>Example</ID>"#.as_bytes());
let eval = maybe_xml::eval::bufread::BufReadEvaluator::from_reader(input);
let mut iter = eval.into_iter()
.map(|token| match token {
Token::StartTag(start_tag) => {
if let Ok(str) = start_tag.to_str() {
Token::StartTag(StartTag::from(str.to_lowercase()))
} else {
Token::StartTag(start_tag)
}
}
Token::EndTag(end_tag) => {
if let Ok(str) = end_tag.to_str() {
Token::EndTag(EndTag::from(str.to_lowercase()))
} else {
Token::EndTag(end_tag)
}
}
_ => token,
});
let token = iter.next();
assert_eq!(token, Some(Token::StartTag(StartTag::from("<id>"))));
match token {
Some(Token::StartTag(start_tag)) => {
assert_eq!(start_tag.name().to_str()?, "id");
}
_ => panic!("unexpected token"),
}
assert_eq!(iter.next(), Some(Token::Characters(Characters::from("Example"))));
assert_eq!(iter.next(), Some(Token::EndTag(EndTag::from("</id>"))));
assert_eq!(iter.next(), Some(Token::Eof));
assert_eq!(iter.next(), None);
License
Licensed under either of Apache License, Version 2.0 or MIT License at your option.
Contributions
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.