1 unstable release
Uses old Rust 2015
0.1.0 | Apr 21, 2022 |
---|
#29 in #streaming-parser
140KB
2.5K
SLoC
rbatis_xml_parser
Warning: This library is alpha quality, so no guarantees are given.
This crate provides a push based XML parser library that trades well-formedness for error recovery.
rbatis_xml_parser is based largely on html5ever parser, so if you have experience with html5ever you will be familiar with rbatis_xml_parser.
The library is dual licensed under MIT and Apache license.
#Why you should use rbatis_xml_parser
Main use case for this library is when XML is badly formatted, usually from bad XML templates. XML5 tries to handle most common errors, in a manner similar to HTML5.
When you should use it?
- You aren't interested in well-formed documents.
- You need to get some info from your data even if it has errors (although not all possible errors are handled).
- You want to features like character references or xml namespaces.
When you shouldn't use it
- You need to have your document validated.
- You require DTD support.
- You require an easy to use parser, with lots of extensions (e.g. XPath, XQuery).
- You require a battle tested, industry proven solution.
#Installation
Add rbatis_xml_parser as a dependency in your project manifest.
[dependencies]
rbatis_xml_parser = "0.1.3"
And add crate declaration in your lib.rs
extern crate rbatis_xml_parser
#Getting started
Here is a very simple RcDom backed parser:
let input = "<xml></xml>".to_tendril();
// To parse XML into a tree form, we need a TreeSink
// luckily rbatis_xml_parser comes with a static RC backed tree represetation.
let dom: RcDom = parse(std::iter::once(input), Default::default());
// Do something with dom
The thing that does actual parsing is the parse
function. It expects an iterator that can be converted into StrTendril
, so you can use std::iter::once(input)
or Some(input).into_iter()
(where input
is StrTendril
like structure).
Dependencies
~4MB
~65K SLoC