#xml-parser #xml-document #xml #xml-data #minimalist #namespaces #events

rxml

Minimalistic, restricted XML 1.0 parser which does not include dangerous XML features

13 releases (breaking)

0.10.0 Mar 16, 2024
0.9.1 Jan 25, 2023
0.8.2 Dec 17, 2022
0.8.1 May 15, 2022
0.3.0 Jun 26, 2021

#253 in Parser implementations

Download history 8449/week @ 2024-01-01 12506/week @ 2024-01-08 14884/week @ 2024-01-15 14552/week @ 2024-01-22 9959/week @ 2024-01-29 9991/week @ 2024-02-05 10748/week @ 2024-02-12 12205/week @ 2024-02-19 16950/week @ 2024-02-26 19412/week @ 2024-03-04 18384/week @ 2024-03-11 15076/week @ 2024-03-18 14469/week @ 2024-03-25 15104/week @ 2024-04-01 12197/week @ 2024-04-08 13886/week @ 2024-04-15

56,110 downloads per month
Used in 29 crates (2 directly)

MIT license

365KB
10K SLoC

rxml — Restricted, minimalistic XML 1.0 parser

This crate provides "restricted" parsing of XML 1.0 documents with namespacing.

crate badge docs badge

Features (some call them restrictions)

  • No external resources
  • No custom entities
  • No DTD whatsoever
  • No processing instructions
  • No comments
  • UTF-8 only
  • Namespacing-well-formedness enforced
  • XML 1.0 only
  • Streamed parsing (parser emits a subset of SAX events)
  • Streamed encoding
  • Parser can be driven push- and pull-based
  • Tokio-based asynchronicity supported via the async feature and AsyncReader.

Examples

Parse data from byte slices

To parse a XML document from a byte slice (or a series of byte slices), you can use the Parser with the Parse trait directly:

use rxml::{Parser, Parse, Error, ResolvedEvent, XmlVersion};
use std::io;
let mut doc = &b"<?xml version='1.0'?><hello>World!</hello>"[..];
let mut fp = Parser::new();
while doc.len() > 0 {
	let ev = fp.parse(&mut doc, true);  // true = doc contains the entire document
	println!("got event: {:?}", ev);
}

Parse data from a standard library reader

To parse a XML document from a std::io::BufRead struct, you can use the Reader.

# use std::io::BufReader;
# let file = &mut &b"<?xml version='1.0'?><hello>World!</hello>"[..];
// let file = std::fs::File::open(..).unwrap();
let reader = BufReader::new(file);
let mut reader = rxml::Reader::<_>::new(reader);
let result = rxml::as_eof_flag(reader.read_all(|ev| {
	println!("got event: {:?}", ev);
}));
assert_eq!(result.unwrap(), true);  // true indicates eof

Parse data using tokio

To parse a XML document from a tokio::io::AsyncBufRead struct, you can use the AsyncReader.

This requires the async feature.

# use tokio::io::AsyncRead;
use rxml::{AsyncReader, Error, ResolvedEvent, XmlVersion};
# tokio_test::block_on(async {
# let sock = &mut &b"<?xml version='1.0'?><hello>World!</hello>"[..];
// let sock = ..;
let reader = tokio::io::BufReader::new(sock);
// this converts the doc into an tokio::io::AsyncRead
let mut reader = AsyncReader::<_>::new(reader);
// we expect the first event to be the XML declaration
let ev = reader.read().await;
assert!(matches!(ev.unwrap().unwrap(), ResolvedEvent::XmlDeclaration(_, XmlVersion::V1_0)));
# })

Feature flags

  • macros: Enable macros to convert &str to &NameStr, &NcNameStr and &CDataStr respectively.
  • smartstring (default): Enable the use of smartstring for some string types to avoid allocations and conserve heap memory.
  • tokio (default): Enable AsyncReader and related types. Implies sync.
  • sync (default): Use Arc instead of Rc for deduplicated namespace URIs.
  • stream: Add a futures::Stream implementation to AsyncReader. Implies tokio.
  • shared_ns: Allow deduplication of namespace URIs within and across parsers.

Dependencies

~2.3–3.5MB
~55K SLoC