#xml-parser #xml #xml-document #xml-data #namespaces #events #minimalist

rxml

Minimalistic, restricted XML 1.0 parser which does not include dangerous XML features

16 releases (10 breaking)

0.11.1 Jun 23, 2024
0.10.1 Jun 8, 2024
0.10.0 Mar 16, 2024
0.9.1 Jan 25, 2023
0.3.0 Jun 26, 2021

#295 in Parser implementations

Download history 16802/week @ 2024-03-15 15293/week @ 2024-03-22 14273/week @ 2024-03-29 13478/week @ 2024-04-05 13440/week @ 2024-04-12 14768/week @ 2024-04-19 14818/week @ 2024-04-26 11551/week @ 2024-05-03 12714/week @ 2024-05-10 12885/week @ 2024-05-17 11749/week @ 2024-05-24 17793/week @ 2024-05-31 14654/week @ 2024-06-07 15116/week @ 2024-06-14 12903/week @ 2024-06-21 7204/week @ 2024-06-28

53,574 downloads per month
Used in 30 crates (2 directly)

MIT license

425KB
12K SLoC

rxml — Restricted, minimalistic XML 1.0 parser

This crate provides "restricted" parsing of XML 1.0 documents with namespacing.

crate badge docs badge

Features (some call them restrictions)

  • No external resources
  • No custom entities
  • No DTD whatsoever
  • No processing instructions
  • No comments
  • UTF-8 only
  • Namespacing-well-formedness enforced
  • XML 1.0 only
  • Streamed parsing (parser emits a subset of SAX events)
  • Streamed encoding
  • Parser can be driven push- and pull-based
  • Tokio-based asynchronicity supported via the async feature and AsyncReader.

Examples

Parse data from byte slices

To parse a XML document from a byte slice (or a series of byte slices), you can use the Parser with the Parse trait directly:

use rxml::{Parser, Parse, Error, Event, XmlVersion};
use std::io;
let mut doc = &b"<?xml version='1.0'?><hello>World!</hello>"[..];
let mut fp = Parser::new();
while doc.len() > 0 {
	let ev = fp.parse(&mut doc, true);  // true = doc contains the entire document
	println!("got event: {:?}", ev);
}

Parse data from a standard library reader

To parse a XML document from a std::io::BufRead struct, you can use the Reader.

# use std::io::BufReader;
# let file = &mut &b"<?xml version='1.0'?><hello>World!</hello>"[..];
// let file = std::fs::File::open(..).unwrap();
let reader = BufReader::new(file);
let mut reader = rxml::Reader::new(reader);
let result = rxml::as_eof_flag(reader.read_all(|ev| {
	println!("got event: {:?}", ev);
}));
assert_eq!(result.unwrap(), true);  // true indicates eof

Parse data using tokio

To parse a XML document from a tokio::io::AsyncBufRead struct, you can use the AsyncReader.

This requires the tokio feature.

# use tokio::io::AsyncRead;
use rxml::{AsyncReader, Error, Event, XmlVersion};
# tokio_test::block_on(async {
# let sock = &mut &b"<?xml version='1.0'?><hello>World!</hello>"[..];
// let sock = ..;
let reader = tokio::io::BufReader::new(sock);
// this converts the doc into an tokio::io::AsyncRead
let mut reader = AsyncReader::new(reader);
// we expect the first event to be the XML declaration
let ev = reader.read().await;
assert!(matches!(ev.unwrap().unwrap(), Event::XmlDeclaration(_, XmlVersion::V1_0)));
# })

Feature flags

  • macros: Enable macros to convert &str to &NameStr, &NcNameStr and &CDataStr respectively.
  • compact_str (default): Enable the use of compact_str for some string types to avoid allocations and conserve heap memory.
  • tokio (default): Enable AsyncReader and related types.
  • stream: Add a futures::Stream implementation to AsyncReader. Implies tokio.
  • shared_ns: Allow deduplication of namespace URIs within and across parsers.

Dependencies

~2.1–3.5MB
~54K SLoC