5 releases
Uses new Rust 2024
| 0.1.0 | Jan 4, 2026 |
|---|---|
| 0.1.0-alpha.4 | Jan 3, 2026 |
| 0.1.0-alpha.3 | Dec 25, 2025 |
| 0.1.0-alpha.2 | Dec 5, 2025 |
#614 in Text processing
Used in 2 crates
48KB
1K
SLoC
markex
Fast, non-validating markup element extractor for Tag elements (XML-like), and later Markdown elements.
- Fast Extraction: Optimized for finding defined element structures without full document parsing.
- Owned & Borrowed: Provides both owned (
Parts) and zero-copy reference (PartsRef) extraction. - Iterators: Streaming iteration via
TagIterandTagRefIter.
Quick Start
use markex::tag::{self, Part};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let input = "Text before <DATA id=123>some content</DATA> and after.";
// 1. Owned Extraction
let parts = tag::extract(input, &["DATA"], true);
for part in parts {
match part {
Part::Text(t) => println!("Text: {t:?}"),
Part::TagElem(e) => println!("Tag: {} | Content: {}", e.tag, e.content),
}
}
// 2. Extrude content (concatenate text, keep elements)
let (elems, text) = parts.into_with_extrude_content();
Ok(())
}
Zero-copy References
For high-performance scenarios, use extract_refs to get PartRef which contains slices of the original input.
use markex::tag;
let input = "<FILE path='a.txt'>content</FILE>";
let parts_ref = tag::extract_refs(input, &["FILE"], true);
for part in parts_ref {
// Zero-copy slices
}
API Highlights
tag::extract(...) -> Parts: Returns owned data.tag::extract_refs(...) -> PartsRef: Returns references (zero-copy).Parts / PartsRef: Collection-like structures withtag_elems(),texts(), and iteration support.TagIter / TagRefIter: Lower-level iterators for streaming processing.
Dependencies
~0.8–1.5MB
~28K SLoC