#parser-generator #transformer #parser #conversion #text-processing #data-conversion

shiva

Shiva library: Implementation in Rust of a parser and generator for documents of any type

2 releases

new 0.3.1 Apr 27, 2024
0.3.0 Apr 25, 2024
0.2.3 Apr 18, 2024
0.1.15 Apr 6, 2024
0.0.1 Mar 19, 2024

#473 in Parser implementations

Download history 119/week @ 2024-03-16 327/week @ 2024-03-23 254/week @ 2024-03-30 480/week @ 2024-04-06 539/week @ 2024-04-13 103/week @ 2024-04-20

1,400 downloads per month
Used in metatron

Custom license

7.5MB
2.5K SLoC

Shiva

shiva

Shiva library: Implementation in Rust of a parser and generator for documents of any type

Features

  • Common Document Model (CDM) for all document types
  • Parsers produce CDM
  • Generators consume CDM

Supported document types

Document type Parse Generate
Plain text + +
Markdown + +
HTML + +
PDF + +
JSON + +
XML - -
DOC - -
XLS - -

Parse document features

Document type Header Paragraph List Table Image Hyperlink PageHeader PageFooter
Plain text - + - - - - - -
Markdown + + + + + + - -
HTML + + + + + + - -
PDF - + + - - - - -
JSON + + + + - + + +

Generate document features

Document type Header Paragraph List Table Image Hyperlink PageHeader PageFooter
Plain text + + + + - + + +
Markdown + + + + + + + +
HTML + + + + + + - -
PDF + + + + - + + +
JSON + + + + - + + +

Usage Shiva library

Cargo.toml

[dependencies]
shiva = {  version = "0.3.1", features = ["html", "markdown", "text", "pdf", "json"] }

main.rs

fn main() {
    let input_vec = std::fs::read("input.html").unwrap();
    let input_bytes = bytes::Bytes::from(input_vec);
    let document = shiva::html::Transformer::parse(&input_bytes, &HashMap::new()).unwrap();
    let output_bytes = shiva::markdown::Transformer::generate(&document, &HashMap::new()).unwrap();
    std::fs::write("out.md", output_bytes).unwrap();
}

Shiva CLI

Install Rust for Linux/MacOS

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Install Rust for Windows

https://static.rust-lang.org/rustup/dist/x86_64-pc-windows-msvc/rustup-init.exe

Build executable Shiva

gti clone https://github.com/igumnoff/shiva.git
cd shiva/cli
cargo build --release

Run executable shiva

cd ./target/release/
./shiva --input-format=markdown --output-format=html --input-file=README.md --output-file=README.html

Who uses Shiva

Dependencies

~0.7–12MB
~84K SLoC