#parser-generator #transformer #parser #conversion #text-processing #data-conversion

shiva

Shiva library: Implementation in Rust of a parser and generator for documents of any type

5 releases

0.3.4 May 7, 2024
0.3.3 May 4, 2024
0.3.2 Apr 30, 2024
0.2.3 Apr 18, 2024
0.0.1 Mar 19, 2024

#287 in Parser implementations

Download history 263/week @ 2024-03-18 315/week @ 2024-03-25 519/week @ 2024-04-01 439/week @ 2024-04-08 204/week @ 2024-04-15 268/week @ 2024-04-22 313/week @ 2024-04-29 163/week @ 2024-05-06

1,103 downloads per month
Used in metatron

Custom license

120KB
2.5K SLoC

Shiva

shiva

Shiva library: Implementation in Rust of a parser and generator for documents of any type

Features

  • Common Document Model (CDM) for all document types
  • Parsers produce CDM
  • Generators consume CDM

Supported document types

Document type Parse Generate
Plain text + +
Markdown + +
HTML + +
PDF + +
JSON + +
XML + +
CSV + +
RTF - -
DOCX - -
XLS - -
Typst - -

Parse document features

Document type Header Paragraph List Table Image Hyperlink PageHeader PageFooter
Plain text - + - - - - - -
Markdown + + + + + + - -
HTML + + + + + + - -
PDF - + + - - - - -
JSON + + + + - + + +
XML + + - - - + + +
CSV - - - + - - - -

Generate document features

Document type Header Paragraph List Table Image Hyperlink PageHeader PageFooter
Plain text + + + + - + + +
Markdown + + + + + + + +
HTML + + + + + + - -
PDF + + + + + + + +
JSON + + + + - + + +
XML + + - - - + + +
CSV - - - + - - - -

Usage Shiva library

Cargo.toml

[dependencies]
shiva = {  version = "0.3.4", features = ["html", "markdown", "text", "pdf", "json", "csv"] }

main.rs

fn main() {
    let input_vec = std::fs::read("input.html").unwrap();
    let input_bytes = bytes::Bytes::from(input_vec);
    let document = shiva::html::Transformer::parse(&input_bytes, &HashMap::new()).unwrap();
    let output_bytes = shiva::markdown::Transformer::generate(&document, &HashMap::new()).unwrap();
    std::fs::write("out.md", output_bytes).unwrap();
}

Shiva CLI

Install Rust for Linux/MacOS

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Install Rust for Windows

https://static.rust-lang.org/rustup/dist/x86_64-pc-windows-msvc/rustup-init.exe

Build executable Shiva

git clone https://github.com/igumnoff/shiva.git
cd shiva/cli
cargo build --release

Run executable shiva

cd ./target/release/
./shiva --input-format=markdown --output-format=html --input-file=README.md --output-file=README.html

Contributing

I would love to see contributions from the community. If you experience bugs, feel free to open an issue. If you would like to implement a new feature or bug fix, please follow the steps:

  1. Contact with me via telegram @ievkz or discord @igumnovnsk
  2. Confirm e-mail invitation in repository
  3. Do "git clone"
  4. Create branch with your assigned issue
  5. Create pull request to main branch

Who uses Shiva

Dependencies

~0.7–21MB
~230K SLoC