19 releases (stable)
new 1.4.9 | Nov 7, 2024 |
---|---|
1.4.5 | Sep 25, 2024 |
1.2.0 | Jul 29, 2024 |
0.5.0 | Jun 6, 2024 |
0.1.10 |
|
#324 in Parser implementations
453 downloads per month
Used in metatron
285KB
6K
SLoC
Shiva
Shiva library: Implementation in Rust of a parser and generator for documents of any type
Features
- Common Document Model (CDM) for all document types
- Parsers produce CDM
- Generators consume CDM
Common Document Model
Supported document types
Document type | Parse | Generate |
---|---|---|
Plain text | + | + |
Markdown | + | + |
HTML | + | + |
+ | + | |
JSON | + | + |
XML | + | + |
CSV | + | + |
RTF | + | + |
DOCX | + | + |
XLS | + | - |
XLSX | + | + |
ODS | + | + |
Typst | - | + |
Parse document features
Document type | Header | Paragraph | List | Table | Image | Hyperlink | PageHeader | PageFooter |
---|---|---|---|---|---|---|---|---|
Plain text | - | + | - | - | - | - | - | - |
Markdown | + | + | + | + | + | + | - | - |
HTML | + | + | + | + | + | + | - | - |
- | + | + | - | - | - | - | - | |
DOCX | + | + | + | + | - | + | - | - |
RTF | + | + | + | + | - | + | + | + |
JSON | + | + | + | + | - | + | + | + |
XML | + | + | + | + | + | + | + | + |
CSV | - | - | - | + | - | - | - | - |
XLS | - | - | - | + | - | - | - | - |
XLSX | - | - | - | + | - | - | - | - |
ODS | - | - | - | + | - | - | - | - |
Generate document features
Document type | Header | Paragraph | List | Table | Image | Hyperlink | PageHeader | PageFooter |
---|---|---|---|---|---|---|---|---|
Plain text | + | + | + | + | - | + | + | + |
Markdown | + | + | + | + | + | + | + | + |
HTML | + | + | + | + | + | + | - | - |
+ | + | + | + | + | + | + | + | |
DOCX | + | + | + | + | + | + | - | - |
RTF | + | + | + | + | + | + | - | - |
JSON | + | + | + | + | - | + | + | + |
XML | + | + | + | + | + | + | + | + |
CSV | - | - | - | + | - | - | - | - |
XLSX | - | - | - | + | - | - | - | - |
ODS | - | - | - | + | - | - | - | - |
Typst | + | + | + | + | + | + | + | + |
Usage Shiva library
Cargo.toml
[dependencies]
shiva = { version = "1.4.9", features = ["html", "markdown", "text", "pdf", "json",
"csv", "rtf", "docx", "xml", "xls", "xlsx", "ods", "typst"] }
main.rs
fn main() {
let input_vec = std::fs::read("input.html").unwrap();
let input_bytes = bytes::Bytes::from(input_vec);
let document = shiva::html::Transformer::parse(&input_bytes).unwrap();
let output_bytes = shiva::markdown::Transformer::generate(&document).unwrap();
std::fs::write("out.md", output_bytes).unwrap();
}
Shiva CLI & Server
Build executable Shiva CLI and Shiva Server
git clone https://github.com/igumnoff/shiva.git
cd shiva/cli
cargo build --release
Run executable Shiva CLI
cd ./target/release/
./shiva README.md README.html
Run Shiva Server
cd ./target/release/
./shiva-server --port=8080 --host=127.0.0.1
Who uses Shiva
Contributing
I would love to see contributions from the community. If you experience bugs, feel free to open an issue. If you would like to implement a new feature or bug fix, please follow the steps:
- Do fork
- Add comment to the issue that you are going to work on it
- Create pull request
If you would like add new document type, you need to implement the following traits:
Required: shiva::core::TransformerTrait
pub trait TransformerTrait {
fn parse(document: &Bytes) -> anyhow::Result<Document>;
fn generate(document: &Document) -> anyhow::Result<Bytes>;
}
Optional: shiva::core::TransformerWithImageLoaderSaverTrait (If images store outside of document for example: HTML, Markdown)
pub trait TransformerWithImageLoaderSaverTrait {
fn parse_with_loader<F>(document: &Bytes, image_loader: F) -> anyhow::Result<Document>
where F: Fn(&str) -> anyhow::Result<Bytes>;
fn generate_with_saver<F>(document: &Document, image_saver: F) -> anyhow::Result<Bytes>
where F: Fn(&Bytes, &str) -> anyhow::Result<()>;
}
License
Licensed under either of Apache License, Version 2.0 or MIT license at your option.Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in Shiva by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Dependencies
~1–26MB
~352K SLoC