48 releases (31 stable)

new 1.4.0 Feb 20, 2025
1.2.13 Jan 30, 2025
1.2.10 Jul 15, 2024
1.2.4 Feb 27, 2024
0.5.1 Jan 6, 2024

#554 in Parser implementations

Download history 43/week @ 2024-10-31 317/week @ 2024-11-07 202/week @ 2024-11-14 140/week @ 2024-11-21 136/week @ 2024-11-28 81/week @ 2024-12-05 134/week @ 2024-12-12 155/week @ 2024-12-19 23/week @ 2024-12-26 342/week @ 2025-01-02 95/week @ 2025-01-09 98/week @ 2025-01-16 114/week @ 2025-01-23 263/week @ 2025-01-30 275/week @ 2025-02-06 286/week @ 2025-02-13

957 downloads per month
Used in 5 crates (3 directly)

Apache-2.0

46KB
1K SLoC

mdka

HTML to Markdown (MD) converter written in Rust.

crates.io Documentation License Dependency Status

Summary

A kind of text manipulator named mdka. "ka" means "化 (か)" pointing to conversion.
Designed with in mind:

  • Fast speed
  • Low memory consumption
  • Easy usage

Usage

Executable

Releases' Assets offer executables for multiple platforms.

$ ./mdka <html-text>
converted-to-markdown-text will be printed

Help

$ ./mdka -h
Usage:
  -h, --help             : Help is shown.
  <html_text>            : Direct parameter is taken as HTML text to be converted. Either this or <html_filepath> is required.
  -i <html_filepath>     : Read HTML text from it. Optional.
  -o <markdown_filepath> : Write Markdown result to it. Optional.
  --overwrites           : Overwrite if Markdown file exists. Optional.

Examples:
  ./mdka "<p>Hello, world.</p>"
  ./mdka -i input.html
  ./mdka -o output.md "<p>Hello, world.</p>"
  ./mdka -i input.html -o output.md --overwrites

Development with Rust and cargo

Cargo.toml

[dependencies]
mdka = "1"

awesome.rs

use mdka::from_html

fn awesome_fn() {
    let input = r#"
<h1>heading 1</h1>
<p>Hello, world.</p>"#;
    let ret = from_html(input);
    println!("{}", ret);
    // # heading 1
    // 
    // Hello, world.
    // 
}

Python integration

Bindings for Python are supported. Python scripts can import this Rust library to use the function(s).

Install:

$ pip install mdka

awesome.py

Convert from HTML text

from mdka import md_from_html

print(md_from_html("<p>Hello, world.</p>"))
# Hello, world.
# 

Convert from HTML file

from mdka import md_from_file

print(md_from_file("tests/fixtures/simple-01.html"))
# Hello, world.
# 

Convert from HTML text and write the result to file

from mdka import md_from_html_to_file

md_from_html_to_file("<p>Hello, world.</p>", "tests/tmp/out.md", False) # third parameter is `overwrites` Boolean

Convert from HTML file and write the result to file

from mdka import md_from_file_to_file

md_from_file_to_file("tests/fixtures/simple-01.html", "tests/tmp/out.md", False) # third parameter is `overwrites` Boolean

Acknowledgements

Depends on Servo's html5ever / markup5ever. Also, on PyO3's pyo3 / maturin on bindings for Python.

Dependencies

~1.4–7MB
~46K SLoC