#markdown-text #format #document #conversion #facilitate #pdf #image

bin+lib markitdown

A Rust library designed to facilitate the conversion of various document formats into markdown text

5 releases

new 0.1.4 Feb 17, 2025
0.1.3 Feb 6, 2025
0.1.2 Jan 26, 2025
0.1.1 Jan 22, 2025
0.1.0 Jan 22, 2025

#1307 in Text processing

Download history 314/week @ 2025-01-22 17/week @ 2025-01-29 145/week @ 2025-02-05 87/week @ 2025-02-12

563 downloads per month

MIT and GPL-3.0+

1MB
686 lines

markitdown-rs

markitdown-rs is a Rust library designed to facilitate the conversion of various document formats into markdown text. It is a Rust implementation of the original markitdown Python library.

Features

It supports:

  • Excel(.xlsx)
  • Word(.docx)
  • PowerPoint
  • PDF
  • Images
  • Audio
  • HTML
  • Text-based formats (plain text, .csv, .xml, .rss, .atom)
  • ZIP

Usage

Command-Line

Installation

cargo install markitdown

Convert a File

markitdown path-to-file.pdf

Or use -o to specify the output file:

markitdown path-to-file.pdf -o document.md

Rust API

Installation

Add the following to your Cargo.toml:

[dependencies]
markitdown = "0.1.4"

Initialize MarkItDown

use markitdown::MarkItDown;

let mut md = MarkItDown::new();

Convert a File

use markitdown::{ConversionOptions, DocumentConverterResult};

let options = ConversionOptions {
    file_extension: Some(".xlsx".to_string()),
    url: None,
};

let result: Option<DocumentConverterResult> = md.convert("path/to/file.xlsx", Some(options));

if let Some(conversion_result) = result {
    println!("Converted Text: {}", conversion_result.text_content);
} else {
    println!("Conversion failed or unsupported file type.");
}

Register a Custom Converter

You can extend MarkItDown by implementing the DocumentConverter trait for your custom converters and registering them:

use markitdown::{DocumentConverter, MarkItDown};

struct MyCustomConverter;

impl DocumentConverter for MyCustomConverter {
    // Implement the required methods here
}

let mut md = MarkItDown::new();
md.register_converter(Box::new(MyCustomConverter));

License

MarkItDown is licensed under the MIT License. See LICENSE for more details.

Dependencies

~34–49MB
~737K SLoC