7 releases
Uses new Rust 2024
0.1.7 | May 15, 2025 |
---|---|
0.1.6 | May 15, 2025 |
#1659 in Parser implementations
1,296 downloads per month
Used in mcat
76KB
2K
SLoC
markdownify
A Rust library for converting various document formats to Markdown, part of the mcat project.
Overview
markdownify is a Rust implementation inspired by Microsoft's markitdown Python project. It provides functionality to convert various document formats to Markdown, making them easier to view, share, and integrate into AI prompts.
Supported Formats
Format | Extension | Description |
---|---|---|
Word Documents | .docx | Microsoft Word documents |
OpenDocument Text | .odt, .odp | OpenDocument text files |
Portable Document Format files | ||
PowerPoint | .pptx | Microsoft PowerPoint presentations |
Excel/Spreadsheets | .xlsx, .xls, .xlsm, .xlsb, .xla, .xlam, .ods | Various spreadsheet formats |
CSV | .csv | Comma-separated values (auto-detects delimiter) |
ZIP Archives | .zip | Extracts and converts contained files |
Other text formats | (various) | Falls back to code block formatting |
Installation
Add to your Cargo.toml
:
[dependencies]
markdownify = "0.1.1"
Usage
Basic Usage
use std::path::Path;
use markdownify::convert;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Convert a file to markdown
let path = Path::new("document.docx");
let markdown = convert(&path, None)?;
println!("{}", markdown);
// With an optional name header
let name = String::from("My Spreadsheet");
let path = Path::new("spreadsheet.xlsx");
let markdown = convert(&path, Some(&name))?;
println!("{}", markdown);
Ok(())
}
Working with Specific Formats
You can also use the format-specific converters directly:
use std::path::Path;
use markdownify::{docx, pdf};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Convert a Word document
let path = Path::new("document.docx")
let markdown = docx::docx_convert(&path)?;
// Convert a PDF
let path = Path::new("document.pdf")
let markdown = pdf::pdf_convert(&path)?;
// same for the others..
Ok(())
}
License
This project is licensed under the MIT License - see the LICENSE under mcat for details.
Dependencies
~31–42MB
~626K SLoC