#markdown-tables #markdown #yaml #csv #command-line-tool #excel #json

bin+lib madato

A library and command line tool for reading and writing tabular data (XLS, ODS, CSV, YAML), and Markdown

3 unstable releases

0.7.0 May 26, 2024
0.5.3 Sep 9, 2018
0.5.1 Aug 12, 2018

#438 in Parser implementations

Download history 1258/week @ 2024-08-14 669/week @ 2024-08-21 1520/week @ 2024-08-28 1031/week @ 2024-09-04 1280/week @ 2024-09-11 451/week @ 2024-09-18 1549/week @ 2024-09-25 804/week @ 2024-10-02 728/week @ 2024-10-09 623/week @ 2024-10-16 876/week @ 2024-10-23 1344/week @ 2024-10-30 1385/week @ 2024-11-06 1290/week @ 2024-11-13 1311/week @ 2024-11-20 813/week @ 2024-11-27

4,956 downloads per month
Used in dtool

MIT/Apache

77KB
1K SLoC

madato   Build Status Latest Version

madato is a library and command line tool for working tabular data, and Markdown


The tools is primarly centered around getting tabular data (spreadsheets, CSVs) into Markdown.

It is a library and a CLI. The library is both Rust, and a Python lib. The library, if you need spreadsheet support, then add the spreadsheets feature.

madato = { version = "0", features = ["spreadsheets"] }

  1. madato (library) - this library, which has YAML support
  2. feature = "spreadsheets" - which provides support for reading and writing XLS and ODS Spreadsheets
  3. madato (cli) - providing a helpful command line tool of the above
  4. The full library is available as a python module

Details

When generating the output:

  • Filter the Rows using basic Regex over Key/Value pairs
  • Limit the columns to named headings
  • Re-order the columns, or repeat them using the same column feature
  • Only generate a table for a named "sheet" (applicable for the XLS/ODS formats)

Madato is:

  • Command Line Tool (Windows, Mac, Linux) - good for CI/CD preprocessing
  • Rust Library - Good for integration into Rust Markdown tooling
  • Node JS WASM API - To be used later for Atom and VSCode Extensions

Madato expects that every column has a heading row. That is, the first row are headings/column names. If a cell in that first row is blank, it will create NULL0..NULLn entries as required.

Examples

  • Extract the 3rd Sheet sheet from an MS Excel Document
08:39 $ target/debug/madato table --type xlsx test/sample_multi_sheet.xlsx --sheetname "3rd Sheet"
|col1|col2| col3 |col4 |                         col5                          |NULL5|
|----|----|------|-----|-------------------------------------------------------|-----|
| 1  |that| are  |wider|  value ‘aaa’ is in the next cell, but has no heading  | aaa |
|than|the |header| row |       (open the spreadsheet to see what I mean)       |     |
  • Extract and reorder just 3 Columns
08:42 $ target/debug/madato table --type xlsx test/sample_multi_sheet.xlsx --sheetname "3rd Sheet" -c col2 -c col3 -c NULL5
|col2| col3 |NULL5|
|----|------|-----|
|that| are  | aaa |
|the |header|     |
  • Pull from the second_sheet sheet
  • Only extract Heading 4 column
  • Use a Filter, where Heading 4 values must only have a letter or number.
08:48 $ target/debug/madato table --type xlsx test/sample_multi_sheet.xlsx --sheetname second_sheet -c "Heading 4" -f 'Heading 4=[a-zA-Z0-9]'
|        Heading 4         |
|--------------------------|
|         << empty         |
|*Some Bolding in Markdown*|
|   `escaped value` foo    |
|           0.22           |
|         #DIV/0!          |
|  “This cell has quotes”  |
|       😕 ← Emoticon       |
  • Filtering on a Column, ensuring that a "+" is there in Trend Column
09:00 $ target/debug/madato table --type xlsx test/sample_multi_sheet.xlsx --sheetname Sheet1 -c Rank -c Language -c Trend -f "Trend=\+"
|                         Rank                         |  Language  |Trend |
|------------------------------------------------------|------------|------|
|                          1                           |   Python   |+5.5 %|
|                          3                           | Javascript |+0.2 %|
|                          7                           |     R      |+0.0 %|
|                          12                          | TypeScript |+0.3 %|
|                          16                          |   Kotlin   |+0.5 %|
|                          17                          |     Go     |+0.3 %|
|                          20                          |    Rust    |+0.0 %|

Internals

madato uses:

Tips

  • I have found that copying the "table" I want from a website: HTML, to a spreadsheet, then through madato gives an excellent Markdown table of the original.

Python

pip install madato

# py
from IPython.display import display, Markdown
import madato
display(Markdown(madato.spreadsheet_to_md("../test/Financial Sample.xlsx")
print(madato.spreadsheet_to_md(str(my_sample_spreadsheet)))

More Commandline

Sheet List

You can list the "sheets" of an XLS*, ODS file with

$ madato sheetlist test/sample_multi_sheet.xlsx 
Sheet1
second_sheet
3rd Sheet

YAML to Markdown

Madato reads a "YAML" file, in the same way it can a Spreadsheet. This is useful for "keeping" tabular data in your source repository, and perhaps not the XLS.

madato table -t yaml test/www-sample/test.yml

|col3| col4  |  data1  |       data2        |
|----|-------|---------|--------------------|
|100 |gar gar|somevalue|someother value here|
|190x|       |  that   |        nice        |
|100 | ta da |  this   |someother value here|

Please see the test/www-sample/test.yml file for the expected layout of this file

Excel/ODS to YAML

Changing the output from default "Markdown (MD)" to "YAML", you get a Markdown file of the Spreadsheet.

madato table -t xlsx test/sample_multi_sheet.xslx.xlsx -s Sheet1 -o yaml
---
- Rank: "1"
  Change: ""
  Language: Python
  Share: "23.59 %"
  Trend: "+5.5 %"
- Rank: "2"
  Change: ""
  Language: Java
  Share: "22.4 %"
  Trend: "-0.5 %"
- Rank: "3"
  Change: ""
  Language: Javascript
  Share: "8.49 %"
...

If you omit the sheet name, it will dump all sheets into an order map of array of maps.

Features

  • [x] Reads a formatted YAML string and renders a Markdown Table
  • [x] Can take an optional list of column headings, and only display those from the table (filtering out other columns present)
  • [X] Native Binary Command Line (windows, linux, osx)
  • [X] Read an XLSX file and produce a Markdown Table
  • [X] Read an ODS file and produce a Markdown Table
  • [X] Read a CSV
  • [X] Published as a Python Module
  • [ ] TSV, PSV (etc) file and produce a Markdown Table
  • [ ] Support Nested Structures in the YAML input
  • [ ] Read a Markdown File, and select the "table" and turn it back into YAML

Future Goals

Known Issues

License

Serde is licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in Serde by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Dependencies

~5–13MB
~169K SLoC