1 unstable release

0.1.0 Jan 6, 2023

#2366 in Parser implementations

37 downloads per month

MIT/Apache

19KB
438 lines

sxd_html_table

Provide features related to HTML tables

There are some complexities to deal with when dealing with HTML tables.

  • There are colspans and rowspans, and the number of rows and columns is indeterminate.
  • There are th and td, and the type of cell requires attention.

This library hides these complexities and makes it easy to deal with the structure of the table. For example, you can convert an HTML table tag to a CSV file.

Usage

use sxd_html_table::Table;

let html = r#"
<table>
  <tr>
    <th>header1</th>
    <th>header2</th>
  </tr>
  <tr>
    <td>data1</td>
    <td>data2</td>
  </tr>
</table>
"#;

fn extract_table_texts_from_document(html: &str) -> Result<Vec<Table<String>>, Error> {
    let package = sxd_html::parse_html(html);
    let document = package.as_document();
    let tables = extract_table_nodes_to_table(document.root())?;
    let tables = tables
        .into_iter()
        .map(|table| table.to_string_table())
        .collect();
    Ok(tables)
}

let table = extract_table_texts_from_document(html).unwrap();
let csv = table.to_csv().unwrap();
assert_eq!(csv, "header1,header2\ndata1,data2\n");

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Dependencies

~4–11MB
~94K SLoC