#website #favicon #scraper #logo #cloudflare-workers #command-line #cli

bin+lib site_icons

Website icon scraper that fetches sizes (with WASM support)

47 releases (5 breaking)

0.6.4 Jan 7, 2023
0.5.0 Dec 28, 2022
0.3.8 Oct 16, 2022
0.1.13 Jul 24, 2022
0.1.6 Feb 3, 2021

#342 in Images


Used in 9 crates (6 directly)

GPL-3.0 license

48KB
1.5K SLoC

site_icons

Crates.io Documentation GitHub Sponsors

An efficient website icon scraper for rust or command line usage.

Features

  • Super fast!
  • Partially downloads images to find the sizes
  • Can extract a site logo <img> using a weighing system
  • Works with inline-data URIs (and automatically converts <svg> to them)
  • Supports WASM (and cloudflare workers)

Rust usage

use site_icons::SiteIcons;

let mut icons = SiteIcons::new();
// scrape the icons from a url
let entries = icons.load_website("https://github.com", false).await?;

// entries are sorted from highest to lowest resolution
for icon in entries {
  println!("{:?}", icon)
}

Command line usage

First install the binary:

cargo install site_icons

then run either:

For text output:

Command:

site-icons https://github.com

Output:

https://github.githubassets.com/favicons/favicon.svg site_favicon svg
https://github.githubassets.com/app-icon-512.png app_icon png 512x512
https://github.githubassets.com/apple-touch-icon-180x180.png app_icon png 180x180
For JSON output:

Command:

site-icons https://reactjs.org --json

Output:

[
  {
    "url": "data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9Ii0xMS41IC0xMC4yMzE3NCAyMyAyMC40NjM0OCI+CiAgPHRpdGxlPlJlYWN0IExvZ288L3RpdGxlPgogIDxjaXJjbGUgY3g9IjAiIGN5PSIwIiByPSIyLjA1IiBmaWxsPSIjNjFkYWZiIi8+CiAgPGcgc3Ryb2tlPSIjNjFkYWZiIiBzdHJva2Utd2lkdGg9IjEiIGZpbGw9Im5vbmUiPgogICAgPGVsbGlwc2Ugcng9IjExIiByeT0iNC4yIi8+CiAgICA8ZWxsaXBzZSByeD0iMTEiIHJ5PSI0LjIiIHRyYW5zZm9ybT0icm90YXRlKDYwKSIvPgogICAgPGVsbGlwc2Ugcng9IjExIiByeT0iNC4yIiB0cmFuc2Zvcm09InJvdGF0ZSgxMjApIi8+CiAgPC9nPgo8L3N2Zz4K",
    "headers": {},
    "kind": "site_logo",
    "type": "svg",
    "size": null
  },
  {
    "url": "https://reactjs.org/icons/icon-512x512.png?v=f4d46f030265b4c48a05c999b8d93791",
    "headers": {},
    "kind": "app_icon",
    "type": "png",
    "size": "512x512"
  },
  {
    "url": "https://reactjs.org/favicon.ico",
    "headers": {},
    "kind": "site_favicon",
    "type": "ico",
    "sizes": ["64x64", "32x32", "24x24", "16x16"]
  },
  {
    "url": "https://reactjs.org/favicon-32x32.png?v=f4d46f030265b4c48a05c999b8d93791",
    "headers": {},
    "kind": "site_favicon",
    "type": "png",
    "size": "32x32"
  }
]

Sources

  • HTML favicon tag (or looking for default /favicon.svg / /favicon.ico)
  • Web app manifest icons field
  • <img> tags on the page, directly inside the header OR with a src|alt|class containing the text "logo"

Running locally

git clone https://github.com/samdenty/site_icons
cd site_icons
cargo run https://github.com

Dependencies

~16–30MB
~529K SLoC