29 releases (17 stable)

Uses new Rust 2024

1.0.17	Feb 25, 2025
1.0.15	Sep 13, 2024
1.0.12	Jul 26, 2024
1.0.9	Jan 27, 2024
0.1.4	Mar 20, 2019

#29 in Testing

34,600 downloads per month
Used in 49 crates (45 directly)

MIT license

245KB
6K SLoC

Headless Chrome

A high-level API to control headless Chrome or Chromium over the DevTools Protocol. It is the Rust equivalent of Puppeteer, a Node library maintained by the Chrome DevTools team.

It is not 100% feature compatible with Puppeteer, but there's enough here to satisfy most browser testing / web crawling use cases, and there are several 'advanced' features such as:

network request interception
JavaScript coverage monitoring
Opening incognito windows
taking screenshots of elements or the entire page
saving pages to PDF
'headful' browsing
automatic downloading of 'known good' Chromium binaries for Linux / Mac / Windows
extension pre-loading

Quick Start

use std::error::Error;

use headless_chrome::Browser;
use headless_chrome::protocol::cdp::Page;

fn browse_wikipedia() -> Result<(), Box<dyn Error>> {
    let browser = Browser::default()?;

    let tab = browser.new_tab()?;

    // Navigate to wikipedia
    tab.navigate_to("https://www.wikipedia.org")?;

    // Wait for network/javascript/dom to make the search-box available
    // and click it.
    tab.wait_for_element("input#searchInput")?.click()?;

    // Type in a query and press `Enter`
    tab.type_str("WebKit")?.press_key("Enter")?;

    // We should end up on the WebKit-page once navigated
    let elem = tab.wait_for_element("#firstHeading")?;
    assert!(tab.get_url().ends_with("WebKit"));

    /// Take a screenshot of the entire browser window
    let jpeg_data = tab.capture_screenshot(
        Page::CaptureScreenshotFormatOption::Jpeg,
        None,
        None,
        true)?;
    // Save the screenshot to disc
    std::fs::write("screenshot.jpeg", jpeg_data)?;

    /// Take a screenshot of just the WebKit-Infobox
    let png_data = tab
        .wait_for_element("#mw-content-text > div > table.infobox.vevent")?
        .capture_screenshot(Page::CaptureScreenshotFormatOption::Png)?;
    // Save the screenshot to disc
    std::fs::write("screenshot.png", png_data)?;

    // Run JavaScript in the page
    let remote_object = elem.call_js_fn(r#"
        function getIdTwice () {
            // `this` is always the element that you called `call_js_fn` on
            const id = this.id;
            return id + id;
        }
    "#, vec![], false)?;
    match remote_object.value {
        Some(returned_string) => {
            dbg!(&returned_string);
            assert_eq!(returned_string, "firstHeadingfirstHeading".to_string());
        }
        _ => unreachable!()
    };

    Ok(())
}

Auto fetching chrome binary

[dependencies]
headless_chrome = {git = "https://github.com/rust-headless-chrome/rust-headless-chrome", features = ["fetch"]}

For fuller examples, take a look at tests/simple.rs and examples.

Before running examples. Make sure add failure crate in your cargo project dependency of Cargo.toml

What can't it do?

The Chrome DevTools Protocol is huge. Currently, Puppeteer supports way more of it than we do. Some of the missing features include:

Dealing with frames
Handling file picker / chooser interactions
Tapping touchscreens
Emulating different network conditions (DevTools can alter latency, throughput, offline status, 'connection type')
Viewing timing information about network requests
Reading the SSL certificate
Replaying XHRs
HTTP Basic Auth
Inspecting EventSources (aka server-sent events or SSEs)
WebSocket inspection

If you're interested in adding one of these features but would like some advice about how to start, please reach out by creating an issue or sending me an email at alistair@sunburnt.country.

fantoccini uses WebDriver, so it works with browsers other than Chrome. It's also asynchronous and based on Tokio, unlike headless_chrome, which has a synchronous API and is just implemented using plain old threads. Fantoccini has also been around longer and is more battle-tested. It doesn't support Chrome DevTools-specific functionality like JS Coverage.

Testing

For debug output, set these environment variables before running cargo test:

RUST_BACKTRACE=1 RUST_LOG=headless_chrome=trace

Version numbers

Starting with v0.2.0, we're trying to follow SemVar strictly.

Troubleshooting

If you get errors related to timeouts, you likely need to enable sandboxing either in the kernel or as a setuid sandbox. Puppeteer has some information about how to do that here.

Contributing

Pull requests and issues are most welcome, even if they're just experience reports. If you find anything frustrating or confusing, let me know!

Dependencies

~9–23MB
~369K SLoC

build build.rs
build auto_generate_cdp
dev chrono +clock
dev env_logger 0.11.3
dev filepath 0.2
dev jpeg-decoder 0.3
dev png
dev tiny_http 0.12