#google-search #search #google #page #results #crawl

search_with_google

A simple library to crawl the google search page

8 releases (4 breaking)

0.5.0 Sep 2, 2020
0.4.1 Jul 21, 2020
0.3.1 Jul 2, 2020
0.2.2 Jun 30, 2020
0.1.0 Jun 28, 2020

#5 in #crawl

22 downloads per month

MIT license

12KB
269 lines

search_with_google

A simple library that crawls the google search results page

Usage

NOTE:

If you're coming from V 0.2.x replace use search_with_google::search; with use search_with_google::blocking::search;

Insert this in Cargo.toml

[dependencies]
search_with_google = "0.5"

for regular blocking

use search_with_google::blocking::search;
let results = search("rust", 3, None);
if let Ok(result_list) = results {
    println!("Title : {}\nLink : {}", result_list[0].title, result_list[0].link);
}

for async

use search_with_google::search;
let results = search("rust", 3, None).await;
if let Ok(result_list) = results {
    println!("Title : {}\nLink : {}", result_list[0].title, result_list[0].link);
}

If you are going to search repeatedly, you can create a Client.

for regular blocking

use search_with_google::blocking::Client;
let client = Client::default();

let results = client.search("rust", 3, None);
if let Ok(result_list) = results {
    println!("Title : {}\nLink : {}", result_list[0].title, result_list[0].link);
}

for async

use search_with_google::Client;

let client = Client::default();
let results = client.search("rust", 3, None).await;
if let Ok(result_list) = results {
     println!("Title : {}\nLink : {}", result_list[0].title, result_list[0].link);
}

Here the second and third parameters are

  • limit: u32 -> maximum number of search results to retrieve (Default: 10)
  • agent: String -> the user agent to use (Default: "Mozilla/5.0 (X11; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0".to_string())

You can pass None to either to use the defaults

SearchResult is a struct with title, link, description and description_raw. title is the search result title link is the main url of the search result description is the small description shown in the search result screen of google description_raw is the same description but it contains the html tags like <em>, <span>, &nbsp etc which are not included in the main description

pub struct SearchResult {
    pub link: String,
    pub title: String,
    pub description: String,
    pub description_raw: String,
}

Changelog

0.1.0 -> 0.2.2

  • option is now limit
  • You can now specify a User Agent
  • You can directly pass Optional parameters like limit: u32, agent: String instead of Some(limit) etc.
  • SearchError is now Error

0.2.2 -> 0.3.1

  • async option available with use search_with_google::search;
  • blocking option available with use search_with_google::blocking::search;
  • Error properly implements std::error::Error

0.3.1 -> 0.4.1

  • Client for repeated searches, Client::default() and blocking::Client::default()
  • description_raw field in result

0.4.1 -> 0.5.0

  • Error is now an enum with IoError and ReqwestError variants

Credits

Based on google-somethin

Dependencies

~10–21MB
~304K SLoC