5 releases
0.2.1 | Nov 13, 2023 |
---|---|
0.2.0 | Nov 12, 2023 |
0.1.2 | Nov 7, 2023 |
0.1.1 | Nov 7, 2023 |
0.1.0 | Oct 28, 2023 |
#16 in #levenshtein
56KB
876 lines
simple_search
A simple library for searching objects.
Basic Usage
use simple_search::search_engine::SearchEngine;
use simple_search::levenshtein::base::weighted_levenshtein_similarity;
fn main() {
let engine = SearchEngine::new()
.with_values(vec!["hello", "world", "foo", "bar"])
.with(|v, q| weighted_levenshtein_similarity(v, q));
let results = engine.search("hallo");
println!("search for hallo: {:?}", results);
}
Advanced Usage
The following example shows how to use the library with a custom type. The SearchEngine is configured to search for books by title, author and description. Each of those is weighted differently and the IncrementalLevenshtein is used to calculate the similarity.
use simple_search::search_engine::SearchEngine;
use simple_search::levenshtein::incremental::IncrementalLevenshtein;
#[derive(Debug)]
struct Book {
title: String,
description: String,
author: String,
}
fn main() {
let book1 = Book {
title: "The Winds of Winter".to_string(),
description: "The sixth book in the A Song of Ice and Fire series.".to_string(),
author: "George R. R. Martin".to_string(),
};
let book2 = Book {
title: "The Great Gatsby".to_string(),
description: "A classic novel of the roaring twenties.".to_string(),
author: "F. Scott Fitzgerald".to_string(),
};
let book3 = Book {
title: "Brave New World".to_string(),
description: "A visionary and disturbing novel about a dystopian future.".to_string(),
author: "Aldous Huxley".to_string(),
};
let book4 = Book {
title: "To Kill a Mockingbird".to_string(),
description: "A novel that deals with issues like injustice and moral growth.".to_string(),
author: "Harper Lee".to_string(),
};
let engine = SearchEngine::new()
.with_values(vec![book1, book2, book3, book4])
.with_state(
|book| IncrementalLevenshtein::new("", &book.title),
|s, _, q| s.weighted_similarity(q),
)
.with_state_and_weight(
0.8,
|book| IncrementalLevenshtein::new("", &book.author),
|s, _, q| s.weighted_similarity(q),
)
.with_state_and_weight(
0.5,
|book| IncrementalLevenshtein::new("", &book.description),
|s, _, q| s.weighted_similarity(q),
);
let results = engine.similarities("Fire adn water");
println!("search for Fire adn water:");
for result in results {
println!("{:?}", result);
}
println!();
let results = engine.similarities("Fitzereld");
println!("Fitzereld");
for result in results {
println!("{:?}", result);
}
println!();
}
Storing an engine
The SearchEngine most often has a very complicated type, that can't easily be expressed.
To work around this, the type_erasure module provides a way to store the engine, by using a trait object in a Box.
This solution is not ideal, as it requires dynamic dispatch, but the overhead is minimal
Once the approved RFC 2515 is part of stable rust, this will be replaced with a more elegant solution.
For more details on this see the type_erasure module.
use simple_search::search_engine::SearchEngine;
use simple_search::levenshtein::incremental::IncrementalLevenshtein;
use simple_search::type_erasure::non_cloneable::MutableSearchEngine;
fn main() {
let engine = SearchEngine::new()
.with_values(vec!["hello", "world", "foo", "bar"])
.with_state(
|v| IncrementalLevenshtein::new("", v),
|s, _, q| s.weighted_similarity(q),
);
let mut engine: MutableSearchEngine<&str, str> = engine.erase_type();
let results = engine.search("hallo");
println!("search for hallo: {:?}", results);
}
Parallelization
The SearchEngine can be used in parallel, using rayon iterators.
This simply involved calling the parallel version of the respective function
(As long as the values and query are Send + Sync).
use simple_search::search_engine::SearchEngine;
use simple_search::levenshtein::base::weighted_levenshtein_similarity;
fn main() {
let engine = SearchEngine::new()
.with_values(vec!["hello", "world", "foo", "bar"])
.with(|v, q| weighted_levenshtein_similarity(v, q));
let results = engine.par_search("hallo");
println!("search for hallo: {:?}", results);
}
Dependencies
~31–295KB