3 releases (1 stable)
1.0.0 | Jul 10, 2023 |
---|---|
0.1.1 | Jun 2, 2023 |
0.1.0 | May 31, 2023 |
#337 in Machine learning
69KB
1K
SLoC
RecSys
A recommender system toolkit with more maths functions. Currently it's only used to learn and improve about this field, but feel free to participate.
Example
You can find a working example here
A simple implementation would be:
// src/s/company.rs
use rec_rsys::models::{one_hot_encode, ItemAdapter, Item};
pub struct Company {
pub id: u32,
pub ticker: String,
pub sector: String,
pub industry: String,
pub exchange: String,
pub country: String,
pub adj: String,
pub growth: f32,
}
impl ItemAdapter for Company {
fn to_item(&self) -> Item {
Item::new(self.id, self.create_values(), None)
}
fn create_values(&self) -> Vec<f32> {
let mut values = vec![self.growth];
[
self.encode_sector(),
self.encode_industry(),
self.encode_exchange(),
self.encode_country(),
self.encode_adjs(),
]
.iter()
.for_each(|f| values.extend(f));
values
}
fn get_references(&self) -> Vec<Item> {
match self.get_references_query() {
Ok(items) => items.then(|c| c.to_item()).collect::<Vec<Item>>(),
Err(_e) => vec![],
}
}
}
impl Company {
fn get_references_query(&self) -> Result<Vec<Company>, CRUDError> {
let query = Orm::new()
.select("id, sector, industry, exchange, country, adj, growth")
.from(&Self::table())
.where_clause()
.not_equal("id", &self.id.to_string())
.ready();
let rows = sqlx::query_as::<_, Self>(&query)
.fetch_all(&mut Self::connect());
match rows {
Ok(json) => Ok(json),
Err(_e) => Err(CRUDError::WrongParameters),
}
}
fn encode_sector(&self) -> Vec<f32> {
let sectors = vec![
"Healthcare",
"Unknown",
"Automotive",
"Technology",
"Communication Services",
"Basic Materials",
"Consumer Cyclical",
"Industrials",
"Financial Services",
"Energy",
"Utilities",
"Real Estate",
"Consumer Defensive",
];
match one_hot_encode(§ors).get(&self.sector) {
Some(val) => val.to_vec(),
None => panic!(),
}
}
// rest of methods ...
}
// src/recommendations/company.rs
use rec_rsys::{algorithms::knn::KNN, models::Item, similarity::SimilarityAlgos};
use super::models::company::Company;
pub struct Recommendation {
prod_id: u32,
result: f32,
}
fn generate_recommendations(id: u32, num_recommendations: u8) -> Vec<Recommendation> {
let company = Company::get(id);
Self::calculate_recommendations(company.to_item(), company.get_references(), num_recommendations)
}
fn calculate_recommendations(
item: Item,
references: Vec<Item>,
num_recs: u8,
) -> Vec<Recommendation> {
let knn = KNN::new(item, references, num_recs);
knn.result(SimilarityAlgos::Cosine)
.into_iter()
.map(|item| Recommendation{item.id, item.result})
.collect()
}
// src/main.rs
mod models;
mod recommendations;
use recommendations::generate_recommendations;
fn main() {
let recs = generate_recommendations(1, 5);
}
Improvements
1.- Primordial:
- Fix possible errors in formulas
- Add tests for each formula to be sure that it's correct
- Normalize documentation so is the same everywhere
- Create two types of docs. One in separated .md file with extense explanation and math examples. And the second one more for "code use"
- Fix typos
- Add benches for the formulas and overall functions
2.- Nice to have:
- Add more docs in .md related
- Add tests in the docs
- Normalize the results. Either 0 or 1 should represent 100% of similarity depending of the formula
- Convert the results into structs with more information
- Improve the code snippets. (The title can be the method's name)
- Make it async
3.- Final steps:
- Accept incoming data
- Convert incoming data into structs?
- Process data and get rankings
- Check ranking accuracy
- Run multiples algorithms at the same time
4.- Future nice to have:
- Save data and results
- Create some sort of "cache" to avoid multiples recalculations
- Use ndarrays of some sort of efficient sci-library
- Compare the performance and results between Generic types, f32 and f64.
How to:
- Docs structure :
/// # [Name of the concept]
/// [Small explanation of the function]
///
/// ## Parameters:
/// * `[Parameter of the function]`: [Small explanation]
///
/// ## Returns:
/// * [What does the function returns]
///
/// ## Examples:
/// [Examples]
///
#[doc = include_str!("../docs/example/example.md")]
pub fn example(){}
In the folder docs/ create a new .md file with the mathematical formula, explanation and examples if necessary.
# [Name of the concept]
## Explanation:
[Explanation of the mathematical concept]
## Formula:
$$ [Mathematical formula in raw katex format] $$
### Where:
* [Definition of each component of the formula]
- Order :
Keep the related concepts together
Dependencies
~78MB
~1M SLoC