11 releases
0.2.0 | Feb 8, 2024 |
---|---|
0.1.9 | Jan 11, 2024 |
0.1.8 | Sep 1, 2023 |
0.1.4 | Aug 28, 2023 |
#1866 in Web programming
53 downloads per month
550KB
246 lines
RustySozluk
Description
RustySozluk
is a Rust library for fetching user entries and thread entries from eksisozluk.com and analyzing sentiment of entries.
With the power of Rust and tokio
library, it is possible to fetch entries in a thread in a very short time.
Features
- Fetch user entries by username
- Fetch entries in a particular thread
- Asynchronous API using Rust's
async/await
- Export entries to both
JSON
andCSV
formats - Calculate sentiment of entries or get simple frequency of words in entries
Installation
Add rustysozluk
to your Cargo.toml
:
[dependencies]
rustysozluk = "0.1.9"
Usage
use rustysozluk::{fetch_user, tokio};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let username= "morgomir"; // username to fetch //alınacak kullanıcı adı
let entry_number = 4; // number of entries to fetch //alınacak girdi sayısı
let entries = fetch_user(username, entry_number).await?;
println!("Extracted {} entries:", entries.len());
for entry in entries.iter() {
println!("Content: {}\nDate: {}\nUsername: {}", entry.content, entry.date, entry.username);
}
Ok(())
}
If you want to fetch entries in a thread, you can simple use fetch_thread
function just like fetch_user
function, no need to change anything.
Sentiment Analysis
rustysozluk has "analyzer" module which is used for sentiment analysis. It uses Sağlam et al., 2019 model to classify entries as positive, negative and give a "Tone" score between -1 and 1.
here is an example usage:
use rustysozluk::tokio;
use rustysozluk::fetch_title;
use rustysozluk::analyzer::analyzer::analyze_sentiment;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let title = "https://eksisozluk.com/rust-programlama-dili--5575227";
let number_of_entries = 4;
let entries = fetch_title(title, number_of_entries).await?;
analyze_sentiment(entries)?;
Ok(())
}
Important Notes 📝
To properly use the analyzer module, you'll need to have access to two CSV files that serve as lexicons for sentiment analysis. These files are:
- stopwords.csv - Contains a list of Turkish stop words to be filtered out during preprocessing.
- SWNetTR.csv - Contains the sentiment lexicon based on the aforementioned model.
Both files can be found in the files folder of this GitHub repository. Download it and place it in the same directory as your project.
Request Limitation and Rate Limiting ⚠️
When using the rustysozluk
crate, please be mindful of the number of requests you make to eksisozluk.com. Sending an excessive number of requests in a short period of time can result in your IP address being temporarily banned from accessing the site.
Recommendations 📋
-
Rate Limiting: Implement rate limiting in your code to control the frequency of your requests.
-
Batch Requests: If possible, batch multiple queries together to minimize the number of individual requests.
-
Caching: Store results locally to reduce the need for repeated requests to the same URLs.
-
By adhering to these guidelines, you help to maintain a respectful use of eksisozluk.com's resources and ensure that you can continue to benefit from the features offered by the rustysozluk library without interruption.
Contributing
Any kind of contribution is welcome! Feel free to open an issue 🙂
Dependencies
~11–24MB
~342K SLoC