3 stable releases
new 1.1.0 | Mar 1, 2025 |
---|---|
1.0.1 | Feb 28, 2025 |
1.0.0 | Feb 15, 2025 |
#422 in Encoding
419 downloads per month
32KB
737 lines
csvkit: CSV Processing Toolkit in Rust
csvkit is a CSV file processing toolkit written in Rust. It aims to provide a user-friendly interface, closely mirroring the Python csvkit API, for efficient and reliable CSV manipulation. csvkit-rs leverages Rust's performance, safety, and ease of use to provide a superior CSV processing experience.
Key Features
- Familiar API: Designed to be as close as possible to the Python csvkit API, making it easy for Python users to transition.
- High Performance: Utilizes Rust's performance benefits for fast and efficient CSV processing.
- Reliability: Leverages Rust's strong type system and memory safety guarantees for robust and stable operation.
- Modular Design: Separates Reader and Writer functionalities, enabling flexible CSV processing pipelines.
Installation
-
Install Rust: If you haven't already, install Rust from rustup.rs.
-
add csvkit from github:
cargo add --git https://github.com/StatPan/csvkit
Usage
Reader (DictReader)
The DictReader
reads CSV files and returns each row as a HashMap<String, String>
.
use csvkit::{
reader::{DictReader, ReaderOptions},
writer::{DictWriter, WriterOptions},
};
use std::fs::File;
use std::io::BufReader;
use std::collections::HashMap;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let file = File::open("data.csv")?;
let reader = BufReader::new(file);
let options = ReaderOptions::default();
let mut dict_reader = DictReader::new(reader, options)?;
for record in dict_reader {
let row: HashMap<String, String> = record?;
println!("{:?}", row);
}
Ok(())
}
DictReader::new(reader, options)
: Creates aDictReader
with aBufReader
andReaderOptions
.for record in dict_reader
:DictReader
implements theIterator
trait, allowing you to iterate over each row in the CSV file.record?
: Each row is returned as aResult<HashMap<String, String>, Box<dyn Error>>
, so the?
operator is used for error handling.
Writer (DictWriter)
The DictWriter
writes data in the form of HashMap<String, String>
to a CSV file.
use csvkit::{
reader::ReaderOptions, // Import necessary for options configuration
writer::{DictWriter, WriterOptions},
};
use std::fs::File;
use std::collections::HashMap;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let file = File::create("output.csv")?;
let fieldnames = vec!["header1".to_string(), "header2".to_string()];
let options = WriterOptions::default();
let mut writer = DictWriter::new(file, fieldnames, options);
writer.writeheader()?;
let mut row1 = HashMap::new();
row1.insert("header1".to_string(), "value1".to_string());
row1.insert("header2".to_string(), "value2".to_string());
writer.writerow(row1)?;
let mut row2 = HashMap::new();
row2.insert("header1".to_string(), "value3".to_string());
row2.insert("header2".to_string(), "value4".to_string());
writer.writerow(row2)?;
Ok(())
}
Writerows
The writerows
method allows you to write multiple rows at once.
use csvkit::{
reader::ReaderOptions, // Import necessary for options configuration
writer::{DictWriter, WriterOptions},
};
use std::fs::File;
use std::collections::HashMap;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let file = File::create("output.csv")?;
let fieldnames = vec!["header1".to_string(), "header2".to_string()];
let options = WriterOptions::default();
let mut writer = DictWriter::new(file, fieldnames, options);
writer.writeheader()?;
let mut rows: Vec<HashMap<String, String>> = Vec::new();
let mut row1 = HashMap::new();
row1.insert("header1".to_string(), "value1".to_string());
row1.insert("header2".to_string(), "value2".to_string());
rows.push(row1);
let mut row2 = HashMap::new();
row2.insert("header1".to_string(), "value3".to_string());
row2.insert("header2".to_string(), "value4".to_string());
rows.push(row2);
writer.writerows(rows)?;
Ok(())
}
Options
You can control the CSV processing behavior using the ReaderOptions
and WriterOptions
structs.
delimiter
: The field delimiter (default:,
)quotechar
: The quote character (default:"
)