3 stable releases

new 1.1.0 Mar 1, 2025
1.0.1 Feb 28, 2025
1.0.0 Feb 15, 2025

#422 in Encoding

Download history 109/week @ 2025-02-12 11/week @ 2025-02-19 299/week @ 2025-02-26

419 downloads per month

MIT/Apache

32KB
737 lines

csvkit: CSV Processing Toolkit in Rust

csvkit is a CSV file processing toolkit written in Rust. It aims to provide a user-friendly interface, closely mirroring the Python csvkit API, for efficient and reliable CSV manipulation. csvkit-rs leverages Rust's performance, safety, and ease of use to provide a superior CSV processing experience.

Key Features

  • Familiar API: Designed to be as close as possible to the Python csvkit API, making it easy for Python users to transition.
  • High Performance: Utilizes Rust's performance benefits for fast and efficient CSV processing.
  • Reliability: Leverages Rust's strong type system and memory safety guarantees for robust and stable operation.
  • Modular Design: Separates Reader and Writer functionalities, enabling flexible CSV processing pipelines.

Installation

  1. Install Rust: If you haven't already, install Rust from rustup.rs.

  2. add csvkit from github:

    cargo add --git https://github.com/StatPan/csvkit
    

Usage

Reader (DictReader)

The DictReader reads CSV files and returns each row as a HashMap<String, String>.

use csvkit::{
    reader::{DictReader, ReaderOptions},
    writer::{DictWriter, WriterOptions},
};
use std::fs::File;
use std::io::BufReader;
use std::collections::HashMap;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let file = File::open("data.csv")?;
    let reader = BufReader::new(file);
    let options = ReaderOptions::default();
    let mut dict_reader = DictReader::new(reader, options)?;
    for record in dict_reader {
        let row: HashMap<String, String> = record?;
        println!("{:?}", row);
    }
    Ok(())
}
  • DictReader::new(reader, options): Creates a DictReader with a BufReader and ReaderOptions.
  • for record in dict_reader: DictReader implements the Iterator trait, allowing you to iterate over each row in the CSV file.
  • record?: Each row is returned as a Result<HashMap<String, String>, Box<dyn Error>>, so the ? operator is used for error handling.

Writer (DictWriter)

The DictWriter writes data in the form of HashMap<String, String> to a CSV file.

use csvkit::{
    reader::ReaderOptions, // Import necessary for options configuration
    writer::{DictWriter, WriterOptions},
};
use std::fs::File;
use std::collections::HashMap;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let file = File::create("output.csv")?;
    let fieldnames = vec!["header1".to_string(), "header2".to_string()];
    let options = WriterOptions::default();
    let mut writer = DictWriter::new(file, fieldnames, options);
    writer.writeheader()?;
    let mut row1 = HashMap::new();
    row1.insert("header1".to_string(), "value1".to_string());
    row1.insert("header2".to_string(), "value2".to_string());
    writer.writerow(row1)?;
    let mut row2 = HashMap::new();
    row2.insert("header1".to_string(), "value3".to_string());
    row2.insert("header2".to_string(), "value4".to_string());
    writer.writerow(row2)?;
    Ok(())
}

Writerows

The writerows method allows you to write multiple rows at once.

use csvkit::{
    reader::ReaderOptions, // Import necessary for options configuration
    writer::{DictWriter, WriterOptions},
};
use std::fs::File;
use std::collections::HashMap;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let file = File::create("output.csv")?;
    let fieldnames = vec!["header1".to_string(), "header2".to_string()];
    let options = WriterOptions::default();
    let mut writer = DictWriter::new(file, fieldnames, options);
    writer.writeheader()?;
    let mut rows: Vec<HashMap<String, String>> = Vec::new();
    let mut row1 = HashMap::new();
    row1.insert("header1".to_string(), "value1".to_string());
    row1.insert("header2".to_string(), "value2".to_string());
    rows.push(row1);
    let mut row2 = HashMap::new();
    row2.insert("header1".to_string(), "value3".to_string());
    row2.insert("header2".to_string(), "value4".to_string());
    rows.push(row2);
    writer.writerows(rows)?;
    Ok(())
}

Options

You can control the CSV processing behavior using the ReaderOptions and WriterOptions structs.

  • delimiter: The field delimiter (default: ,)
  • quotechar: The quote character (default: ")

No runtime deps