2 unstable releases
0.2.0 | Feb 12, 2023 |
---|---|
0.1.0 | Feb 11, 2023 |
#2213 in Encoding
Used in pretty-csv
28KB
654 lines
CsvStream
Deserialize CSVs, one record at a time.
Usage
use csvstream::{ByteRecord, read_csv};
let mut csv = &b"1,2\n3,4"[..];
let records = read_csv(&mut csv).collect::<Vec<ByteRecord>>();
assert_eq!(records.len(), 2);
assert_eq!(&records[0][0], b"1");
assert_eq!(&records[0][1], b"2");
assert_eq!(&records[1][0], b"3");
assert_eq!(&records[1][1], b"4");
See the docs for more info.
lib.rs
:
Deserialize CSVs, one record at a time.
Csv input is assumed to follow RFC4180, which is the closest thing to a standard we have.
RFC4180 TLDR; A csv is a sequence of records seperated by new lines. Records contain fields. Fields should be seperated by commas. If a field needs to have a comma, double quote or newline character within it, double quote that field. Within quoted fields, use 2 x double quotes to represent a literal double quote:
// basic normal fields
field 1,field 2,field 3
// Sparse fields
field 1,,field 3,""
// quoted fields. The values of the fields are: `field "1"`, `field "2"`, `field "3"`
"field ""1""","field ""2""", "field ""3"""
// quote fields with newlines in them
"this field
spans two lines", "this one does not"
The RFC says nothing about the following common cases:
// empty records
field 1, field 2
field 5, field 6
// whitespace before/after a field
field 1, field2
field 3 ,field 4
These are considered valid by the parser and you'll get what you would expect.
Csv headers are not treated differently. They are parsed like any other record
Example
You can deserialize anything that implements the BufRead
trait
use csvstream::{ByteRecord, read_csv};
let mut csv = &b"1,2\n3,4"[..];
let records = read_csv(&mut csv).collect::<Vec<ByteRecord>>();
assert_eq!(records.len(), 2);
assert_eq!(&records[0][0], b"1");
assert_eq!(&records[0][1], b"2");
assert_eq!(&records[1][0], b"3");
assert_eq!(&records[1][1], b"4");
This crate also offers a helper method to write csv. It's not really serialization because we take a string as input to begin with, but it does help with escaping fields properly where necessary.
You can write to anything that implements the Write
trait
use csvstream::{write_csv_record};
use std::str::from_utf8;
let mut output: Vec<u8> = vec![];
write_csv_record(["hello", "world"], &mut output).unwrap();
assert_eq!("hello,world\n", from_utf8(&output).unwrap());