3 releases (breaking)
new 0.3.0 | Feb 16, 2025 |
---|---|
0.2.0 | Feb 12, 2025 |
0.1.3 | Feb 5, 2025 |
0.1.2 |
|
#8 in #parquet-file
367 downloads per month
28KB
392 lines
parquet_to_excel
A tool to convert parquet file to an/a excel/csv file in rust with constant memory, both a single parquet file and a folder of parquet files are supported.
You can also use python or rust to call it. The python package name is parquet_to_excel too. you can install it by pip install parquet_to_excel
. If you could not install this package correctly, you can try to install rust and maturin (pip install maturin
) first. Then you can try again.
Functions
- parquet_file_to_csv: convert a single parquet file to a csv file
- parquet_files_to_csv: convert a folder of parquet files to a csv file
- parquet_file_to_xlsx: convert a single parquet file to an excel file
- parquet_files_to_xlsx: convert a folder of parquet files to an excel file
Rust Excamples
- parquet to csv
use std::collections::HashMap;
use parquet_to_excel::csv::{file_to_csv, folder_to_csv};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut headerlabels = HashMap::new();
headerlabels.insert("gsmc".to_string(), "公司名称".to_string());
headerlabels.insert("col2".to_string(), "Column 2".to_string());
// parquet file to csv
let source = r"D:\Projects\RustTool\data\.duck\csv_test\source=csv_export.xlsx\data.parquet";
let writer = r"data\test.csv";
file_to_csv(source, writer, headerlabels.clone())?;
// parquet folder to csv
let source = r"D:\Projects\RustTool\data\.duck\csv_test";
let writer = r"data\test1.csv";
folder_to_csv(source, writer, headerlabels)?;
Ok(())
}
- parquet to xlxs
use std::collections::HashMap;
use parquet_to_excel::xlsx::{file_to_xlsx, folder_to_xlsx};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut headerlabels = HashMap::new();
headerlabels.insert("gsmc".to_string(), "公司名称".to_string());
headerlabels.insert("col2".to_string(), "Column 2".to_string());
// parquet file to xlsx
let source = r"D:\Projects\RustTool\data\.duck\csv_test\source=csv_export.xlsx\data.parquet";
let writer = r"data\test.xlsx";
file_to_xlsx(source, writer, Some("data".into()), None, headerlabels.clone())?;
// parquet folder to csv
let source = r"D:\Projects\RustTool\data\.duck\csv_test";
let writer = r"data\test1.xlsx";
folder_to_xlsx(source, writer, None, Some("gsmc".into()),headerlabels)?;
Ok(())
}
Python Example
- parquet to xlsx
from parquet_to_excel import parquet_file_to_xlsx, parquet_files_to_xlsx
# the last three arguments are optional
parquet_file_to_xlsx(r"data\result\qid=160\a.parquet", r"out1.xlsx", "data", "", {"ddbm": "地点编码"})
parquet_files_to_xlsx(r"data\result\qid=160", r"out2.xlsx", "", "scfs", {"ddbm": "地点编码"})
Dependencies
~28–39MB
~740K SLoC