2 releases

new 0.4.8	May 24, 2025
0.4.7	Apr 26, 2025

#407 in Database interfaces

147 downloads per month

Custom license

40KB
672 lines

Pirkle — Query CSV and SQLite with PRQL

Pirkle is a command-line tool to query CSV and SQLite files using the PRQL language.

It loads CSV files into an in-memory SQLite database — allowing you to join them with other tables, apply filters, and export results in table or CSV format.

Features

Query CSV files as structured tables
Join CSV and SQLite files together
Write expressive queries using PRQL
Output as a pretty table, CSV, JSON, or logfmt
Inspect the generated SQL
View schema information for files
Lightweight, fast, and written in Rust

Installation

Prebuilt binaries

Prebuilt binaries are available on the Releases page.

The macOS binaries are code-signed and notarized with Apple, but as a command-line tool, you may still need to:

Right-click the executable and choose "Open" the first time you run it

OR

Remove the quarantine attribute via Terminal:

xattr -d com.apple.quarantine /path/to/pirkle

From crates.io

$ cargo install pirkle

From Source

Or install using Rust:

git clone https://github.com/dloss/pirkle.git
cd pirkle
cargo install --path .

Usage

Basic Queries

# Query a CSV file. CSV files are auto-loaded as SQLite tables.
$ pirkle examples/employees.csv --query "from employees | filter country == 'USA' | select {name, age}"
name            age  
---------------------
John Smith      32   
Robert Johnson  41   
James Brown     39

# Query a SQLite file (after generating with examples/make_sqlite.sh)
$ pirkle examples/company.sqlite --query "from employees | select {name, age}"
name     age  
--------------
Alice    30   
Bob      45   
Charlie  25   
Diana    35   
Eva      28

# Alternative syntax using -- delimiter
$ pirkle examples/employees.csv -- "from employees | filter department == 'Engineering' | select {name, age}"
name            age  
---------------------
John Smith      32   
Robert Johnson  41   
Ahmed Hassan    29   
Sarah Kim       31

Reading from Standard Input

Pirkle supports reading CSV data from standard input, making it easy to pipe data from other commands:

# Pipe data into pirkle
$ cat examples/employees.csv | pirkle stdin --query "from stdin | filter salary > 70000"

# Use stdin with files
$ cat examples/orders.csv | pirkle stdin examples/customers.csv --query "from stdin | join customers (==customer_id)"

# Custom table name for stdin data
$ cat examples/employees.csv | pirkle stdin:workers --query "from workers | sort {-salary}"

Key features:

Auto-detection: Data on stdin is loaded as a table named "stdin"
Explicit reference: Use the filename stdin to read from stdin
Custom naming: Use stdin:tablename for custom table names
Query from stdin: If no query is provided with --query or --, Pirkle will read the query from stdin:
```
$ echo "from employees | filter country == 'USA'" | pirkle examples/employees.csv
```

Multiple references: Use the same stdin data with different table names

$ cat examples/employees.csv | pirkle stdin:workers stdin:staff --query "from workers | join staff (==id)"

Pirkle intelligently determines how to use stdin based on your command arguments, making it a flexible tool for data pipelines.

Viewing Schema Information

To see the structure of your tables:

# View schemas with the --schema flag
$ pirkle examples/employees.csv --schema
Table: employees
Columns:
  id (INTEGER)
  name (TEXT)
  department (TEXT)
  age (INTEGER)
  salary (INTEGER)
  country (TEXT)

Show SQL without executing

You can use the --show-sql flag to see the SQL that would be generated without executing the query:

$ pirkle examples/employees.csv --query "from employees | filter country == 'USA'" --show-sql
SELECT
  *
FROM
  employees
WHERE
  country = 'USA'
-- Generated by PRQL compiler version:0.12.2 (https://prql-lang.org)

This also works with PRQL files:

$ pirkle examples/employees.csv --query examples/queries/avg_age_by_department.prql --show-sql
SELECT
  department_id,
  AVG(age) AS avg_age
FROM
  employees
GROUP BY
  department_id
-- Generated by PRQL compiler version:0.12.2 (https://prql-lang.org)

Output formats

Default is a readable table format.

To output CSV:

$ pirkle examples/employees.csv --format csv --query "from employees | filter salary > 70000"
1,John Smith,Engineering,32,85000,USA
3,Robert Johnson,Engineering,41,92000,USA
5,Ahmed Hassan,Engineering,29,75000,Egypt
8,Sarah Kim,Engineering,31,83000,South Korea
9,James Brown,Sales,39,85000,USA
10,Fatima Al-Farsi,Marketing,36,76000,UAE

Other supported formats:

# JSON Lines format
$ pirkle examples/employees.csv --format jsonl --query "from employees | filter country == 'USA'"
{"age":32,"country":"USA","department":"Engineering","id":1,"name":"John Smith","salary":85000}
{"age":41,"country":"USA","department":"Engineering","id":3,"name":"Robert Johnson","salary":92000}
{"age":39,"country":"USA","department":"Sales","id":9,"name":"James Brown","salary":85000}

# logfmt format
$ pirkle examples/employees.csv --format logfmt --query "from employees | filter country == 'USA'"
id="1" name="John Smith" department="Engineering" age="32" salary="85000" country="USA"
id="3" name="Robert Johnson" department="Engineering" age="41" salary="92000" country="USA"
id="9" name="James Brown" department="Sales" age="39" salary="85000" country="USA"

Using PRQL files

You can use prewritten PRQL query files:

# Use a PRQL file directly with --query
$ pirkle examples/employees.csv --query examples/queries/top_5_paid.prql
name                  department    salary
---------------------------------------
Robert Johnson        Engineering   92000
John Smith            Engineering   85000
James Brown           Sales         85000
Sarah Kim             Engineering   83000
Fatima Al-Farsi       Marketing     76000

Joining tables

To join tables, use the join operation:

$ pirkle examples/orders.csv examples/customers.csv --query "from orders
join customers (==customer_id)
select {orders.order_id, customers.name, orders.amount}"
order_id   name            amount
-------------------------------
1          Acme Corp       250
2          Globex Inc      300
3          Acme Corp       150
4          Initech         400
5          Stark Industries 200

Example Data

Included example files:

examples/employees.csv: Employee data with department, salary, and country information
examples/departments.csv: Department names and IDs
examples/customers.csv, examples/orders.csv: Customer-order relationship data
examples/queries/*.prql: Sample PRQL queries

License

MIT

Dependencies

~57–85MB
~1.5M SLoC