28 releases (13 breaking)
0.14.0 | Dec 18, 2024 |
---|---|
0.13.0 | Nov 2, 2024 |
0.12.0 | Sep 19, 2024 |
0.11.0 | Jun 28, 2024 |
0.3.5 | Dec 23, 2023 |
#49 in Database implementations
290KB
2K
SLoC
dfsql
- Revision: the standalone
count
command is replaced withlen
, so make sure to replace(count)
andcol "count"
withlen
andcol "len"
respectively.- the unary
count <col>
command is unaffected.
- the unary
Install
cargo install dfsql
How to run
dfsql --input your.csv --output a-new.csv
# ...or
dfsql -i your.csv -o a-new.csv
REPL
exit
/quit
: exit the REPL loop.exit
undo
: undo the previous successful operation.undo
reset
: reset all the changes and go back to the original data frame.reset
schema
: show column names and types of the data frame.schema
save
: save the current data frame to a file.save a-new.csv
Statements
select
select <expr>*
select last_name first_name
- Select columns "last_name" and "first_name" and collect them into a data frame.
- Group by
group (<col> | <var>)* agg <expr>*
group first_name agg (count)
- Group the data frame by column "first_name" and then aggregate each group with the count of the members.
filter
filter <expr>
filter first_name = "John"
limit
limit <int>
limit 5
reverse
reverse
sort
sort ((asc | desc | ()) <col>)*
sort icpsr_id
use
use <var>
use other
- Switch to the data frame called
other
.
- Switch to the data frame called
- join
(left | right | inner | full) join <var> on <col> <col>?
left join other on id ID
- left join the data frame called
other
on my columnid
and its columnID
- left join the data frame called
Expressions
col
: reference to a column.col : (<str> | <var>) -> <expr>
select col first_name
exclude
: remove columns from the data frame.exclude : <expr>* -> <expr>
select exclude last_name first_name
- literal: literal values like
42
,"John"
,1.0
, andnull
. - binary operations
select a * b
- Calculate the product of columns "a" and "b" and collect the result.
- unary operations
select -a
select sum a
- Sum all values in column "a" and collect the scalar result.
alias
: assign a name to a column.alias : (<col> | <var>) <expr> -> <expr>
select alias product a * b
- Assign the name "product" to the product and collect the new column.
- conditional
<conditional> : if <expr> then <expr> (if <expr> then <expr>)* otherwise <expr> -> <expr>
select if class = 0 then "A" if class = 1 then "B" else null
cast
: cast a column to either typestr
,int
, orfloat
.cast : <type> <expr> -> <expr>
select cast str id
- Cast the column "id" to type
str
and collect the result.
- Cast the column "id" to type
Dependencies
~27–58MB
~1M SLoC