#awk #csv #tsv

app frawk

an efficient Awk-like language

5 releases (3 breaking)

0.4.1 Jun 11, 2021
0.4.0 Jan 18, 2021
0.3.0 Nov 8, 2020
0.2.0 Sep 27, 2020
0.1.0 Aug 22, 2020

#408 in Command line utilities

49 downloads per month

MIT OR Apache-2.0

1MB
25K SLoC

frawk

frawk is a small programming language for writing short programs processing textual data. To a first approximation, it is an implementation of the AWK language; many common Awk programs produce equivalent output when passed to frawk. You might be interested in frawk if you want your scripts to handle escaped CSV/TSV like standard Awk fields, or if you want your scripts to execute faster.

The info subdirectory has more in-depth information on frawk:

  • Overview: what frawk is all about, how it differs from Awk.
  • Types: A quick gloss on frawk's approach to types and type inference.
  • Parallelism: An overview of frawk's parallelism support.
  • Benchmarks: A sense of the relative performance of frawk and other tools when processing large CSV or TSV files.
  • Builtin Functions Reference: A list of builtin functions implemented by frawk, including some that are new when compared with Awk.

frawk is dual-licensed under MIT or Apache 2.0.

Installation

You will need to install Rust. If you would like to use the LLVM backend, you will need an installation of LLVM 10.0 on your machine:

  • See this site for installation instructions on some debian-based Linux distros.
  • On Arch pacman -Sy llvm llvm-libs and a C compiler (e.g. clang) are sufficient as of September 2020.
  • brew install llvm@10 or similar seem to work on Mac OS.

Depending on where your package manager puts these libraries, you may need to point LLVM_SYS_100_PREFIX at the llvm library installation (e.g. /usr/lib/llvm-10).

Building Without LLVM

While the LLVM backend is recommended, it is possible to build frawk only with support for the Cranelift-based JIT and its bytecode interpreter. To do this, build without the llvm_backend feature. The Cranelift backend provides comparable performance to LLVM for smaller scripts, but LLVM's optimizations can sometimes deliver a substantial performance boost over Cranelift (see the benchmarks document for some examples of this).

Building Using Stable

frawk currently requires a nightly compiler by default. To compile frawk using stable, compile without the unstable feature. Using rustup default nightly, or some other method to run a nightly compiler release is otherwise required to build frawk.

Building a Binary

With those prerequisites, cloning this repository and a cargo build --release or cargo [+nightly] install --path <frawk repo path> will produce a binary that you can add to your PATH if you so choose:

$ cd <frawk repo path>
# With LLVM
$ cargo +nightly install --path .
# Without LLVM, but with other recommended defaults
$ cargo +nightly install --path . --no-default-features --features use_jemalloc,allow_avx2,unstable

frawk is now on crates.io, so running cargo install frawk with the desired features should also work.

While there are no deliberate unix-isms in frawk, I have not tested it on Windows.

Bugs and Feature Requests

frawk has bugs, and many rough edges. If you notice a bug in frawk, filing an issue with an explanation of how to reproduce the error would be very helpful. There are no guarantees on response time or latency for a fix. No one works on frawk full-time. The same policy holds for feature requests.

Dependencies

~15MB
~303K SLoC