#language #dataset #social #data-science #documentation #csv #html

bin+lib ream

Data language for building maintainable social science datasets

5 releases

0.4.2 May 27, 2021
0.4.0 May 21, 2021
0.3.3 May 15, 2021
0.3.2 May 11, 2021
0.3.1 Apr 27, 2021

#895 in Encoding

MIT license

68KB
1.5K SLoC

ream-core

license version

REAM is a data language for building maintainable social science datasets. It encourages inline documentation for individual data points, and introduces features to reduce repetition.

The language has three main components:

  • a data serialization language for structured datasets (working in progress)
  • a data template language to generate datasets (planned)
  • a collection of filters to manipulate data (planned)

REAM compiles to both human-readable documentation (HTML, PDF, etc.) and analysis-ready datasets (CSV, JSON, etc.) Two formats, one source.

# Country
- name: Belgium
- capital: Brussels
- population: 11433256
  > data from 2019; retrieved from World Bank
- euro_zone: TRUE
  > joined in 1999

## Language
- name: Dutch
- size: 0.59

## Language
- name: French
- size: 0.4

## Language
- name: German
- size: 0.01

compiles to

Belgium,Brussels,11433256,TRUE,Dutch,0.59
Belgium,Brussels,11433256,TRUE,French,0.4
Belgium,Brussels,11433256,TRUE,German,0.01

The official REAM documentation provides more information on the language. The rest of the README focuses on the compiler, ream-core.

Usage

Web

Two web-based editors with ream-core embedded are available without local installation:

Commandline Tool

For a local copy of the commandline tool, you will need Cargo and install in one of the two ways:

  1. Download the latest tagged version from crates.io:
cargo install ream
  1. Compile the latest development version from source:
git clone https://github.com/chmlee/ream-core
cd ream-core && cargo build

Now you have commandline tool ream available locally.

To compile your REAM file, execute:

ream -i <INPUT> -o <OUTPUT> -f <FORMAT> [-p]

where <INPUT> is the path to the REAM file and <OUTPUT> the path of the output file. For <FORMAT> there are two options: CSV and AST(abstract syntax tree). If the -p flag is present, the output will also be printed out as stdout.

Example:

ream -i my_data.ream -o my_data.csv -f CSV -p

Crate

To include ream-core into your Rust project, add the following line to your Cargo.toml file:

[dependencies]
ream = "0.3.1"

See docs.rs for more information.

WebAssembly

wasm-pack is requried to compile ream-core to WebAssembly.

git clone https://github.com/chmlee/ream-core
cd ream-core && wasm-pack build --target web

Two functions are avaiable in the WASM module: ream2csv and ream2ast:

import init, {ream2csv, ream2ast} from "./ream.js";

init()
  .then(() => {
    let csv = ream2csv(input);
    let ast = ream2ast(input);
  })

Dependencies

~4.5–6.5MB
~120K SLoC