#plot #data-points #data #graph #cli

app plotrs

CLI app for plotting data points from a csv and writing a png to disk

1 unstable release

0.1.3 Apr 28, 2022

#174 in Visualization

MIT/Apache

540KB
3K SLoC

linux windows crates.io docs MIT/Apache 2.0

plotrs

A CLI app for plotting csv data sets onto a graph. It works by reading a graph definition from a .ron file, then extracts data from one or more csv files and produces a .png image. Currently only scatter graphs are supported.

s

Back in the mists of time I used to use GNU Octave for plotting data about plasmonic absorption and photovoltaic-thermoelectric currents. As part of my Rust journey I thought I'd try writting a program for plotting data points in a similar style.

Features

  • Overlay best fit curves onto your graph
  • Graph element/component positions and sizes are dynamically calculated based on the size of the image you want
  • Multiple colours and symbols can be used to plot data sets
  • Data can be sourced from one or more csv files - you're simply targeting certain columns in a given file for extraction
  • Error bars - plot uncertainty in x and y singly or jointly
  • Appropriate quadrants are drawn if your data makes use of negative x-y values

Install

cargo install plotrs

How To Use

Create a .ron file containing the configuration of your desired chart and generate a png with:

plotrs -g <graph_type> -c <path_to_config_ron_file> -o <dir_for_output_png>

E.g

plotrs -g scatter -c scatter_config.ron -o here/please

Note that if your canvas is too small then your title and axis labels may become blurry.

Graph .ron Schemas

Scatter Definition

Scatter(
	title: "Engery against Time for Fuzzing About Things",
	canvas_pixel_size: (840, 600),
	x_axis_label: "Time (ms)",
	x_axis_resolution: 11, // Number of times the x-axis will be divided to show your data scale
	y_axis_label: "Energy (kJ)",
	y_axis_resolution: 11, // Number of times the y-axis will be divided to show your data scale
	has_grid: false, // Should the graph have a light grey background grid
	has_legend: false, // should a legend be generated? Only really useful with multiple data sets
	// data sets can be sourced from the same csv or from different ones and each can be configured with different colours/symbols
	data_sets: [
		DataSet(
			data_path: "scatter.csv",
			has_headers: true, // if your data has headers set to `true` so they can be ignored
			x_axis_csv_column: 0, // which column contains the x values
			x_axis_error_bar_csv_column: None, // which column contains x uncertainty Some(usize) or None
			y_axis_csv_column: 1, // which column contains the y values
			y_axis_error_bar_csv_column: None, // which column contains y uncertainty Some(usize) or None
			name: "Very interesting", // legend will indicate which colour and symbol correspond to which data set
			colour: Orange, // the colour to render a data point
			symbol: Cross, // the shape a plotted data point should take
			symbol_radius: 5, // The size of a drawn symbol in (1+ symbol_radius) pixels
			symbol_thickness: 0, // The thinkness of a drawn symbol in (1 + symbol_thickness) pixels
			best_fit: None, // A curve to fit to the axes. Some(BestFit) or None
		),
	],
)

Where your csv data may look like (note the lack of whitespace between columns!):

x,y
0.5,0.5
1.0,1.0
1.5,1.5

In a directory you may have:

- my_config.ron
- data.csv

So to generate a png you'd run from within the directory plotrs -g scatter -c my_config.ron and it'll write a png next to the files.

Symbol Types/Colours

The following symbols can be used for plotting data points:

  • Cross
  • Circle
  • Triangle
  • Square
  • Point

With the following colours:

  • White
  • Black
  • Grey
  • Orange
  • Red
  • Blue
  • Green
  • Pink

Best Fit Schemas

Each data set definition can also specify a Best Fit line to be drawn. In the examples below the data sets are tiny and the symbols are coloured white to hide them in the background canvas, they really just define the extent of the axes to show case overlaying a Best Fit.

Linear

y = gradient * x + y_intercept

Some(Linear(gradient: 1.0, y_intercept: 0.0, colour: Black))

s

Quadratic

y = intercept + (linear_coeff * x) + (quadratic_coeff * x.powf(2))

Some(Quadratic(intercept: 1.0, linear_coeff: 0.0, quadratic_coeff: 1.0, colour: Black))

s

Cubic

y = intercept + (linear_coeff * x) + (quadratic_coeff * x.powf(2)) + + (cubic_coeff * x.powf(3))

Some(Cubic(intercept: 1.0, linear_coeff: -0.5, quadratic_coeff: 1.0, cubic_coeff: 1.0, colour: Black))

s

Generic Polynomial

For custom polynomials you supply a map of coefficients where each key is the nth power x will be raised by and the value is the coefficient it'll be multiplied by.

Roughly:

for (k, v) in coefficients.iter() {
	y += v * x.powf(k);
}

The following extends the Cubic best fit into a Quartic Polynomial:

Some(GenericPolynomial(coefficients: {0: 1.0, 1: 1.0, 2: 1.0, 3: 1.0, 4: -1.0}, colour: Black))

Which to the human eye kinda looks like: 1 + x + x^2 + x^3 - x^4.

s

Exponential

y = (constant * base.powf(power * x)) + vertical_shift;

Some(Exponential(constant: 0.5, base: 2.7, power: -1.0, vertical_shift: 3.0, colour: Black))

s

Gaussian

`y = (variance * (2.0 * PI).sqrt()).powf(-1.0) * E.powf(-(x - expected_value).powf(2.0) / (2.0 * variance.powf(2.0)))`

Some(Gaussian(expected_value: 0.0, variance: 0.3, colour: Black))

s

Sinusoidal

y = amplitude * ((period * x) + phase_shift).sin() + vertical_shift;

Some(Sine(amplitude: 2.0, period: 1.0, phase_shift: 0.0, vertical_shift: 3.0, colour: Black))

s

Cosinusoidal

y = amplitude * ((period * x) + phase_shift).cos() + vertical_shift;

Some(Cosine(amplitude: 2.0, period: 1.0, phase_shift: 0.0, vertical_shift: 3.0, colour: Black))

s

Examples

Simple Scatter

s

Image Size Scales Elements Dynamically

Based on the dimensions of your image (canvas_size) the text and axes positions are automatically calculated. You can also toggle a light grey background grid drawn the from axes scales.

s

Scatter Multidata

From single or multiple csv files you can plot several data sets onto a single graph. Each data set can be configured to plot with a different colour and/or symbol. The legend can be toggled on and off. The size and thickness of the symbols are configurable on a per data set basis.

From a single csv containing multiple columns for different data sets:

s

From two csv files where each contains a column pair:

s

Scatter Error Bars

You can also indicate uncertainty with the use of error bars which can be specified for either axes.

s s

Quadrants Derived From Data

Based on the range of values across a given number of data sets the cartesian quadrants required are determined during exection with scale markings and axis labels moved appropriately.

sssssssss

Troubleshooting

The numbers along the axis are long floats overlapping one another

Try changing the x and y axis resolutions to numbers which are a factor of your largest values + 10%. What happens under the hood is that the largest values in your data set are found and slightly scaled so that data points avoid being plotted directly on an axis and thus obscurring some text/markers. When an axis is drawn it has a certain length in pixels and the resolution decides how many times it gets chopped up to display scale markers. To map a data value (f32) to a pixel (u32) there is a conversion where a single pixel represents some amount or length of value data. For an awkward resolution the pixel length between two scale markers could be a long float rather than rounded whole number.

E.g if the largest x value in your data is 10 try setting the x_axis_resolution to 10 * 1.1 = 11, that should produce 11 nice scale markers with whole numbers. Likewise a resolution 22 would produce nice markers also as 11 fits into 22 snugly.

The title/axis labels/legend are blurry

Try increasing the size of your canvas if the edges of the text become blurry.

Contributing

  • If you're unsure about something raise an issue first
  • Fork it
  • Tippy tap your keyboard
  • Submit a PR

LICENSE

Dual license of MIT and Apache.

TODO

  • Show BestFit types in legend
  • Allow overriding font
  • checked sub and addition to ensure pixel u32s are not overflowing maybe?
  • Split/simplify drawing methods out and then add a billion tests, many around position calculations
  • What methods/modules can be reused to draw other graph types...

Dependencies

~20–31MB
~315K SLoC