#command-line #sorting #command-line-tool #join #terminal #text-file #terminal-text

bin+lib cdx

Library and application for text file manipulation and command line data mining, a little like the gnu textutils

22 releases

0.1.21 Jun 14, 2022
0.1.20 Jun 1, 2022
0.1.19 Apr 7, 2022
0.1.18 Feb 20, 2022
0.1.1 Oct 20, 2021

#1625 in Development tools

MIT/Apache

530KB
15K SLoC

Command line data mining refers to using text files and shell scripts to do data manipulation, rather than database systems and complex formats. As it turns out, this is not only simple and powerful, but more performant in many situations.

Command line data mining tools have a dual nature. They are excellent for large production systems, where terabytes of data are constantly being processed by robust fault tolerant scripts. They also excel at ad-hoc exploration of data from a live command line.

The command line tool cdx is a set of tools reminiscent of the old gnu textutils (cut, sort, join et al) but far more flexible and powerful. The tools are well documented at (https://avjewe.github.io/cdxdoc).

The “main” program for each tool is quite simple; parsing command line arguments and assembling functionality from a broad toolbox. If no tools quite meets your needs, it should be quite simple to roll your own.


lib.rs:

The command line tool cdx is a set of tools for text file manipulation and command line data mining. It is hoped that the associated library will be useful for theid party tools.

Dependencies

~33–49MB
~792K SLoC