#text-file #terminal #join #sorting #terminal-text

bin+lib cdx

Library and application for text file manipulation and command line data mining, a little like the gnu textutils

24 releases

Uses new Rust 2024

0.1.23 Mar 3, 2026
0.1.22 May 6, 2024
0.1.21 Jun 14, 2022
0.1.18 Feb 20, 2022
0.1.6 Nov 25, 2021

#434 in Development tools

MIT/Apache

600KB
16K SLoC

The command line tool cdx is a set of tools for text file manipulation and command line data mining. It is hoped that the associated library will be useful for third party tools.


Command line data mining refers to using text files and shell scripts to do data manipulation, rather than database systems and complex formats. As it turns out, this is not only simple and powerful, but more performant in many situations.

Command line data mining tools have a dual nature. They are excellent for large production systems, where terabytes of data are constantly being processed by robust fault tolerant scripts. They also excel at ad-hoc exploration of data from a live command line.

The command line tool cdx is a set of tools reminiscent of the old gnu textutils (cut, sort, join et al) but far more flexible and powerful. The tools are well documented at (https://avjewe.github.io/cdxdoc).

The “main” program for each tool is quite simple; parsing command line arguments and assembling functionality from a broad toolbox. If no tools quite meets your needs, it should be quite simple to roll your own.

Dependencies

~9–23MB
~400K SLoC