3 unstable releases
0.2.6 | Feb 2, 2023 |
---|---|
0.2.0 | Jul 4, 2021 |
0.1.1 | Jun 15, 2021 |
#946 in Command line utilities
32KB
447 lines
zet: Take the union, intersection, etc of files
This is a command-line utility for doing set operations on files considered as
sets of lines. For instance, zet union x y z
outputs the lines that occur in
any of x
, y
, or z
, and zet intersect x y z
those that occur in all of them.
Here are the subcommands of zet
and what they do:
zet union x y z
outputs the lines that occur in any ofx
,y
, orz
.zet intersect x y z
outputs the lines that occur in all ofx
,y
, andz
.zet diff x y z
outputs the lines that occur inx
but not iny
orz
.zet single x y z
outputs the lines that occur in exactly one ofx
,y
, orz
.zet multiple x y z
outputs the lines that occur in two or more ofx
,y
, andz
.
Notes
- Each output line occurs only once, because we're treating the files as sets and the lines as their elements.
- We do take the file structure into account in one respect: the lines are
output in the same order as they are encountered. So
zet union x
prints out the lines ofx
, in order, with duplicates removed. - Zet translates UTF-16LE and UTF-16BE files to UTF-8, and ignores Byte Order Marks (BOMs) when comparing lines. It prepends a BOM to its output if and only if its first file argument begins with a BOM.
- Zet ignores all lines endings (
\r\n
or\n
) when comparing lines, so two input lines compare the same if their only difference is that one ends in\r\n
and the other in\r
. Zet ends each output line with\r\n
if the first line of its first file argument ends in\r\n
, and\n
otherwise (if the first line ends in\n
or the first file has only one line and that line has no line terminator.) - Zet reads its entire first input file into memory. Its memory usage is
closely proportional to the size of its first input (
zet intersect
andzet diff
) or the larger of the size of its first input and the size of its output (zet union
,zet single
, andzet multiple
).
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Dependencies
~8MB
~203K SLoC