#machine-learning #glob-pattern

bin+lib reorger

A tool for reorganizing a large number of files by splitting and unsplitting files into subdirectories

7 unstable releases (3 breaking)

0.5.2 Sep 27, 2020
0.5.1 Sep 27, 2020
0.3.1 Sep 27, 2020
0.2.0 Sep 27, 2020
0.1.0 Sep 27, 2020

#301 in Machine learning

25 downloads per month

MIT license

211 lines


A small collection of tools for manipulating large collections of files. Most of these were written to help deal with image data sets for Machine Learning.

  • split - Takes a glob pattern in the current directory and creates batches of 1000 files that match the glob pattern into subdirectories.
  • unsplit - Reverts the split command by looking into all directories (with no further depth traversal) and moves them into the current directory. If a directory that is traversed ends up being empty, it will be removed.
  • sample - Copies every n^th file that matches a glob pattern from the current directory into a new directory.


~48K SLoC