dsc is a cli tool for finding and removing duplicate files on one or multiple file systems, while respecting your gitignore rules

0.1.3 Jan 16, 2021
0.1.2 Jan 16, 2021
0.1.1 Dec 31, 2020
0.1.0 Dec 30, 2020

dsc is a command line tool for locating duplicate files. The project is heavily inspired by the fd tool.



To get a quick overview of the amount of data that is duplicated, type dsc cmp in the current folder.

➜ dsc cmp
Duplicate data        : 1.54GB
Total duplicates      : 2,479
Total duplicate files : 5,164

This process can be sped up by giving a rough estimation of duplicate data by using dsc cmp --estimate.

➜ dsc cmp --estimate --min-size 500KiB ~/git
Duplicate data        : 1.07GB
Total duplicates      : 234
Total duplicate files : 309


For a more detailed overview use dsc report. This will output the duplicate files in CSV (default) or JSON format.

➜ dsc report --min-size 500KiB ~/Downloads
0,0,0,"/home/user/Downloads/talon-linux (1)/talon/resources/python/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so",291681296
0,2,0,"/home/user/Downloads/talon-linux (2)/talon/resources/python/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so",291681296


Use dsc link to clean up disk space by creating hard links between files on the same devices.

➜ dsc link --dry-run ~/Downloads
Are you sure you want to link 5,164 files? [y/N]: y
Done. Reclaimed 1.54GB of disk space.

To see what is going to happen before running link you can run dsc link --dry-run

➜ dsc link --dry-run ~/Downloads
(dryrun) Are you sure you want to link 5,164 files? [y/N]: y
linking "/home/user/Downloads/bloomrpc-1.3.1-x86_64(1).AppImage" => "/home/user/Downloads/bloomrpc-1.3.1-x86_64.AppImage"


Full help is available by typing dsc help <command>. All commands can be listed by typing dsc.


