#backup #backup-file #directory #metadata #zip-archive #system

bin+lib metadata-backup

Program to back up file system metadata

1 unstable release

0.1.0 Dec 25, 2019

#1533 in Filesystem

Apache-2.0

23KB
311 lines

metadata_backup: Back up your file system metadata

This project takes as input a location on your file system and recursively walks your directory structure and saves metadata about the files that are present on your computer in a hopefully easy-to-navigate zip file. This could be useful in a number of situations, some examples:

  1. You have a large amount of data backed up in a format that is not easily searchable (e.g. cold storage), and you want a smaller, local index of the files so you can know which backups to pull.

  2. You have data on your computer that you could re-create or get again from a public source, but you would like to keep a record of how you arranged it in your directory structure — e.g. you have many e-books, movies, games, etc. from multiple providers organized in a specific way, but you still have access to the original sources and don't feel the need to back up the actual files.

  3. You would like to keep a log of your metadata as a function of time for whatever reason with a higher frequency than you would make full backups.

I am sure there are some other reasons why you would want this, though I am not sure why I'm trying to convince anyone it's a useful thing — use it if it solves a problem for you, otherwise don't.

Installation

This project is written in Rust and uses cargo to manage builds. To build it from source, first install cargo, clone the project and from the repository root run cargo build --release, and you'll find the binary at target/release/metadata_backup. You can move it to somewhere on your path if desired.

Use

Once installed, use this with metadata_backup -r <filesystem_root> -o <output>

metadata-backup 0.1.0

USAGE:
    metadata_backup [FLAGS] -o <output> -r <root>

FLAGS:
    -h, --help                    Prints help information
    --no-remove-on-failure        Leave the zip file in place on error
    -V, --version                 Prints version information

OPTIONS:
    -o <output>
    -r <root>

Output format

This outputs a zip file with a directory FILESYSTEM_ROOT and a file FILE_MANIFEST at its root. The directory structure of the backup root is duplicated under the FILESYSTEM_ROOT, but all files and symbolic links have been replaced with a single contents.csv file, which contains information about the directory's contents. So, for example, backing up a directory like this:

$ tree root
root
├── a
   └── a1.txt
├── a1.txt -> a/a1.txt
├── b
   └── b1.txt
├── f1.txt
└── f2.txt

3 directories, 5 files

The file structure of the output would look like this:

backup
├── FILE_MANIFEST
└── FILESYSTEM_ROOT
    ├── a
    │   └── contents.csv
    ├── b
    │   └── contents.csv
    └── contents.csv

contents.csv

The contents.csv file in each directory will contain as many of the following properties as are available (availability varies by platform):

  1. name: File name
  2. size: File size in bytes
  3. is_dir: Whether or not the name represents a directory
  4. atime: Access time
  5. mtime: Modified time
  6. ctime: Creation time
  7. st_mode: An integer representing the Unix mode of the file or directory
  8. st_mode_string: A string representing a human readable equivalent of st_mode, e.g. drwxr-xr-x.
  9. uid: The numeric user ID that owns the file
  10. gid: The numeric group ID that owns the file.
  11. link: Empty if the file is not a symbolic link, otherwise contains the path to the linked file, relative to the backup's root.

FILE_MANIFEST

The FILE_MANIFEST file is intended to be an easy-to-search listing of the full path (relative to the backup root) of every file and directory listed in any of the contents.csv files in the zip file. For our example directory structure, the contents would look like this:

$ cat backup/FILE_MANIFEST
a
a/a1.txt
a1.txt
b
b/b1.txt
f1.txt
f2.txt

Errors during backup

metadata_backup will skip over directories that the user does not have permissions to traverse, but all other errors will cause the backup to terminate early. By default, it will also delete any half-complete backups in the event of a failure. If you would like to leave the half-complete backups in place in the case of an error, pass the --no-remove-on-failure flag.

License

This project is licensed under Apache 2.0, all new files should have the following boilerplate included:

   Copyright 2019 metadata-backup Authors (see AUTHORS.md)

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.

Dependencies

~8–16MB
~184K SLoC