Backing Up $HOME Using rsync

Content

Why should I backup?

As is clearly outlined in the FAQ, SDF backups up system files only. SDF does NOT maintain backups of userspace. If there is something stored in your home directory that is important to you then you need to take on the responsibility of protecting it. Paragraph two of the SDF AUP also clearly outlines the fact that taking care of your files is your responsibility.

Requirements

What you will need to perform a mirror of your home directory on SDF to another location is a computer with ssh and rsync installed and enough space to store your SDF files. The instructions below are being run on an x86 running Linux, but should be suitable for Mac OS/X and Windows running cygwin. Please read the ssh and rsync documentation on those platforms for any differences.

Why rsync?

Rsync will allow you to syncronize your data on SDF to an area on another computer. Using rsync will enable you to copy only new files or files that have changed. This saves bandwidth for both parties and greatly speeds up the operation. Rsync is in current development, open source, and very flexible. It is possible to create a variety of different backups using rsync.

Basic rsync backup.

Only a portion of rsync's options and capabilities will be discussed here. See the rsync man page for further details.

You may want to add the switch ”-n” to your rsync commands while setting things up and testing. This option causes rsync to do a “dry run”, executing the command and producing output without actually manipulating files.

You will be using rsync on a machine to CONNECT TO SDF. What follows is NOT designed to be run from SDF. On the machine you wish to backup your SDF home directory to, issue the following:

“rsync -avz -e ssh username@sdf.lonestar.org:/path/to/home /local/backup/dir”

What does this mean?

  • ”-a” : archive mode. This preserves timestamps, owner and group information, mirrors recursivly, and copies symlinks as links. If you'd like to copy the link's referent, you will need to add the ”-L” switch.
  • ”-v” : verbose mode. The names of files being copied will be sent to SDTOUT along with other information.
  • ”-z” : compress data being transmited.
  • ”-e” : specify the shell to be used. This is being used here to rsync over a secure shell connection.
  • username : your SDF username.
  • ”/path/to/home” [1] : The path to the directory you wish to backup.
  • ”/local/backup/dir” : Where you would like the files to be stored locally.

You will be prompted for your SDF password. Upon completion, rsync will report a few statistics to you such as the amount of data transfered and your average throughput. When you run this command subsiquently rsync will only backup new and changed files.

Restoring

Restoring a single file is as easy as using “sftp” or “scp” to transfer the file from your backup directory back to SDF. To restore your entire directory, simply issue the rsync command you used for backing up your SDF data with the source and destination reversed.

Caveats and extras

Trailing directory slashes

Most often, the trailing slash when specifying a directory is optional. To rsync, however, the trailing slash has meaning when specifying the source directory. Omitting the trailing slash cause the entire directory structure of the source files to be recreated on the destination machine. Specifying the trailing slash, however, causes the directory structure only to be recreated from the given directory.

For example:

“rsync -avz -e ssh username@sdf.lonestar.org:/path/to/home /local/backup/dir”

The above would produce ”/local/backup/dir/path/to/home” on the destination machine. Adding the trailing slash on the source directory would simply dump the contents of of the directory into the destination dir.

The "--delete" option

Rsync's default behavior is to copy and update files on the source to the destination. If a file is removed from the source directory it is retained at the destination. Adding the ”–delete” option causes files removed from the source directory to also be removed from the destination.

As stated above, the default behavior for copying links with rsync's ”-a” option is to copy simlinks as links. On sdf $HOME/html and $HOME/gopher are links and will not be backed up unless you either specify the full path to those directories as separate backups or you add the ”-L” option after the ”-a” option to your rsync backup.

Rsync and Cron

Backups work best if they are done regularly. A nice way to make sure you don't forget to make regular backups is to have your computer remember for you by placing your rsync command in a cron job. Due to a password being involved, this is not a completly straightforward task and some caution is required. For a very good first read on this subject, please read Troy Johnson's nice writeup on the subject.

$Id: rsync-backup.html,v 1.3 2007/04/26 18:11:08 avoyager Exp $