Get a full local backup of your remote web server with some basic command-line interaction and
rsync. Bonus for OS X users: clickable icon backup goodness.
This is one of those handy tricks I discovered way too late, which some of you may not already know.
Problem: You have a web server located somewhere not physically close to you. You use FTP to send and receive files. You’re generally okay with this setup, except for one little chink in the armor: backup. Even if you don’t run remote scripts which generate files on the server (I’m looking at Movable Type here) which you never remember to backup, sooner or later your local copy and the server will lose synch.
Solution: How about a way to backup a perfect copy of the remote server, incrementally, so that each new update only downloads the files that have changed (and not the whole multi-gigabyte site)? It’s as great as it sounds.
Caveat: Though your local computer can run any OS, this only works if the server itself is Unix-based, and you have shell access. If your site runs on IIS, it won’t work. If your host doesn’t provide you with a shell account, it won’t work. In theory, your shell login should be the same as your FTP account, but not necessarily. You may want to get in touch with your host to verify your settings.
Warning: The most important things you should pay attention to are the various path settings. If you get them wrong, and somehow end up moving files to or from the wrong spot, data could become corrupted awfully quickly. The first time you run this, make sure you also have an alternate method of recovering from data loss. Just in case.
To pull this off, we need to dip into some Unix hackery, which is a bit scary for those of us used to the cushy buttons and checkboxes of a GUI. If you’re on OS X or Linux, you’ve already got everything you need. Open up the Terminal in the former, or the command line in the latter. (If you’re using Linux, presumably you already know how to get a command line and I don’t have to explain this further, not that I could anyway.)
If you’re on Windows, you’re going to need some extra software, namely something called an “rsync client”. Though it’s probably overkill, grab Cygwin for now — which is a command line environment that comes with a set of powerful tools, all very much like what you get in a Unix-based OS — and you’ll get rsync with it. Install, then run Cygwin and you should be taken to a Unix-like command line.
Finding Your Local Backup Directory
So we should all be on the same page at this point, with a command prompt greeting us (shown below). If you already know how to get to your backup directory on the command line, skip ahead to the header “Running rsync”.
Now we want to find the directory that will house our backed-up site. This can be anywhere on your local system, and getting to it is going to depend largely on your computer’s configuration. In my case, I have a partition on my hard drive called ‘Shine’, which is mounted as a separate volume. This is the equivalent to calling a partition the G: drive in Windows. So let’s begin at the root (otherwise known as /) of our system by issuing the “change directory” command: cd /.
We can take a look at what’s in the root by issuing the “list” command: ls.
Where exactly to go from here depends on your OS; on a Mac, partitions are stored under /volumes. Under Cygwin on Windows, the User Guide should help you figure out where you need to go. So if we’re working on a Mac, let’s change the directory to /volumes and take a look at what’s in it using the ls command again:
On my system we see two volumes, Sparkle and Shine, which correspond with my local partitions. I’m going to skip the ensuing directory drill-down to find my ultimate destination, but by continuing to use cd and ls to navigate your file system, find the directory you’ll be storing your backup in. (You can either create it ahead of time with the file manager in your OS, or use the Unix mkdir command once you’re in the parent directory.) Your prompt will likely be the current path, if not you can display the path by invoking the pwd command:
Now we’re ready. I’ll cut to the chase and just show you right now what you’re going to be typing (more or less), and explain it afterward:
Let’s break it down piece-by-piece.
rsync - the program name itself, this is just causing it to run.
-aze - these are three options we’re specifying. a sets archive mode, which does things like preserve permissions and use relative paths. z compresses file data to speed up the transfer. e allows us to connect to a remote server. There are more options available, but these are the essential ones for what we’re trying to accomplish.
ssh - ssh, or secure shell, is a method of securely connecting to a remote server. The previous e option told rsync that we wanted to do so, and ssh is the protocol we’re going to use to do it.
username@ - this is your username on the remote server. Again, this may be similar to your FTP program’s login, or it may not. You’ll want to contact your host if you don’t know what your shell login is.
22.214.171.124 - this is the IP address of your web server. You likely won’t be able to just enter yourdomain.com here, so using your IP address is the best bet. However, that’s a pain when you don’t have a static IP, so alternatively this can also be the name of your host’s server. I can use aristotle.multipattern.com in place of an IP address, for example.
:/home/username/public_html/ - this is the full server path to the root of the directory you want to back up. Note the preceding colon, this is important for separating the IP address from the server path. By full server path, I mean you need to know where your site sits within the filesystem of the remote server. You might be able to find this with your FTP program by continuing to navigate up in the hierarchy until you can go no further; then simply chain together the resulting directories you navigated through until you get a full path back down the hierarchy to your web site’s root. Otherwise, you may need to contact your host for the full path.
. - and finally, an important trailing space followed by a single period. This indicates the current local path, which is where we navigated to earlier. Alternatively you could skip the initial step of finding this on the command line and use an absolute path here instead of a period, ie. /Volumes/Shine/Personal/mb-backup.
At this point, if you have the correct data entered, you should be ready to go. Hit return, and if the server is found, it will prompt you for your password. Enter it, then wait. The first sync will take quite a while.
If everything is working properly, it will appear that nothing is happening; when rsync has finished synchronizing, the command prompt will simply pop up again with no message one way or another, and you’ll be able to view the results by issuing an
ls command. If you don’t see your entire remote server’s contents now on your local hard drive, something has gone wrong. (For some reason on OS X, I get a message informing me that “stdin: is not a tty”. It doesn’t seem to affect the backup though, and everything else runs as expected.)
Aliasing your Backup
That’s about it if you don’t mind entering the command manually every time you want to backup. But you can also create an alias or a shell script for the entire command that will make life a little easier. In this case, make sure to use the full absolute path on your local server instead of the period, so that the scripts are callable from anywhere.
Aliasing involves opening up your shell user profile. There are a bunch of different Unix shells, bash being a more common one. Each will have its own profile naming scheme. In bash, this is .bash_profile, and creating an alias means adding a line like this with your own settings: (make sure it’s all on one line)
alias backup='rsync -aze ssh email@example.com:/home/username/public_html/ /Volumes/Shine/Personal/mb-backup'
The user profile file itself is stored in your home directory, which is most likely the directory that loads when you first open up the command line — if not, you can get to it with the command cd ~. It may be difficult to open a file with a preceding period in Windows (if Cygwin even uses this format); unfortunately I can’t really be of much more help here, so the User Manual is once again your friend.
Assuming you’ve managed to create the alias, you can now invoke the backup simply by typing backup on the command line.
OS X Shell Script
We can take it one step further in OS X though, and create a clickable icon for the backup. This involves opening a text editor and creating a new text file, which we’ll save as a shell script. Enter the following as the contents of the file, replacing with your own settings where appropriate:
#!/bin/bash rsync -aze ssh firstname.lastname@example.org:/home/username/public_html/ /Volumes/Shine/Personal/mb-backup
The latter three lines are identical to the command-line we generated earlier, and should all be on one line. Save this file wherever you want it, but make sure to give it a “.command” extension. Also very important, make sure that the line break formats are Unix, not Macintosh or DOS.
Once you have this file saved, you’ll need to make sure you have executable permissions on the file. Open up the Terminal again and find the directory you’ve saved it in, then issue this command:
chmod 744 filename.command
The very last step will probably be necessary, depending on your system configuration. In the Finder, right-click (Ctrl-click if you have to) on the file and select “Get Info”. In the “Open with” menu, select Terminal from the list. Close the dialogue, and you’re done.
Now whenever you wish to backup your server, all you need to do is double-click the icon and enter your password. If it’s not working as expected, check out this tutorial on executable scripts for more help.
Finally, if this simple set of Unix commands is brand new to you, you may also wish to look into the ability of ssh to lock down your mail, especially if you use a wireless internet connection of any kind.
There’s gold in the Unix command line. It’s worth learning.