This tutorial is equally applicable to Linux, Mac OS X and Windows.
Why keep backups?
Data is crucial for every organisation, but it can be tainted by computer crashes, virus infections, hard drive failures or human errors. That's why it's paramount for businesses to secure their data via scheduled and audited backups. It's common practice to create multiple copies of critical data, this long-standing approach is known as the three-two-one backup rule:
- At least three copies,
- In two different formats,
- with one of those copies offsite.
Do you need an offsite location to backup your data?
Cloud Storage is our new product specifically built for the needs of Linux system administrators who are looking for a simple backup solution. It works out of the box with common Linux tools such as
SSHFS. In this tutorial, we will show you how to use ElasticHosts Cloud Storage for creating offsite backups.
If you don't already have an ElasticHosts account, you can create one here:
Offsite Backup Tutorial
Table of Contents
1/B. Windows: Installing Cygwin
- Authentication with SSH keys
- Creating backups with rsync
- Scheduling rsync backups
4/A. Linux & Mac: Setting up cron 4/B. Windows task scheduler
Scheduling offsite backups with rsync
In this tutorial, we'll be utilising the following robust utilities:
cron, as the technology behind the scheduled offsite backups.
SSH enables secure connections via a private-public key pair. The public key is placed on systems/computers to allow access to the owner of the matching private key, the private key is kept secret. OS X and most Linux distributions come with a SSH client while Windows systems need to download a third-party tool. Since we will be using Cygwin in this tutorial, we advise to use the SSH client included in Cygwin (OpenSSH). (read more about SSH)
Rsync is pretty much the standard utility for creating offsite copies of that all too important data. The huge number of options does make it look fiendishly complex. Hopefully, the below tutorial will give you a firm grasp of the syntax by taking you slowly through the numerous options. Does not come with Windows, but Cygwin includes it.
Cron is a time-based job scheduler in Unix-like computer operating systems. It's perfect for kicking off that nightly backup when your workstation/server's load is low. If you are using Windows, you will need to revert to the native task scheduler in combination with an rsync shell script to emulate cron. We'll be covering cron's syntax and usage shortly.
Note: Microsoft announced that the next major Windows 10 update (Anniversary Update) will add a native Linux environment plus packages to the Windows operating system. Windows Insider program participants can already test its beta version. This is important because Bash on Windows brings, among others, native SSH client, rsync, and cron to Windows. For now, we recommend you install Cygwin. During the installation, please choose to install rsync, OpenSSH, and Dos2Unix packages.
1/B. Windows: Installing Cygwin
You will need Cygwin to follow this tutorial.
If you don't have it yet, click here for our detailed installation guide. If you already have Cygwin, you're all set to work through the rest of the tutorial.
2. SSH Keys
We'll be using SSH keys as the authentication method for our backups. On the system that needs scheduled backups, open a terminal and enter
whoami. It will display the user you are currently working under. Linux/Mac: If you're not
root enter either the
sudo su command to become root.
cd and press enter to move to root's home directory - or your Window's users home directory in Cygwin. Enter
pwd and press enter to print the home directory path. Please take note of this path, We'll be referring to it as ROUTE.
We will now use the
ssh-keygen command to create public SSH keys, specifically for ElasticHosts. Enter:
ssh-keygen -t rsa -b 4096 -C "email@example.com"
When you're prompted to "Enter a file in which to save the key", enter this:
You will need the ROUTE value from earlier. Our key path looks like:
At the 'Enter passphrase' prompts, press enter twice to set no passphrase. Now you should have two ElasticHosts files in your
ls -l ~/.ssh/*elastic* elastic_id_rsa elastic_id_rsa.pub
Lock down your SSH files with the below
chmod 400 ~/.ssh/*elastic*
If you don't do this you will be faced with 'too open' permissions errors.
Adding your public SSH key to ElasticHosts.
Log into your ElasticHosts account, then click on your storage folder's cog symbol:
Click on the 'Track profile keys' drop-down and select 'Custom'.
Give your key a name using the 'Description' field.
Enter the below command as root, or the Windows user from earlier to print your public ElasticHosts SSH key to the screen:
Copy the whole key, from
ssh-rsa.. until you reach the end of your email address, and paste it into the
Click the 'Add Key' button at the bottom.
Save your changes and then click the 'Back to account overview' button.
Your storage folder will now accept SSH connections using your public key.
Now is a good time to test our Cloud Storage SSH authentication. Click on your folder's eye symbol to view its connection details including the 'Username' and 'SSH Hostname'.
Using a terminal as root or the Windows user, start an SSH session. We will specifically be using the keys generated earlier. Replace your Cloud Storage username and hostname in the command below, and then run it:
ssh -o IdentitiesOnly=yes -i ~/.ssh/elastic_id_rsa firstname.lastname@example.org
The output of the command looks like the below:
root@mymachine:~# ssh -i ~/.ssh/elastic_id_rsa email@example.com The authenticity of host '8596fc6d-4d4b-4708-8000-0a1dcd0c8f6e.file-sto.re (126.96.36.199)' can't be established. RSA key fingerprint is fd:38:df:b5:2e:a8:27:8a:6b:f6:36:54:27:b0:5c:7f. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '8596fc6d-4d4b-4708-8000-0a1dcd0c8f6e.file-sto.re,188.8.131.52' (RSA) to the list of known hosts. sh-4.0#
This confirms our SSH key based authentication is good. Type
exit to disconnect from your storage.
If you are using either a Linux or Mac system, there's a very good chance rsync is already installed.
Launch a terminal, type in
rsync --version and press enter. If rsync is installed, you should get some output that looks similar to the below:
root@ubuntu:~# rsync --version rsync version 3.1.0 protocol version 31 Copyright (C) 1996-2013 by Andrew Tridgell, Wayne Davison, and others. Web site: http://rsync.samba.org/ Capabilities: 64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints, socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace, append, ACLs, xattrs, iconv, symtimes, prealloc rsync comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions. See the GNU General Public Licence for details.
If you don't have rsync installed, please use your respective package managers (
YUM) to install it.
Rsync commands take the following format:
rsync options source destination
Rsync 'options ' alter the way rsync behaves. For example, a -n option will put rsync into dry run mode, information will be printed to the screen but the command will not actually be run.
Let's look at a local example:
rsync -a /etc /tmp/
The above command will copy the
/etc directory with its content into the
/tmp directory. Visually:
If we subtly change the command (note the trailing slash on the end of '/etc/'):
rsync -a /etc/ /tmp/
The result will be different. All the files and folders inside the
/etc folder will now be copied into the
/tmp directory. Visually:
It's always a good idea to refer to rsync's manual page to brush up on the syntax. Yes, there are a huge number of options but we'll be distilling the relevant rsync options. Let's dive in:
-a or --archive: equates to archive mode and equals -rlptgoD (no -H,-A,-X)
With all Linux commands options can be grouped together as a lot, for example from above -rlptgoD. Each of these letters translates into their own individual options, let's break it down:
- -r recurse into directories.
- -l copy symlinks as symlinks.
- -p preserve permissions.
- -t preserve modification times.
- -g preserve group.
- -o preserve owner (super-user only).
- -D same as --devices --specials.
- --devices preserve device files (super-user only).
- --specials preserve special files.
- -H preserve hard links.
- -A preserve ACLs (implies -p).
- -X preserve extended attributes.
Note: options that mention super-user only will need to be run as root.
rsync push and pull
Rsync can either push data to a remote store or pull data from a remote store. Let's us look at examples of each.
rsync -a ~/dir1 username@remote_host:destination_directory
The above command will push the
dir1 into the
destination_directory on the
rsync -a username@remote_host:/home/username/dir1 directory_on_local_machine
The above command will pull the remote
dir1 directory from the remote server
remote_host into the local directory
Note: trailing slashes
/ apply to the above commands.
Tips for using rsync
Useful rsync options
- --delete delete extraneous files from dest dirs.
By default, rsync does not delete anything from the destination directory. You will need to use the --delete option to keep a local and remote directories in sync.
- --protect-args no space-splitting; wildcard chars only.
If you use spaces in your file names, the --protect-args option will be applicable too. Especially useful for Windows users to ensure spaces are not split in file names and any non-wildcard special characters (such as
&, etc.) are not translated.
- -z or --compress compress file data during the transfer.
To save time and money use the -z or --compress option. This will speed up the backup and reduce bandwidth costs. The --skip-compress option should be used in tandem to skip already compressed files.
To discover more about rsync use the
man rsync command. It will display a long list of all the options. Use the up and down keyboard arrow keys to navigate the manual page. To search, type a
/ followed by the term you would like to search for and then press enter. Pressing
n on the keyboard will jump down the page highlighting successive matches. To exit the manual page, press
Create 7 day incremental backups
It's possible to use rsync to create remote incremental snapshots of your data. Please refer to this samba.org 7 day incremental backup guide for more details and an example script that will need to be tweaked.
Windows users will benefit from reading the below Linux & Mac guide. Skip ahead here: Windows task scheduler
4/A. Setting up cron - for Linux & Mac
Mac systems come pre-installed with cron. Linux, on the other hand, might need to have it installed.
As root, run the below command to see the list of scheduled cron jobs:
You should get output similar to the below:
crontab -l no crontab for root
Or a number of lines of text starting with:
# Edit this file to introduce tasks to be run by cron.
Creating your rsync cron job
Our simple nightly rsync backup script could take the following format:
rsync -aH --numeric-ids --delete ~/sourceFolder Username@SSH_Hostname:/root/
To see the Username, SSH_Hostname and password, open the connection details for your Cloud Storage by clicking on the eye symbol on the control panel:
Adjust the rsync command above to your source folder and connection details, and run it as root. Now, launch a browser and visit your folder's storage web page - if you don't remember the URL and credentials, open the connection details again via the eye symbol.
Your files will be presented in the below format:
Let's add a nightly job in cron to kick off the rsync script we just used (we will time it for 01:07 am in the tutorial, but please choose another time for yourself). First, we will need to find out the file paths of rsync and ssh before composing the below cron job. Use the
which command to locate them respectively:
which ssh /usr/bin/ssh
which rsync /usr/bin/rsync
The next step is to add our cron job the crontab, a text file that enlists the scheduled commands set to run at specific times.
Below is our cron job, based on the rsync script, color coded to highlight the different portions of the command:
01 07 * * * env -i sh -c "/usr/bin/rsync -a --delete -e '/usr/bin/ssh -o IdentitiesOnly=yes -i /root/.ssh/elastic_id_rsa -C -c blowfish' /root/sourceFolder 8596fc6d-4d4b-4708-8000-0a1dcd0c8f6e @ 8596fc6d-4d4b-4708-8000-0a1dcd0c8f6e.file-sto.re:/root/"
Let's go through each of these colored segments:
cron job breakdown
01 07 * * * is every day at 1:07 am, as per cron's syntax.
mm hh dd mt wd command syntax enables specifying the time and date for scheduled commands the following way (all numeric values can be replaced by * which means all):
mm: minute 0-59;
hh: hour 0-23;
dd: day of month 1-31;
mt: month 1-12 and
wd: day of week 0-7 (Sunday = 0 or 7).
env -i start with an empty environment.
sh -c use the dash command interpreter to execute the rsync.
" beginning of the command to be executed by a dash.
/usr/bin/rsync -a --delete -e full path to rsync with the archive and delete parameters and remote ssh shell invoked via
' beginning of the remote shell (ssh) command.
/usr/bin/ssh full path to ssh.
-o IdentitiesOnly=yes -i /root/.ssh/elasticidrsa -C -c blowfish
-o IdentitiesOnly=yes specifies that ssh should only use authentication identity files;
-i gives the path to the identity file;
-C requests compression; and lastly
-c blowfish sets the cipher to blowfish, which is faster than the default cipher, 3DES.
' end of the remote shell (ssh) command.
/root/sourceFolder source folder for rsync.
firstname.lastname@example.org remote username @ hostname.
:/root/ remote rsync destination path.
" end of the command to be executed by a dash.
To edit the crontab with the nano editor, run the following command:
EDITOR=nano crontab -e
Now find the last '#' symbol in the crontab, and enter the cron code in the line below. Don't forget to alter the code according to your SSH, rsync and SSH key paths:
01 07 * * * env -i sh -c "/usr/bin/rsync -a --delete -e '/usr/bin/ssh -o IdentitiesOnly=yes -i /root/.ssh/elastic_id_rsa -C -c blowfish' /root/sourceFolder email@example.com:/root/"
Click on the connection detail (eye symbol) on the control panel for your Cloud Storage username and hostname.
4/B. Windows task scheduler and rsync
Open notepad and create a file that looks similar to the below. You will need to adjust: ssh,rsync, ssh key paths and the remote username @ hostname.
/usr/bin/rsync -a --delete -e '/usr/bin/ssh -o IdentitiesOnly=yes -i /root/.ssh/elastic_id_rsa -C -c blowfish' /root/sourceFolder firstname.lastname@example.org:/root/
Save your file as script.txt to say C:\rhcygwin64\home\Administrator , your path will differ depending where you install Cygwin.
Open a Cygwin terminal and cd into your home directory:
Rename script.txt to script.sh, set the line type to Unix and make it executable with the below commands:
mv script.txt script.sh dos2unix script.sh chmod +x script.sh
Please refer to the following short guide to convert your
script.sh file into a scheduled task: Running Cygwin Scripts as Scheduled Tasks in Windows - davidjnice.com
You have created a scheduled backup, of a folder, to Elastichosts' Cloud Storage, using rsync. Are you looking for other ways to use Cloud Storage? Check out the list of tutorials!
If you don't have an ElasticHosts account, you can create one here:
- Command to install cron↩
sudo apt-get update sudo apt-get install cron
For CentOS/Red Hat Linux, commands after the '#' character ensure cron starts automatically:
sudo yum update sudo yum install vixie-cron crontabs # ensure cron starts sudo /sbin/chkconfig crond on sudo /sbin/service crond start