Two years back my hosting domain crashed and I lost all my data/code. The sad news: I did not have a backup. I spent many hours building my website and all the effort went to waste. This was a good lesson to keep my data backed up.

Now all my projects are backed up either using Dropbox or a revision control system like Git. In addition to having your data backed up, a revision control system store revisions of your files enabling you to browse your data over a timeline. This is greatly useful if you are working on a project without having to worry about storing multiple versions as you make progress.

In this post, I will discuss how to use Git to automatically back up your data periodically. Follow the steps to backup your data automatically. All the commands have to be run from a shell (I use gnome-terminal).

Step 1: Installing git

Most Linux systems have git by default. If you already have git, skip to Step 2. You can easily install git based on your distribution

Debian based distributions like Ubuntu:

sudo apt-get install git

rpm-based distributions like Fedora:

sudo yum install git-core

If you are using any other OS, you can follow this installation guide

Step 2: Set up Git global configuration

You have to set one-time global settings for Git. This is not mandatory but it is helpful.

Set your Name:

git config --global user.name "FIRST_NAME LAST_NAME"

Set your identity:

git config --global user.email "MY_NAME@example.com"

Step 3: Choose a Git Remote Hosting Service

You can host your project on any remote machine that has git. Here I describe only the method using professional git hosting providers. There are free git hosting providers like GitHub or BitBucket. I personally prefer BitBucket since it allows to create as many number of private repositories and also there is no limit on data size. And all this for free :-). What else you need? If data privacy isn't is an issue, you could host it on GitHub or on Bitbucket.

Register for a free account on BitBucket or GitHub.

Specify the email id you used in Step 2. BitBucket/GitHub asks you password every time you try to copy your project. In order to avoid this, one can use SSH authentication keys. SSH authentication keys allow you to copy your projects without password. This is essential since you plan to take backup automatically once in a while.

To create ssh authentication key on your Linux system, run

ssh-keygen -C "MY_NAME@example.com"

Type enter to use all the default settings. This creates a key in the file .ssh/id_rsa.pub.

Go to BB/GH account settings and go to SSH Keys menu. You should see a page which looks like this or this. Add a new key and copy the contents of .ssh/id_rsa.pub. You can specify any name e.g. MyOfficePC. You can add as many keys as the number of computers you have.

Remember Steps 1, 2 and 3 are a one time thing for all your projects. You need not repeat these steps when you create new projects.

Step 4: Host a project locally

Let us say you have your project in /path/working_directory and you want to back it up. Before backing it up, we will have to create a git repository locally specific to this project. To do this, go to the directory /path/working_directory and run the following commands.

Initialise project

git init

Add contents of the folder to the repository

git add .

Note: This command adds everything inside the directory to the project. If you have specific files to be added, you can specify them with the command git file1 file2 ..

Though the above command specifies what all to be added to the project, it just creates a snapshot of your contents without copying them to the repository. To commit the snapshot, use this command

git commit -m 'initial repository added'

Note: It is good to specify a note or else the system forces you to enter a message.

Until this step, we have successfully created a local repository of your project. In the next step we will create a copy of this repository on a remote machine.

Step 5: Host the project remotely

Create a new repository either on BitBucket (BB) or GitHub (GH). Go the respective websites and choose "Create a Repo" from the menu. Specify a name for your project and select it as either private or public depending on whether you want to share your project with others. Creating a new repository is straightforward. If you need help, follow these links of creating BB repo or GH repo.

Once your remote repository is created, you will be directed to the repository page. The url should look something like this: http://hostingprovider.org/username/repository-name. Remember the repository name since you will need it later. You have to point your local repository to your remote repository to copy files from local to remote. Run the below commands from your local repository directory /path/working_directory

Adding remote BB repository:

git remote add origin ssh://git@bitbucket.org/username/repository-name.git

Adding remote GH repository:

git remote add origin git@github.com:username/repository-name.git

Note: you will require your username and repository name which can be obtained from the url of your remote repository page.

Now that we have pointed your local repository to remote repository, we can copy a snapshot of your project on to the remote repository. To do this, run the command

git push origin master

If this command gives you an error, try running git pull origin master first and then run git push origin master. The above steps can be practised to get familiar with the manual steps involved in creating git repositories locally and remotely.

Step 6: Setting up automatic backups periodically

To copy snapshots of your projects periodically without manually pushing them, you can set up cron jobs. Cron jobs offer an excellent way of automatizing repetitive jobs. In Linux, cron jobs can be specified using the command crontab.

First let us create an executable file, say git.run with all the commands required to copy data from local to remote.

Open a file named git.run in the directory /path/working_directory. Copy the below commands to the file and save it.

# Specify the files to be backed up.
# Below command will backup everything inside the project folder
git add .
# You can also use specific files using the command git add file1 file2 ..

# Committing to the local repository with a message containing the time details
curtime=`date`
git commit -m "Automatic Backup @ $curtime"

# Push the local snapshot to a remote destination
git push origin master

Change the permissions of git.run to executable.

chmod +x git.run

Almost there! We have to create a cron job which runs the script git.run periodically. For e.g. if you want to take up backup every week,

crontab -e

It asks for your favourite editor. I use vim. Add the below entry to the file. (i for insert, esc + : + wq for save).

0 0 * * 0 cd /path/working_dir && ./git.run

Save the entry and you are done. Your project is now backed up every week automatically!!. Great job if you have followed until this step. If you want daily, monthly, yearly, you can choose one of the patterns in the table below instead of 0 0 * * 0 (which represents weekly update on Sunday at 00:00).

Entry Description Equivalent To
@yearly (or @annually) Run once a year at midnight in the morning of January 1 0 0 1 1 *
@monthly Run once a month at midnight in the morning of the first of the month 0 0 1 * *
@weekly Run once a week at midnight in the morning of Sunday 0 0 * * 0
@daily Run once a day at midnight 0 0 * * *
@hourly Run once an hour at the beginning of the hour 0 * * * *
@reboot Run at startup @reboot
*    *    *    *    *  command to be executed
┬    ┬    ┬    ┬    ┬
│    │    │    │    │
│    │    │    │    │
│    │    │    │    └───── day of week (0 - 6) (0 or 6 are Sunday to Saturday, or use names)
│    │    │    └────────── month (1 - 12)
│    │    └─────────────── day of month (1 - 31)
│    └──────────────────── hour (0 - 23)
└───────────────────────── min (0 - 59)

Source: Wikipedia

Additionally I take periodic updates of SQL database required by my website. I add an additional entry in the git.run file which dumps SQL database to a file.

mysqldump --password=**** sql_sivareddy_in | gzip > website.sql.gz

Though this is not an efficient way of taking your database backup, it is a simple solution and you run it once a week. It doesn't hurt my hosting provider.

Note: If you create a new project, you just have to repeat Steps 4, 5 and 6.

I feel safe and I never have to worry about losing any of my data :-). Do you face any similar problems or do you use different automatic backup solutions? Is this solution working for you?

Site Counter