Automatically Back Up Projects or Data using Git
Two years back my hosting domain crashed and I lost all my data/code. The sad news: I did not have a backup. I spent many hours building my website and all the effort went to waste. This was a good lesson to keep my data backed up.
Now all my projects are backed up either using Dropbox or a revision control system like Git. In addition to having your data backed up, a revision control system store revisions of your files enabling you to browse your data over a timeline. This is greatly useful if you are working on a project without having to worry about storing multiple versions as you make progress.
In this post, I will discuss how to use Git to automatically back up your data periodically. Follow the steps to backup your data automatically. All the commands have to be run from a shell (I use gnome-terminal).
Step 1: Installing git
Most Linux systems have git by default. If you already have git, skip to Step 2. You can easily install git based on your distribution
Debian based distributions like Ubuntu:
sudo apt-get install git
rpm-based distributions like Fedora:
sudo yum install git-core
If you are using any other OS, you can follow this installation guide
Step 2: Set up Git global configuration
You have to set one-time global settings for Git. This is not mandatory but it is helpful.
Set your Name:
git config --global user.name "FIRST_NAME LAST_NAME"
Set your identity:
git config --global user.email "MY_NAME@example.com"
Step 3: Choose a Git Remote Hosting Service
You can host your project on any remote machine that has git. Here I describe only the method using professional git hosting providers. There are free git hosting providers like GitHub or BitBucket. I personally prefer BitBucket since it allows to create as many number of private repositories and also there is no limit on data size. And all this for free :-). What else you need? If data privacy isn't is an issue, you could host it on GitHub or on Bitbucket.
Specify the email id you used in Step 2. BitBucket/GitHub asks you password every time you try to copy your project. In order to avoid this, one can use SSH authentication keys. SSH authentication keys allow you to copy your projects without password. This is essential since you plan to take backup automatically once in a while.
To create ssh authentication key on your Linux system, run
ssh-keygen -C "MY_NAME@example.com"
Type enter to use all the default settings. This creates a key in the file
Go to BB/GH account settings and go to SSH Keys menu. You should see a page which looks like this or this. Add a new key and copy the contents of
.ssh/id_rsa.pub. You can specify any name e.g. MyOfficePC. You can add as many keys as the number of computers you have.
Remember Steps 1, 2 and 3 are a one time thing for all your projects. You need not repeat these steps when you create new projects.
Step 4: Host a project locally
Let us say you have your project in
/path/working_directory and you want to back it up. Before backing it up, we will have to create a git repository locally specific to this project. To do this, go to the directory
/path/working_directory and run the following commands.
Add contents of the folder to the repository
git add .
Note: This command adds everything inside the directory to the project. If you have specific files to be added, you can specify them with the command
git file1 file2 ..
Though the above command specifies what all to be added to the project, it just creates a snapshot of your contents without copying them to the repository. To commit the snapshot, use this command
git commit -m 'initial repository added'
Note: It is good to specify a note or else the system forces you to enter a message.
Until this step, we have successfully created a local repository of your project. In the next step we will create a copy of this repository on a remote machine.
Step 5: Host the project remotely
Create a new repository either on BitBucket (BB) or GitHub (GH). Go the respective websites and choose "Create a Repo" from the menu. Specify a name for your project and select it as either private or public depending on whether you want to share your project with others. Creating a new repository is straightforward. If you need help, follow these links of creating BB repo or GH repo.
Once your remote repository is created, you will be directed to the repository page. The url should look something like this: http://hostingprovider.org/username/repository-name. Remember the repository name since you will need it later. You have to point your local repository to your remote repository to copy files from local to remote. Run the below commands from your local repository directory
Adding remote BB repository:
git remote add origin ssh://email@example.com/username/repository-name.git
Adding remote GH repository:
git remote add origin firstname.lastname@example.org:username/repository-name.git
Note: you will require your username and repository name which can be obtained from the url of your remote repository page.
Now that we have pointed your local repository to remote repository, we can copy a snapshot of your project on to the remote repository. To do this, run the command
git push origin master
If this command gives you an error, try running
git pull origin master first and then run
git push origin master. The above steps can be practised to get familiar with the manual steps involved in creating git repositories locally and remotely.
Step 6: Setting up automatic backups periodically
To copy snapshots of your projects periodically without manually pushing them, you can set up cron jobs. Cron jobs offer an excellent way of automatizing repetitive jobs. In Linux, cron jobs can be specified using the command
First let us create an executable file, say
git.run with all the commands required to copy data from local to remote.
Open a file named
git.run in the directory
/path/working_directory. Copy the below commands to the file and save it.
# Below command will backup everything inside the project folder
git add .
# You can also use specific files using the command
git add file1 file2 ..
# Committing to the local repository with a message containing the time details
git commit -m "Automatic Backup @ $curtime"
# Push the local snapshot to a remote destination
git push origin master
Change the permissions of
git.run to executable.
chmod +x git.run
Almost there! We have to create a cron job which runs the script
git.run periodically. For e.g. if you want to take up backup every week,
It asks for your favourite editor. I use vim. Add the below entry to the file. (i for insert, esc + : + wq for save).
0 0 * * 0 cd /path/working_dir && ./git.run
Save the entry and you are done. Your project is now backed up every week automatically!!. Great job if you have followed until this step. If you want daily, monthly, yearly, you can choose one of the patterns in the table below instead of 0 0 * * 0 (which represents weekly update on Sunday at 00:00).
||Run once a year at midnight in the morning of January 1||
||Run once a month at midnight in the morning of the first of the month||
||Run once a week at midnight in the morning of Sunday||
||Run once a day at midnight||
||Run once an hour at the beginning of the hour||
||Run at startup||
* * * * * command to be executed ┬ ┬ ┬ ┬ ┬ │ │ │ │ │ │ │ │ │ │ │ │ │ │ └───── day of week (0 - 6) (0 or 6 are Sunday to Saturday, or use names) │ │ │ └────────── month (1 - 12) │ │ └─────────────── day of month (1 - 31) │ └──────────────────── hour (0 - 23) └───────────────────────── min (0 - 59) Source: Wikipedia
Additionally I take periodic updates of SQL database required by my website. I add an additional entry in the
git.run file which dumps SQL database to a file.
mysqldump --password=**** sql_sivareddy_in | gzip > website.sql.gz
Though this is not an efficient way of taking your database backup, it is a simple solution and you run it once a week. It doesn't hurt my hosting provider.
Note: If you create a new project, you just have to repeat Steps 4, 5 and 6.
I feel safe and I never have to worry about losing any of my data :-). Do you face any similar problems or do you use different automatic backup solutions? Is this solution working for you?