I want to share a (relatively basic) shell script that solved a really important problem for us. As you may know, we provide warehouse/order management services via mystockd.com and job management via myservicd.com. Part of our services package for both sites is independent databases for each client, and nightly backups.
In times past, I backed up the relatively small client list for INSIGHT to a remote 3TB drive. That's no longer tenable with these new services. I needed something easily extensible and completely cloud-based where our Amazon EC2 servers could backup to our Amazon S3 repositories.
A bit of research into Amazon Command Line Interface tools led to the answer.
#!/bin/bash
# very important to define the home directory and
# where to find the IAM credentials for access
export HOME=/root
AWS_CONFIG_FILE="~/.aws/config"
# add today's source code
now=$(date +"%H.%M.%S")
echo "$now: adding current servicd source code before creating backups"
rm -rf /root/backups/servicd/*
cd /root/backups/servicd/
mkdir source
cp -R /var/www/html/servicd/ source/
# strip out files not specific to the SaaS part of servicd
rm -rf ~/backups/servicd/$today/source/servicd/[long list of files that are on the public side of the site]
# prepare to dump
now=$(date +"%H.%M.%S")
echo "$now: starting backup of servicd clients"
# this is the list of active servicd clients
input="/root/backups/servicd_list"
# run through the list, getting one name at a time
while read -r line
do
now=$(date +"%H.%M.%S")
echo "$now: backup of $line"
# remove prior archives and dumps
rm -rf *.gz
rm -rf *.dmp
# dump databases (add to servicd_list as needed)
pg_dump $line > $line.dmp
# compress the archive
now=$(date +"%H.%M.%S")
echo "$now: compressing backup for $line"
tar -zcvf ~/backups/servicd/$line.gz ~/backups/servicd/
now=$(date +"%H.%M.%S")
echo "$now: ready to backup to S3"
# dump to S3 archive
date=$(date +"%Y.%m.%d")
/root/bin/aws s3 cp /root/backups/servicd/$line.gz s3://backups.myservicd.com/$line/$date.gz
now=$(date +"%H.%M.%S")
echo "$now: $line backup is done"
done < "$input"
now=$(date +"%H.%M.%S")
echo "$now: all backups are done"
Here are the main points:
- The list of active clients is a single text file of their database names. Very easy to manage.
- When the script runs, it first grabs a copy of the source directory and removes all public-facing (non-SaaS) files.
- It iterates through each active user-base, dumping their database first.
- A gzip archive is made with their database dump and the current source code.
- The compressed archive is backed up to an S3 repository in a sub-directory tied to their database name.
- To add a new backup, all we have to do is edit the config file.
- To stop backing up a client (now inactive), we remove their database name from the config file.
- The script runs as a crontab job every night.
That's it. We hope this is helpful to others. Regards!