This guide provides guidance for backup and recovery of a Rackspace Private Cloud installation.
Table of Contents
Rackspace has developed Rackspace Private Cloud Software, a fast, free, and easy way to download and install a Rackspace Private Cloud powered by OpenStack in any data center. Rackspace Private Cloud Software is suitable for anyone who wants to install a stable, tested, and supportable OpenStack private cloud, and can be used for all scenarios from initial evaluations to production deployments.
Rackspace Private Cloud Software v 2.0 supports the Folsom release of OpenStack.
This document provides guidance for backing up the OpenStack cloud that is created with Rackspace Private Cloud Software, and for recovering the controller and compute nodes in the event of failure.
To use this document, you should have prior knowledge of OpenStack and cloud computing, basic Linux administration skills, Opscode's chef tool, and Rackspace Private Cloud Software.
This version of Rackspace Private Cloud Backup and Recovery replaces and obsoletes all previous versions. The most recent changes are described in the table below:
| Revision Date | Summary of Changes | |||
| August 15, 2012 |
|
|||
| November 15, 2012 |
|
|||
backup-manager man pagebackup-manager user guideFor more information about sales and support, contact us at <opencloudinfo@rackspace.com>. For feedback on the documentation, contact us at <RPCFeedback@rackspace.com>, or leave a comment at the Knowledge Center.
Table of Contents
After you have installed Rackspace Private Cloud Software, there are no backups configured for your cluster. If you want to back up your cluster, this document provides guidance on performing a backup.
Note that this document does not attempt to cover broader problems of instance failure and recovery, and it does not address backup policy questions, such as appropriate retention policies and rotations.
The following directories contain crucial components for backup:
/var/lib/glance/images/var/lib/chef/backups/etc/mysqlYou must also back up the following MySQL databases:
Ubuntu includes a simple tool called backup-manager. Any backup tool compatible with Ubuntu 12.04 can be used, but backup-manager is relatively simple to use for users already familiar with UNIX and bash scripts.
The following procedure will configure backup-manager to use the script that you will create to back up the controller node.
sudo -i. You will need root access for all of the procedures in this chapter.backup-manager:$ apt-get install -y backup-manager
backup-manager archives. You may accept the default of /var/archives.root as the owner user of the repository.root as the owner group of the repository./var/lib/chef/backups/var/lib/glance/images/etc/mysql/etc/backup-manager.conffile, edit the following variables to match these settings:export BM_PRE_BACKUP_COMMAND="/usr/local/bin/chef-backup.sh" export BM_MYSQL_DATABASES="nova glance keystone dash mysql" export BM_MYSQL_ADMINPASS=[root password defined in /root/.my.cnf] export BM_ARCHIVE_METHOD="tarball mysql"
You can now configure the backup-manager configuration file to suit your retention policy and upload requirements. For example, you can create a simple data redundancy plan by uploading the backup to a secondary server that is accessible via SSH. Refer to the backup-manager documentation for more information about configuring the backup.
By default, backup-manager automatically executes nightly. You can also generate a backup manually.
Note: Because all image files are backed up, these backups can be quite large. Ensure that you have enough space.
In a Rackspace Private Cloud installation, the controller node houses all the configuration information for the cluster, all OpenStack databases, and all images. To back up the configuration data, follow this procedure.
/usr/local/bin/chef-backup.shand include the following:
#!/bin/bash
BACKUP_DIR=${BACKUP_DIR:-/var/lib/chef/backups}
set -e
topics="node environment" # "client role cookbook"
declare -A flags
flags=([default]=-Fj [node]=-lFj)
for topic in $topics; do
OUT_DIR=${BACKUP_DIR}/${topic}
flag=${flags[${topic}]:-${flags[default]}}
rm -rf ${OUT_DIR}
mkdir -p ${OUT_DIR}
echo "Dumping $topic data"
for item in $(knife ${topic} list | awk '{print $1; }'); do
if [ "$topic" != "cookbook" ]; then
knife ${topic} show $flag $item > ${OUT_DIR}/${item}.js
else
knife cookbook download $item -N --force -d $OUT_DIR
fi
done
done
This script will place the configuration data for your cluster in the directory that you specify in the BACKUP_DIR environment variable. By default, it will choose /var/lib/chef/backups.
backup-manager to execute automatically.You can also set up a cron job to schedule backup-manager. Run the command crontab -u root -e and enter a cron job specifier. The following sample specifier would run backup-manager every night at midnight.
@midnight /usr/bin/backup-manager
The only information unique to the compute nodes is the disks for running instances. The instances can be backed up with the OpenStack snapshot tools in the Horizon dashboard and in the nova-compute API. You can also install backup tools inside the instances themselves.
For the recovery process, you will reinstall the components with the Rackspace Private Cloud Software ISO. Before you begin, ensure that you have the correct networking information:
xxx.xxx.xxx.xxx or a CIDR range, and it must be able to access the internet.xxx.xxx.xxx.1 address.10.1.0.0/16 and you specify a DMZ of 10.2.0.0/16, any devices or hosts in that range will be able to communicate with the instances on the nova fixed network.demo.sudo -i./var/lib/chef/backupsis restored:#!/bin/bash
BACKUP_DIR=${BACKUP_DIR:-/var/lib/chef/backups}
#restore chef node attributes and environment overrides
for n in $(ls ${BACKUP_DIR}/{node,environment}/*.js |
grep -v '_default.js$'); do
knife $(basename $(dirname "${n}")) from file "${n}"
done
#!/bin/bash
BACKUP_DIR=${BACKUP_DIR:-/var/lib/chef/backups}
cd "$BACKUP_DIR"
for sql in $(ls *.sql.bz2 | perl -ne '{ if (m/-mysql\./)
{ $line = $_} else {print;}} END {print $line}'); do
db=$(echo $sql | cut -d- -f2 | cut -d. -f1)
bzcat "$sql" | mysql "$db"
done
cd -
/etc/chef/client.pem).Before restoring compute nodes, you must remove existing compute node data from the controller node.
sudo -i.$ knife client delete name_of_compute_node
You can now re-create the instances. Note that when a compute node fails, all instance data is lost, so you must the instance data from configuration management, other backup recovery methods, or deployment of snapshots. IP addresses on instances will not be stable, so some reconfiguration may be necessary.
Table of Contents
Rackspace Private Cloud Software installs all "control plane" services on the controller node, including and not limited to:
You can reduce recovery times and potential data loss windows by configuring a standby server for the controller node. This does not constitute a true "high-availability" solution, but increases application layer resilience.
In this chapter, the main controller node will be referred to as the "active node", and the backup controller node as the "standby node". The configuration process includes the following stages:
In the event of failure, follow the steps in the Failover procedure to switch to the standby node.
Install Rackspace Private Cloud software on the device that you want to use as a standby node.
Controller.admin user. You will use this admin username and password to access the API and the dashboard.demo or enter your own and provide a password at the prompt. This user will not have admin privileges, but will be able to perform basic OpenStack functions, such as creating instances from images. Creating the user will also automatically create a project (also known as a tentant) for this user.Jane DoejdoemysecurepasswordAt this point, it will take approximately 5-10 minutes for the Ubuntu operating system installation to complete.
http://proxy_ip_address:proxy_ip_port. If you do not have a proxy, press enter to skip this step and leave the proxy information blank.At this point, the installation process will run for approximately 30 minutes without the need for user intervention. The device will reboot during the installation process. You will see a screen with the Rackspace Private Cloud logo, followed by a screen that displays a progress bar; you can use Ctrl+Alt+F2 to toggle between the progress bar screen and a Linux TTY screen (Ctrl+Alt+Fn+F2 on a Mac). You can follow the log during installation by switching to the correct TTY screen and viewing the log in /var/log/post-install.log.
After the installation is complete, you can view the install log by logging into the operating system with the username and password that you configured in Step 9. The log is stored in /var/log/post-install.log.
Whenever a node is added to your cloud, you should back up the configuration data.
Create a script file in /usr/local/bin/chef-backup.sh and include the following:
#!/bin/bash
BACKUP_DIR=${BACKUP_DIR:-/var/lib/chef/backups}
set -e
topics="node environment" # "client role cookbook"
declare -A flags
flags=([default]=-Fj [node]=-lFj)
for topic in $topics; do
OUT_DIR=${BACKUP_DIR}/${topic}
flag=${flags[${topic}]:-${flags[default]}}
rm -rf ${OUT_DIR}
mkdir -p ${OUT_DIR}
echo "Dumping $topic data"
for item in $(knife ${topic} list | awk '{print $1; }'); do
if [ "$topic" != "cookbook" ]; then
knife ${topic} show $flag $item > ${OUT_DIR}/${item}.js
else
knife cookbook download $item -N --force -d $OUT_DIR
fi
done
done
This script will place the configuration data for your cluster in the directory that you specify in the BACKUP_DIR environment variable. By default, it will choose /var/lib/chef/backups. Ensure that this script has executed. If it has done so successfully, you may now run backup-manager to back up your controller node to your desired backup destination.
The most robust mechanism for ensuring the security of your images is to configure an OpenStack Storage (Swift) cluster and configure Glance to store images in the cluster. If this is not a viable option, you can use rsync to save the image to your standby node.
This section describes the rsync method in detail. For information about using OpenStack Storage, refer to Rackspace Private Cloud Software OpenStack Storage Installation.
sudo -i to switch to root access.apt-get install rsync.cat to obtain the contents of the public key.$ cat /root/.ssh/id_rsa.pub
The command will output the public key in a string similar to the following:
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDJUXnQwaTpw5Gj072PHF
jxD6Av3gSdDYx1blyNB/L3CA52tvRGWwwwFzbrbqHWE+VpYgeoiL6ePul5H
W4ENG1QYxkc6xTpWNHfM4lZNHOXEguxRDhM5W0MAAlO9tr62NETe4AvpUtI
NskwdCWkthyt0c+jG0pW4FxuHFfdrF2S55pL4Sfh1SkDGEicCbpPtcvFXc0
/aIRgB9/coDE2SEsCiMQDcCfKZR/tWmezDmTY0dAE2qsSPIw75QzCySujbs
4t+rP8/mrjYqo0urYbYlhV7zvcoZNrgbaxciZJ2NXzh253Yy2NN9Wp9QAix
lCOLAqChPoTZah9iwYHchwy+Q4d root@controller01
Save this output by whatever means you prefer (copy-pasting it to a text editor, etc.).
sudo -i to switch to root access.apt-get install rsync./root/.ssh/authorized_keys file in your preferred editor and paste in the active node's id_rsa.pub key as a single line, with no line breaks.standby_node_IPwith the IP address of the standby node.$ rsync -av /var/lib/glance/images root@standby_node_IP:/var/lib/glance/images
The command should produce output similar to the following:
sending incremental file list
./
4ba6e796-a69d-4c61-b9f3-e8c398b1aa5b
58ee77f9-6028-47d3-8f3c-5ee1deec128b
929e52b3-d37f-4cfa-9578-73dab283bafd
This output shows that the synchronization between the active node and the standby node is successful. You will now need to create a cron job to automate the synchronization. Because rsync only copies new or changed images, the initial transfer may be slow. However, subsequent synchronization runs will run more quickly.
In the following procedure, rsync is configured to copy chef backup information in addition to the Glance images, and to run every five minutes.
$ mkdir -p /var/lib/chef/backups
standby_node_IPwith the standby node's IP address. #!/bin/bash
if ! pgrep '^rsync_job.sh$' &>/dev/null; then
rsync -av /var/lib/glance/images/ root@standby_node_IP:/var/lib/glance/images
rsync -av --delete /var/lib/chef/backups/ root@standby_node_IP:/var/lib/chef/backups
fi
$ chmod +rx /usr/local/bin/rsync_job.sh
crontab -u root -e. Enter the following cron job specifier to make the rsync job run every five minutes:*/5 * * * * /usr/local/bin/rsync_job.sh
In this configuration, images deleted on the active node are not deleted on the standby node. This means that you can retrieve an image from the standby node if it is accidentally deleted from the active node, but if you are frequently adding and deleting images, the undeleted images can take up a lot of disk space. To automatically remove images from the standby node when they are deleted from the active node, add a --delete flag to the rsync images command in the script file:
#!/bin/bash
if ! pgrep '^rsync_job.sh$' &>/dev/null; then
rsync -av --delete /var/lib/glance/images/ root@standby_node_IP:/var/lib/glance/images
rsync -av --delete /var/lib/chef/backups/ root@standby_node_IP:/var/lib/chef/backups
fi
To ensure that the latest chef state is backed up as well, add a chef backup script to rsync_job.sh.
#!/bin/bash
BACKUP_DIR=${BACKUP_DIR:-/var/lib/chef/backups}
set -e
topics="node environment" # "client role cookbook"
declare -A flags
flags=([default]=-Fj [node]=-lFj)
for topic in $topics; do
OUT_DIR=${BACKUP_DIR}/${topic}
flag=${flags[${topic}]:-${flags[default]}}
rm -rf ${OUT_DIR}
mkdir -p ${OUT_DIR}
echo "Dumping $topic data"
for item in $(knife ${topic} list | awk '{print $1; }'); do
if [ "$topic" != "cookbook" ]; then
knife ${topic} show $flag $item > ${OUT_DIR}/${item}.js
else
knife cookbook download $item -N --force -d $OUT_DIR
fi
done
done
if ! pgrep '^rsync_job.sh$' &>/dev/null; then
rsync -av --delete /var/lib/glance/images/ root@standby_node_IP:/var/lib/glance/images
rsync -av --delete /var/lib/chef/backups/ root@standby_node_IP:/var/lib/chef/backups
fi
In addition to Glance and chef information, you must also ensure that your MySQL state is also in sync between the active node and the standby node.
Note: Running the FLUSH TABLES command will be disruptive. The control plane of OpenStack will be inoperable until that step is complete.
/etc/mysql is the most important element./etc/mysql/conf.d/replication.cnfand include the following content:[mysqld] log-bin=mysql-bin server-id=1
restart mysql on the active node.mysql on the active node and enter the following at the prompts, replacing standby_password and standby_node_IPwith the password and IP address for the standby node.mysql> CREATE USER 'repl'@'standby_node_IP' IDENTIFIED BY 'standby_password'; mysql> GRANT REPLICATION SLAVE ON *.* TO 'repl'@'standby_node_IP';
/etc/mysql/conf.d/replication.cnfand include the following content:[mysqld] log-bin=mysql-bin server-id=2
restart mysql on the standby node. Leave the ssh session to the standby node open.mysql. At the prompt, enter the command FLUSH TABLES WITH READ LOCK;. Leave the session open on the mysqlprompt.mysql> FLUSH TABLES WITH READ LOCK; mysql>
$ mysqldump --all-databases --master-data >dbdump.db
UNLOCK TABLES; at the mysqlprompt. Exit mysql.mysql> UNLOCK TABLES; mysql> exit
dbdump.db file to the standby node, replacing standby_node_IPwith the standby node's IP address.$ scp dbdump.db root@$standby_node_IP:
$ grep 'CHANGE MASTER TO MASTER_LOG' dbdump.db
This command will return a statement that includes the filename of the master log file and a master log position.
standby_password is the one used for the user repl in step 5, and the grep_log_file and grep_positionvariables are the filename and position returned by the grep command in step 13.$ mysql < /root/dbdump.db
$ mysql -e "CHANGE MASTER TO MASTER_HOST='active_node_IP", \ MASTER_USER='repl', MASTER_PASSWORD='standby_password', \ MASTER_LOG_FILE='grep_log_file', MASTER_LOG_POS=grep_position'" $ knife node edit `hostname -f` $ vim /root/.my.cnf # Replace password $ mysql -e 'start slave;'
The MySQL replication is now configured on the active and standby nodes. For more information about MySQL replication, refer to "How to Set Up Replication" in the MySQL documentation.
In the event that the active node fails and you need to switch to the standby node, follow this procedure.
#!/bin/bash
BACKUP_DIR=${BACKUP_DIR:-/var/lib/chef/backups}
#restore chef node attributes and environment overrides
for n in ${BACKUP_DIR}/{node,environment}/*.js; do
knife $(basename $(dirname "${n}")) from file "${n}"
done
172.16.137.10 and the hostname is standby.myhost.com, the line would look like this:# 172.16.137.10 standby.myhost.com
client.pem file from all the nodes in the cluster and run chef-client on all compute nodes.
© 2011-2013 Rackspace US, Inc.
Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License

0 Comments
Add new comment