Using Task States With Server Imaging


Some new task states have been exposed in the OpenStack Server Extended Status Extension that provide more fine-grained visibility into server state during the image-create (or "snapshot") process.  In this article, I'll tell you what they are and make some suggestions for how you can make use of them.

Before the OpenStack Grizzly release, when you requested a create image action on a server, the server would go into a special "task state" of image_snapshot, and it would stay in this task state until the image was completed.  This single task state hid the fact that there are three distinct phases to a snapshot operation:

  1. The hypervisor creates an image of the server's virtual hard disk.
  2. The hypervisor packages the image and prepares it for upload to the image store.
  3. The hypervisor uploads the packaged image to the image store.

By far, the phase that takes the longest is the third phase, in which the upload occurs.  Further, in both phases 2 and 3, while the hypervisor is working on your server's behalf, it isn't doing anything with your virtual hard disk.  So only during phase 1 is it important that you avoid any operations that would modify data on the server's virtual hard disk.  (Otherwise, the recorded snapshot might include inconsistencies from which certain application programs on your server--I'm thinking primarily of databases here--might not be able to recover when you boot from the image.)

With the OpenStack Grizzly release, the semantics of the image_snapshot task state were modified slightly and two new task states were added.  So now the task states that your server will go through while processing an image create action are:

  1. image_snapshot: the hypervisor is creating an image of the server's virtual hard disk
  2. image_pending_upload: the hypervisor is packaging the image and preparing it for upload
  3. image_uploading: the hypervisor is uploading the image to the image store

While your server is any of these task states, you can't issue another create image action on that server.  As you can see from the task state descriptions, the hypervisor is involved in all three phases of the image create action, so all the extra bookkeeping resources the hypervisor has allocated to your server are in use.  You have to wait until the entire snapshot process has completed and these resources are released before you can create another snapshot.

Notice, however, that once the first phase is completed, you no longer have to worry about operations occuring on your server that might interfere with the effectiveness of your snapshot.  Unfortunately, server task states are not currently exposed in the control panel.  You can, however, easily check them by using the API or the python-novaclient.

Using the API to Check Server Task State

The task states are included in the server detail response:

POST /v2/servers/{serverId}

Here's an abbreviated XML response:

<server xmlns="http://docs.openstack.org/compute/api/v1.1"
        xmlns:OS-EXT-STS="http://docs.openstack.org/compute/ext/extended_status/api/v1.1"
        ...
        status="ACTIVE"
        name="check-my-task-state"
        progress="100"
        id="933e803f-13b0-4698-a5c7-f74ec424fd38"
        OS-EXT-STS:vm_state="active"
        OS-EXT-STS:task_state="None"
        OS-EXT-STS:power_state="1">
  <!-- etc -->
</server>

The task state will be a namespaced attribute of the 'server' element in the namespace "http://docs.openstack.org/compute/ext/extended_status/api/v1.1".  (The namespace prefix is usually OS-EXT-STS, but if you've worked much with XML you know to rely on the namespace itself and to treat the namespace prefix as arbitrary.)  The task state is the next-to-last attribute listed in the sample response above.  You can see that the hypervisor has completed creating the image of the server's virtual hard disk and is now packaging and preparing the image for upload.

Here's a sample JSON server detail response:

{
    "server": {
        "OS-EXT-STS:power_state": 1,
        "OS-EXT-STS:task_state": "image_pending_upload",
        "OS-EXT-STS:vm_state": "active",
        /* ... */
        "id": "c2d5da0a-80d7-4ca7-872c-505410ab55d0",
        /* ... */
        "name": "check-my-task-state",
        "progress": 100,
        "status": "ACTIVE",
   }
}

In JSON, you're looking for the element named "OS-EXT-STS:task_state".  (Since a JSON object is unordered, it might not be right near the top of the response, as it is in this example.)

Using the Python-Novaclient to Check Server Task State

The python-novaclient is a handy program you can run from the command line.  If you haven't used it before, here are some Knowledge Center articles to check out:

These articles will provide you with an overview of python-novaclient and complete instructions for installing it on your operating system of choice.

To see the task state for a server using the python-novaclient, simply do a "show" on the server:

$ nova show {serverId}

Here's an abbreviated response:

+------------------------+---------------------------------------+
| Property               | Value                                 |
+------------------------+---------------------------------------+
| status                 | ACTIVE                                |
| OS-EXT-STS:task_state  | None                                  |
| OS-EXT-STS:vm_state    | active                                |
| id                     | 933e803f-13b0-4698-a5c7-f74ec424fd38  |
| name                   | check-my-task-state                   |
| OS-DCF:diskConfig      | MANUAL                                |
| progress               | 100                                   |
| OS-EXT-STS:power_state | 1                                     |
| metadata               | {}                                    |
+------------------------+---------------------------------------+

In this example, you can see that there's no task state for the server, so it could accept an image-create request.

 

Polling to Check Server Task State

The use case for server task states we're discussing here is the ability to:

  1. Stop activities on the server that would affect the quality of the disk image (e.g., stop a database management system).
  2. Issue an image-create command (via API, novaclient, or control panel) for the server.
  3. Monitor the server to see when it exits the 'image_snapshot' task state.
  4. Restart the activities stopped before you took the snapshot (e.g., bring your database management system back up).

You can write a simple bash script to monitor your server.  How elaborate you want the script to be is up to you, here's a sample of the most relevant part.  (Please read through and make sure you know what it's doing before using it.)  This script uses a program named "xmllint" to parse an XML server detail response; this program may not be installed by default on your system.  This fragment is pretty primitive, you have to control-C to stop the script.

# set these vars
#
# the API endpoint, e.g., "https://iad.servers.api.rackspacecloud.com/v2/123456"
API_ENDPOINT=
# your API username, e.g., "fredco"
API_USER=
# your API auth token, obtained from the Rackspace cloud identity service
API_AUTH_TOKEN=
# the UUID of the server you want to monitor
API_SERVER=
# how long to pause in between requests, in seconds
SLEEP_TIME=30
# a temporary file, e.g., "/tmp/polling.xml"
DETAIL_FIL=

# verify that the server exists
API_RESP_CODE=$(curl -X GET \
 -k -s \
 -H "X-Auth-User: $API_USER" \
 -H "X-Auth-Token: $API_AUTH_TOKEN" \
 -H "Accept: application/json" \
 -w "%{http_code}" \
 -o $DETAIL_FIL \
 "$API_ENDPOINT/servers/$API_SERVER")
if [ "$API_RESP_CODE" != "200" ] ; then
  echo "[error] can't find server $API_SERVER"
  exit 1
fi

while [ 0 ] ; do
   API_RESP_CODE=$(curl -s -k -X GET \
    -H "X-Auth-User: $API_USER" \
    -H "X-Auth-Token: $API_AUTH_TOKEN" \
    -H "Accept: application/xml" \
    -w "%{http_code}" \
    -o $DETAIL_FIL \
  "$API_ENDPOINT/servers/$API_SERVER")
  if [ "$API_RESP_CODE" == "404" ] ; then
    echo "[info] server $API_SERVER has disappeared!"
    break
  fi
  RAW_STAT=$(xmllint --xpath '/*[local-name()="server"]/@status' $DETAIL_FIL)
  VM_STAT=$(xmllint --xpath '/*[local-name()="server"]/@*[local-name()="vm_state"]' $DETAIL_FIL)
  TASK_STAT=$(xmllint --xpath '/*[local-name()="server"]/@*[local-name()="task_state"]' $DETAIL_FIL)
  POW_STAT=$(xmllint --xpath '/*[local-name()="server"]/@*[local-name()="power_state"]' $DETAIL_FIL)
  TIME=$(date +"%H:%M:%S")
  echo "$TIME  $RAW_STAT  ${VM_STAT##*:}  ${TASK_STAT##*:}  ${POW_STAT##*:}"
  sleep ${SLEEP_TIME:-45}
done

If you start a script that contains the above fragment and then take a server snapshot, you'll see something like this:

18:12:26   status="ACTIVE"  vm_state="active"  task_state="None"  power_state="1"
18:12:29   status="ACTIVE"  vm_state="active"  task_state="None"  power_state="1"
18:12:31   status="ACTIVE"  vm_state="active"  task_state="None"  power_state="1"
18:12:33   status="ACTIVE"  vm_state="active"  task_state="image_snapshot"  power_state="1"
18:12:36   status="ACTIVE"  vm_state="active"  task_state="image_snapshot"  power_state="1"
18:12:38   status="ACTIVE"  vm_state="active"  task_state="image_pending_upload"  power_state="1"
18:12:41   status="ACTIVE"  vm_state="active"  task_state="image_pending_upload"  power_state="1"
18:12:43   status="ACTIVE"  vm_state="active"  task_state="image_uploading"  power_state="1"
18:12:47   status="ACTIVE"  vm_state="active"  task_state="image_uploading"  power_state="1"
18:12:49   status="ACTIVE"  vm_state="active"  task_state="image_uploading"  power_state="1"
18:12:51   status="ACTIVE"  vm_state="active"  task_state="image_uploading"  power_state="1"
 ...
18:57:23   status="ACTIVE"  vm_state="active"  task_state="image_uploading"  power_state="1"
18:57:45   status="ACTIVE"  vm_state="active"  task_state="image_uploading"  power_state="1"
18:58:08   status="ACTIVE"  vm_state="active"  task_state="None"  power_state="1"
18:58:31   status="ACTIVE"  vm_state="active"  task_state="None"  power_state="1"


Was this content helpful?




© 2011-2013 Rackspace US, Inc.

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License


See license specifics and DISCLAIMER