• Sales: 1-800-961-2888
  • Support: 1-800-961-4454

Cloud Files Python Binding - Script Writing


This article is a continuation of the Cloud Files Python Binding Cookbook article and shows how you can use the python-cloudfiles python binding to write some simple Python scripts that interact with your Cloud Files account. The purpose is to show you how a few lines of code can be used to make powerful tools for working with your data. It is intended for those who are new to Python and want some help getting started.

The Basics
To make use of this article you will need Python installed, the python-cloudfiles binding, and the ability to run a script. The previous article covers the basics of using the binding and running Python scripts, so please refer back to it if you are unsure how any of this works.

Scripting
This article poses three problems and shows you how Python scripts can be used to tackle them. Each script will introduce some particular language/binding features that will help you to make the most of the binding.

Listing more than 10,000 Objects:
The Cloud Files API will never return a list of more than 10,000 objects for any single API request. If you have containers with more objects than this you will need to understand how to use the binding to make multiple requests to get all the objects.

This capability is something you will require on a frequent basis if you are working with containers that hold large numbers of objects. The following script shows how to generate a list of all the objects in a container. Remember that any line starting with a # is a comment in the code.

#first we need to make the binding and the sys module, from the Python library,
#available to the script

import cloudfiles
import sys

#sys.argv is a list of the arguments passed to Python when you run the script
#on the commandline. The script expects the format of the command to be
#usage: python script_name.py username apikey container_name

username = sys.argv[1]
apikey = sys.argv[2]
cont_name = sys.argv[3]

#US-based Cloud Files accounts - uncomment if your account is in the US
#conn = cloudfiles.get_connection(username, apikey)
#UK-based Cloud Files accounts - comment out if your account is in the US

conn = cloudfiles.get_connection(username, apikey, authurl=cloudfiles.uk_authurl)

#connect to container
cont = conn.get_container(cont_name)


#create empty list to hold the complete list of objects
object_list = []

#this will get a list of the names of the first 10000 objects in the container
list_of_10000 = cont.list_objects()

#the while loop is used to keep getting the next 10000 objects. It only ends
#when list_of_10000 is empty

while list_of_10000:
#extend add the currently grabbed list_of_10000 to object_list

object_list.extend(list_of_10000)

#the API defines a marker object as a way to select the first object in
#the list to be returned. This defines the marker to use as the last
#object we have downloaded

mark = object_list[-1]

#this grabs the next 10000 objects. The marker says that the list should
#start at the first object greater than the marker

list_of_10000 = cont.list_objects(marker=mark)


#prints the list of objects one per line

print '\n'.join(object_list)

Updating a list of objects:
One of the most powerful things about scripting is the ability to do the same thing to multiple objects, thereby reducing lots of repetitive manual work. With Python it's simple to take a list of objects and update each one in the list based on certain criteria.

Use Case Example
Imagine you have uploaded a list of objects to a container using a third-party tool, and the content type on all the objects is incorrect. To re-upload the objects with the correct content type may take a long time depending on the size and number of files. Trying to update the content type for each file individually would also be cumbersome.

Instead of a manual process, you could create a quick script that would run through the list of files and modify the content type. Let's say that the list of files uploaded were PNG files, that means you would want to set a content type of image/png. Take a look at the script below and its comments to see how this is can be done:


import cloudfiles
import sys

#usage: python script_name.py username apikey container_name

username = sys.argv[1]
apikey = sys.argv[2]
region = sys.argv[3]
cont_name = sys.argv[4]

#US-based Cloud Files accounts - uncomment if your account is in the US
#conn = cloudfiles.get_connection(username, apikey)
#UK-based Cloud Files accounts - comment out if your account is in the US

conn = cloudfiles.get_connection(username, apikey, authurl=cloudfiles.uk_authurl)

#connect to container
cont = conn.get_container(cont_name)

#this grabs a list of the first 10,00 objects in the container. It there are
#more than 10,000 objects you might want to try combining this with the script
#above

objs = cont.get_objects()

#the for loop takes each object in turn and then runs the following block of code
for obj in objs:

#the if statement first takes the last 4 characters of the object name and
#converts them to uppercase then checks if they are equal to .PNG
#if the name ends with .PNG the if statement then makes sure that the
#content type isn't already correctly set.

if '.PNG' in obj.name[-4:].upper() and obj.content_type != 'image/png':

#this sets the correct content type on the Python object

obj.headers['Content-Type'] = 'image/png'

#this syncs the new content type with Cloud Files

obj.sync_metadata()

Handling Errors:
A script can only work within the confines of the code that defines it. If something happens that the code cannot handle, it will raise an exception and exit. If you know that a particular exception may happen you can write your code so that the program can continue running, even after the exception is raised. The Python language defines many exceptions for all its different functionality. The python-cloudfiles binding also defines a number of exceptions that are specific to the capabilities it provides. Take a look here to see what these exceptions are.

Use Case Example
Say you are tidying up your Cloud Files account and are looking to delete a number of containers. If you have a list of the containers, you could provide that to a script for it to run through and delete the containers one after another. You've already deleted all the objects from the containers, so it should just be a case of requesting that each container be deleted, right?.

The problem with this is that, if you forgot to delete some objects from a container or one has had new objects created in it, an exception will get thrown when the script tries to delete the container. If an exception is thrown but not caught within the code, the script will exit. This means that you would have to keep watching the script to make sure that it completes. Alternatively, because you know the exception may get thrown you can have your code do something useful when this happens, and then proceed with deleting the next container. The script is full of comments that explain how it's put together:

import cloudfiles
import sys


#usage:python script_name.py username apikey region container_name_list

username = sys.argv[1]
apikey = sys.argv[2]
cont_name = sys.argv[3]
cont_list_file = sys.argv[4]


#US-based Cloud Files accounts - uncomment if your account is in the US
#conn = cloudfiles.get_connection(username, apikey)
#UK-based Cloud Files accounts - comment out if your account is in the US

conn = cloudfiles.get_connection(username, apikey, authurl=cloudfiles.uk_authurl)


#grab the list of newline-separated container names

f = open(cont_list_file, 'r')
lines = f.readlines()
f.close()

#strips any whitespace from the beginning and end of the container names
cont_list = [line.strip() for line in lines if line.strip()]

#the for loop goes through the list of container names one at a time
for cont_name in cont_list:

#the try statement contains the code that may throw the error to catch
try:

#this is the request to delete the container
conn.delete_container(cont_name)

#if the request to delete the container fails because it is not empty the
#exception cloudfiles.errors.ContainerNotEmpty is thrown and the except
#code is then run.

except cloudfiles.errors.ContainerNotEmpty:

#the name of the container is printed to the terminal
print cont_name

Summary
The objective of this article was to build on the Cloud Files Python Binding Cookbook. Here we showed you some simple examples of how scripting can be used to improve productivity - specifically: how to work with large containers, selectively update objects in containers, and handle errors. If you are interested in learning more, the following links will likely help:

Cloud Files API documentation

Cloud Files API Python Binding

Official Python Website



© 2011-2013 Rackspace US, Inc.

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License


See license specifics and DISCLAIMER

0 Comments


Add new comment