• Sales: 1-800-961-2888
  • Support: 1-800-961-4454

Best Practices for Using Cloud Files


What is Cloud Files used for?

There are many ways to utilize Cloud Files, but it is strongest when functioning as unlimited object storage on the Cloud, or when used as a website accelerator with the Content Delivery Network (CDN).  Here we've compiled a few recommendations on ways to get the most out of Cloud Files.  

We'll follow a couple of use-case scenarios here:  Cloud Files as an object storage solution, and using Cloud Files for web site acceleration through the CDN.

Object Storage 

Please remember that, at its core, Cloud Files is an object storage solution and is not designed for high IOPS (Input/output Operations Per Second).  Instead, Cloud Files is designed for consistent reliability of data.  The primary function of Cloud Files is to ensure that your data is available when you ask for it.  This works best with relatively static files, as opposed to files that are frequently updated.  As a result, it is impractical to run a database out of Cloud Files.  You can’t expect to write to the same object 20 times per second, Cloud Files is not designed for that.  It was designed so that when you write to an object in Cloud Files, that object will be there each and every time you call for it.  

Organize Content for Web Acceleration

When using Cloud Files for web acceleration, a basic organizational structure would separate your content into different containers based on object type (for example): images, css, javascript, videos, uploaded content, etc.  This structure enables quick location of objects when you need them.  

 

Container Management

Using Multiple Containers

To improve organization and ease of accessing your content, our strongest recommendation is for you to divide your files into multiple containers.  Each container has performance limitations on the number of writes per second, including uploads and updates to files.  This means that if you are only using one container to store all of your files, as your project scales you will likely see an impact to performance due to the write limitations on the single container.  When using multiple containers to store your files, it is easier to locate files by querying the logical pathing since listing times are reduced.  Remember, you can have up to 500,000 containers in Cloud Files.

How to Label Your Containers

When organizing your containers for an object storage solution, we recommend labeling the container based on type of storage, perhaps based on the segment of the application accessing it, and attaching an incremental number to plan ahead in case the number of objects gets too high, looking something like this (for example): personnel-00000.  A basic set up for web acceleration would be to organize your content in separate containers (for example): images, css, javascript, videos, uploaded content, etc.  This structure enables quick location of objects when you need them.  

Limit the Number of Objects in Your Container

For the best performance, we recommend limiting container size to 500,000 objects.   This means that for any dynamically or organically growing system, there should be a fragmentation, or sharding, system built into place.  Keeping track of the object count can be done locally, and verified against the object count in the container by performing a HEAD on the container, and checking the Object Count.  For any container that would have the possibility of exceeding this, it’s a common convention to name the container by the objective name followed by a hyphen and a number.  Example: uploads-00000

Keep a Local Database of Your Container Structure

For users with a larger number of files, another recommendation is to keep a local copy of the container structure and listing so that you are not waiting on the container to list all the objects, 10,000 at a time.  You can do this in a local database, significantly reducing the chance of a naming conflict and location of a specific object.  That way if there is an update to an object, it is known in which container the object is located.  This allows for a simple update to the object, without the need to list all the containers.  This provides a significantly reduced time for updates.  This tip is best employed by customers using Cloud Files for an object storage solution, since these are frequently accessed programmatically, and will also grow organically over time.  This also applies to any site that allows for additional content, such as an uploads section, which may quickly grow beyond the webmaster's expectations.

 

Pathing

Remember, containers in Cloud Files do not nest, and all objects in a single conatiner are subject to the same limitations.  There are applications which will fake a folder structure by adding the path to the beginning of the object name, which works for pathing in the CDN URLs as well.  This allows for virtual pathing if needed.  For object storage, this allows for better subdivision of slow growth closely-grouped data, meaning that you’re unlikely to need to divide it out again later.  For website acceleration, this allows pathing that displays in the browser, for example: http://c123456.r02.cf3.rackcdn.com/ducks/funny/duckling.jpg where the name of the object is ducks/funny/duckling.jpg.

 

Removing Containers and Container Data

All objects in a container must be deleted before the container itself can be deleted.  Multiple containers allow for better threading of the deletion scripts.  Content grows regardless of the use case, so it’s best to plan ahead for this.

 

TTL

In addition to the performance benefit of reduced time in listings, you also get the improvement of being able to set the TTL (Time To Live value) for objects on the CDN appropriately.  This means that you could set longer TTL values on content that is unlikely to change often.  For website acceleration, you could use a longer TTL on files that are updated infrequently, such as your CSS, javascript, and videos.  At the same time you can set a shorter time on objects for which you need content control, such as user uploads.  Using an appropriate TTL for your object type TTL will improve performance since, the longer the TTL, the less often performance drops are seen due to renewing the CDN cache from your Cloud Files store.  It’s unlikely you would use the CDN for an internal object storage solution, since there is no method of content control.

 



© 2011-2013 Rackspace US, Inc.

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License


See license specifics and DISCLAIMER

24 Comments

Hi,

If somebody has the direct url to a file in the CDN. Ex http://a1e2d66ad8fca0ec3b0c-d921e3174d8d249c32b31ae0d54f3033.r95.cf2.rackcdn.com/test.flv

Is there a mechanism in place that will ask the user for a username and password prior to serving the file ?

Thanks

Unfortunately there isn't a mechanism for that in place for CDN-enabled files at this time. You might add that as a suggestion on our product feedback forum, or vote for a similar request (I know there's a topic asking for temporary/expiring URLs, for example).

http://feedback.rackspace.com/forums/71021-product-feedback

So when a object is given new content. How fast is this content updated?

Hey Mient-Jan,

Your content should update instantly. Depending on your location, you may at times have to wait about 5 minutes.

Let us know if you have any other questions.
-Rae

I assume that if I use the rackspace cdn to supply images to my website that setting up my .htaccess file to dis-allow hot linking will not work for the rackspace files since they are not located on my websites server.

Is there a way to prevent someone from viewing source and simply copying and pasting an images rackcdn.com url into there website html and using it for themselves?

If there is not a way to prevent it, is there a way to detect if it is being done and preventing their access?

You are correct in that the images are hosted on the CDN, so you wouldn't be able to manage access directly. There is some discussion of how to keep track of accesses and ways to obfuscate your page source in this article:

http://www.rackspace.com/knowledge_center/article/protect-your-cloud-files-cdn-bill-from-unexpected-usage-0

Thanks Jered. Being a really small time freelancer I probably won't use Cloud Files for anything that might be hot linked because in the very unlikely event of it happening, I can't afford to get hammered with a big bill, and it's not worth it to me to invest the time and effort to monitor the situation.

I'll figure out some use for Cloud Files though, because I'm a geek and this stuff is cool.

Is there any way to access cloud files from a cloud site (internally) without having to publish to CDN? IE: Access cloud files locally from my site? Also, if not, and I put a cloud files object (image/video) on my cloud site, am I charged for the bandwidth to copy the file from Cloud Files to my Cloud Sites site, and then again to the end user?

Right now there isn't an internal way to publish items in Cloud Files through Cloud Sites apart from linking to the file on the CDN. The file would be served directly from the CDN, so you would only be charged for the bandwidth once (the one time the visitor retrieves the file).

Are you sure? What I'm saying is that I have a cloud Sites site that is getting a file from cloud files in 10K chunks and passing them on to the client. The only way I can refer to the file is by it's public URL. So, Cloud Files CDN is going to charge me for bandwidth, and cloud sites is going to charge me for outbound bandwidth to pass it to the end user... unless, Cloud Files doesn't charge for outbound bandwidth to it's own IP addresses. I haven't seen anything that mentions that in the forums.

I'm sorry Jim, I misunderstood the circumstance. With what you describe, yes: You'd get charged for the bandwidth when the chunk is retrieved from the CDN, and Cloud Sites would track the bandwidth when the file is served from Sites to the requesting client.

An alternative approach would be to access Cloud Files, not through the CDN, but through a ServiceNet connection within the datacenter. Connections on the backend network like that wouldn't incur bandwidth expenses, so that initial request could avoid bandwidth charges.

Here's an article that describes using a PHP script to create a ServiceNet connection to Cloud Files. It focuses on Cloud Servers, but the information there should work for Cloud Sites as well:

http://www.rackspace.com/knowledge_center/article/connecting-to-cloudfiles

Not too long ago my CDN files we located a a short URL like
http://c710.r10.cf1.rackcdn.com/myimage.gif

Now, I find they are at a very long URL like this
http://381beee3d2ecfc0fbf3e-bfa204e8870a5beac2a15a1a9a44bd7a.r38.cf1.rackcdn.com/myimage.gif

How can I get the shorter URL back?

Hey Nicky,

At the moment we currently don't have the option to do short URLs. I'd recommend using a URL shortener like bit.ly. Sorry for the inconvenience. I would also recommend that you request this as a feature through our product feedback page here: http://feedback.rackspace.com/forums/71021-product-feedback/category/6849-cloud-files

Couldn't you use a cname instead? E.g. http://mydomain.com/myimage.gif

You can use a cname with Cloud Files, yes, though not for SSL connections. There is more information here:

http://www.rackspace.com/knowledge_center/frequently-asked-question/how-can-i-use-cnames-with-a-cloud-files-container

You really don't want to use a link shorten or unless you have to. simply because for seo reasons you want the original 301 redirected link. If you're using twitter I know that http://totally.awe.sm/ Is the best URL shorten her on the market.

The article says we can use "ducks/funny/duckling.jpg" to fake a file structure, but when uploading through the control panel the slashes are replaced with colons, so we get "ducks:funny:duckling.jpg".

I'll check into this in more detail tomorrow, but could your OS be making that substitution? I know the Mac OS has a habit of replacing colons and forward slashes in file names when they're being saved, it might be applying a similar filter to files being uploaded through a web browser.

Yes, you are correct Jered. The slash shows in the OS X gui but when I look at the files in Terminal they are actually named with colons. For anyone else who runs across this, you can rename the files using Cyberduck...

http://www.rackspace.com/knowledge_center/article/configuring-rackspace-cloud-files-with-cyberduck

Thanks for confirming the cause, iseem, and for posting your workaround for others. Much appreciated!

Hi

Can we cache images (uploaded on Reckspace server) on client side once it's fetched through CDN, so that pages with full of images can be loaded within a second?
Is there attribute/property which uses to cache images on client?

If yes then could you please tell us how can I achieve it?

Thanks.

Hi N.D.,
The concept of the CDN is that is caches on the server once it is displayed. On the client side, it is a browser setting that will be remain in their control.

On your server side, you can adjust your Time To Live (TTL) to retain your files in the cache for the time you specify.

Hopefully that helps.

I'm having trouble identifying how (or whether) we can ensure that our containers' objects are replicated in more than one data center. I see in http://docs.rackspace.com/files/api/v1/cf-releasenotes/content/September25.html that multi-region support has been added, but have no idea whether that means we can associate a container with more than one region (or how to do it).

Cloud Files does support multiple regions, but containers and objects cannot be cross-region at this time. You'd need to copy items from one region to another to have them stored in more than one region.

That doesn't affect CDN availability, which still uses Akamai's network to distribute file availability across multiple geographic regions.

Add new comment