Running Jekyll On Rackspace Cloud Files

Filed in Cloud Industry Insights by Philip Thomas | December 20, 2012 3:00 pm

This post is a guide to maintaining a Jekyll[1] website on the Akamai-powered Rackspace Cloud Files CDN[2] network. However, it may serve as a general guide for hosting any static website on the network.

Background
I decided to use the Akamai Content Delivery Network as a host after reading an article about the use of Jekyll and Akamai in the Obama fundraising campaign. In general, hosting a static website on a CDN gives extremely quick load times with nearly unlimited scalability. However, CDNs were built for delivering content, not whole websites, so making this setup work requires some finagling.

This guide assumes:

Container Creation
In the Rackspace Cloud Files interface, first create a Cloud Files container, then publish it to the CDN. Note the published URL, as it is used to configure the CNAME in DNS, below.

Also decide on a TTL for your domain – this will be the maximum time items are cached. I keep mine at seven days, but this also means that any whole-site changes could take up to a week to propagate. Also note that we can override a TTL with Edge Purge[3], as discussed below.

DNS
In order to set your URL (ex. www.example.com) to the CDN, set the CNAME record to the published URL of your cloud container that was noted in the above step (omitting the “http://”). For using www.example.com, this means setting the www CNAME to the published URL.

Note: Because Cloud Files requires a CNAME instead of an A record, in general we must use a subdomain. Therefore, the naked domain (e.g. example.com) cannot be used. While it is technically against standards, some nameserver-level services such as Cloudflare[4] allow naked domain CNAME records. I have had limited success with naked domain CNAME records and do not suggest it.

Uploading
After you have configured DNS, the next step is uploading your Jekyll site to the CDN. First, run the command Jekyll to compile your site into the _site folder.

In order to upload my site to the CDN, I use this Jekyll to Rackspace Cloud Files Python script[5]. It requires having Python installed on your computer – if you are unsure of how to do this, Google the directions, because installation on each operating system differs.

Save the script to your root Jekyll directory, and edit it to put in your Rackspace username, API key and container name. To run the file, thus uploading your site to the CDN, use the command:

python cloudfiles_jekyll_upload.py

Warning: Add the uploading script to the exclude[6] in your _config.yml, and consider adding it to your .gitignore. Otherwise, the access credentials for your Rackspace account could be compromised.

Note: You may upload the contents of your _site folder however you wish; I find the above script most efficient, but using the web interface or Cyberduck[7] is equally viable.

Setting a Web Index
If you type in the root URL of your website now, e.g. www.example.com, it will show an error. However, accessing a specific file works – e.g. www.example.com/index.html. This is because, when accessing the root URL, you are not calling a specific file, just the container. We need to tell Akamai which file to show when somebody accesses just the container. Here, we will make index.html the web index of the container, so that visiting www.example.com will show the file at http://www.example.com/index.html.

We do this in two steps using the Rackspace API. First, we authenticate to receive an authorization token, which gives us access to the files for 24 hours, and also to find the CDN URL where we will send our request. Second, we set the container’s web index to index.html.

API Authentication
This step authenticates with Rackspace and gives us an authentication token (valid for 24 hours) to access the container through API. It also gives us the CDN URL for managing your containers.

Note: The CDN URL is different from the published URL for your container that we put in your CNAME record. The CDN URL is obtained through the API, and it gives you the ability to control all of your containers. The published URL is specific to a container, and is the public view of a container’s files. Use the CDN URL for cURL statements / management, and use the published URL to update your DNS records.

In order to obtain an authentication token and in order to see the CDN URL, you need your Rackspace username and Rackspace API Key (available in your Rackspace Account information). Run this cURL command:

cURL -H "X-Auth-User: [Rackspace Username]" -H "X-Auth-Key: [Your Rackspace API Key]" https://auth.api.rackspacecloud.com/v1.0/ -v

In the response, there are two important lines for this guide. X-Storage-Url is your CDN URL, and X-Auth-Token is your authentication token. Record both of these, as we will use them in the next step.

Setting Web Index
Now that we have the authentication token and CDN URL, we can set the web index file using the below cURL command. The command is specific to a container, so you need your container name, which you append to the CDN URL. Here, I am setting the web index to index.html, but you could hypothetically set it to any file in the container.

cURL -X POST -H "X-Container-Meta-Web-Index: index.html" -H "X-Auth-Token: [Authentication Token]" [CDN URL]/[Container Name]/ -v

It should take a few minutes for this to propagate, but you then should be able to navigate to your root URL, e.g. www.example.com, and see the contents of index.html.

Error Pages
Rackspace also allows 401 (unauthorized) and 404 (not found) error pages to be configured for a container. This means that if somebody navigates to www.example.com/nonexistent_file, you can show your own 404 error page instead of the default Rackspace error. This adds professionalism to your site.

To start, you need to add two static pages to Jekyll: 401error.html and 404error.html. Rackspace is a bit idiosyncratic in that you must have both a 401 page and a 404 page to use their custom error pages. The query for configuring error pages is almost exactly the same as creating a web index, and requires information from the preceding API authentication step. Once you have uploaded the error pages and obtained the authentication token and CDN URL, run this cURL statement:

cURL -X POST -H "X-Container-Meta-Web-Error: error.html" -H "X-Auth-Token: [Authentication Token]" [CDN URL]/[Container Name]/ -v

Note that we set the web error page to error.html, even though we created the pages 401error.html and 404error.html. This is because Rackspace preprends the error code to the page we set. Thus, if you want the error pages to be 401request.html and 404request.html, you could update the above statement with request.html in place of error.html.

Pushing Changes with Edge Purge
Updating the site is a simple matter of running the upload script. However, updates on existing pages may not be visible for the time length of your TTL. That means that, if your TTL is one week, it could take one week for your change to propagate through the CDN. This section covers how to override the TTL in order to push changes immediately so that changes are visible in minutes, rather than days.

Updating a Page
If you are updating a single file in the root of your container (e.g. www.example.com/about.html), you can use a basic Edge Purge. Edge Purge overrides the TTL of a file, and resets it across the CDN in a couple minutes.

Executing an Edge Purge is simple through the Rackspace Open Cloud Files control panel[8]. When viewing the contents of the container, click the sprocket on the left and select Refresh File (Purge). The update should then be visible in a few minutes.

If you use Cyberduck[7] to manage your cloud files, the software supports Edge Purge with its “invalidation” command[9].

Note: Rackspace limits you to 25 Edge Purges per day.

Updating a Page in a Directory
If you have a file at www.example.com/folder/index.html (for example a Jekyll blog post), you can access it three ways:
www.example.com/folder
www.example.com/folder/
www.example.com/folder/index.html

In our structural directories, we see this as one file. However, Akamai uses a flat file database[10]. This means when you upload /folder/index.html it creates three separate files to match the above files. If you Edge Purge the index.html file, only one of the three file paths will be updated:
www.example.com/folder/index.html

The other paths will remain unchanged because Akamai sees them as separate files:
www.example.com/folder
www.example.com/folder/

This is a problem because the permalink structure of Jekyll posts will use the www.example.com/folder/ path in navigation, which does not reflect an update to www.example.com/folder/index.html. Thus, attempting to update a blog post can be difficult.

In order to overcome this, we need to use the API again. We run three separate purges – one for each of the file structures. However, these purges are destructive, meaning that you must upload the corrected file after running these queries.

Run these three queries, subbing in your authorization token, CDN URL and container name. They are structured to delete the file path /folder/index.html; modify the query for folders or sub-folders. In addition, you can insert your email address and Akamai will email you when the purge is complete. In my experience, deleting files through this method takes about 10 minutes.

Delete /folder

cURL -D – -X DELETE -H "X-Auth-Token: [Auth Token]" -H "X-Purge-Email: [Your Email]" [CDN URL]/[Container Name]/folder -v

Delete /folder/

cURL -D – -X DELETE -H "X-Auth-Token: [Auth Token]" -H "X-Purge-Email: [Your Email]" [CDN URL]/[Container Name]/folder/ -v

Delete /folder/index.html

cURL -D – -X DELETE -H "X-Auth-Token: [Auth Token]" -H "X-Purge-Email: [Your Email]" [CDN URL]/[Container Name]/folder -v

At this point, you have successfully deleted the old file. Now, upload your corrected /folder/index.html file, and changes should be reflected online in about five minutes.

Note: Each of these purges counts against your daily limit of 25 Edge Purges through Rackspace.

Site-wide change
If you are making a large-scale change to your Jekyll site (like updating the footer on every page), consider setting up a new container and starting fresh. After re-configuring the index and error pages, updating your CNAME will put the new site live. Rackspace does offer a whole-container edge purge that you may request through a support ticket, but I have found problems with the script (specifically related to the flat file structure) and am currently working with Rackspace to fix these problems.

A Note on SSL
Cloud Files CDN does support SSL delivery[11], but the certificate is valid only for the Rackspace domain. This means that you cannot use SSL on your own domain using this static site method.

Endnotes:
  1. Jekyll: http://jekyllrb.com
  2. Rackspace Cloud Files CDN: http://www.rackspace.com/cloud/public/files/
  3. Edge Purge: http://www.rackspace.com/knowledge_center/article/using-cdn-edge-purge
  4. Cloudflare: http://cloudflare.com
  5. this Jekyll to Rackspace Cloud Files Python script: https://github.com/nicholaskuechler/jekyll-rackspace-cloudfiles-clean-urls/blob/master/cloudfiles_jekyll_upload.py
  6. exclude: http://blog.patrickcrosby.com/2009/09/05/jekyll-exclude-files.html
  7. Cyberduck: http://cyberduck.ch
  8. Rackspace Open Cloud Files control panel: https://mycloud.rackspace.com/
  9. supports Edge Purge with its “invalidation” command: http://trac.cyberduck.ch/wiki/help/en/howto/akamai#Invalidation
  10. flat file database: http://en.wikipedia.org/wiki/Flat_file_database
  11. SSL delivery: http://www.rackspace.com/blog/rackspace-cloud-files-cdn-launches-ssl-delivery/

Source URL: http://www.rackspace.com/blog/running-jekyll-on-rackspace-cloud-files/