Support: 1-800-961-4454
1-800-961-2888

Sync Your Magento eCommerce Site To Cloud Files And The Akamai CDN

10

The constant battle a web site administrator must fight is the resource requirements of their site in contrast with the resource requirements of the configuration it is run on. In the end, this will almost always dictate how many user requests can be handled at any point in time. A common tool in the arsenal to fight this battle is to find clever ways to off load these common, or static, requests, off the server and allow it to spend as much time as possible delivering dynamic content as quickly as it can handle. This allows a server to not handle more request than before, but leaves it free to address more than previously it would have.

Most administrators choose a fairly simple process, which is to implement a Content Delivery Network (CDN) solution into their configuration. The premise is simple: upload all static content such as images, css, javascript, html or anything else that does not require dynamic server-side rendering to a cloud CDN provider, and change your site’s code to point to the CDN provider’s servers instead of yours. A lot of the popular content management systems have plugins developed to sync these files for you to a product like Rackspace Cloud Files. From the management end, you then just enable CDN access and the plugin takes all of the work to change the site code to point to your Cloud Files container. Recently a customer had an issue using Magento and didn’t have a tried and true way of CDN-enabling the static content with their site. Magento has a community driven plugin model and has many popular third-party plugins, some which have this capability. The most popular is called OnePica and is available from Magento Connect here: http://www.magentocommerce.com/magento-connect/6274.html.

This plugin is a great solution, however it has a few drawbacks. First, it will only sync some of the key category and product images from your site. Anything beyond these are not synched; for instance other files in the skin, js or media. This might not achieve what you would want, which is to offload all of this content from the server. Second, the plugin does not support HTTPS content with Rackspace Cloud Files, even though it supports both HTTP and HTTPS protocols, including some streaming ones to boot. I believe this is just a feature yet to be updated in their plugin.

Rather than being at mercy with a plugin, Magento offers a more native solution that allows you to change the base URL of your site for these three areas, which is for specific use with a CDN solution. This allows you to upload the js, media and skin folders in their entirety to a CDN provider; edit these values to point to the CDN container; and immediately all content is being pulled from the CDN servers instead of from your server. From here, it is just a task of keeping the files that are in these locations on your server in sync with what is in Cloud Files.

For this problem we found an open source program called fileconveyor located on GitHub here http://fileconveyor.org/. This application is written in pure python and works excellent on syncing files from one place to another, specifically to cloud technologies like Rackspace Cloud Files. We forked the project here at Rackspace and figured out a base configuration that works to sync specific directories in Magento with a Cloud Files container. The beauty of the program is that it uses inotify monitoring to watch for changes on the directories, which means that as soon as a file is created, modified or deleted the change is reflected instantly across to the Cloud Files container. This takes the mundane task of manually keeping the files here in sync, as well as extrapolates the responsibility of doing this from Magento or a plugin and brings it to the OS layer where it can better be managed.

This is also a big opportunity for the Rackspace support teams to more easily support this as a solution. The implications beyond Magento are also amazing, as this program can sync anything located on the local filesystem of your server to Cloud Files with ease. We see future use for any CMS that requires cloud files syncing to achieve the benefits of a CDN with less hassle.

If you are interested in implementing this solution for your Magento eCommerce site or CMS, check out this helpful Knowledge Center article.

About the Author

This is a post written and contributed by Justin Seubert.

Justin Seubert is a Linux Administrator in Rackspace's SMB Hybrid Linux segment and has been a Racker for over five years now. His responsibilities on the team include tackling the toughest customer issues along with assisting in the training and development of fellow Rackers. Justin enjoys working on servers, coding and even computer hardware outside of Rackspace. He truly loves diffusing his knowledge to others, which also helps him strengthen his skills.


More
  • Simon

    Hi Justin,

    This is good news – I have been hassling Rackspace Support about easier cloud files syncing deployment for a long time.

    You mention you forked FileConveyer – what features did you add and will this be released for Rackspace customers to use? Did you alter the behaviour of the Daemon itself or just modify the CloudFiles transporter?

    • Justin Seubert

      Hi Simon,

      I am glad to hear this could be helpful for you. Until finding an application like this recently, you are right, it has been a hassle on everyone for a solution here.

      The fork of this application was to specifically improve the daemon features. Fortunately, the transporters are from Django and are incredibly stable so they literally just work. A bug was found in the inotify system however that could prevent, on rare occasions, a file to not be detected properly. This happens sometimes in Magento where it will create a cache for images and must create a directory structure of the first two letters of the file as individual directories. An example would be:

      /c/a/cat.jpg

      This was found to not be a bug in fileconveyor, but in inotify which was failing to pick up on the recursive directory creation and in turn the file was not sync’ed. I am in the process to file a bug report with the inotify project, but in mean time I made one modification in the code to force a recursive check on these directories to catch these files that are missed.

      Other than this, the program is as-is for now. We hope to make this into a more known and supportable solution with packaging and knowledge articles for technicians to use.

  • http://www.magentobr.com Eric

    Great write up Justin!
    I tried the fileconveyor and it was using too much memory of the server.
    I had used the OnePICA extension before and it works fine.
    I only have to add the information there and then change the base URL for media, skin and js. I normally just for media and only images on cache. I already use Varnish and mod_pagespeed that creates a smaller size image and also adds the size to the image when it’s not added by the theme. But only doing the images from cache I already saw a 30% speed increase. I am doing some tests to use only CDN for everything. Only problem is the cache, with pagespeed and varnish I can purge the cache and I see the new content. With Cloud Files it takes some time. hehehe

    • Justin Seubert

      Hi Eric,

      Thanks for reading!
      In my testing fileconveyor only used a lot of memory if there was a lot it was having to monitor. This was directly related to the fact that the iNotify code has to create a watch on everything in the directory you set. This can also potentially hit max file open limits, which can be changed with a quick sysctl change, but it will use more memory like you have said.
      I have also used pagespeed and Varnish with great success. Like you also said though, purging or waiting for the TTL if you set up the images on CDN, it can take a bit longer there.

  • nitecrwler

    This is a clear example of why Amazon CloudFront became and is still the most popular choice for most simple setups and Magento users who want the boost in performance without the hassle. No extra scripts or uploads are required. Just point the URL and the job is done. Cloudfront handles the rest.

    I personally have no desire to add more unnecessary daemons to my cloud server. Even the driveclient daemon that’s currently required to get cloudbackups working gobbles memory on my cloud server. I had to set up a cron job to make sure it’s only active for the amount of time it takes to complete the backup.

    I’d much rather use Rackspace’s solutions. I hate amazon on a personal level.. but it’s hard not to use their service when you just need to get a CDN setup quickly for a project.

    • Justin Seubert

      Hi nitecrwler,

      Thank you for your feedback on this solution.
      Over the years, I’ve noticed that different customers choose different solutions depending on their requirements. Many customers choose us for our Fanatical Support and simpler pricing model and are OK with running a simple script. I think what is also important here about fileconveyor, is that its implications go beyond just syncing CMS content for a CDN, such as towards a goal of syncing any content, from any system, to your Rackspace Cloud Files container; this is powerful outside just trying to CDN enable your site as easily as possible.
      In terms of the backup, I would open a ticket with your support team here for us to investigate that with our developers if possible. You can always choose image backup versus file backup as well although, I do realize that you may have a direct need for file level backups. File backups by nature require an agent, this is not a limitation that Rackspace imposes, but we can always work with our customers to make that experience as best as possible.

      We are working on expanding our CDN capabilities all the time and we are always looking at how we can simplify tasks like this one. Thanks again for your comment.

  • http://www.magentobr.com Eric

    Hi there Justin, I am back here.
    I agree with what you said so far.
    I was having some trouble with a server running Magento for about a week.
    Webserver and database running under same instance of 2GB.
    I an instance on cloud database of 1GB and moved database there. Still nothing, so I decided to create another server and use Load Balancer.
    Then came the problem, upload image here, but when load from other server it won’t open.
    Now I am using the CDN for all images (media folder) and the fileconveyor is working great.
    Servers were using about 50% CPU each (before just one was about 99% all day).
    Right now I am testing again with only 1 server of 2GB and I created another for the database with 1GB (cheaper this way lol).
    I also moved to a new server as I think the other one had some problem, even with http stopped, it was running at 99%.

    My simple question is, how to keep fileconvoyer runnning all the time?
    Let’s say like a service?
    And it does not use much memory or cpu.
    Thanks for the tip.

  • http://www.software2.co.uk Ryan Heath

    Hi Justin,

    Great article. I stumbled upon it after resorting to Google when I got no suggestions on how to do this from the Rackspace support team (ironically)..

    Just a few questions on actually implementing this for Magento..
    1) Presumably you created a separate container on Cloud Files for all of the media files, would they then reside under container/catalog/product… etc?
    2) Do you include all the cached images in your sync to CDN? Presumably Magento will pull an image from CDN, cache it at the relevant size and then fileconveyor will upload that to CDN so it doesn’t need to be cached next time?
    3) How would you deal with CSS and JS merging? Do you disable it or again, have fileconveyor upload all the merged files? I imagine this being a problem if I update my CSS with the merged copy being on the CDN?

    Any advice would be appreciated and thanks again for the solution!

  • Karl Baumgartner

    Justin,

    Thank you for this write-up. It’s been very helpful for me with getting a workable solution in place for syncing Magento content, hosted on RS as it turns out, to the RS cloud.

    I’m curious what kind of success you’ve seen with your changes for inotify as I believe I’m running in to the same situation. I see Magento creating these cache directories under the media directory and it causes fileconveyor to throw exceptions when the file isn’t there shortly there after. Though that may be less a recursion problem and more to do with rate of filesystem updates from inotify.

    I’m also seeing a bug identified on the FC issues list where the database gets into a locked state. Very odd and I suspect it’s due to an exception being thrown, FC thinking it’s crashed, and then spawning a new instance, thus causing contention and borking the sqlite database. But I’m still trying to solidify this scenario.

    In any case, thanks for the write-up! Any thoughts on what I”m seeing would be appreciated.

    Karl

  • Piyush Sharma

    I tried to use onepica and found that it not work for hongcong region. It always say “Container name i entered is wrong”.

Racker Powered
©2014 Rackspace, US Inc.