Rackspace has experienced a service interruption during tonight’s scheduled maintenance on UPS Cluster G. We were testing phase rotation on a Power Distribution Unit (PDU) when a short occurred and caused us to lose the PDUs behind this Cluster. The phase rotation allows us to verify synchronization of power between primary and secondary sources.
All power has been restored and devices are being brought back online. The PDUs were down for a total of about 5 minutes. We have aborted the maintenance for the remainder of the evening and will reschedule this for another date.
Service to Cloud sites has been restored and we are continuing to work with Cloud sites customers to bring them online. We will continue to update the Cloud Server and Slicehost status on their respective websites, which you can access through these links: Cloud Status and Slicehost Status.
If you have any questions please feel free to contact a member of your support team.
Sincerely,
Rackspace
Hope all goes better and more close to the plan next time.
[...] 3:40 AM CST [UPDATE] Thank you for your patience as we work to restore service to our CloudFiles system this morning. The service interruption was related to tonight’s scheduled data center maintenance on UPS Cluster G in our DFW data center. We have posted an official Rackspace incident report here: http://www.rackspace.com/blog/?p=690 [...]
Are comments here censored…?
I’m at a lost for words. Only word that keeps popping up – “unacceptable” – after what went through in June/July. Get your act together, please. We have invested a lot of time switching over to RackspaceCloud.
[...] All power has been restored and devices are being brought back online. The PDUs were down for a total of about 5 minutes. We have aborted the maintenance for the remainder of the evening and will reschedule this for another date. (Source: Rackspace) [...]
The PDUs may have been down for 5 minutes, but customer servers were unavailable for well over an hour. Let’s don’t whitewash this, please.
[...] From Rackspace: from their site: Rackspace has experienced a service interruption [...]
I agree that this blog post misrepresents the scope of the outage. In my case, my slice was unreachable for about an hour and a half, and I was never proactively notified of it (my own monitoring running elsewhere caught it). The blog post instead makes it sound like the outage was only five minutes, which is not correct.
Why did a 5 minute PDU outage escalate into a widespread 90 minute outage that seemingly took down an entire data center?
If you are a Rackspace Cloud customer, you will be receiving a Post Mortem via email outlining the details of the service interruption that occurred last night/early morning. We appreciate your patience.
Best,
Angela Bartels
@rackcloud
I would still like an explanation as to why customers were not proactively notified.
I am a SliceHost customer. Will we also receive the email?
Yeah all these constant issues with the Cloud Site is a very big concern for my company and I. We have spend a lot of time migrating over to Cloud Site and we have discovered that it is not as reliable as the 99.9% uptime Rackspace had promised.
was this to do with the UPS’s internal bypass failing? and thus having an out of phase moment causing a complete loss of power?
-n
curious – is MGE your PDU vendor?
[...] today there was an outage at the DFW Rackspace data center, which unfortunately meant the BanditDeals web site was knocked offline as well. Things look to be [...]
Doesn’t the customer gear have redundant power supplies that operate against separate UPS Clusters? Or is each side share a single point of failure?
I was not greatly affected by 90 minutes of downtime, but was significantly bothered by the lack of information. I received notification of the outage from my own monitor about 2 hours after it was over and went to the website to find out what happened (primarily if it was fixed and if there was anything I needed to do) and clicked on “cloud status.” It had something about network connectivity, but nothing about servers rebooting or maintenance misfortunes.
I truly appreciate the full explanation by email 20 hours later, but some hint in real time would have been better.
Also, as I’ve seen pointed out elsewhere, Rackspace should report times in UTC, not CST. CST is probably inapplicable to the majority of customers. It’s a pain converting.
[...] @Rackspace: Update on Scheduled Maintenance/Interruption: http://www.rackspace.com/blog/?p=690 #RackStatus 22 hrs [...]
We had a downtime of 3 hours in our case (not 5 min). This is the not the first time this has happened. In this instance we were not even notified that there was going to be a maintenance window.
We actively encourage organization to migrate to the cloud. These type of incidents are deal breakers.
I too was down for 1 hour, 12 minutes. I was never made aware there was going to be maintenance nor was I aware that my slice went down (besides my external Nagios monitoring).
I should have been warned…all of us should have been. A followup email should have been sent to the “affected” customers with an RFO. This is disappointing.
[...] by 90 minutes of downtime, but was significantly bothered by the lack of information.” -Bryan Henderson “I too was down for 1 hour, 12 minutes. I was never made aware there was going to be [...]
Its stormy on the cloud
The unique selling point for the coud sites product is debunked.
Shame, as I was looking forward to using it.

[...] From Rackspace: from their site: Rackspace has experienced a service interruption during tonight’s scheduled maintenance on UPS [...]