Rackspace Cloud Monitoring is an API driven cloud service built for infrastructure monitoring. It offers a simple yet powerful feature-set, allowing extreme flexibility in configuration and execution. Very simply, we help answer the question, “Is my service up and running?” To do that, we have two simple objectives; to alert the resource owner before their customer knows, and to take measure against allowing the system to go down in the first place.
Cloud Monitoring: FAQs
- Getting Started -
There are five monitoring zones:
- Blacksburg (IAD)
- Dallas-Fort Worth (DFW)
- Chicago (ORD)
- London (LON)
- Hong Kong (HKG)
Using multiple monitoring zones eliminates the need for maintenance and upgrade downtime, as well as ensures your monitoring services remain uninterrupted even in the event of a datacenter failure. You can find an up-to-date list of monitoring zones in the Rackspace Cloud Monitoring Developer Guide.
Looking into the future, we are laying a super solid foundation for building scalable high-performance and reliable services for making sure your infrastructure is up. We’re now ready to start experimenting with some very interesting features. Look for aggregate alarms, historical data, and most importantly the high performance low-memory usage agent to power automation and information gathering.
For more information, please see the following link here.
Rackspace Cloud Monitoring is an API-based system. You can access it using the following methods:
To set a monitoring check using the Cloud Control Panel, log into the Control Panel and go the Server Details page of a Cloud Server. Then scroll down to the Monitoring Checks section and click Create Check.
If you would like to practice setting up a monitoring check using the CLI, consult the Cloud Monitoring Getting Started Guide.
For a complete list of Cloud Monitoring API endpoints, see the Cloud Monitoring Developer's Guide.
Right now we support email and webhook. The most up to date information is here.
Anything with a URL or an IP address that is not blocked by a firewall. Details can be found here.
As the complexity of your business increases with the number of products, customers, websites, etc., there is an increasing possibility of one or more of your resources suffering from a variety of system failures. Learning of a problem from your customers means that you’ve already lost business, and your customers are already having a negative experience using your website or application. Preventing this is what Rackspace Cloud Monitoring is all about.
Yes, you can use both US and UK accounts. This is a global system that works with both identities. Use the identity server where your tenant lives and pass that token and tenantId to the Cloud Monitoring system.
Not yet, but we are planning on starting in the near future.
High availability is a feature unique to Rackspace Cloud Monitoring. Because we provide Monitoring as a Service hosted in the cloud, we are able to keep that service up and running without any downtime. When implementing improvements to the system, we can take a region down for an upgrade, even lose another datacenter due to a localized disaster or event, and Rackspace Cloud Monitoring will continue to monitor your resources and send you notifications.
Please let us know of any features you would like added or any issues you find via the Feedback forums located here. For more immediate assistance with a time-sensitive issue, please file a ticket or contact your account team.
Monitoring and Troubleshooting
We will only notify you when there is a problem with our infrastructure that impacts your Cloud resource. We do not run monitoring checks on your individual Cloud Servers, Cloud Load Balancers, or websites to verify functionality. This leaves a lot of exposure in situations where the infrastructure is working fine, but a given application or resource in your Cloud setup has stopped functioning correctly, leaving your site performance degraded, or totally inaccessible.
It can take a lot of administrative overhead in order to correctly configure and maintain a robust monitoring application. If there is no built-in redundancy for the monitoring application, you could lose all monitoring services if the server hosting your monitoring application runs into trouble. A simpler option, with built-in redundancy, is to use the newly available Rackspace Cloud Monitoring service. It provides you with four key pieces of information that can help you manage your business as well as address and prevent infrastructure problems before they affect your customers.
Rackspace Cloud Monitoring is released through the cloud without need for software installation on servers or computers. By eliminating the need for installation, Rackspace can upgrade the monitoring service without involving the customer. This process hides the complexity of the upgrade and maintenance processes from the customer, giving them a simple and reliable experience.
As with any commands you will submit to your Cloud resources through the API, you must first authenticate through the API in order for the commands to be correctly processed.
In our Cloud Monitoring Developer's Guide, we show the detailed configuration options available with this service offering, and the necessary components to build functioning monitoring checks. This will require you to work with the following components:
- Account - An account contains attributes describing your account. This description contains mostly read only data; however, a few properties can be modified with the API.
- Entities - An entity is a resource that you want to monitor. Some examples are a server, a website, or a service.
- Checks - Checks explicitly specify how you want to monitor an entity.
- Check Types - These are the available service monitoring checks that you can configure, such as PING, HTTPS, SMTP (and many more).
- Monitoring Zones - A monitoring zone is the "launch point" of a check. You can launch checks from multiple monitoring zones.
- Alarms - An alarm contains a set of rules that determine when a notification is triggered.
- Notifications - A notification is an informational message sent to one or more addresses when an alarm is triggered.
- Notification Plans - A notification plan is a set of notification rules to execute when an alarm is triggered.
See our Cloud Monitoring Developer's Guide for more detailed information about submitting commands through the Cloud Monitoring API.
What is a notification plan?
A notification plan contains a set of actions that Rackspace Cloud Monitoring executes when triggered by an alarm. You may have multiple notification plans in your cloud account.
How are notification plans used?
Each monitoring check can reference one notification plan. When an alarm for that check triggers the critical state, the notification plan associated with the check is used.
What does the default notification plan “Technical Contacts - Email” do?
If you do not set up a custom notification plan, then by default, email is sent to all of the technical contacts on your account. If your account lists no technical contacts, then the primary contact is emailed. You can view the list of contacts for your account on the User Management page at https://mycloud.rackspace.com/account#users.
What does the “Rackspace Managed Notifications” plan do?
It creates a support ticket within your account. This feature is available only to customers with a Managed Operations Service Level.
How do I set up a custom notification plan?
You can use the Cloud Monitoring API. For instructions, see the Cloud Monitoring API Developer Guide to read more.
Where can I download the Rackspace monitoring CLI?
Download the CLI at https://pypi.python.org/pypi/rackspace-monitoring-cli/0.4.5.
Where can I find the API documentation for monitoring notification plans?
Please contact our sales department in order to make a custom service plan that will both cover your monitoring needs and fit your budget.
It is targeted at replacing these type of tools. Right now we don’t offer all the features of Nagios; however, CM is hosted as a service, API driven and built for the cloud. Cloud Monitoring also provides geographically redundant checks which is something that is generally very difficult to get with any solution. Customers will get to leverage our big datacenter footprint, incredible scalability, and our continuous release feature. Future improvements to the service will be released as they become functional (no need to wait for an upgrade package, and no need for downtime).
Yes, but you will need a Cloud account. You configure your own notifications, so alerts may only go to you (the user). A Racker will not respond to your alarms unless they are included in the notifications.
Listing monitoring zones will give you CIDR’s of the set of collectors in that zone. To get the most accurate list possible use this API call.
This is a global product which is supported in both the US and UK. We have a primary datacenter in the UK that can process alerts on its own if the link between the US and UK goes down, as well as other datacenters that act as safety nets in case of localized datacenter failure; no matter what, your monitoring service will remain functional.
Not today, we will have more flexibility in setting policies in the future though.
Most of the interesting patterns are represented here. Let us know in the comments if there are additional examples you’d like to contribute.
A monitoring zone is the “launch point” of a check. You can launch checks from multiple monitoring zones.
An Alarm is a set of rules that determine what status is returned based on the result of the check.
Email notifications are sent via Mailgun. They make sure that email will be sent properly and not placed into SPAM folders.
At this time, we don’t do "synthetic transactions” (a simulated set of actions.) However, we do support checking the HTML of the response. We follow redirects but don’t check content within a frame/iframe.
A Notification Plan is a set of actions undertaken when a certain status is returned by the check.
Right now we have a Getting Started Guide which is an evolving document that is updated on a regular basis. This guide includes situations set up as well as just an overview of how to use some of the tools to start getting value. We are working on setting up a feedback site for dealing with customer feature requests.
A Notification defines how the customer wants to be contacted in the case of a system failure.
We wanted to build a state of the art monitoring platform. This requires the data collection to be separate from the thresholding. On the CLI/API level, this is a more complex user experience, but provides the most flexibility. The dashboard GUI simplifies the process for those users who don’t want to work with a CLI.
Rackspace’s Cloud Monitoring system is extremely secure. Before releasing the product, an independent firm assessed the level of security of the Cloud Monitoring's systems and API, and all reported issues have been addressed. You can have confidence that Rackspace Cloud Monitoring is both safe and secure.
A Check specifies what aspect of the resource you wish to monitor.
- Check if your website is going to expire in less than 1 month
- Check for a 404
- Check for something in the HTML response body and set critical if it exists
The user experience. At Rackspace, we strive to make your interactions with us and our products as easy as possible, and we work towards improving that experience every day. Cloud Monitoring was built first and foremost with the user in mind - we want your experience to be even easier than ours when using the product.
An Entity is the resource (website, server, etc) you wish to monitor.
We have an on-demand simulation feature which allows the user to test the functionality of their monitoring system by simulating a normal operating situation.
The production URL is https://monitoring.api.rackspacecloud.com.
Yes. Check out the raxmon project. It uses the Apache Libcloud framework for building a reliable API implementation that functions well.
To avoid repeating the raxmon installation on each new Cloud Server, install it on your workstation and not your server.
Note: raxmon requires Python 2.5, 2.6 or 2.7.
Rackspace Cloud Monitoring was built as a means, not an end. Although just as (and in many aspects, more) advanced than the other monitoring services, Cloud Monitoring has promoted reliable functionality and a solid foundation over superficial “whizz bang” features. This is by no means the final product - we plan to continue innovating and improving on what we have already, which is a solid, reliable foundation for the future. Customer feedback remains a very valuable source of input concerning the direction of Cloud Monitoring.
Yes - please visit our Cloud Monitoring Getting Started Guide, which can guide you through the steps in creating your Rackspace Cloud Monitoring setup from scratch.
Rackspace Cloud Monitoring tries very hard to make sure that if a change fails it let's the customer know. Deliberate action is necessary in a monitoring system. Along with that scalability has been one of our priorities from the beginning. Even if a customer adds thousands of Cloud Servers in minutes, Rackspace Cloud Monitoring can instantly begin monitoring all of them.
Monitoring from multiple monitoring zones allows you to monitor the experience of customers from many locations, which is important for companies that do business in more than one region. Consider this: your website might be working fine in the Western United States, but users from the Eastern half of the country are experiencing high load times and nonresponsive web pages. Monitoring from just a western datacenter would report an OK status, but if you also monitor from the eastern datacenter, you will be alerted to this problem before it affects your customers.
If you’re only monitoring from a single zone, and for some reason there is an error with a check, you may get false notifications that your site has a problem, when in reality it is working fine. Using multiple monitoring zones helps prevent these false alarms by verifying a system failure from multiple sources before alerting you. Or, you can have it send you a message if even just one check returns a failure status - the level of customization possible through our alarm language is incredible.