At this year’s O’Reilly Strata event we will showcase our support of the newest genesis of the Hortonworks Data Platform (2.0), a release that we believe represents a paradigm shift in the perception of what you can use Hadoop to accomplish.
A few weeks ago we told you about our two Data Services for Hadoop-based applications, the Managed Big Data Platform service (in Unlimited Availability) and the Cloud Big Data Platform (in Early Access). Working hand in hand with Hortonworks, we are giving you a choice of architectures for your Hadoop applications, whether you need a custom-built Hadoop architecture based on specific dedicated hardware, or a dynamic, API-driven programmable Hadoop cluster in our public cloud.
As we close out another big year for data driven application adopters, the emphasis on delivering right business insights has never been greater. The companies that have embraced data analytics have quickly tapped into new revenue streams and innovated faster than their competitors. The conversation is no longer about when and why an organization should implement a robust data strategy, but which one they should focus on and how they will get valuable insights quicker. This may be best represented by the overwhelming attendance at this year’s fall Strata Conference focusing on the world of Apache HadoopTM, October 28 through October 30 in New York.
Not all data is created equal. Each company’s data strategy needs to be laid out with consideration and thought as to what the future demands of the system might be. Although HadoopTM has its roots firmly planted in the JBOD and bare metal camps, users are increasingly trying to find ways to split up data processing based on workload requirements and on the nature of the type of query they are running.
Most companies today still make business decisions based on guesses, hunches and intuition rather than leverage data-driven analytics platforms. Many don’t understand how to capture the insights or they don’t have the mathematical expertise to make their data meaningful in a manner that helps describe what they should look for. As companies transition away from traditional ways of doing business and start capturing the relevant data that allows senior management to make informed decisions, they encounter an explosion of data growth that legacy database systems and information management processes can no longer manage.
The world of big data is vast. And organizations trying to prepare for success by harnessing the power of data analytics can sometimes fall victim to confusion and tricky marketing. But one thing that is often left out of the conversation is the awareness of the big data problem. With the heightened expectations that big data promises, it’s important to put your data considerations in the context of a realistic discussion around your organization’s needs, use cases and tools.
With a mission-based focus on scientific research, universities and research institutes have long been at the forefront of technology innovation that embraces open standards and ease of collaboration. By nature, the academy well understands that when hundreds of researchers contribute to a shared purpose and solve a shared problem in open and transparent ways, everyone benefits. The pace of innovation is accelerated and the diversity of solutions and approaches ensures that good solutions persist and not so good ones are quickly identified. Some might argue that the relative success of open-sourced platforms suggests that proprietary technologies often preclude necessary innovation. The growth of cloud has further challenged both researchers and industry. The scale economies of cloud-based solutions brings with it attendant challenges for the research community – beginning with the question of how to scale a conversation that brings both sides of an open community together to ensure that the full benefits of the cloud can be realized.
Big Data and OpenStack are probably the two most mentioned terms in the data center today. It’s become increasingly apparent that organizations have to deal with such massive amounts of data that traditional methods of scaling up with a bigger server or bigger memory or bigger disk are no longer feasible.