The world of big data is vast. And organizations trying to prepare for success by harnessing the power of data analytics can sometimes fall victim to confusion and tricky marketing. But one thing that is often left out of the conversation is the awareness of the big data problem. With the heightened expectations that big data promises, it’s important to put your data considerations in the context of a realistic discussion around your organization’s needs, use cases and tools.
With a mission-based focus on scientific research, universities and research institutes have long been at the forefront of technology innovation that embraces open standards and ease of collaboration. By nature, the academy well understands that when hundreds of researchers contribute to a shared purpose and solve a shared problem in open and transparent ways, everyone benefits. The pace of innovation is accelerated and the diversity of solutions and approaches ensures that good solutions persist and not so good ones are quickly identified. Some might argue that the relative success of open-sourced platforms suggests that proprietary technologies often preclude necessary innovation. The growth of cloud has further challenged both researchers and industry. The scale economies of cloud-based solutions brings with it attendant challenges for the research community – beginning with the question of how to scale a conversation that brings both sides of an open community together to ensure that the full benefits of the cloud can be realized.
Big Data and OpenStack are probably the two most mentioned terms in the data center today. It’s become increasingly apparent that organizations have to deal with such massive amounts of data that traditional methods of scaling up with a bigger server or bigger memory or bigger disk are no longer feasible.
This is a guest post written and contributed by Erik Severinghaus, founder and CEO at SimpleRelevance, a Rackspace Cloud Tools partner. SimpleRelevance is an email optimization platform that uses internal and external data to automatically execute smarter, individualized email companies.
Last month, I joined three leading tech executives at Cloud Expo New York to talk about how cloud computing and Big Data are transforming enterprise IT. In addition to myself, the panel featured Bruce E. Otte, director of platform and workload services at IBM; Greg O’Connor, president and CEO of AppZero; and Kevin Brown, CEO of Coraid.
The question “do you need a data scientist?” came up a lot when I was a management consultant for a global firm that successfully incubated data science within a few enterprise organizations. It’s hard. The discussion is hard and the culture clash for data scientists is hard. Many approach data science as some dark magic from Hogwarts. It’s not. Investigating a hypothesis takes time. Spontaneously generating data and building a model against that data doesn’t work. Understanding who you need and how they will fit into your organization is challenging. Where do we put them? Who do they interact with? What is the hand-off? Who do we structure around the project? How do you execute a project? Even better, how do we make MONEY? Yet, before we go there, perhaps we should step back a bit and think of this as a strategic question. Because maybe you do need a data scientist and maybe you don’t.
As the momentum in big data solutions reaches a heightened awareness, the demand for deep understanding of technologies like Apache HadoopTM becomes top of mind with our customers. One of the focal events for training and thought leadership around Apache Hadoop is the annual Hadoop Summit hosted by Hortonworks. As you may remember, we partnered with Hortonworks in the fourth quarter of 2012 to bring an on-demand Apache Hadoop product to the open cloud. Currently in private beta, the open cloud offering aims to provide customers with an easy platform to learn, develop and validate both the application and toolset.
This is a guest post written and contributed by Mike Prince, Founder at Corporation Wiki, a Rackspace Hybrid Cloud customer that archives historical corporate data to become the go-to spot for information on corporations and executives.