Not all data is created equal. Each company’s data strategy needs to be laid out with consideration and thought as to what the future demands of the system might be. Although HadoopTM has its roots firmly planted in the JBOD and bare metal camps, users are increasingly trying to find ways to split up data processing based on workload requirements and on the nature of the type of query they are running.
Most companies today still make business decisions based on guesses, hunches and intuition rather than leverage data-driven analytics platforms. Many don’t understand how to capture the insights or they don’t have the mathematical expertise to make their data meaningful in a manner that helps describe what they should look for. As companies transition away from traditional ways of doing business and start capturing the relevant data that allows senior management to make informed decisions, they encounter an explosion of data growth that legacy database systems and information management processes can no longer manage.
The world of big data is vast. And organizations trying to prepare for success by harnessing the power of data analytics can sometimes fall victim to confusion and tricky marketing. But one thing that is often left out of the conversation is the awareness of the big data problem. With the heightened expectations that big data promises, it’s important to put your data considerations in the context of a realistic discussion around your organization’s needs, use cases and tools.
With a mission-based focus on scientific research, universities and research institutes have long been at the forefront of technology innovation that embraces open standards and ease of collaboration. By nature, the academy well understands that when hundreds of researchers contribute to a shared purpose and solve a shared problem in open and transparent ways, everyone benefits. The pace of innovation is accelerated and the diversity of solutions and approaches ensures that good solutions persist and not so good ones are quickly identified. Some might argue that the relative success of open-sourced platforms suggests that proprietary technologies often preclude necessary innovation. The growth of cloud has further challenged both researchers and industry. The scale economies of cloud-based solutions brings with it attendant challenges for the research community – beginning with the question of how to scale a conversation that brings both sides of an open community together to ensure that the full benefits of the cloud can be realized.
Big Data and OpenStack are probably the two most mentioned terms in the data center today. It’s become increasingly apparent that organizations have to deal with such massive amounts of data that traditional methods of scaling up with a bigger server or bigger memory or bigger disk are no longer feasible.
This is a guest post written and contributed by Erik Severinghaus, founder and CEO at SimpleRelevance, a Rackspace Cloud Tools partner. SimpleRelevance is an email optimization platform that uses internal and external data to automatically execute smarter, individualized email companies.
Last month, I joined three leading tech executives at Cloud Expo New York to talk about how cloud computing and Big Data are transforming enterprise IT. In addition to myself, the panel featured Bruce E. Otte, director of platform and workload services at IBM; Greg O’Connor, president and CEO of AppZero; and Kevin Brown, CEO of Coraid.
The question “do you need a data scientist?” came up a lot when I was a management consultant for a global firm that successfully incubated data science within a few enterprise organizations. It’s hard. The discussion is hard and the culture clash for data scientists is hard. Many approach data science as some dark magic from Hogwarts. It’s not. Investigating a hypothesis takes time. Spontaneously generating data and building a model against that data doesn’t work. Understanding who you need and how they will fit into your organization is challenging. Where do we put them? Who do they interact with? What is the hand-off? Who do we structure around the project? How do you execute a project? Even better, how do we make MONEY? Yet, before we go there, perhaps we should step back a bit and think of this as a strategic question. Because maybe you do need a data scientist and maybe you don’t.