What is DataOps? It depends who you ask. The concept is evolving, there’s lots of chatter and everyone has their own view. There’s a tendency to concentrate on tooling and mechanics — but DataOps is far more than those things. Get lost in buzzwords and it’s easy to lose the true meaning of a term.
So what is DataOps? It’s not a software or a tool you can buy, technology is definitely a part of it. I believe you must think more holistically. DataOps is a lot more about people and processes — those producing data, those consuming it, how they communicate and how you make the data available. It can incorporate agile, DevOps, total quality management and lean manufacturing principles — and apply all that to analytics.
At Rackspace Technology we define DataOps as a “process framework for the entire data lifecycle that incorporates DevOps style deployments, agile development and statistical process controls and governance to apply towards people, processes and technology.” If I had to condense that into one key message, it would be: DataOps is about how to make data available for consumers, in the most efficient way. It’s the orchestration of people, processes and technology to deliver trusted, high-quality data to the right users, wherever and whenever they need it.
Why we need DataOps now
DataOps provides answers to key challenges in data management with data warehouses and data lakes, business intelligence with dashboards and reporting, and data science with data preparation and pipelines. It streamlines processes, improves collaboration, increases agility and speed, and establishes a culture of continuous improvement. It bridges the gap between data engineering, IT and businesses, ensures relevancy with short development cycles and leads to the creation of analytics pipelines that keep the data science process flexible.
A much shorter answer to why we need DataOps now is because of COVID-19.
The pandemic upended every model and prediction IT companies once held. It made stark the realization that things can change rapidly — and without data at your fingertips, you cannot adapt quickly enough to catch up with reality. Over the past year, many companies discovered that, although they have the data they need somewhere, their systems did not allow them to utilize it.
It takes companies weeks or months to get the data they need and then analyze it. That’s not good enough. We need to figure out how to get from a couple of months to a day. That’s the big challenge. Otherwise, by the time data gets back to a user or stakeholder, it’s no longer useful and relevant. The original purpose behind wanting the data is lost.
The combination of people, processes and technology can help solve for these things. The aim must be to move to self-service, where data is readily available. Again, this is DataOps not as tool nor technology but as concept. It’s about getting data from source systems to destination users at speed, while it’s relevant, in an easily consumable manner.
How to make DataOps real
Although some organizations might claim otherwise, DataOps does not magically spring into being. There are key challenges to overcome. It’s crucial to do the following:
1. Put a strategy in place
This one sounds obvious, but many companies lack a DataOps strategy. In fact, they don’t even have the vision to start thinking about DataOps. Without that in place — and buy-in from leadership — a company runs the risk of responding to hype rather than business needs. It’s vital for companies to take a holistic approach and define a DataOps strategy that works specifically for each organization.
2. Efficiently manage change
With DataOps, you need to be able to change — and keep up with change — pretty quickly. You’ll be able to do that only if you have proper change management for data in place. Even with companies deep into automation, a change management practice and CI/CD pipelines for analytics is still relatively new. So ensure effective change management is properly infused into your organization — and that it’s instituted all the way from leadership.
3. Integrate security from the start
You cannot allow the wrong people to gain access to and monitor data. So you must figure out whether your data is secure, how it’s secured, where and how it’s available and whether the right people have the right levels of access. All of this needs to be integrated from the start.
4. Ensure you have quality data
It’s crucial to be able to observe data and deal with issues if it changes. For that, a lot of instrumentation is needed and automation comes into play. You need to ensure data quality doesn’t drop and that you don’t have different people producing conflicting interpretations of the same data. You must be disciplined about this at a foundational organizational level and build policies and definitions to reduce errors and keep striving to reduce cycle time, thereby streamlining and accelerating your efforts.
5. Identify your DataOps maturity
Few companies are mature in this field. Some are further ahead than others. When you break it down, each company will be in a unique place in automation, management, collaboration and other factors. Broadly, more are ahead in automation; many fall behind with cross-collaboration, making sure data is shared across an organization and that the right people have access. Figure out where your company’s shortcomings are and focus on fixing them.
Get ahead in DataOps
What are the key takeaways in making DataOps work for you? What advice should you take to embed this concept and culture into everything you do?
Build awareness. Start with the data that exists and encourage everyone — not just business analysts — within your organization to leverage the data. That’s the first step. Then utilize platform automation to lower the time it takes to get data into place.
Throughout, ensure users understand data and the value of data. Then make it real, so they can go and do this for themselves. Move beyond those initial hurdles and your company is well on its way.
Above all, forget the hype surrounding DataOps and the many buzzwords being thrown around. Instead, tie everything back to that single message about getting data to your end users. Focus on outcomes. Ask yourself: What business purposes are we trying to solve for? Everything else will fall into place once you have a good vision of how you want to use the data.
Five steps to data modernization success
About the Authors
Nirmal is a Principal Architect and Product Manager at Rackspace responsible for building Rackspace's Data Analytics and Artificial Intelligence/Machine Learning solutions as part of Rackspace's Data Solutions portfolio. Nirmal works closely with our Alliances, Partners and Customers to create the most effective and efficient analytics and machine learning solutions to enable our customers to focus on driving a data driven culture within their organizations and become leaders in their respective segments. Prior to this, Nirmal was a consultant in our Professional Services organization and provided recommendations and solutions for a wide variety of industry verticals around large scale databases and data processing, data analytics, data warehousing in the cloud and machine learning/artificial intelligence. Nirmal has a strong background in cloud and distributed systems, having contributed to various open-source projects from Cassandra to OpenStackRead more about Nirmal Ranganathan