Support: 1-800-961-4454
Sales Chat
1-800-961-2888

Don’t Throw Your Code Over The Wall: 5 Ways To Work With Ops Engineers

15

While the tech world is shifting to more of a DevOps structure, in many places the invisible wall that separates developers and operations engineers is still pretty high. Communication between both teams is necessary to make sure the application is deployed correctly and achieves high availability for end users. Here are some suggestions from different Rackers (Rackspace employees) on how to best handoff your app to the ops team instead of just throwing the code over the wall.

The Devil is in the Documentation Details

Documenting a well-defined list of dependencies is one of the most important things you can do when handing off your application. Rather than being open-ended with what the app needs to run properly, it is important to be specific. Let’s say you’ve developed an app that relies on MongoDB for storing and retrieving data.

“Saying that your app needs MongoDB leaves many things to interpretation. For instance, what is your scalability strategy? Should this application need to be sharded? Do you want replica sets? What version do you want installed? What are your indexing needs?” Rackspace Deployment Services engineer BK Box says. “If your app requires a MySQL backend, do you want Master-Master replication or Master-Slave? How much memory and disk space do you need? These are a couple examples of the types of details operations engineers need to have. Be sure to keep the communication open to talk about requirements needed to get the application running.”

Set Up Descriptive Logging

Having detailed logs is one way to help operations engineers quickly diagnose and fix a problem when an alert comes in. “Looking at the application logs is one of the first ways you can tell if a developer came from an ops background,” says Farid Saad, Cloud Servers engineer. Saad says that ops engineers have to quickly determine what is going on if an application begins behaving strangely, and having to constantly debug logs that are not meaningful can cause frustration and delays in getting the app back online. Overcome this issue by ensuring that your logs adequately describe the errors that the app is encountering.

Fully Tested Code

Major Hayden, the Rackspace Chief Security Architect, advises that developers fully test the code to help mitigate any surprises on launch day. Hayden emphasizes that the testing should extend beyond the typical unit testing that happens in most QE processes. “When testing, it is important to understand what happens when you put all the pieces together,” Hayden says. “With integration testing, you should make sure that the code works well with other dependent applications, such as an authentication system.” Be certain to verify the integration points between your code and other applications.

Have a Rollback Plan

Everyone expects code deployments to go successfully, however, you have to make the assumption that something could go wrong. “We see it all the time where a new piece of code is rolled out and the application starts behaving poorly,” says Kenny Gorman, co-founder of ObjectRocket, a MongoDB solution by Rackspace. “Many times it could be a simple missing database index, but other times the new application could actually be logically corrupting the database. In these cases it’s great to have a rollback plan, which not only mitigates the risk of a bad deployment but also minimizes the amount of downtime if something goes awry.”

Communicate Availability Expectations

Rackspace Cloud Servers engineer Richard Maynard wants to know what the developer’s expectations are for the app. “How often should it be available? How often will it be updated? What is the anticipated usage?” Maynard asked. “To me, it is about requirements for availability and understanding what the operations team can do to meet those requirements.”

There is a difference between 99.99 percent and 99.999 percent uptime. While it may not sound like a lot, the amount of 9s can affect how you architect the infrastructure for the app (in the above example, it is the difference between 87 hours 36 minutes of downtime and 8 hours 45 minutes of downtime). “What it comes down to is an understanding of the cost versus complexity and the impact of downtime to the business,” Maynard says. “Are you developing a payment gateway and could lose money out the window if it goes down, or is it a non-critical app that could simply upset a handful of users?” Communicating this expectation to the ops team can help ensure that the hardware is architected to achieve the required level of uptime.

While industries are beginning to tear down the wall between operations and developers, it would be helpful to consider these tips until finally arriving at a true DevOps structure.

Looking to host your newest app? Rackspace offers performance cloud hosting with all SSD drives so you can get the most out of your app in addition to ObjectRocket, a Database as a Service solution to run MongoDB on custom, fine-tuned hardware.

The Rackspace DevOps Automation Service automates application environments using DevOps tools, and includes 24×7 DevOps Engineering support.

About the Author

This is a post written and contributed by Garrett Heath.

Garrett Heath is a Racker who works in Rackspace Marketing and has had experience as a technical project manager in the Cloud. He enjoys writing about how the cloud is spurring innovation for startups, small businesses and enterprises. You can read his personal blog for where he likes to eat in San Antonio.


More
15 Comments

790 words, 2 conclusions: document and test it.

avatar PeterFV on November 30, 2013 | Reply

Actually it was 816 words, or 4,154 characters without spaces, or 4,955 characters with spaces, and 15 paragraphs. This information comes without the title, byline, social badges, tags, and about the author. Peter if your going bash the post, get your facts right. Also, this post is great information. As a developer, people are often passive and choose not to criticize other’s work which makes knowing your faults impossible. Although this may seem as common sense, as a developer I often feel that by saying things like ‘MongoDB is required’ I often think the answer to those questions require nothing more than common sense. It all depends on which side of the wall your working from.

avatar Ian P. on March 16, 2014

Document–>test

Nothing new here.

avatar fpleti2 on January 24, 2014 | Reply

Fast, cheap and good. Pick any two. Human nature/ the adminsphere do not seem to like more than two choices. This creates the “Nothing tests like production” scenario.

avatar Red_Ruffansore on February 8, 2014 | Reply

Back in the day, I had a simple technique that helped with documentation. People were allowed to ask me questions directly but I had to respond in the documentation. The assumption is that if someone has to ask, the docs are insufficient to the task. I would either enhance the docs or re-engineer the user interface to make things clearer. Me answering directly might solve the problem for that user but there would be others.

The other technique I used was what I called incremental development. I would meet with my customer and we would bang out some very basic requirements that we could all agree on. We would leave the more minor functions for later. I would deliver the basic application very quickly and that would give us all a frame of reference for filling in the other features. My customers loved it. They didn’t have to try to think of everything they wanted all at once and then wait until it was all done to see if it was what they really wanted.

Interestingly, I was part of an acquisition and my new customers were used to a very regimented specification process. They expected to have to write a full specification and then wait a year for delivery. I come along and, after an initial discussion, I deliver a rough version in about 2 weeks. They freak out thinking that the rough version is what I expect them to use. Once they get into my way of doing things, they are thrilled. I can get them going in under a month and we can collaborate and prioritize the rest over the next month or so.

Then my job was outsourced to a group of 20 programmers in Bangkok and they laid down the edict that all software changes would be subject to a strict specification process and would take a year to complete. I guess the company saved money. I moved on.

avatar rdfInOP on February 10, 2014 | Reply

Good stuff! Sometimes process and red tape get in the way of true progress. Sad to hear the higher echelons were more focused on the budget, and not their users/customers…

avatar CupOfJoe on February 14, 2014

This.

avatar Happy Haxor on March 4, 2014

DO throw it over the wall, just make sure it is well documented and tested. You’re supposed to throw functional, well-documented product over the wall, not hand-grenades. Lowering the wall creates a situation where ops does not take responsibility for execution and maintenance of systems. I’ve seen it first hand.

avatar jonbme on February 14, 2014 | Reply

I believe having an open doors along the wall helps. Some of the information above is pertinent when initially starting a project, or when major upgrades need to occur. It would be essential to sit down with Ops to go over the requirements. Most of the time, they have a better view of the services than Devs do. But agree, the wall should still stay, as a way of defining responsibility.

avatar CupOfJoe on February 14, 2014

In our case, our customers (internal) didn’t know exactly what they wanted. Incremental development was the best thing for them. They could very quickly describe the basic framework of what they wanted and I could give them that just as quickly. Then the application became the basis for our refinements.

But these were fairly small applications for an internal group that was buried in work. When I started supporting them, they were struggling to tread water. It would have been too much for them to take time out to create a detailed specification. By giving them something that worked, I immediately relieved some of their pain.

You do what works.

avatar rdfInOP on February 15, 2014

So the engineer who wrote it shouldn’t have to deal with the consequences of his/her decisions?

avatar Andy Rotering on February 16, 2014

In a way, I agree. Design it right so that it’s very easy to use and often it doesn’t even need documentation. (“We’ve just released our ‘hello world’ program and the 300 page manual…”)

avatar Paul Robinson on February 26, 2014

As a software engineer who graduated with BS in computer science….I have no idea what a operations engineer is. Is that the monkey who hits next until a windows program is installed?

avatar Andy Rotering on February 16, 2014 | Reply

I think they mean “production”. The term baffled me as well.

avatar chrisclement on February 27, 2014

Fundamentally what is the functional space of Documentation, Engineering, and Operations Engineer and Software Engineering? After working for 7 years in IT on numerous project with no direct knowledge that was the space(not my educational background) I have seen it all, waterfall, incremental, iterative, small team, managed service, custom local extreme programming. Can we get some definitions of what the structure of these teams are supposed to be doing? Managing every team/ business model is a unique challenge, but realistically what are the teams descriptions/supposed makeups ideally expected in your discipline.

avatar AGravagile on March 3, 2014 | Reply

Leave a New Comment

(Required)


Racker Powered
©2014 Rackspace, US Inc.