Once upon a time, a team got into a cycle of putting out fires that escalated to a point where productivity practically grind to a halt. When one issue was fixed, another popped up. It was like a game of whack-a-mole. It never stopped. This team had been in this cycle for so long that people consider it normal.
“That’s life, it happens.”
But this doesn’t have to be life. This doesn’t have to “happen.”
There’s a way out.
No no, not that radical way out…
Let’s look at a couple of examples of this type of issue, and then at ways to break out.
1. Manual QA Regression Cycles
Let’s say one of your products requires a two-week manual regression cycle before a release. Over the years a huge suite of manual tests has accumulated in order to detect regressions. Obviously, such regression cycles are very expensive, slow, and far from inspirational for those who perform them. In an attempt to address this, your company replaced manual testers with QAs that can automate these tests. Good idea!
However, due to escalations and constant maintenance releases, the newly hired automation QAs never get to actually write any automation test. Instead, they spend all their time manual testing in order not to block releases from happening.
Every time they finish, so many new issues have been escalated by customers and fixed by developers, that the QAs have a full-time job manually verifying the fixes, followed by yet another regression cycle.
Then, everybody keeps their fingers crossed, hoping that this time they will get some time to actually automate stuff.
Yah—doesn’t happen. After half a year, barely any real progress is made. The result: people hired to do automation, do manual testing, under utilizing their skill set — to put it mildly.
2. Escalation Cycle of Doom
Cycle number two: you have a very exciting product roadmap, but then trouble hits. For some reason your last maintenance release introduced some serious new bugs. Customers en masse are reporting serious problems.
So, all hands on deck! All developers are reallocated to work on fixing escalated bugs. For weeks, rather than making progress on the roadmap, developer spend fixing all the things.
Unfortunately, after this new release, new — different—issues appear and the cycle playing whack-a-mole continues.
Not only is no progress made on the product roadmap, developers get very demotivated from always be putting out fires.
3. Fire as the Status Quo
Welcome to Ops. Things being on fire is normal. It’s part of the job. No reason to get upset. We put out a fire here, and another appears there.
If this is what your Ops department looks like, it must be burning through people (no pun intended) rapidly. People can only take this sort of stress situation for so long, before they have to get out.
Breaking the Cycle
So what to do? Here’s how you can attempt to break such cycles:
- Accept that you have a problem. This seems obvious, but in many cases, because cycles have been in place so long, people no longer recognize they’re in one — it has become the status quo.
- Decide that you have to invest in getting the problem fixed. This will cost you short term, but should quickly start repaying itself long term.
- And now the real meat:
- Have retrospectives. Have a meeting where the whole team brainstorms on how to break the cycle, how to avoid the same thing from occurring again and again.
- Learn from what happened. Take time to pause, step back, and think about what you’re doing. Ideas for improvement can be small, but should result in making meaningful progress. If brainstorming ideas results in nothing useful, consider inviting an outsider. A fresh perspective can help.
- Implement at least one small idea. Radical changes of strategy are often not realistic, or even wise. Focus on small steps. Only if that doesn’t work, consider more radical approaches.
Some examples of things that may be proposed during a retrospective.
Addressing “QA regression cycles”
- While counter-intuitive, one thing to consider may be to hire some additional manual testers. While these (cheaper) people do manual regression, the automation QAs can actually start automating part of the process. Perhaps the first cycle they manage to shave off an hour or two off the manual regression, eventually all or most of the regression cycle is automated and no, or little, manual testing is required.
- If hiring more people is not an option, consider assigning a non-negotiable minimum automation budget, let’s say 25% of time, that should be spent on actual automation. You will have to accept that this will slow down releases initially.
- Review all tests in the test scripts to see if they can be more efficiently (faster, more reliably) addressed with other types of testing. User acceptance testing (desktop automation, selenium) should generally be a last resort in finding bugs. Perhaps more aspects can be verified at lower levels, like integration testing or even unit testing.
Addressing “escalation cycles”
Have release retrospectives in which you review all escalations addressed in that (and perhaps previous) releases, trying to find patterns:
- Is it always the same piece of the product that’s breaking? Perhaps it worth covering this part with more tests to more easily detect regressions. This will also make rewriting the whole thing easier later on.
- Is it always some other piece of the product that’s breaking? Perhaps components are too intertwined resulting whack-a-mole problems. Consider extracting pieces and rearchitecting parts of the product, piece by piece.
Addressing “fire fighting in Ops”
Do root-cause analyses using 5 Whys. This powerful technique, originally developed in Toyota Motors, is designed to get to the root of the problem, and then at every “why” come up with specific actions to ensure that the same issue cannot reoccur again.
Some good resources on 5 whys:
Don’t just sit there — retrospect and act. Break the cycle. Learn and improve.
If you only ever have one type of meeting, have retrospectives. Seriously.