How do we measure software bugs in a healthier way?
I’m a software engineer. I’ve experienced a metric called Zero Bug Balance intended to increase our software quality.
Periodically, we:
bug_balance[this_time] = bugs_found - bugs_fixed + bug_balance[last_time]
bug_balance[this_time] > 0
, that’s kind of bad.bug_balance[this_time] > bug_balance[last_time]
, that’s bad for real.This is a convenient metric that management uses to gauge how each project is going, but it could be improved.
Goodhart’s Law states:
“When a measure becomes a target, it ceases to be a good measure.”
Zero Bug Balance as a scalar metric promotes an unhealthy work culture.
With Zero Bug Balance…
Luckily in our case, nothing very bad happens if our bug balance is nonzero, aside from the public shame of admitting we created more than zero bugs (sometimes a lot more).
The shame largely works–we’ve felt it when filing a bug. Unfortunately, I have been cautioned about labeling issues as “bugs” because we’re chasing Zero Bug Balance, and last_time
we had bad numbers. Alternatives have included filing a card without labeling it as a bug, or just mentally noting its existence and forgetting it until it happens again.
By shaming teams for reporting bugs, Zero Bug Balance can lose its accuracy in the unhealthy work environment it promotes.
I have an arsenal of negative experiences with SAFe, but I’ll share something good from my time with it: we learned risk assessment 101. Severity can be defined as the product of probability and impact:
Additionally, teams I’ve worked with typically prefer sleep over midnight emergency pages, which we can sum up as:
severity = probability * impact = fear_induced_insomnia
This is the severity chart Zero Bug Balance encourages:
In contrast to Zero Bug Balance, reality yields the following chart:
Equally penalizing bugs of any severity means Zero Bug Balance treats trivial annoyances the same as nightmares. Fixing a nightmare into an annoyance counts for nothing.
It would be a dream to keep no bugs around. However, here in reality, teams weigh the opportunity cost to solve a bug. Bugfixes compete with other priorities. Erik Bernhardsson describes this scenario, as seen by outsiders:
Why is bug Z still present? Hasn’t it been known for a really long time? I don’t understand why they aren’t fixing it?
And his answer:
Of course, why these things never happened is that something else was more important.
A visual bug in a UI probably matters less than a data corruption bug in an API. Similarly, a bug in a component may not be worth fixing if a new component will obsolete it soon. Or perhaps a bug is worth fixing sometime, except the person representing the entire bus factor for that part is on vacation.
In these situations, the team may accept the technical debt of solving the low-severity bugs later. Allowing some technical debt is good: it affords the option to push higher-priority features or fixes.
Zero Bug Balance does not account for acceptably low-severity bugs, which disregards weighing opportunity cost, and discourages the team from leveraging technical debt.
Microsoft has a concept called Zero Bug Bounce. It is a state of product maturity that teams strive for, described in Mike Torres’ opening paragraph in his article applying it to life:
… all active bugs in the software have been looked at and either punted or fixed – and the team’s fix rate (or the rate at which they’re able to fix bugs) is greater than the team’s incoming rate (or the rate at which new bugs are being opened).
Zero Bug Bounce differs from Zero Bug Balance in that punting a bug (deferring a fix until later) counts as fixing it, and focuses on bug-finding rate relative to team capacity. However, Zero Bug Bounce is only meant to account for new bugs, and is unaffected by shrinking a running bug balance.
I propose a friendlier solution to Zero Bug Balance that I’m dubbing Bugs Found and Fixed, or BFF.
Recall Zero Bug Balance’s formula:
bug_balance[this_time] = bugs_found - bugs_fixed + bug_balance[last_time]
BFF makes an operator more positive, and stops caring about bug balances. Behold:
This absolutely groundbreaking formula is actually a better alternative to Zero Bug Balance, and even Zero Bug Bounce:
Admittedly, BFF alone can be gamed by a bored engineer filing many small bugs in place of one, but a picture paints itself when combined with other metrics:
Deciding what we measure plays into delivering a healthy culture. Particularly when combined with other metrics, Bugs Found and Fixed is a healthier alternative to the intolerance of Zero Bug Balance.
Plus, who wouldn’t want a good BFF?