Confidence is a Key Metric

Whether you’re the CTO, an engineering manager, a team lead, or an individual contributor; when you’re building and developing systems and process, whether for release or otherwise, a key metric to keep in mind is confidence.

What does that mean? That means that you need to build a system that is trusted. For example a CI system that let through every build wouldn’t be very confidence inspiring, even if some users might see that as “broken,” how could you ever trust the system to catch your mistakes or to keep your product from going off the rails?

What happens to systems that don’t inspire confidence? Eventually, nothing. That’s the problem with systems that don’t instill confidence; more often than not, their use will decline until they’re not used at all.

In the absence of a trusted process, individuals will seek to replace that system with something they can be more confident in.

These are often “hope driven” systems.

Hope driven systems are usually less productive, like manually checking and rechecking the commits in a release, or reading and rereading a patch and hoping a potential bug will jump out the 5th time that didn’t pop up on the 4th (or 3rd, or 2nd). Or with lack of confidence in a result (say a test run or pipeline flow), retrying with the hope that the result will be different or more revealing.

Does this mean double checking things are bad? No, not at all. Simply that ad hoc “security blanket” type systems that replace humans with computers instead of the reverse for things computers are good at like checking tests and confirming outcomes is usually inefficient.

Certainly more inefficient than rolling out to a single box in prod and giving 1% of traffic for example. Something that a system may make possible in a matter of minutes as opposed to the hours or even days that panicked reading and rereading takes.

If your system isn’t producing confidence, then its likely being subverted or causing time to be wasted.

They should produce more than the end result. They should produce knowledge and confidence in the process itself and in the operators’ skill along the way. It doesn’t have to be a lot at once. I’m not saying that someone should run through a checklist or other process once and feel like they’ve mastered it, but over time this should be noticeable effect.

Here are some things that erode confidence:

  • Intermittent outcomes
  • Random failures
  • Unreliability.

Things that instill and improve confidence:

  • Accurate, helpful error messages
  • Date of last accuracy check

Note that not all of these need be applied in all situations.

What processes or systems do you have in place on your team that build or destroy confidence?