Building Confidence

Siddharth Ram
3 min readMay 4, 2022

--

In Managing Debt, I shared my opinion on the four traits that leaders to solve for: Cohesiveness, Independence, Ownership and Confidence. I will address Confidence in this post and share how this can be solved for.

The first step in measuring confidence is understanding the current state. I like to simplify it with a single question that I ask engineers every quarter:

Would you recommend this code base to a friend?

There are other ways of asking this question. it could be much bigger — “Would you recommend this company/engineering team to a friend?’ for example. But that starts overlapping with employee engagement scores and dilutes its focus on technology. So I prefer to keep it very focused.

This NPS question reveals the confidence engineers have in making changes. As a reminder, NPS = %Promotors — % Detractors. On a scale of 0–10, 9 and 10 are promoters, ≤ 6 are detractors.

It is quite likely that you will get a low score (and possibly a negative score, which means there are more detractors than promoters). Engineers tend to be somewhat discontent at the state of affairs (and that is a mostly good thing). What matters more than the score is: are you able to improve it quarter over quarter? Doing so means ensuring that you understand what gets in the way of engineers — why did the detractors give the score they did? And what do you need to do to get to the target state that addresses their concerns? To make this happen, you need to have conversations with the CEO and the executive leadership team and establish a a tech modernization budget .

Feature Flags

I have found that a key tactic in improving confidence is to use feature flags, even as you chip away on modernizing your stack (e.g. you may be planning to move to microservices). A feature flag system that is well managed allows engineers to release to production with confidence — because you can target the audience who will see changes. At Inflection, feature flags were used to release code to production after automated tests were run post commit in the CI/CD pipeline. This allowed major new changes to be available only to internal users. Once there was confidence that this works well, the team (often the Product Manager) would increase the dial to expose the changes to more customers. On occasion, we had used feature flags to make a feature available to only a particular customer who really needed it.

The downside to feature flags is that they make the code worse if they are not removed after their use is complete. We took a disciplined approach by ensuring that whenever we added a backlog item that had a feature flag, there was another story added for its removal at a future time (between 3 weeks and 3 months is typical)

CI/CD and Automated Testing

How do you get better at something you are not good at? By practicing it frequently. Often teams become fearful of releases, with good reason too- a key reason for system instability is change. The odds of a production incident right after a release is much higher. So what is the instinctive reaction? Release as infrequently as you can. This is fear in action.

If you want to get better — and hence, more confident — release more often, not less often.

At a startup I was at, releases happened every 3–4 weeks. It was a huge production. Testing was manual, and took a week of time. Come release time, tensions were high. The SRE, DevOps team were on notice, as was customer support.

This is where you do not want to be. Part of the reason for the lack of confidence is drift — over 4 weeks, the amount of change can be substantial. Releasing smaller changes more often results in confidence.

In order to do this, over the course of 1.5 years (because of complexity and a codebase of 4M+ lines of code in a monolith) we automated all regression testing, build a unit testing framework, and added standards for code coverage/code acceptance . We built out a CI/CD pipeline that could deploy a monolith in 5 minutes (even as we are removing chunks out of the monolith into serverless/Lambda based microservices.

We went from an NPS score of -38 to +13 over the duration of 2 years. Focus more on the trajectory of the scores than the absolute value. It is hard work, but something that must be done. And engineers appreciated the focus the company had in making them more productive and less painful.

--

--