6 traps when choosing operational metrics for software or digital teams

Because it's highly conceptual, software development is notoriously difficult to manage with data:

On one hand, it's very clear when something that's been built--works. And when it doesn't. And even though it's digital, it follows a logic almost as objective as those of physics: gravity, thermodynamics, etc. It will be painstakingly obvious to pretty much anyone with a pulse that something's off.
On the other hand, building software is a creative process. There are many ways to approach building a piece of code. At a high level, it's about solving a business problem or addressing a user need. But when you get into the details, there can be big variations based on what's easy to make, what's possible, and what's attractive to the user.

In an effort to get a bit more quantitative, I started peering into our source code control system. For each person on the team, I noted down the number of commits, or changes to the code, they made over a monthly period.

I wasn't really looking to create some kind of monitoring system. I just wanted to identify patterns in the work being completed. And to find anything that's obviously wrong. It turned out that a few developers were averaging one small change per week, whereas the rest were managing to submit an equivalent number of changes over the course a day. So it kicked off some useful discussions.

But that said, if this became the only metric to evaluate developer performance, it would ultimately distort overall productivity. I think the fastest way to increase productivity is not to make the individual contributors as productive as possible, but to make sure the team is productive. And commits/developer would result in a local optimization, not the entire system. In addition to commits, factors like how quickly and how well code is reviewed matter. How quickly it's tested after it's done also matters a lot. And without this data, it would be hard to truly operationalize anything. Because it would probably skew behavior in the wrong direction.

At the moment, in order to address this problem, we're implementing gitprime at the client site. It sounds good on paper, where we'll have greater visibility into team dynamics. And also systemic metrics based on what's going on in the code base. If the primary output we need is working software, then metrics around how this arises are a high leverage activity. Will report back when I have a clearer view of how it helped (or didn't).

Here are a series of potential traps I've fallen into around choosing operating metrics in the past, when looking at managing software teams. This is for my own benefit and hopefully you'll also find it useful:

Hard to observe/measure

If it's possible to generate a metric, but it takes a lot of effort to generate the number, then it's not a good metric to use. Or alternatively, if you can only observe the metric on a monthly basis, then it's useful strategically, but not operationally. Because the team won't have feedback loops to guide their efforts.

If you are using a manual process, the gold standard for this is what I found in the EOS approach:

source of data given by link which clearly defines how a metric is "calculated"

Have a spreadsheet with a number of metrics which are recorded on a weekly basis. See above for a google docs example. For each of those metrics, have it documented exactly how that metric is generated. It should be so clear and easy, that you could have an intern or receptionist follow the instructions. In my case, since I was mostly gathering stats around my writing, this included URLs/links to specific reports in google analytics. And then describing which boxes needed to be transcribed into the operating spreadsheet.

This was also why I started just aggregating commit data manually from git. Being forced to look at each team member's profile on a regular basis meant that I became aware of what was going on. And the spreadsheet gave me a cross team view. Which I updated weekly, so it worked over time.

Focusing on efficient output while losing sight of outcome

This is a really common one in the context of waterfall project management. The three underlying variables that are optimized there are:

% utilization: how much time each person spends working
% completion: a (usually highly subjective) estimate of how much of the work is done
how this relates to predetermined dates that were agreed at the beginning of the project

While in theory all of this sounds like a great idea, these metrics are devoid of a measure of output and more importantly outcomes. Don Reinertsen ( @DReinertsen) has a concise summary of why this is madness: "In product development, our problem is virtually never motionless engineers. It is almost always motionless work products." All of the above metrics focus on what the people are doing or not doing. Not on whether the product is getting built.

If you are interested in efficiency, motionless work products matter more than motionless engineers.

Also notice how all of the above are true of projects in general, but there is nothing business or industry specific in them. According to these numbers, whether or not the project finished on time is what matters. So if that is what matters, that is what should be tracked, not efficiency. Because efficiency can easily dominate everyone's attention, so that you lose sight of what you're trying to accomplish.

Working with both larger companies and entrepreneurs, I suspect founders and startups get this intuitively. But beyond a certain size, the bigger companies get so obsessed with efficiency, that they end up spending very little time talking about effectiveness and strategy. So everyone just focuses on efficiency.

Independent of customer needs

This is one of the common traps I tried to address with the Hero Canvas. For established companies, there is an increasing internal focus over time. A lot happens, but customers don't see it. And to be fair, they might not want to see it all anyway. :) Any work which doesn't contribute directly to what a customer or prospect might want is "waste". Some of this waste is necessary, to produce the intended outcome. But at least it's deliberately added.

For example, in a larger company, there are few organizational onion layers between the product teams and customers. There can be lots of reasons for this. One of the most common ones are functional silos. Because sales and CRMs own the customer relationships, they are unwilling to risk sharing access to anyone else. Conversely, the technical staff might prefer to deal with technology, and to let the sales and marketing guys deal with the messiness of people & relationships.

As a result, there is a disconnect between the market facing side of the company and the technology facing side. And lots of un-sellable features are developed. Or the timing is off, and the sales teams want a faster time to market. To some extent, a well curated roadmap can help alleviate this problem. But even then, the point of the roadmap is not the roadmap. It's the conversations and openness that lead up to creating it. The roadmap itself is just a side effect.

Metrics as simple as nr of customers affected or some kind of measurement how they are affected would be really useful to stay on track with any new products or changes to existing products.

Multi-colinearity, e.g. spending with a project calendar

Multicolinearity is a pesky attribute of many complex systems. Basically, this means that a number of variables or metrics you think are independent of one another actual vary together. In particular, this is most important for the variables that help define what the output of the system is.

The most famous example in financial markets relates to what happens during market crashes. When the market falls like a knife, all of the variables you were trying to use to figure out where the market will be...fall together. So suddenly all of your fancy math and statistics stop being useful. Right when you need it to work the most.

On a project level, there are a lot of metrics that just don't change much. They simply go up the same amount, month by month. For example: total development cost seems like a good metric to track. Yet as long as the team composition stays stable over the life of the project, you'll have a pretty good predictor of what your product development cost will be. Each month will cost you a certain run rate. Beyond that, spending time and effort, e.g. on a monthly basis, analyzing has much less value.

Because project budget is co-linear to the calendar. And to percent of original plan completed. If any one of these variables (total development cost, percent scope completed, earned value) is 63% complete, all of them will be. So you can just track one of them and know it's roughly similar for all of them. Or even better, make sure your metrics are really related to something that matters.

Or as Chris Condron (@clcondron ) put it recently, you're: "measuring progress based on how much gas you have left in the tank." It's better to measure progress based on what's happening in the near or immediate future. Not based on a view you took months earlier when you knew very little about what needed to be done. Which leads me to the next observation...

Budgets and earned value are up-front guesses, which assumes nothing changes over the life of a project

In anything high tech related, most of the assumptions you make at the beginning of a project will be a guess. And many don't matter, but a few assumptions might matter a lot. This includes both business, operating, and technical assumptions. And most financial metrics are just not useful, when you have such low confidence in them in the first place.

Given that's the case, you shouldn't be surprised that 35-50% more effort or scope is added to projects on average, relative to the best efforts of the product team to define what needs to be done. This is a really tricky one. Because you're knee-deep in the "fuzzy" size of innovation, you don't really know exactly how to build the product. You want the team to have enough flexibility to explore the problem domain. At the same time, if you don't have a clear scope, you QA team won't know what to test. Because you don't know what you'll deliver at a more detailed level.

All of this creepiness wreaks havoc to more traditional financial metrics for the project. Annual budgets are defined on an annual cadence. Product development work happens on a weekly to monthly cadence. So by month 3, what the product team does might be radically different than what was in the budget. Because realizing the budget was discovered to be pointless.

Budgets themselves also tend to fall in two categories according to Bjarte Bogsnes of Beyond Budgeting fame, based on the demand and market context in which they operate:

Same weather tomorrow as today: if this happens, you can just pretty much repeat what you did this year + an annual inflation adjustment. So not much will change anyway, and it's not worth spending much executive time and effort on planning and politicking.
Different whether tomorrow than today: if this happens, pretty much everything you did this year is irrelevant and you need to redesign your budget from scratch (zero based budgeting)

And finally earned value is even more pointless for truly high tech products. Multiplying a subjective (and usually unvalidated) estimate of sales activity by the subjective estimate of % completion. Both of those are easy to game, hard to measure objectively, and the people doing the measurement have incentives to skew the numbers in their favor.

Not related to a decision either immediately or in the future

Finally, tracking metrics for general purpose knowledge is not a good use of anyone's time. A great litmus test for whether or not a metric is useful is whether or not it would trigger action or a decision at a threshold value. If it wouldn't, then realize that you are paying a cost of monitoring it (one which most likely doesn't exceed the value you get from it).

It doesn't really matter if it's actionable for the product team or for the stakeholders around the project. The main point is that it should be really obvious when you need to intervene. So that you don't intervene otherwise and you minimize management meddling.

Key Takeaways

Metrics need to drive a decision or change of behavior. Otherwise they're pointless.
Metrics are ideally tied with how you deliver customer outcomes. Or prospect outcomes.
Financial numbers such as budgets give an air of precision, but are typically not useful in the context of new product development.