I gave a talk a couple years ago where I spoke out about five common metrics which are misused for measuring developer productivity:
- Commits
- Lines of code
- Number of pull requests
- Velocity points
- Code impact
These metrics still come up often, despite growing consensus that they are ineffective. This post summarizes key points from my talk to serve as a quick reference for why each of these metrics should not be used for measuring productivity.
Commits. Number of commits doesn’t tell you anything about the size, value, or quality of those commits. In addition, it is impossible to measure commits with any semblance of accuracy since developers have unique individual commit patterns and typically squash their commits before pushing to a remote branch.
Lines of code. This metric has lingered around for decades, despite criticism from experts including notable programmers like Martin Fowler and Bill Atkinson. More lines of code does not equate to more value delivered, since software is best-written using the fewest lines of code possible in order to increase code readability and maintainability.
Velocity points. Velocity points are a popular way of estimating future work. Unfortunately, they’ve also become a popular way of measuring how much work has been delivered. There are two problems with this: first, velocity points are estimates of work before being worked on, and these estimates more often than not tend to be incorrect. Second, turning velocity into a measure of “work delivered” incentivizes teams to inflate their point estimates, which undermines their core purpose for estimation.
Code impact. You may not have heard of this metric unless you’ve explored one of the many engineering metrics vendors out there. Here’s a definition of this metric from one such vendor: "Impact attempts to answer the question, 'Roughly how much cognitive load did the engineer carry when implementing these changes?' through the severity of edits to the codebase, as compared to repository history."
Or from another: "The magnitude of changes to the codebase over a period of time based on whether code is added or changed, the proximity of the edits to one another, and the specific nature of the change on a line by line basis.”
I’m hesitant to call this metric snakeoil because I’m sure it is well-intentioned. But anyone who’s ever written a line of code knows that there’s no legitimacy to this measure.