Abstract
Rhythmic experimentation defines Scrum. A good Sprint experiment seeks to improve important metrics, such as increasing velocity or decreasing bug count. Some managers claim consistent velocity is important. Percent velocity deviation, σ(V)/E(V), is a reasonable metric to compare teams’ consistency. However, software companies usually look for innovation and profitability. Staid, old companies recreating boring stuff can get very consistent velocity. When innovating teams are asked for consistent velocity, some may game the metrics by padding, or stop innovating. Therefore, use consistency only as part of a collection of metrics that describe a team’s behavior, but not a target.
Client: Our company is looking into how to best represent the velocity variance per collection of Scrum Teams. Do you think that relative standard deviation is the best representation, or would you recommend another way to show velocity variance?
Dan Greening: If you are interested in velocity variance, you’ll want to use the percent-standard-deviation of the velocity. Standard deviation is the square root of the variance, and percent standard deviation is (standard deviation) / (mean). Think about standard deviation as the average difference of the sprint’s velocity from the mean sprint velocity. Since story point scales vary per team, you want this standard-deviation expressed as a percentage of the team’s mean. In this way, you can compare velocity deviation between teams.
However, when managers have asked teams to “reduce the velocity standard-deviation,” the outcome has been bad. Velocity deviation measures the estimation accuracy, assuming the team doesn’t game the metric (which is trivial to innocently do). There are many reasons that estimation can be inaccurate. One, of course, is an inexperienced or undisciplined team. However, another reason for the inaccuracy might be a team that is trying a new process such as pair-programming. Or the team may be innovating. Or the team might be learning new skills. These “good” things, which all help our client succeed in the market, will decrease estimation accuracy and increase velocity deviation.
So, like virtually all metrics, we don’t want to give teams goals to “reduce the velocity deviation”, because it becomes a perverse incentive for behaviors we don’t want. Here’s what we’ve seen when “consistent velocity” is rewarded: sandbagging (padding the estimate), lack of innovation, no process changes, etc. I’ve seen teams that actually thought that their managers wanted them to pad estimates so they could achieve their commitments, and so they did.
Velocity deviation is valuable as an indicator, and along with a collection of other metrics can help spark a conversation with team members. It can be helpful in finding teams that need help. Incentivizing it is the part I’m warning against. Even the common perception in one some of our clients that “consistent velocity is good” seems like the wrong thing to worry about.
Why would we even care about estimation accuracy? Well, the main reason might be that we want better forecasting. However, most clients have a much bigger problem. Roughly half the Scrum teams I’ve coached have a Backlog Forecast Horizon larger than zero. In other words; so what if our estimation accuracy is good, we don’t have an estimated backlog where we could use the team’s average velocity. We have nothing to apply this consistent velocity to, other than helping the team decide how much to take into its Sprint. But if that’s the only thing left, I’d say, “Hey team, how about just focusing on the top stuff on the backlog?”
Backlog Forecast Horizon is an important metric. As product owners seek to increase their Backlog Forecast Horizon, their teams get a better vision of what they are building. They may come up with seemingly “easy” ways to increase the forecast horizon, such as getting the team to estimate big chunky stories. The funny thing is that this form of “gaming” actually improves the teams longer term perspective. More power to them!
I hope this sheds some light on the velocity deviation metric, the forecast horizon metric, and their nuances.
If you have questions, send us an email for private reply: info@senexrex.com.
One reply on “Velocity Variance: Should we seek consistent velocity?”
Some followup comments, based on feedback in a LinkedIn group:
Managers, in my experience, and even some Certified Scrum Trainers, have advocated that teams seek “consistent or predictable velocity”. In the article, I argue that seeking predictable velocity can cause your team to stop innovating or to game the metrics, and that consistency isn’t a good thing to advocate. However, the metric might have value in comparing, motivating deeper investigation, and learning about teams.
There’s a metric for inconsistent velocity, and that’s σ(V)/E(V), the percent standard deviation of the velocity. This can be used to compare teams, even those with different story point scales.
Why would you want to compare teams? Well, one reason to compare teams is because you want to judge, punish or reward them. OK, we all agree that’s bad, demotivating, stifles innovation. However, the other reason you want metrics is to study teams.
Scrum is an experimental framework for studying and improving team performance. Those Sprints are experiments. We teach teams to measure velocity with relative estimation, in large part so they can see the effect of process changes on their production rate, and adopt the ones that work.
When considering Scrum in the large, other metrics are interesting for larger experiments. With lots of Scrum teams, research comparing teams is possible and very useful, but useful experiments have objective, portable metrics. I think σ(V)/E(V) is NOT likely to be a metric correlated with productivity, but I bet we’ll see interesting differences between teams that have high σ(V)/E(V) and those that have low.
The low ones, in my experience, are often internally dysfunctional. Imagine a team with σ(V)/E(V)=0; ridiculous right? Like a fraudulent scientist forging the unruly data to fit the desired curve, someone on the team, or the team as a whole, is forcing production rate to fit the mean velocity, E(V), by padding, sandbagging or outright falsification. And it happens, way too much.
The high ones, usually have major external randomizing impediments or are exploring strange new worlds (innovating). You can even get inconsistent velocity by regularly increasing your velocity, and what manager really wants to stop that? Regardless, if you’re an executive or coach in a large company, and one of your teams has a wildly varying velocity, you might want to visit them. “Can I help with an impediment?” or “What crazy thing are you people doing over here (that I might learn from)?”
Regardless, I can’t imagine experienced coaches have never heard a manager or another coach say “We have to get this velocity to be more predictable!” This is equivalent to saying, “We have to get σ(V)/E(V) closer to zero!” People are talking about it. We should have a thoughtful answer when it comes up, not just saying “Don’t measure that!”
Let me here express my appreciation to Steve Gordon, Philip Ledgerwood, Gary Bamberger, Glen Wang, Alan Dayley and Esther Derby for their thoughtful feedback.