"There were only an estimated two to five thousand humans alive in Africa sixty thousand years ago. We were literally a species on the brink of extinction! And some scientists believe (from studies of carbon-dated cave art, archaeological sites, and human skeletons) that the group that crossed the Red Sea to begin the great migration was a mere one hundred fifty. Only the most innovative survived, carrying with them problem-solving traits that would eventually give rise to our incredible imagination and creativity." --Marc Sisson
Imagination fed into our uniquely human ability to cooperate flexibly in large numbers. So fast forward to today. Our most valuable and exciting work, particularly in the context of innovation, still relies on our ability to imagine what needs to be done, start, and continuously course correct.
In this case, we're using imagination to structure and agree how work needs to happen, and to map that to a subjective estimate of effort.
First, we imagine what needs to be built, why it needs to be built, and how it needs to work. Then, we subdivide the big overall vision into lots of little pieces, and divvy it up among a group of people who go execute on the vision. Before they do that, though, these people imagine, analyze, and discuss doing the work involved on each specific piece of the overall vision. They all need to agree how much effort it will take to complete that task. If there are differences of opinion, they should be ironed out up front.
If done successfully, this generate full buy-in and alignment from everyone involved. Even if the end product isn't a physical thing, this approach works. The benefits of trusting people and harnessing all their energy and imagination far outweigh the inherent risks. It's already done by tens of thousands of teams around the world already in various digital industries including software.
Relative Cognitive Effort is what we're imagining.
The key number that is used for tracking this is a measure of how much "cognitive effort" was completed over a predetermined unit of time. Agile and scrum use the concept of a story instead of tasks, in order to help describe complex needs in a narrative form if needed. Usually this includes elements such as: user problem, what's missing, acceptance criteria for the solution required. Therefore, the unit of measure for the cognitive effort expected to complete a story is called a story point.
Each story is sized in terms of story points. Story points themselves are quite abstract. They relate to the relative complexity of each item. If this task more complex than that one, then it should have more story points. Story points primarily refer to how difficult the team expects a specific task to be.
For example, it's more precise to track the number of story points completed in the last 2 weeks, than just the raw number of stories completed. As stories can be of different sizes.
Now it's time for a few disclaimers...
1. Story points are not measures of developer time required.
Cognitive complexity isn't necessarily the same thing as how time consuming it will be to achieve. For example, a complex story may require a lot of thought to get right, but once you figure out how to do it, it can be a few minor changes in the codebase. Or it could be a relatively simple change that need to be done many times over, which in and of itself increases potential complexity and risk of side effects.
The main purpose of story points is to help communicate--up front--how much effort a given task will require. To have meaning for the team, it should be generated by the team who will actually be doing the work. These estimates can then be used by non-technical decisionmakers to prioritize, order, and plan work accordingly. They can then take into account the amount of effort and trade it off with expected business value for each particular story.
2. Story points related to time are a lagging indicator for team performance.
The key, though, is that story points shouldn't be derived as 1 story point = half day, so this item will be 3 story points because I expect it will take 1.5 days. This type of analysis can only be done after the fact, and related to entire timeboxes like a 2 week period. Instead, the team should be comparing the story they are estimating to other stories already estimated on the backlog:
- Do you think it will be bigger than story X123? Or smaller?
- What about X124?
The team needs to get together regularly and estimate the relative size of each story, compared to every other story.
This generates a lot of discussion. It takes time. And therefore estimation itself has a very real cost. Some technical people view it as a distraction from "doing the work". Rightly so.
3. Story Points assume you fix all bugs & address problems as you discover them.
Only new functionality has a story point value associated. This means that you are incentivized to creating new functionality. While discovering and fixing problems takes up time, it doesn't contribute to the final feature set upon release. Or the value a user will get from the product.
Anything that is a bug or a problem with existing code needs to be logged and addressed as soon as possible, ideally before any new functionality is started, to be certain that anything "done" (where the story points have been credited) is actually done. If you don't do this, then you will have a lot of story points completed, but you won't be able to release the product because of the amount of bugs you know about. What's worse, bugfixing can drag on and on for months, if you delay this until the end. It's highly unpredictable how long it will take a team to fix all bugs, as each bug can take a few minutes or a few weeks. If you fix bugs immediately, you have a much higher chance of fixing them quickly, as the work is still fresh in the team's collective memory.
Fixing bugs as soon as they're discovered is a pretty high bar in terms of team discipline. And a lot will depend on the organizational context where the work happens. Is it really ok to invest 40% more time to deliver stories with all unit testing embedded, and deliver less features that we're more confident in? Or is the release date more important?
4. One team's trash is another team's treasure.
Finally, it's worth noting that story points themselves will always be team-specific. In other words, a "3" in one team won't necessarily be equal to a "3" in another team. Each team have their own relative strengths, levels of experience with different technologies, and levels of understanding how to approach a particular technical problem.
Moreover, there are lots of factors which can affect both estimates and comparability. It wouldn't make sense to compare the story point estimates of a team working on an established legacy code base with a team who is building an initial prototype for a totally new product. Those have very different technical ramifications and "cognitive loads".
Conversely, you can compare story points over time within one team, as it was the same team who provided the estimates. So you can reason about how long it took to deliver a 3 story point story now vs. six months ago--by the same team only.
Wait, can't Story Point estimation be gamed?
As a system, story points gamify completing the work. Keen observers sarcastically claim they will just do a task to help the team "score a few points".
But then again, that's the idea behind the approach of measuring story points. To draw everyone's attention to what matters the most: fully specifying, developing, and testing new features as an interdependent delivery team.
Moreover, all of this discussion focuses on capacity and allocation. The key measure of progress (in an agile context) is working software. Or new product features in a non-software context. Not story points completed. If you start to make goals using story points, for example for velocity, you introduce trade-offs usually around quality:
- Should we make it faster or should we make it better?
- Do we have enough time for Refactoring?
- Why not accumulate some Technical Debt to increase our Velocity?
Story points completed are only a proxy for completed features. They come in handy in scenarios where you don't have a clear user interface to see a features in action. For example, on an infrastructure project with a lot of back-end services, you might not be able to demo much until you have the core
Example: Adding technical scope to an already tight schedule
On a client project, I had a really good architect propose and start a major restructuring of the code base. It was kicked off by his frustration with trying to introduce a small change. A fellow developer tried to add something that should have taken an hour or two, but it took a few days. The architect decided the structure of the project was at fault.
Yet, this refactoring started to go into a few weeks. The majority of the team were blocked on anything important. He was working on an important part of the final deliverable. While the work he was doing was necessary, it would have been good to figure out any elapsed time impact on the overall deliverable, so that it could be coordinated with everyone interested.
As the sprint ended, I proposed we define the work explicitly on the backlog, and estimate it as a team. This way, the architectural work would be a bit more "on the radar". There were around nine tasks left. The team said the remaining work was comparable across all of them, and collectively decided it was about a 5 story point size per item. So we had added roughly 45 story points of effort.
Knowing that the team was averaging around 20 story points per elapsed week, it became clear we had suddenly added 2 weeks worth of work--without explicitly acknowledging what this might do to the final delivery date. While the architect was quite productive, and claimed he could do it faster, there was still an opportunity cost. He wasn't working on something else that was important.
In this case, story points helped come up with a realistic impact to schedule that senior stakeholders and sponsors needed to know about. The impact on the initial launch date was material. So the estimation with story points helped provide an "elapsed time" estimate of an otherwise invisible change.
While not perfect, Story Points are primarily a tool for capacity planning, not whip cracking.
So to step back, you can see that story points are a useful abstraction which gets at the core of what everyone cares about: new product features. While subjective, for the same task--as long as it's well defined--most of the members of a team usually come up with pretty close estimates. It's kind of surprising at first, but eventually you get used it. And you look forward to differences, because that means there may be something that needs to be discussed or agreed first. That is the primary purpose of story points. As a side effect, it can help get to grips with a much larger backlog. And plan roughly how many teams need to be involved.
However, this approach only works within relatively strict parameters and disclaimers if you want the numbers to mean anything. It is at a level of resolution that proxies progress, but makes external micromanagement difficult. If you want the team to self-manage and self-organize, this is a feature not a bug of the whole approach. Ultimately the core goal is correctly functioning new features. Best not to lose sight of that.