Agile Backlog Estimation and Sizing

An estimate by definition is: “an approximate calculation or judgment of the value, number, quantity, or extent of something.”

It is fundamentally a guess, a probability distribution. When estimating an Agile backlog, no promise is made. No warranty should be expressed or implied. (If you want a guarantee, buy a toaster ;).

Missing an estimate is not in any way dishonorable. That is the reason we call it an “estimate” – because we don’t know everything we need to know, and honestly it’s hard to be accurate. (Missing commitments is a topic for another day…)

While estimating a backlog is optional in Kanban, it is a prescription of Scrum and Scrum teams “estimate” in a couple of discrete ways:

They estimate (commonly called “sizing”) their backlog of epics and stories – usually during grooming. Some teams use Story Points (discussed in more detail below), other teams simply go by count of Story cards. Either approach can be used for velocity-based commitment.
During iteration planning (or optionally groom, which is not “just in time” and may be wasteful), teams estimate using Dev Hours the tasks/subtasks needed to meet the acceptance tests of a Story. These estimates can then be used for capacity-based sprint commitment (but not for release planning).

About Story Points

Story Points are a unit-less, relative measure of size. Agile backlog estimation using Story Points (or card count), combined with a team’s historical velocity (remember: Estimate size, derive duration) allows for some sanity and confidence in answering the questions:

How much work can the team likely commit to in a sprint?
How long will “this part” of the backlog likely take to deliver?

There are two fundamentally different approaches in regard to Story Points and sizing:

Story Points can be used to represent Ideal Effort (a measure of value-producing effort, or benefit) – hence, benchmark stories are fixed, and intangible classes of work (e.g., bugs, tech debt) typically would get no points.
Story Sizing / Story Points can be used to represent Actual Effort (a measure of cost) – ipso facto: sizes can change over time, intangibles would probably get points.

Which is better? It depends. The one you use should be chosen based what you hope to learn by measuring it and by your expectations about the future. If you aren’t sure, I highly recommend that your team start with Ideal Effort.

What Exactly is “Ideal Effort?”

From chapter 3.7 Velocity and Sizing Exploring Scrum: The Fundamentals Second Edition By Dan Rawsthorne with Doug Shimp.

Ideal Effort is the effort it would take to develop the Story (meet both the Acceptance and Doneness Criteria) if everything were as it should be.
What does ‘as it should be’ mean?
Your Team has the people it needs, and they are all top-notch; the Team has the domain knowledge, the skills, and the development environment it needs to do its job – there are no impediments due to a lack of Team Ability.
Your Code is Clean, protected by both Unit and Functional Tests, and the Technical Documentation is both minimal and sufficient – there is no Technical Debt.
The Organization provides a safe, learning environment; Team Members are allowed to focus on their Team’s work and are not sidetracked by excessive meetings and other distractions – there is no Organizational Noise.
There are Subject Matter Experts available who have the knowledge or expertise you need, when you need it – there are no impediments based on SME Availability

Agile Backlog Estimation – A Game the Whole Team Can (and Should) Play

Backlog sizing activities should include the entire Scrum team. The product owner participates in sizing of User Stories to clarify requirements, user stories, stakeholder expectations, etc. but does not “size” a story. Everyone else on the team participates in the actual sizing.

Stakeholders, Business Owners, and Gold Owners on the other hand should not be included in the agile backlog estimation party. All their input should have been obtained previously – at grooming when a card is confirmed to meet the teams Definition of Ready. If significant questions come up during sizing, then the PO should follow up with those stakeholders separately.

Why include the full scrum team?

At the time of sizing, we normally don’t know exactly who will be implementing which parts of which stories. Stories normally involve several people and different types of expertise (user interface design, coding, testing, etc). In order to provide a meaningful, relative size, a team member needs some kind of understanding of what the story is about. By asking everybody to participate we make sure that each team member understands what each item is about. This increases the likelihood that team members will help each other out during the sprint. It’s not the estimate, it’s the estimating…

When asking the full team to participate, we often discover discrepancies where two different team members have wildly different sizes for the same story. That kind of stuff is better to discover and discuss earlier than later. This is part of the power of group dialog.

Sizing Methods for Story Pointing

Method 1: Planning Poker

Planning Poker is a term coined by Mike Cohn and James Grenning circa 2002, and is based on an estimation technique known as Wideband Delphi which was created by the RAND corporation in 1968 or 1940 depending on which source you believe…. (I assume by Delphi they’re referring to the ancient, gas-huffing priestesses at Delphi.)

Here’s how it works:

Each team member gets a deck of 13 cards.

Whenever a story is to be sized, each team member selects a card that represents his estimate in story points) and places it face-down on the table.

When all team members are done the cards on the table are revealed simultaneously. (That way each team member is forced to think for himself rather than lean on somebody else’s estimate.)

When sizes differ significantly, the high and low estimators explain their reasoning. It’s important that this does not come across as attacking those estimators. Instead, you want to learn what they were thinking about. The team then discusses the differences and tries to build a common picture of what work is involved in the story. Afterwards, the team estimates again. This loop is repeated until the time estimates converge, i.e. all estimates are approximately the same for that story.

It is important to keep in mind that we are to size the total amount of work involved in the story to get to “dev complete.” By “dev complete” we mean “DONE” – so include CR’s to +2 and passing all tests. (Testing should not be decoupled from the programming task.)

Note that the number sequence on the cards is non-linear

For example there is nothing between 40 and 100. Why? This is to avoid a false sense of accuracy for large time estimates. If a story is estimated at approximately 20 story points, it is not relevant to discuss whether it should be 20 or 18 or 21. All we know is that it is a large story and that it is hard to estimate. So 20 is our ballpark guess.

Want more detailed estimates? Split the story into smaller stories and size the smaller stories instead! And no, you can’t cheat by combining a 5 and a 2 to make a 7. You have to choose either 5 or 8, there is no 7.

Some Special Cards to Note

0 = “this story is already done” or “this story is pretty much nothing, just a few minutes of work”.
? = “I have absolutely no idea at all.”
Infinity = “This item is too big or too complicated to estimate”

Method 2: Triangulation

When sizing by triangulation, a team compares the story/epic they want to size with some previously sized ones. They then decide if the story/epic is about the same size, smaller or bigger than the references.

Using Triangulation with story sizing based on Actual Effort will cause fluctuation, and the need for “rebenchmarking” on a periodic basis.

Method 3: Bob Shatz’s Story Sizing/Value Game

Start with all the cards to be size visible and benchmark stories available for reference
First person picks a card (any card) and places under their estimated size (based on level of effort, not time, nor complexity).
Second person pick a new card; compares to first card; and places their card under the number they estimate their card to be.
Third person and every one after that…gets a CHOICE of MOVE…place a new card down, OR move an existing card.
Game continues until all cards are sized and the team has come to a consensus, which means nobody is in violent opposition to moving forward with the relative size estimates.

Variation of #3: Order the Stories First and Then Story Point (Modified Bockman Technique)

Start with the one sticky per story and place these on the side of the whiteboard
For each story, the team discusses what the story is about and what it might involve. This should avoid deep diving into solutions though.
Once everyone has had enough of that: Someone picks a story, any story, and puts it in the middle of a blank area on the white board.
The next person pick a different story and decide if it is bigger or smaller than the story already on the board. If bigger they put the sticky to the right of what is already there. If it is smaller then put it to the left. “Exactly the same size” is OK, but should be avoided if possible. (I like to do this step silently)
The next person can move an already placed story, pass, or place a new story relative to the ones already placed. (A bit of rearranging happens as needed to make room)
Iterate over all the stories, adding them in the relative size order.
Everyone gets a chance to go over the stories for a quick check that the stories are in the correct relative order. Some limited discussion happens

At this point we have relative sizing. Now, we need to map those relative sizes to story points. This is done by a ‘shout out’ method.

Pick a new color of sticky note if you can.
Write 1 on it.
Hold it at the left of the list and slowly right it up still someone shouts out ‘too big’ or ‘stop’ or what ever.
At this point the team can have a chat about the exact point at which the stories get bigger than one point.
Stick the 1 point sticky above the largest story which was agreed to be 1 point.
Now move to 3 points and start again where you left off, moving right until someone shouts “stop”
Repeat w story points 5, 8 and 13 until all the stories are sized.

To avoid story point inflation or drift, it is essential for the team to have benchmark stories available for reference.

(Note: with this variation, I’ve been able to facilitate the sizing of about 50 stories in under one hour.)

The Right Amount of Discussion

Time is up agile backlog estimation Regardless of method, some amount of discussion is necessary and appropriate when sizing a backlog. However, spending too much time on discussions is often wasted effort. The point of sizing is not absolute precision but reasonableness.

Here’s an effective way to encourage some amount of discussion but make sure that it doesn’t go on too long.

Use a two-minute timer, and place it in the middle of the table where sizing is taking place.
Anyone in the meeting can turn the timer over at any time. When the sand runs out (in two minutes) if agreement hasn’t been reached, the team decides if the discussion should continue.
But someone can immediately turn the timer over, again limiting the discussion to a second two minutes.

The timer rarely needs to be turned over more than twice. Over time this helps teams learn to size more rapidly as well as learn that at some point additional discussion does not lead to improved accuracy.

Avoid wasting time decomposing work “too much” in an effort to increase accuracy.
Be consistent in your sizing – establish a set of baselines / benchmarks. (being consistent is more important than being “accurate”)
Size level of relative (actual/ideal) effort, but not duration, and not complexity…

Other Ways

There are probably as many ways to approach agile backlog estimation as there are colors in the rainbow. As mentioned above, some teams don’t track velocity via story points at all – they track based simply on the number of story cards themselves, having sliced the cards down into “all the same size.”

And there are many who swear by no estimates at all….

As always, if what you are doing works, keep doing it. On the other hand, it you team is not hitting their sprint goals on a consistent basis, try something else…

Agile Backlog Estimation / Sizing

About Story Points

What Exactly is “Ideal Effort?”

What does ‘as it should be’ mean?