Concept: My posts on AgileES have thus far focused on “what” and “why”. That is, I’ve described the concepts behind AgileES and the benefits of using ES on Agile projects. It’s now time to shift the focus to “how to”. The next posts describe how to apply ES to Agile projects. There are several steps. Some of the steps, especially the initial ones, are controversial. In explaining each step, I will identify and address the issues and then describe the actions to be taken.
Practice: Agile projects commonly use relative estimating techniques such as Tshirt sizing, Planning Poker, and Fibonacci bucketing. Relative estimates are viewed as replacements for Work Hour estimates. They are relatively painless to develop, and, most important, they provide the information needed to plan the next Sprint. Right?
1a. Estimate velocity: the limits of relative estimates.
Relative estimates use comparisons to identify the size of Product Backlog Items. For example, Item A is larger than Item B which is larger than Item C. But, what dimension is being sized? That is not so clear. Some say it is the amount of complexity; others say it is cost; still others say it is uncertainty.
In the end, relative estimates are viewed as replacements for absolute Work Hour estimates. So, complexity, cost, and uncertainty are only considerations used to size work effort.
Relative estimates use Release Points (aka, Story Points) to differentiate sizes. One common way of doing so is to assign the numbers in a geometric series or in the Fibonacci sequence as numbers of Release Points. That gives the appearance of quantifying the differences. For instance, an Item assigned an 8 is larger than one assigned a 2.
Many proponents of Agile do not stop there. They say that relative estimates also indicate how much larger one Item is than another. For instance, an Item assigned an 8 is fourtimes larger than one assigned a 2.
We have found that teams often agree on the order between Items. But, we have observed frequent disagreement over how much difference there is between Items. Is an Item labeled with a Fibonacci number of “21” really 7 times larger than one labeled as “3”? Is an Item labeled with a geometric series number “3” really onethird the effort of an “9”?
The Likert Scale
The disagreements reflect divergent beliefs about size. The situation is similar to one that occurs in social science and marketing research. There, Likert scales measure psychological states such as levels of satisfaction (e.g., with a product or service).* [1] The comparative levels are often associated with numbers from 1 to 5, as depicted in Figure 1.
Figure 1
There is a longstanding debate on whether the numbers have an objective numerical basis.* [2] There are also concerns that the intervals between levels are not equal.* [3]^{ }The upshot is that the scale represents subjective states and that such states cannot be objectively measured. So, we cannot be sure that the difference in assigned numbers is the same as the attribute they represent.* [4]
Similarly, Agile team members use the number of Points to express beliefs about the relative size of work effort. But, what one person believes is twice as much effort may differ from what another person believes is twice the effort. So, although “4” is twice the size of “2”, we cannot be sure that everyone on the team means the same thing when they assign “4” to an Item. The most that we can be sure of is that the Item is moreorless twice the size of a “2”.
Figure 2
Why “fuzziness” matters.
Does the “fuzziness” matter? It matters because the estimates are used in Sprint planning. Sprint planning, in turn, is important for meeting time and budget constraints. Items are selected to fit into a Sprint based in part on the expected velocity of the team. Without that guide, the Sprint goal might be set too high or too low. Either way, project commitments would be undermined.
For selecting Items of the right size, relative estimates are inadequate. Using fuzzy estimates is like driving through a town that has posted its speed limit as “25ish”. If you are not worried about keeping to a schedule—just drive at 15, and you should be OK. Or, if you do not care about money, drive as fast as you want, but be prepared to pay a fine.
If time and money matter, you need to know how fast you can go without breaking the limit but still getting through as quickly as possible. That is what a cardinal estimate tells you. It goes beyond the subjective state to something that can be objectively measured. It provides a clear baseline for assessing performance on past Sprints and planning future ones.
For AgileES, the cardinal estimate is based on underlying data about effort. The underlying data is in Work Breakdown Structures. The WBSs represent the intermediate deliverables and associated tasks that are required to produce each Product Backlog Item. Work hour estimates are an essential part of the WBSs and are used to represent the number of Release Points for an Item.
Relative Estimates and Statistics
A benefit of using cardinal numbers is unconditional access to statistics in Sprint planning. With cardinal estimates, size is measured in equal units of Work Hours. Statistics such as mean velocity and standard deviation can be calculated without stipulation.
By contrast, the use of statistics for ordinals is conditional.* [5] For statistics to be meaningful, the difference between ordinals must be measurable and equal. With beliefs about size, neither condition is met. Statistics on previous relative estimates must therefore depend on cardinal numbers.
Whether effort estimates are ordinal or cardinal, actual effort is always in cardinal numbers. Team members are not polled after the fact for their opinion on the size of work effort expended. Instead, the hours spent are simply counted, and statistics are calculated.
The statistics can then be used to adjust current beliefs about size. If the mean velocity was greater than they believed the size of effort was going to be, team members should increase their estimate of size. If it was less, they should decrease their estimate of size.
In AgileES, we start with relative estimates (beliefs) and then calibrate them using WBSs (hours). After each Sprint, we assess the results using statistics and recalibrate our estimates as required. Steps 57 discuss the assessment and its consequences in detail.* [6]
References and Notes
[1] Technically, there is a difference between a Likert scale and a Likert item. The item is what a survey respondent is asked to evaluate, e.g., level of satisfaction with a recently purchased product. In wellformed surveys, there are multiple items that reveal the respondent’s underlying psychological state. The Likert scale is the sum of all the items. So, in a survey with 5 items and responses ranging from 1=Very Dissatisfied to 5=Very Satisfied, the Likert scale is 5 to 25. Figure 2 therefore represents the format of a Likerttype item. For more information, see What Is a Likert Scale? and Likert Scale vs Likert Item.
[2] The debate goes back at least 70 years. See Stevens, S. S. (7 June 1946). "On the Theory of Scales of Measurement". Science. 103 (2684): 677–680 and Michell, J. (1986). "Measurement scales and statistics: a clash of paradigms". Psychological Bulletin. 3: 398–407.
[3] Ibid.
[4] Ordinal vs Interval Scales.
[5] For a negative view, see Why stats do not apply to ordinals. For a positive view, see Use stats with ordinals, but carefully.
[6] AgileES use of mean and standard deviation will be covered later. See posts on AgileES Advanced Metrics.
