Concept: All of the ES metrics described thus far (the ES, SPIt, EACt, TSPI) are point values. As such, they do not fully express the natural variation in performance that occurs over time. For that, a probability distribution is needed, and for that, statistical analysis is required. This post shows why and how statistical analysis applies to Agile projects.
Practice: It sounds like an odd pairing: Agile and stats. To many, it was eqally incongruous to pair baseball and stats...until Moneyball.
The book by that name (and, later, the movie) documented how the Oakland Athletics baseball team and its General Manager, Billy Beane, used an analytical, databased, statistical approach to assembling a competitive majorleague baseball club. The team's success changed the way professional baseball was viewed. As the book puts it:
The same can be said of Agile projects and statistics. The drama inherent in an Agile project is obvious to anyone who has faced a looming “drop dead” date. What’s not so obvious is why stats matter to an Agile project.
The short answer is that there are metrics in every project, including Agile projects, that lend themselves to statistical analysis and that analysis carries value for managing the project. In this post, I will first address what metrics are in play. Then, I will show the value that the stats add to managing the project. Throughout, I will avoid the esoteric jargon common to math books and offer intuitive explanations. At the same time, I promise not to patronize you. When necessary, I will introduce math concepts to help you understand the stats. Let me know how well I have done.
Which metrics?
As explained by the New York Times: “Many realworld observations can be approximated by, and tested against, the same expected pattern: the normal distribution. In this familiar symmetric bellshaped pattern, most observations are close to average, and there are fewer observations further from the average.” [1] (See Figure 1.)
Figure 1
So, if a metric exhibits normal distribution, it lends itself to statistical analysis. Ideally, in our case, the metric would also be one that expresses schedule performance. We could then use statistics to model its behaviour over time and, in turn, use that information to manage schedule performance.
The best candidate for analysis would seem to be the SPIt, as its specific role is to measure schedule performance efficiency. Research has shown, however, that the SPIt does not have a normal distribution. [2] Instead, its behaviour is skewed. (See Figure 2.)
Figure 2
Fortunately, there are ways to transform the SPIt into values that exhibit a normal distribution. The “natural logarithm” is one such mathematical transformation. A logarithm tells us how many times one number must be multiplied by itself to equal another number.The natural logarithm tells us how many times the number known as “e” must be multiplied by itself to equal another number.
The number “e” (aka, 2.71828…), like the number “pi”, is special. It enables us to measure the amount of change in a system against a constant rate. [3]
Because the SPIt tends to hover around 1.0, the number of “e’s” to be multiplied is roughly the same for most instances of the SPIt. But, there are generally a few cases where the SPIt varies widely from 1.0.
So, if we count the number of “e’s” required to produce each SPIt, the counts will cluster around a central value with a few scattered at greater distance from it. And, that is what a normal distribution looks like. (See Figure 3.)
Figure 3
Given a normal distribution, we can calculate the spread of the results. That’s also known as the standard deviation. We can then calculate how far each individual value is from the central value. Finally, we can calculate the likely high and low values at any given Sprint.
The ultimate result of the calculations is a high and low estimate for completion dates. The estimates are derived from the range of past performance and are expressed as a spread of values given a certain level of confidence.
What value do stats add to Agile projects?
Basic ES metrics provide a point estimate for project duration, i.e., the EACt. The credibility of that point estimate derives from its pedigree: the amount of schedule actually earned and the level of schedule efficiency achieved thus far on the project. Although historical performance on a specific project offers prima facie support for the EACt, it does not tell a full story.
The point value is only one of many possible values. What’s missing is a fix on the underlying distribution of values in the problem space. With knowledge of the variance, we know how confident to be in the number.
ES statistical analysis provides the variance in EACt, given a particular confidence level. Our confidence in the estimate is an input into schedule decisions. Beyond that, ES statistical analysis helps us assess the project’s allowances for uncertainty.
Once the upper (EACtHigh) and lower (EACt Low) boundaries stabilize, we compare them with Contingency and Reserve, i.e., the allowances for uncertainty. If the boundaries exceed the uncertainty allowances, the target end date is at risk. The severity of the risk depends on the amount of deviation from the allowances.
If both Contingency and Reserve are breached, the risk of missing the target date is high, and we say the project status is "Red". Within Reserve, the committed date is still intact, but the baseline end date is at risk. The project status is deemed "Yellow". A "Green" status is awarded to projects operating within both Contingency and Reserve. (See Figure 4.)
Figure 4
We also observe trends in the boundaries. The boundaries inevitably converge because of the calculations that are performed. [4] But, even as the gap between them narrows, individual boundaries may rise, fall, or move in a straightline.
A rise indicates that the estimated finish is growing later. Schedule performance is lagging. The opposite trend indicates that the estimated finish date is drawing closer. Schedule performance is improving over time.
The example below (from Lipke (2010), p 22) documents a case in which the lower boundary rose over time while the upper boundary remained relatively flat. Schedule performance lagged.
Figure 5
Walt Lipke has observed that when a rise occurs, the high boundary is slightly higher than the final duration and tracks it as almost a horizontal line. Similarly, when a fall occurs, the low boundary is slightly lower than the final duration and tracks it moreorless horizontally. [5]
Finally, when the pair remain symmetrical around the nominal value (which is the EACt point estimate), the nominal forecast is close to the final duration.
In summary, ES statistical analysis tells us how confident to be in completion estimates, how well uncertainty allowances are operating, and what the trends tell us about meeting our deadlines.
Warning Label
The use of statistical analysis on Agile projects comes with several caveats.
 Minimum Number of Sprints: At ProjectFlightDeck, we have found that ES stats are most useful on longerrunning projects with weekly Sprints. The reason is that it takes time to build a sufficient amount of data for the stats to stabilize.
We use 30 Sprints as the minimum, reflecting a common ruleofthumb for normal distributions. Althhough there are calculations for smaller sample sizes, we have found that the resulting spreads tend to be very wide and make assessments using them less effective. [6]
 Preferred Confidence Level: Similarly, we use a 90% confidence level, rather than the more rigorous 95%. Again, the spreads with 95% confidence are wider than for 90%. With 90% we get the greatest confidence with the smallest spread between high and low boundaries. That combination fits well with the way we commonly use ES stats to manage project schedules.
 Tool Use: Finally, the calculations required to produce the high and low EACt are manually intensive. They require a tool if they are to be used regularly on a project. Building a tool demands not only time for design, construction, and testing but also a detailed knowledge of the method. [7] Fortunately, there are both freeware and commercial tools available for doing the calculations. [8]
An Example
A recent project illustrates how statistical analysis is used to manage schedule performance on an Agile project.
The sample project was part of a large, multiyear, ninefigure program. The program was a heterogeneous mix of plandriven, Agile, and hybrid projects. ES metrics were used as standard schedule performance measures on all projects in the program.
The sample project was executed by a large team of technical writers tasked with documenting all of the business processes and procedures for a package implementation. There were dozens of processes and hundreds of procedures.
Although all team members were experienced in technical writing, few had businessdomain knowledge. The scope of the work was dynamic, reflecting the learning curve of the team. At the same time, the duration of the project was fixed at two years including uncertainty allowances. Contractually, the documentation had to be in place prior to implementation of the package. The fluidity of scope and rigor of end date made an Agile approach a good fit.
All projects in the program had access to ES, SPIt, and EACt. Until schedule performance stabilized, the Sample Project used these basic ES metrics to help manage the timeline. Once the stats stabilized, ES statistical analysis was added to the Sample Project’s available metrics.
By Sprint 30, the stats were stable, and a clear pattern had emerged. The high estimate (Green Triangles) and low estimate (Blue Squares) had started off far beyond the uncertainty allowances (Purple x’s and Light Blue Asterisks). At Sprint 30, they remained well beyond the limits. (See Figure 6.)
Figure 6
Although the boundaries were converging, the target date was assessed to be at risk. In fact, it appeared that a rebaseline might be necessary. The project status was Red.
Six weeks later, at Sprint 36, the boundaries finally came within uncertainty allowances. The project seemed likely to continue drawing down the allowances. That put at risk the “bare plan” finish date, i.e., the date excluding Contingency and Reserve. (See Figure 7.)
On the other hand, the upper and lower boundaries were symmetric around the nominal EACt, and the nominal EACt was itself hovering around the bare plan date. The project was assessed as likely to finish within the commitment, i.e., within Reserve, but it would consume much, if not all, of Contingency. The project status was Yellow.
Figure 7
In the end, the project finished beyond Contingency but within Reserve. The committed delivery date was met. As Billy Beane once said, “The math works.” [9]
Notes:
[1] New York Times as quoted in Paret (2013).
[2] Lipke (2011), p 1 and Figure 1. And, Lipke (2006), p 5.
[3] Azad (n.d.). Kalid Azad's blog, Better Explained, is a good source of intutive explanations of mathematical concepts.
[5] For shorter duration projects and for Sprints early in a longrunning project, techniques such as Monte Carlo simulation can be used.
[6] The details of the method include adjustments that must be made to the natural log of the SPIt to account for the fact that projects have a finite size whereas statistical analysis assumes an infinite sample size. Further discussion is beyond the scope of this post. See Lipke (2009), p 143 for details.
[7] Lipke (2009), Chapter 12. Also, Lipke (2006).
[8] ProjectFlightDeck offers tools that implement statistical analysis: currently, the Schedule Performance Analyzer of MS Project and the forthcoming upgrade to the AgileESM MS Excel tool. See http://www.projectflightdeck.com/.
[9] BrainyQuotes (n.d.).
References:
Azad, K. (n.d.) Demystifying the Natural Logarithm (ln). Retrieved from https://betterexplained.com/articles/demystifyingthenaturallogarithmln/.
BrainyQuotes. (n.d.) Retrieved from https: //www.brainyquote.com/authors/billy_beane.
Lewis, M.. (2003). Moneyball: The Art of Winning an Unfair Game. W.W. Norton: New York and London.
Lipke, W. (2006). Statistical Methods Applied to EVM…the Next Frontier. CrossTalk. June,9, 6, pp 2023.
Lipke, W. (2009). Earned Schedule. Raleigh, North Carolina: Lulu Publishing.
Lipke, W. (2010). Applying Statistical Methods to EVM Reserve Planning and Forecasting.The Measurable News, 3, 1724.
Lipke, W. (2011). Further Study of the Normality of CPI and SPI(t). PM World Today. October.
Paret, M. (2013). Explaining the Central Limit Theorem. The Minitab Blog. Retrieved from http://blog.minitab.com/blog/michelleparet/explainingthecentrallimittheoremwithbunniesanddragonsv2.
For the blog home page and additional posts, click here.
