Predicting the Unpredictable (A Book Review)

I came to this book with an anticipation of a coherent discussion of estimating on modern software development projects. The subtitle is Pragmatic Approached to Estimating Project Schedule and Cost, says I’d find practical advice on how to estimate software development.

The Introduction opens with the standard project questions – how much will this cost and when will we be done? These are questions critical to any business spending money to produce a product – since time is money. You can buy technical solutions, you can get more more, but you can’t buy back lost time.

In the 2^nd paragraph, there is an obvious statement the problem with these questions is that they are predictions. Then it follows I don’t know about you, but my crystal ball is opaque, It’s (sic – should be I’ve) never been good at predictions.

This indicates to me that the author actually doesn’t know how to estimate, but intends to tell the readers how to estimate, starting with a misunderstanding of what an estimate is and how that estimate is produced.

There are more observations about estimates changing and estimates expiring. This is correct. Estimates get updated with actual information, changes in future plans, discovered risks, etc. Estimates age out with this new information.

Chapter 2 starts with the naïve definition of estimating software projects. Estimates are guesses. The dictionary definition of a guess is used for an estimate. Trouble is the dictionary is usually not a good source of probability and statistics terms. Estimating is part of probabilistic decision-making.

One useful definition of an estimate is finding a value that is close enough to the right answer, usually with some thought or calculation involved. Another is an approximate calculation of some value of interest. For example how much will this cost and when will we be done?

A broad definition is:

Estimation (or estimating) is the process of finding an estimate, or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is derived from the best information available. Typically, estimation involves « using the value of a statistic derived from a sample to estimate the value of a corresponding population parameter. The sample provides information that can be projected, through various formal or informal processes, to determine a range most likely to describe the missing information. An estimate that turns out to be incorrect will be an overestimate if the estimate exceeded the actual result, and an underestimate if the estimate fell short of the actual result.

This chapter confuses the accuracy of an estimate with the precision of an estimate. Accuracy and precision are defined in terms of systematic and random errors. The more common definition associates accuracy with systematic errors and precision with random errors. Another definition, advanced by ISO, associates trueness with systematic errors and precision with random errors and defines accuracy as the combination of both trueness and precision.

According to ISO 5725-1, the general term accuracy is used to describe the closeness of a measurement to the true value. When the term is applied to sets of measurements of the same measurand, it involves a component of random error and a component of systematic error. In this case, trueness is the closeness of the mean of a set of measurement results to the actual (true) value and precision is the closeness of agreement among a set of results.

To make this really formal, here’s the math for accuracy and precision. Accuracy is the portion of true results (both positive and negative) among the total number of cases examined.

Precision is the portion of the true positives against all the positive results.

Chapter 2 ends with a description of an example of bad management. Making and accepting estimates without assessment of precision and accuracy – the variances of the estimated value – is simply bad management. It’s actually naïve management that goes bad when actions are taken with this naïve estimate.

Just an aside, the optimistic, most likely, and pessimistic estimate are OK but are actually not allowed in the domain I work. These are subject to optimism bias and result in large variances depending on in which order the questions are asked.

Chapter 3 starts with another misinformation term. An Order of Magnitude estimate is a number that is within 10X the actual value. This is an estimate that is ±100%. Not very useful in practice. The term rough can have tight or broad ranges.

The core issue here is the question why do we estimate. The popular software developer point of view is because we have been asked, or because we have always done estimates.

The actual answer is

because making decisions in the presence of uncertainty about future outcomes from these decisions is called microeconomics. Making these decisions required – mandates actually – estimating these impacts.

So in the end, estimates are for those providing the money. Any sense that estimates are a waste needs to be confirmed with those providing the money. This is not to say that estimates and estimating are not done poorly, manipulated, misused, or used to abuse people. But from the development point of view, it’s not your money.

Chapter 4 starts with the common lament of software developers – we’ve never done this before so how can we possibly estimate our work?

Well, if you haven’t done this before, go find someone who has. It’s that simple and that hard. The notion on the first page in Chapter 4 about the tools like SLIM, COCOMO seems a bit narrow-minded. As one who uses those tools as well a Seer and QSM, I can attest of their measurable value, accuracy, and precision, I wonder if the author has applied them in a mature development environment. These tools require skill, experience, and most of all calibration. The conjecture that they require substantial (no units of measure) time that takes away time from team learning to work together begs the question what is the value at risk. Applying Seer to $1.2B of national asset software development is much different than applying a tool to $200K of web site development. The problem with this book – at least up to Chapter 4 – is there is no domain defined in which the advice is applicable.

Next comes the notion that estimates in software are like estimates in construction. If you provide a single point estimate is again bad management – don’t do that. By the way, construction is innovation as well. Estimating new and innovative ways to pour concrete foundations for nuclear power stations is estimating in the presence of unknowns. Estimating the construction of fusion power development is actually rocket science. I’ve done both in recent years. These types of phrases appear to come from people who have not actually worked in those domains, and are being used as Red Herrings.

At the end of Chapter 4, we’re back to practices no matter the domain. Inch-pebbles, small stories all fit into a critical success factor for all projects, no matter the domain.

How long are you willing to wait before you find out you’re late?

The answer to this question defines the sample time for producing outcomes. Not too long, not too short. This is the Nyquist sampling interval

Chapter 5 starts out with some generalizations that just aren’t true. Estimating time and budget is always possible. We do it all the time. The answer to the question of budget and time is how much margin is needed. This requires a model of the work, the uncertainties – both reducible and irreducible. This is the standard launch date problem. This can be a literal launch date. A date that can’t be missed. Either a product launch or a physical system launch – we can fly to Mars in a 3-week window every 3½ years – be there, with your newly developed, never been done before autonomous rendezvous and dock software.

The 4 steps in §5.1 are logical steps. Except for the last sentence in #4 that estimates are guesses. They can be a guess, but that’s bad estimating. Don’t guess, apply good estimating practices. The rest of Chapter 5 gets better. Although the notion of exclusive tradeoffs in §5.2 ignores the purpose of margin. It’s not a trade between features, time, cost, and quality. The very purpose of estimating is to provide schedule margin, cost margin, management reserve, and technical reserve, and to establish the probabilistic values for each of those and the combinations of those.

For simple projects that is too much. For complex enterprise software intensive systems that kind of analysis, planning and execution is mandatory.

The enterprise project class needs to show up on time, with the needed capabilities, for the planned cost, and the planned effectiveness, performance, reliability, and all the other …illities needed for success.[2]

Chapter 6 starts with advice on how to actually estimate. Make stories small is good advice anywhere. In our domain, we have the 44-day rule. No work can cross more than one accounting period. This limits exposure to not knowing what done looks like. In the small agile world, 44 days (2 working months) sound huge. In a software intensive system – even using agile – it’s a short time. Building DO-178 compliant flight software is not the same as build web pages for the shoe store ordering application. So yes, decompose the work into visible chunks that can be sized in one of several manners.

The author uses the term SWAG (Scientific Wild Ass Guess). SWAG’s are not estimates. SWAGs are bad estimates. There are much easier ways with much more accurate results than Guessing out of your Ass. This starts with a Binary Search method as shown in How to Estimate Almost Any Software Deliverable. This way you can stop guessing and start estimating using proven methods.

In §6.4.1 there is the notion of using past performance and past velocity as a measure of the future. There is a fundamental flaw in this common agile approach to estimating.

Using past velocity for the future only works if the future is like the past. No secondly this only works if the variance in that past velocity is narrow enough to provide a credible forecast of the future velocity. Below is a simple example of some past performance numbers. They can be stories or anything else you want to forecast into the future. Care is needed to assess how the variances in the past will likely be expressed in the future. This is called The Flaw of Averages and there is a book of the same title. The colored bands on the right are 80% and 90% confidence ranges of the possible outcomes. These, from the past data, are 45% swings from the Mean (Average). Not good confidence when spending other people’s money.

The chart below is produced from a simple R script from past performance data and shows the possible ranges of the future, given the past.

Chapter 6 starts with one of my favorite topics. Rolling Wave Planning is a standard process on our Software Intensive Systems [1] We can’t know what’s going to happen beyond the planning horizon, so detailed planning is not possible. This, of course, is a fallacy, when you have past performance.

Chapter 7 speaks to various estimating models. The cone of uncertainty is the first example. The sentence This is a Gaussian distribution is not mathematically true. No cost or schedule variance model can be normally distributed. To be normally distributed all the random variables in the population represented by the distribution must be I.I.D. – Independent, and Identically Distributed. This means there is no coupling between each random variance. For any non-trivial project that can never be the case. Cost and Schedule distribution functions are long-tailed – asymmetric.

The suggestion that projects stay at 90% complete for a long time has nothing to do with the shape of the probability distribution of the possible durations or costs. Like a few other authors in agile, this author may not be familiar with the underlying statistical mathematics of estimates, so it’s forgivable that concepts are included without consideration to what math is actually taking place. Projects are a coupled stochastic process. That is networks of dependent activities, where each node in the network is a stochastic process, is a collection of random variables, representing the evolution of some system of random values over time. These random variables interact with each other and may evolve over time. This is why estimating is hard. But also why Monte Carlo Simulation tools are powerful solutions to the estimating problem.

In Chapter 8, 8.1 gives me heartburn. This notion that we’re inventing software says, we’ve never done this before. Which really says we don’t know what we’re doing and we’ll have to invent the solution. Would you hire anyone to solve a complex problem for you that has not done it or something like it before? Or go get some reference design? What happened to all the Patterns books, and reference designs. Model View Controller was my rock when building process control systems and their user interfaces. So this section is a common repeat of the current agile approach – all software development is new. Which means it’s new to me, so I don’t know how to estimate. Go Find Someone Who Does

There are some good suggestions in this section. The multiple references to Troy Magennis’s book Forecasting and Simulating Software Development Projects leads me to believe his book is what you should read, rather than this one. But that aside, Chapter 8 has the beginnings of a description of estimating processes. There is a wealth of information at USC’s Center for Systems and Software Engineering on estimating processes. Read some of those first.

Chapter 9 starts with the same red herring word – perfect estimate. There is no such thing. All estimates are probabilistic, with measures of accuracy and precision, as described above. This is a common phrase when those making the phrase don’t seem to have the math skills for actually making estimates. This is not a criticism, it is an observation.

The notion of stop estimating and start producing begs the question – produce what? Does the poor agile project manager described here have any idea of what done looks like? Does anyone have any idea of what done looks like, how is he (I assumed a he) going to get to done? What problems will be encountered along the way? How will progress be measured – other than the passage of time and spending of money? This is not only bad project management, this is bad business management.

This is where this book becomes disconnected from the reality of the business of writing software for money.

Management is obligated to know how much this will cost and when it will be done to some level of confidence determined by the governance process of the business.

Open-ended spending is usually not a good way to stay in business. Having little or no confidence about when the spending will stop is not good business. Having little or no confidence in when the needed capabilities for that spend will arrive is not good business. What the author is describing on page 40 is low maturity, inexperienced management of other people’s money.

Chapter 10 opens with the loaded question Do your estimates provide value? And it repeats the often doublespeak of the #NoEstimates advocates. No estimates doesn’t literally mean no estimates. Well explain this Lucy from the original poster of #NoEstimates hashtag:

It seems pretty clear that No Estimates means we can make decisions with No Estimates. At this point in the book, the author has lost me. This oxymoron of no estimates means estimates have reduced the conversations to essentially nonsense ideas.

The approach to breaking down the work into atomic (singular outcomes) is a nice way to decompose all the work. But this takes effort. It takes analysis. It takes deep understanding sometimes about the future. This is physical estimating by revealing all the work to an atomic unit or maybe 2nd unit level. With that in hand you don’t need to estimate. You’ve got a visible list of all the work, sized in singular measures. Just add them up and that’s the Estimate to Complete.

But how long will that take? Can it even be done? I really wanted to finish the book and the review, so I skipped the rest of Chapter 10.

Chapter 11 starts to lay out how to produce an estimate but falls back into the naïve mathematics of those unfamiliar with the probabilistic processes of projects. The use of 90% confidence in June 1 and 100% confidence in August 1 tells me there is no understanding of probability and statistics here. Here is no such thing as 100% confidence in anything in the software business. A 90% confidence number is unheard of. And without margins and management reserve those numbers are essential – and I’ll be considered rude here – pure nonsense.

This is the core problem with estimating even in our domain. There is little understanding of the mathematics of project work. Troy’s book is a good start, but it has some issues as well. But those of us who work in a domain that lives and many times dies by estimates have all learned that naïve approaches, like those described here, are at the root of the smell of dysfunction so popularly used by the #NoEstimates advocates.

Chapter 12 finally arrives at the core process of all good estimating – probabilistic scheduling. Unfortunately, an example used by the author is not a software development project but a book-writing project.

The core concept in estimating cost and schedule of a software project is the probabilistic behavior of the work and the accumulation of the variance in that work to produce a confidence of completing on or before a need date.

The notion at the end of this chapter is it doesn’t matter what kind of life cycle you use, the further out the dates are, the less you know is not actually true in practice. In the probabilistic scheduling model, the future is less known, but that future can be modeled in many ways. In all cases, a Monte Carlo Simulation is used to model this future to show the probability of completing on or before the need date.

While the author used Troy’s book in some past chapter, it would have been useful to use Troy’s book here as well, where he shows how to model the entire project, not just the close in work.

Chapter 13 opens with a short description of the principle failure mode of all projects. You don’t know what done looks like. The agile notion of starting work and letting the requirements emerge is seriously flawed in the absence of some tangible description of Done in terms of capabilities. What is this project trying to produce in terms of capabilities for the customer? Don’t Know? Don’t Start. The only way to start in this condition, is to have the customer pay to discover the answer to what does done look like?

Even in the pure science world – and I know something about pure science in the particle physics world – there is a research goal. Something planned outcome for the discovery efforts to convince the funding agency they will get something back in the future. Research grants have a Stated Goal. Build software for money is rarely research. It is development. It’s the D in R&D.

So before you can know something about when you’ll be done, you must know What done looks like in units of measures meaningful to the decision makers. With that information, you can start asking and answering questions about the attributes of done? How fast, how reliable, how big, what measures of effectiveness, measures of performance, key performance parameters, technical performance measures, and all the other…ilities must be satisfied before starting, during execution, and when everything changes all the time – which it will.

In Chapter 13 there is a good checklist (pg. 50) that is connected to other project management methods we use in our domain. Integrated Master Plan / Integrated Master Schedule, where physical percent complete uses Quantifiable Backup Data to state where we are in the plan from the tangible evidence of progress to that plan. So Chapter 13 is a winner all around.

[1] A Software Intensive System is one where software contributes essential influences to the design, construction, deployment and evolution of the system as a whole. http://www2.cs.uni-paderborn.de/cs/ag-schaefer/Lehre/Lehrveranstaltungen/Vorlesungen/SoftwareEngineeringForSoftwareIntensiveSystems/WS0506/SEfSIS-I.pdf

[2] …illities are all the things that the resulting system needs to do to be successful beyond the tangible requirements http://www.dtic.mil/ndia/2011system/13166_WillisWednesday.pdf

Source de l’article sur HerdingCats

L’assistance proposée par ANKAA PMO