Risk identification during early design phases of complex systems is commonly implemented but often fails to identify events and circumstances that challenge program performance. Inefficiencies in cost and schedule estimates are usually held accountable for cost and schedule overruns, but the true root cause is often the realization of programmatic risks. A deeper understanding of frequent risk identification trends and biases pervasive during system design and development is needed, for it would lead to the improved execution of existing identification processes and methods.
Risk management means building a model of the risk, the impact of the risk on the program, and a model for handling of the risk, since it is a risk, the corrective or preventive action has not occurred yet.
Probabilistic Risk Assessment (PRA) is the basis of these models and provides the Probability of Project Success Probabilities result from uncertainty and are central to the analysis of the risk. Scenarios, model assumptions, with model parameters based on current knowledge of the behavior of the system under a given set of uncertainty conditions.
The source of uncertainty must be identified, characterized, and the impact on program success modeled and understood, so decisions can be made about corrective and preventive actions needed to increase the Probability of Project Success.
Since risk is the outcome of Uncertainty, distinguishing between the types of uncertainty in the definition and management of risk on complex systems is useful when building risk assessment and management models.
- Epistemic uncertainty ‒ from the Greek επιστηµη (episteme), is uncertainty from the lack of knowledge of a quantity or process in the system or an environment. Epistemic uncertainty is represented by a range of values for parameters, a range of workable models, the level of model detail, multiple expert interpretations, or statistical confidence. The accumulation of information and implementation of actions reduce epistemic uncertainty to eliminate or reduce the likelihood and/or impact of risk. This uncertainty is modeled as a subjective assessment of the probability of our knowledge and the probability of occurrence of an undesirable event.
Incomplete knowledge about some characteristics of the system or its environment are primary sources of Epistemic uncertainty.
- Aleatory uncertainty ‒ from the Latin alea (a single die) is the inherent variability associated with a physical system or environment. Aleatory uncertainty comes from an inherent randomness, natural stochasticity, environmental or structural variation across space and time in the properties or behavior of the system under study. The accumulation of more data or additional information cannot reduce aleatory uncertainty. This uncertainty is modeled as a stochastic process of an inherently random physical model. The projected impact of the risk produced by Aleatory uncertainty can be managed through cost, schedule, and/or technical margin.
Naturally occurring variations associated with the physical system are primary sources of Aleatory uncertainty.
There is a third uncertainty found on some projects.
- Ontological Uncertainty ‒ is attributable to the complete lack of knowledge of the states of a system. This is sometimes labeled an Unknowable Risk. Ontological uncertainty cannot be measured directly.
Separating Aleatory and Epistemic Uncertainty for Risk Management
Knowing the percentage of reducible versus irreducible uncertainty is needed to construct a credible risk model.
Without the separation, knowing what uncertainty is reducible and what uncertainty is irreducible inhibits the design of the corrective and preventive actions needed to increase the probability of program success.
Separating the uncertainty types increases the clarity of risk communication, making it clear which type of uncertainty can be reduced and which types cannot be reduced. For the latter (irreducible risk), only margin can be used to protect the program from the uncertainty.
As uncertainty increases, the ability to precisely measure the uncertainty is reduced to where a direct estimate of the risk can no longer be assessed through a mathematical model. While a decision in the presence of uncertainty must still be made, deep uncertainty and poorly characterized risks lead to the absence of data and risk models in many domains.
Epistemic Uncertainty Creates Reducible Risk
The risk created by Epistemic Uncertainty represents resolvable knowledge, with elements expressed as a probabilistic uncertainty of a future value related to a loss in a future period of time. Awareness of this lack of knowledge provides the opportunity to reduce this uncertainty through direct corrective or preventive actions.
Epistemic uncertainty, and the risk it creates, is modeled by defining the probability that the risk will occur, the time frame in which that probability is active, and the probability of an impact or consequence from the risk when it does occur, and finally, the probability of the residual risk when the handing of that risk has been applied.
Epistemic uncertainty statements define and model these event‒based risks:
- If‒Then ‒ if we miss our next milestone then the program will fail to achieve its business value during the next quarter.
- Condition‒Concern ‒ our subcontractor has not provided enough information for us to status the schedule, and our concern is the schedule is slipping and we do not know it.
- Condition‒Event‒Consequence ‒ our status shows there are some tasks behind schedule, so we could miss our milestone, and the program will fail to achieve its business value in the next quarter.
For these types of risks, an explicit or an implicit risk handling plan is needed. The word handling is used with special purpose. “We Handle risks” in a variety of ways. Mitigation is one of those ways. In order to mitigate the risk, new effort (work) must be introduced into the schedule. We are buying down the risk, or we are retiring the risk by spending money and/or consuming time to reduce the probability of the risk occurring. Or we could be spending money and consuming time to reduce the impact of the risk when it does occur. In both cases, actions are taken to address the risk.
Reducible Cost Risk
Reducible cost risk is often associated with unidentified reducible Technical risks, changes in technical requirements and their propagation that impacts cost. Understanding the uncertainty in cost estimates supports decision making for setting targets and contingencies, risk treatment planning, and the selection of options in the management of program costs. Before reducible cost risk can take place, the cost structure must be understood. Cost risk analysis goes beyond capturing the cost of WBS elements or content of the Product Roadmap in the Basis of Estimate and the Cost Estimating Relationships. This involves:
- Development of quantitative modeling of integrated cost and schedule, incorporating the drivers of reducible uncertainty in quantities, rates and productivities, and the recording of these drivers in the Risk Register.
- Determining how cost and schedule uncertainty can be integrated into the analysis of the cost risk model.
- Performing sensitivity analysis to provide an understanding of the effects of reducible uncertainty and the allocation of contingency amounts across the program.
Reducible Schedule Risk
While there is significant variability, for every 10% in Schedule Growth there is a corresponding 12% Cost Growth.
Schedule Risk Analysis (SRA) is an effective technique to connect the risk information of program activities to the baseline schedule, to provide sensitivity information of individual program activities to assess the potential impact of uncertainty on the final program duration and cost.
Schedule risk assessment is performed in 4 steps:
- Baseline Schedule ‒ Construct a credible activity network compliant with GAO‒16‒89G, “Schedule Assessment Guide: Best Practices for Project Schedule.”
- Define Reducible Uncertainties ‒ for activity durations and cost distributions from the Risk Register and assign these to work activities affected by the risk and/or the work activities assigned to reduce the risk.
- Run Monte‒Carlo simulations ‒ for the schedule using the assigned Probability Distribution Functions (PDFs), using the Min/Max values of the distribution, for each work activity in the Integrated Master Schedule.
- Interpret Simulation Results ‒ using data produced by the Monte Carlo Simulation
Reducible Technical Risk
Technical risk is the impact on a program, system, or entire infrastructure when the outcomes from engineering development do not work as expected, do not provide the needed technical performance or create higher than the planned risk to the performance of the system. Failure to identify or properly manage this technical risk results in performance degradation, security breaches, system failures, increased maintenance time, and a significant amount of technical debt and addition cost and time for end item deliverable for the program.
Reducible Cost Estimating Risk
Reducible cost estimating risk is dependent on technical, schedule, and programmatic risks, which must be assessed to provide an accurate picture of program cost. Cost risk estimating assessment addresses the cost, schedule, and technical risks that impact the cost estimate. To quantify these cost impacts from the reducible risk, sources of risk need to be identified. This assessment is concerned with three sources of risk and ensures that the model calculating the cost also accounts for these risks:
- The risk inherent in the cost estimating method. The Standard Error of the Estimate (SEE), confidence intervals, and prediction intervals.
- The risk inherent in technical and programmatic processes. The technology’s maturity, design, and engineering, integration, manufacturing, schedule, and complexity.
- The risk inherent in the correlation between WBS elements, which decides to what degree one WBS element’s change in cost is related to another and in which direction. WBS elements within the project have positive correlations with each other, and the cumulative effect of this positive correlation increases the range of the costs.
Unidentified reducible Technical Risks are often associated with Reducible Cost and Schedule risk.
Aleatory Uncertainty Creates Irreducible Risk
Aleatory uncertainty and the risk it creates comes not from the lack of information, but from the naturally occurring processes of the system. For aleatory uncertainty, more information cannot be bought nor specific risk reduction actions are taken to reduce the uncertainty and resulting risk. The objective of identifying and managing aleatory uncertainty to be preparing to handle the impacts when risk is realized.
The method for handling these impacts is to provide margin for this type of risk, including cost, schedule, and technical margin.
Using the standard project management definition, Margin is the difference between the maximum possible value and the maximum expected Value and separate from Contingency. Contingency is the difference between the current best estimates and maximum expected estimate. For systems under development, the technical resources and the technical performance values carry both margin and contingency.
Schedule Margin should be used to cover the naturally occurring variances in how long it takes to do the work. Cost Margin is held to cover the naturally occurring variances in the price of something being consumed in the program. The technical margin is intended to cover the naturally occurring variation of technical products.
Aleatory uncertainty and the resulting risk is modeled with a Probability Distribution Function (PDF) that describes the possible values the process can take and the probability of each value. The PDF for the possible durations for the work in the program can be determined. Knowledge can be brought about the aleatory uncertainty through Reference Class Forecasting and past performance modeling. This new information then allows us to update ‒ adjust ‒ our past performance on similar work will provide information about our future performance. But the underlying processes are still random, and our new information simply created a new aleatory uncertainty PDF.
The first step in handling Irreducible Uncertainty is the creation of Margin. Schedule margin, Cost margin, Technical Margin, to protect the program from the risk of irreducible uncertainty. The margin is defined as the allowance in the budget, programmed schedule … to account for uncertainties and risks.
Margin needs to be quantified by:
- Identifying WBS elements that contribute to margin.
- Identifying uncertainty and risk that contributes to margin.
Irreducible Schedule Risk
Programs are over budget and behind schedule, to some extent because uncertainties are not accounted for in schedule estimates. Research and practice are now addressing this problem, often by using Monte Carlo methods to simulate the effect of variances in work package costs and durations on total cost and date of completion. However, many such program risk approaches ignore the significant impact of probabilistic correlation on work package cost and duration predictions.
Irreducible schedule risk is handled with Schedule Margin which is defined as the amount of added time needed to achieve a significant event with an acceptable probability of success. Significant events are major contractual milestones or deliverables.
With minimal or no margins in schedule, technical, or cost present to deal with unanticipated risks, successful acquisition is susceptible to cost growth and cost overruns.
The Project Manager owns the schedule margin. It does not belong to the client nor can it be negotiated away by the business management team or the customer. This is the primary reason to CLEARLY identify the Schedule Margin in the Integrated Master Schedule. It is there to protect the program deliverable(s). Schedule margin is not allocated to over‒running tasks, rather is planned to protect the end item deliverables.
The schedule margin should protect the delivery date of major contract events or deliverables. This is done with a Task in the IMS that has no budget (BCWS). The duration of this Task is derived from Reference Classes or Monte Carlo Simulation of aleatory uncertainty that creates a risk to the event or deliverable.
The Integrated Master Schedule (or Product Roadmap and Release Plan), with margin to protect against the impact aleatory uncertainty, represents the most likely and realistic risk‒based plan to deliver the needed capabilities of the program.