Every flight test professional is familiar with the following graphic:
The Air Force, and other test organizations, use the simple chart above to help characterize flight test safety risk. It allows risk managers to assemble disparate failure modes of various systems and subsystems into a digestible and quantifiable risk metric. The axes are defined by the following "legends":
The Air Force uses this system to gauge safety risk, but the method can be applied to other risks equally well. Processes such as Failure Mode and Effects Analysis and System Theoretical Process Analysis can be used to populate the severity and probability axes. The chart is simple, and intentionally so to provide decision makers a high-level qualitative view of test risk. However, this simplicity can mask underlying variance and lack of total understanding of the system. Too often the values that populate risk assessment are derived from within the center of the bell curve, and ignore the shape of the distribution.
A third axis is required to safely quantify flight test risk - confidence. Confidence, or lack thereof, clouds everything in flight test, particularly when flying immature systems. With limited flight or bench hours, and relatively low pilot exposure, how sure can you be of those failure rates or modes? How well do you really know what you think you know, and how does that impact your risk assessment? Failure rates determined under lab conditions can be illustrative, but understanding how that translates into a system under operational flight conditions is imperative for assessing risk. Statistically rigorous component and system test is vital, but not necessarily exhaustive. Applying a demonstrated failure rate or consequence derived from analysis or a lab environment to risk mitigation strategy can lead to underappreciated flight test risk, and potential mishaps. Understanding how those determinations were made, and the confidence that those data correlate to the total system in realistic environments is paramount to accurate risk assessments. How sure are you that a 10^-9 event is really 10^-9 in the real world?
Moreover, a vital component of risk assessment confidence is the human system. While test pilots are trained professionals who should be intimately familiar with their system, I'll be the first to admit that we are not perfect. Variability in human reactions to a failure, including recognition time, time to execute emergency procedures (and whether those procedures are executed correctly), and availability of chase or control room support will directly affect the outcome of a situation. Human performance is, after all, the reason we engage control rooms and develop scripted risk mitigation or response procedures - we attempt to transfer as much of the rapid reaction decision making to those at ground speed zero. Simulator time is invaluable to mitigating human system risk, and should be an integral component of test build up. That said, the consequence of the spectrum of human reactions must be accounted for to produce a valid risk assessment. Assessing the risk of hardware and software is fairly deterministic, but the human operating that hardware and software less so. This pillar of risk assessment is particularly important when transitioning from the controlled test environment to operational implementation. Issues that were dutifully corrected by experienced test pilots may prove catastrophic to inexperienced aircrew. A systems level hazard analysis should include impacts of the human system, but the range of behaviors is not always adequately addressed.
Flight test risk management processes have been honed over decades of flight test, but are only as good as the assumptions that are used to populate them. Without understanding the quality of the data, risk assessment can be a general feeling at best. Perhaps more component or system level testing is required to more accurately characterize failure modes, or more simulator time to gauge human reaction. This is no time to be optimistic - realism and pessimism can save lives. Bear in mind that low risk is not always achievable; it is up to the program to decide how much risk to accept. Low statistical confidence almost always means higher risk. Decision makers need to ask of the test team: How well do you really know what you know? And how prepared are you to handle the unknown event?
Comments