1、 This Page Intentionally Left Blank NASA Accident Precursor Analysis Handbook National Aeronautics and Space Administration Office of Safety and Mission Assurance NASA/SP-2011-3423 Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-NASA STI Program in P
2、rofile Since its founding, NASA has been dedicated to the advancement of aeronautics and space science. The NASA scientific and technical information (STI) program plays a key part in helping NASA maintain this important role. The NASA STI program operates under the auspices of the Agency Chief Info
3、rmation Officer. It collects, organizes, provides for archiving, and disseminates NASAs STI. The NASA STI program provides access to the NASA Aeronautics and Space Database and its public interface, the NASA Technical Report Server, thus providing one of the largest collections of aeronautical and s
4、pace science STI in the world. Results are published in both non-NASA channels and by NASA in the NASA STI Report Series, which includes the following report types: Technical Publication: Reports of completed research or a major significant phase of research that present the results of NASA Programs
5、 and include extensive data or theoretical analysis. Includes compilations of significant scientific and technical data and information deemed to be of continuing reference value. NASA counterpart of peer-reviewed formal professional papers but has less stringent limitations on manuscript length and
6、 extent of graphic presentations. Technical Memorandum: Scientific and technical findings that are preliminary or of specialized interest, e.g., quick release reports, working papers, and bibliographies that contain minimal annotation. Does not contain extensive analysis. Contractor Report: Scientif
7、ic and technical findings by NASA-sponsored contractors and grantees. Conference Publication: Collected papers from scientific and technical conferences, symposia, seminars, or other meetings sponsored or co-sponsored by NASA. Special Publication: Scientific, technical, or historical information fro
8、m NASA programs, projects, and missions, often concerned with subjects having substantial public interest. Technical Translation: English-language translations of foreign scientific and technical material pertinent to NASAs mission. Specialized services also include creating custom thesauri, buildin
9、g customized databases, and organizing and publishing research results. For more information about the NASA STI program, see the following: Access the NASA STI program home page at http:/www.sti.nasa.gov E-mail your question via the Internet to helpsti.nasa.gov Fax your question to the NASA STI Help
10、 Desk at 443-757-5803 Phone the NASA STI Help Desk at 443-757-5802 Write to: NASA STI Help Desk NASA Center for AeroSpace Information 7115 Standard Drive Hanover, MD 21076-1320 Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-NASA/SP-2011-3423 NASA AC
11、CIDENT PRECURSOR ANALYSIS HANDBOOK Version 1.0 National Aeronautics and Space Administration Office of Safety and Mission Assurance Washington, D.C. 20546 December 2011 Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-To request print or electronic co
12、pies or provide comments, contact the Office of Safety and Mission Assurance. Electronic copies are also available from NASA Center for AeroSpace Information 7115 Standard Drive Hanover, MD 21076-1320 http:/ntrs.nasa.gov Provided by IHSNot for ResaleNo reproduction or networking permitted without li
13、cense from IHS-,-,-iii Table of Contents Acknowledgments. v 1 Introduction . 1 1.1 Background 1 1.2 Summary of Accident Precursor Analysis . 1 1.3 History of NASAs Precursor Program 2 1.4 Handbook Overview 3 2 Accident Precursor Analysis Overview 5 2.1 The Accident Precursor Concept 5 2.2 Accident P
14、recursor Analysis and its Role in System Safety 8 2.3 NASA Accident Precursor Analysis Process Overview 12 3 Accident Precursor Analysis Process Steps 19 3.1 Building a Caseload . 19 3.1.1 Data Sources of Interest 19 3.1.2 Choosing Accident Precursor Analysis Relevant Data . 19 3.1.3 Timetable for E
15、valuation of Anomaly Data 21 3.1.4 Anomaly Screening Methods 22 3.2 Anomaly Failure Mechanism Identification The MSFC team led by Rob Ring (BTI); The ISS precursor team led by Alicia Carrier (SAIC); The ARC Mission Assurance Systems team including Irene Tollinger, Christian Ratterman, Don Kalar, and
16、 Alex Eiser; Clay Smith (APL) Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-vi Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-1 1 Introduction 1.1 Background Catastrophic accidents are usually prec
17、eded by precursory events that, although observable, are not recognized as harbingers of a tragedy until after the fact. In the nuclear industry, the Three Mile Island accident was preceded by at least two events portending the potential for severe consequences from an underappreciated causal mechan
18、ism 1. Anomalies whose failure mechanisms were integral to the losses of Space Transportation Systems (STS) Challenger and Columbia had been occurring within the STS fleet prior to those accidents. Both the Rogers Commission Report 2 and the Columbia Accident Investigation Board report 3 found that
19、processes in place at the time did not respond to the prior anomalies in a way that shed light on their true risk implications. This includes the concern that, in the words of the NASA Aerospace Safety Advisory Panel (ASAP) 4, “no process addresses the need to update a hazard analysis when anomalies
20、 occur.” At a broader level, the ASAP noted in 2007 5 that NASA “could better gauge the likelihood of losses by developing leading indicators, rather than continue to depend on lagging indicators”. These observations suggest a need to revalidate prior assumptions and conclusions of existing safety (
21、and reliability) analyses, as well as to consider the potential for previously unrecognized accident scenarios, when unexpected or otherwise undesired behaviors of the system are observed. This need is also discussed in NASAs system safety handbook 6, which advocates a view of safety assurance as dr
22、iving a program to take steps that are necessary to establish and maintain a valid and credible argument for the safety of its missions. It is the premise of this handbook that making cases for safety more experience-based allows NASA to be better informed about the safety performance of its systems
23、, and will ultimately help it to manage safety in a more effective manner. 1.2 Summary of Accident Precursor Analysis The APA process described in this handbook provides a systematic means of analyzing candidate accident precursors by evaluating anomaly occurrences for their system safety implicatio
24、ns and, through both analytical and deliberative methods used to project to other circumstances, identifying those that portend more serious consequences to come if effective corrective action is not taken. APA builds upon existing safety analysis processes currently in practice within NASA, leverag
25、ing their results to provide an improved understanding of overall system risk. As such, APA represents an important dimension of safety evaluation; as operational experience is acquired, precursor information is generated such that it can be fed back into system safety analyses to risk-inform safety
26、 improvements. Importantly, APA utilizes anomaly data to predict risk whereas standard reliability and PRA approaches utilize failure data which often is limited and rare. Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-2 The purpose of the APA proce
27、ss is to identify and characterize potential sources of safety risk for which indications are received in the form of anomalous events which, although not necessarily presenting an immediate safety impact, may indicate that an unknown or insufficiently understood potential risk-significant condition
28、 exists in the system. Such anomalous events are considered to be potential accident precursors because they signal the potential for more severe consequences that may occur in the future, due to failure mechanisms that are discernible from their occurrence today. Their early identification allows t
29、hem to be fully scrutinized and the results to be used to inform decisions relating to safety. Stemming from the anomalous event that was actually observed, the NASA process invokes an “imaginative” aspect to the process using a structured brainstorming session to identify similar anomalous conditio
30、ns which could have more severe consequences than the observed anomalous event. In the context of NASA systems, the term severe consequences typically refers to loss of crew (LOC), loss of vehicle (LOV), loss of mission (LOM), or loss of science (LOS). It is up to the particular program employing th
31、e approach to define severe consequences appropriate to its objectives and apply the technical approach accordingly. The APA process presented in this document has been applied to earth-to-orbit transportation systems and crewed orbital science platforms, although the fundamental process steps are v
32、alid for other mission classes (e.g., crewed and uncrewed orbital platforms, crewed lunar and planetary outposts, deep-space robotic missions, and other human space exploration missions), and may be tailored to the specific needs of each class. Programs at NASA that have benefited from the APA proce
33、ss presented in this document include the Space Shuttle and the ISS. In addition, NASA is continuing to exercise a robust terrestrial and solar system satellite and robotic based science agenda that could benefit from a systematic APA process. In this case, an accident precursor process could provid
34、e valuable information to guide the design of future scientific missions as well as indicate when corrective actions are required during the mission to preclude potential mission-ending failures. Finally, APA plays an important role in extending NASAs anomaly management process to provide additional
35、 screening and assessment of anomalies for their risk significance. 1.3 History of NASAs Precursor Program In February 2007 the NASA Office of Safety and Mission Assurance (OSMA) hosted a “Precursor Analysis Working Group Kick-off Meeting” to discuss the development of an Accident Precursor Analysis
36、 (APA) process at NASA. Shortly after, an APA team was formed with the intention of utilizing the U.S. Nuclear Regulatory Commissions (NRC) Accident Sequence Precursor (ASP) process 7 as a point of departure for the development of a NASA-specific process, augmented as necessary based on fundamental
37、differences in the nature of the two organizations. In particular, the process presented in this document makes use of NASAs data-rich environment and is tailored to the high-performance space systems that the agency designs and operates. A first version of an APA approach tailored to NASAs needs, d
38、erived from the NRC ASP process and contributions by Dr. Bill Vesely 8, 9, was completed in 2008 10. Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-3 The approach was tested and refined based on a number of preliminary and on-site pilot exercises. F
39、irst, using an early draft of the process, a retrospective APA assessment was conducted on the significant Thermal Protection System (TPS) damage and the major External Tank (ET) foam loss incidents that occurred prior to Columbia that were identified by the Columbia Accident Investigation Board 3.
40、Second, a number of APA working sessions were conducted at the Johnson Space Center (JSC) to serve as pilot applications, in collaboration with the Space Shuttle and International Space Station programs 11. Following those pilot exercises, both programs independently conducted precursor exercises. T
41、his handbook captures the experiences and lessons learned from the above activities. 1.4 Handbook Overview Section 2 - Accident Precursor Analysis Overview, presents a summary background and overview of the NASA APA process. This section outlines the sequence of steps involved in screening, generali
42、zation, grading, risk modeling, and reporting of findings. It presents the technical and risk management rationale behind the approach, and the benefit that APA brings to risk management. Section 3 -Accident Precursor Analysis Process Steps, details the sequence of tasks required to conduct a full A
43、PA cycle. Divided into the following sub-sections; 3.1 - Building a Caseload, addresses the collection of anomaly source data. This section touches on the use of existing problem reporting data sources, the use of multiple data sources, and the timing of caseload assembly with respect to the initial
44、 reporting of the anomalies and subsequent investigatory activities. It also addresses the use of screening methods to filter out anomalies with little or no potential for more severe consequences. 3.2 - Anomaly Failure Mechanism Identification this feedback mechanism allows the real-world behavior
45、of the system to be reflected back into the risk and safety analyses of the 1 “In a timely manner” is a matter to be determined by the organization overseeing the system (as will be discussed in subsequent sections) but basically is defined by the end result which is the avoidance of an accident due
46、 to a recurring failure mechanism. Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-9 system. In this way APA uses examples of off-nominal behavior in a proactive rather than a reactive fashion. The off nominal event is not simply resolved so that ope
47、ration can continue; it is analyzed and used strategically by gleaning information from it to help understand and control risk for the future. As test and operational experience accumulates, the APA process helps to support a convergence between the assessed risk and its actual as-operated risk. In
48、the absence of an APA process, convergence between a risk model and the occurring events and phenomena of the system modeled may occur in response to system failure, which for NASA systems is all too often catastrophic. In order for NASA to conclude that a system is sufficiently safe, the information that demonstrates the systems ability to meet those levels of safety must be documented. This will consist of a consolidated set of technical and programmatic activities and standards that define and implement safety processes and requirements, and record operational performance, and system and o