1、Lessons Learned Entry: 2049Lesson Info:a71 Lesson Number: 2049a71 Lesson Date: 2009-04-21a71 Submitting Organization: JPLa71 Submitted by: David Oberhettingera71 POC Name: Lorraine M. Fesq; John McDougal (MSFC)a71 POC Email: Lorraine.M.Fesqjpl.nasa.gov; John.M.Mcdougalnasa.gova71 POC Phone: 818-393-
2、7224 (Fesq); 256-961-7481 (McDougal)Subject: Improving Fault Management for Spaceflight Missions Abstract: Fault management subsystems reveal pervasive architecture, design, and verification/validation (V&V) problems during both technical reviews of spaceflight missions and in-flight. An industry-wi
3、de Spacecraft Fault Management Workshop was held in April 2008 to characterize fault management practices, identify trends, and provide a roadmap for improvements. A final report on the workshop provides 12 sets of recommendations in the areas of requirements definition, design, and test practices f
4、or fault management.Description of Driving Event: Fault management is the capability of a spacecraft system to detect, isolate, and recover from in-flight events that may hinder nominal mission operations. Autonomous fault management (aka “fault protection,“ “Fault Detection/Isolation/Recovery,“ “sa
5、fing,“ etc.) is especially critical for deep space and planetary missions where the lightspeed communications delay may prevent timely intervention by ground control. However, increasingly challenging science objectives imposed upon deep space missions are taxing the ability of onboard spacecraft re
6、sources and control logic to manage in-flight fault events. Technical reviews of spaceflight missions by NASA and its contractors encounter pervasive fault management architecture, design, and verification/validation (V&V) problems, including:a71 Fault management design changes required late in the
7、life-cycle (that often necessitate secondary changes elsewhere in the system), a71 Insufficient project insight into the required system-level fault management testing, and unexpected test results that require resolution, a71 Spacecraft operational limitations because restrictions are placed on the
8、use of untested functions (in compliance with the “fly-as-you-test“ principle).In addition, complex fault management subsystems are subject to in-flight anomalies like those described in References (1) through (6). Fault management requirements definition, design, and test practices used by NASA, th
9、e Department of Defense, and government contractors are not consistent or well defined. The terminology, engineering processes, tools, and training for fault management are not standardized. An industry-wide Spacecraft Fault Management Provided by IHSNot for ResaleNo reproduction or networking permi
10、tted without license from IHS-,-,-Workshop was held in April 2008 to characterize fault management practices, identify trends, and provide a roadmap for improvements. For example, the workshop affirmed the benefits of ingraining fault management into the system architecture instead of the more commo
11、n practice of attaching the completed fault management code to the flight software. Reference (7) summarizes the findings and recommendations from the workshop. References: 1. “Autonomous Transfer to Reaction Wheel Control May Lead to Safing Instability,“ NASA Lesson Learned No. 2048, NASA Engineeri
12、ng Network, April 14, 2009. http:/www.nasa.gov/offices/oce/llis/imported_content/lesson_2048.html 2. “MRO Articulation Keep-Out Zone Anomaly,“ NASA Lesson Learned No. 2044, NASA Engineering Network, April 7, 2009. http:/www.nasa.gov/offices/oce/llis/imported_content/lesson_2044.html 3. “MRO Spacefli
13、ght Computer Side Swap Anomalies Export Version,“ NASA Lesson Learned No. 2041, NASA Engineering Network, December 16, 2008. http:/www.nasa.gov/offices/oce/llis/imported_content/lesson_2041.html 4. “Mars Global Surveyor (MGS) Spacecraft Loss of Contact,“ NASA Lesson Learned No. 1805, NASA Engineerin
14、g Network, September 4, 2007. http:/www.nasa.gov/offices/oce/llis/imported_content/lesson_1805.html 5. “Erroneous Onboard Status Reporting Disabled IMAGEs Radio,“ NASA Lesson Learned No. 1799, NASA Engineering Network, July 10, 2007. http:/www.nasa.gov/offices/oce/llis/imported_content/lesson_1799.h
15、tml 6. “Anomalous Flight Conditions May Trigger Common-Mode Failures in Highly Redundant Systems,“ NASA Lesson Learned No. 1778, NASA Engineering Network, March 6, 2007. http:/www.nasa.gov/offices/oce/llis/imported_content/lesson_1778.html 7. Lorraine M. Fesq, “White Paper Report: Spacecraft Fault M
16、anagement Workshop Results for the Science Mission Directorate,“ Planetary Sciences Division, NASA/Caltech Jet Propulsion Laboratory, March 2009.Lesson(s) Learned: See “Recommendation(s)“Recommendation(s): Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-
17、,-,-The complete alt text for this table exceeds system limits. Please contact David Oberhettinger at davidonasa.gov for the full alt text.Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-Table 1. Top-level fault management (FM) lessons learned and re
18、commendations from the Spacecraft Fault Management Workshop (Reference (7)Evidence of Recurrence Control Effectiveness: JPL has referenced this lesson learned as additional rationale and guidance supporting Paragraph 6.14.2.7 (“Engineering Practices: Project and System Level Functional Verification
19、and Validation - Verification and Validation“) in the Jet Propulsion Laboratory standard “Flight Project Practices, Rev. 7,“ JPL DocID 58032, September 30, 2008. In addition, JPL has referenced it supporting Paragraph 4.9 (“Flight System Design: System Fault Protection Design“) in the JPL standard “
20、Design, Verification/Validation and Operations Principles for Flight Systems (Design Principles),“ JPL Document D-17868, Rev. 3, December 11, 2006.Documents Related to Lesson: N/AProvided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-Mission Directorate(s):
21、 a71 Sciencea71 Space OperationsAdditional Key Phrase(s): a71 0.a71 1.Mission concepts and life-cycle planninga71 1.Review boardsa71 1.Engineering design and project processes and standardsa71 1.Level II/III requirements definitiona71 1.Long term sustainability and maintenance planninga71 1.Mission
22、and systems trade studiesa71 1.Mission definition and planninga71 1.Planning of requirements verification processesa71 1.Roboticsa71 1.Simulators and Training Systemsa71 1.Software Engineeringa71 1.Spacecraft and Spacecraft Instrumentsa71 0a71 1a71 1.Mission control Planninga71 1.Mission operations
23、systemsa71 1.Training and simulation systemsa71 1.Early requirements and standards definitiona71 1.Reliabilitya71 1.Review systems and boardsa71 1.Flight Equipmenta71 1.Flight Operationsa71 1.Independent Verification and Validationa71 1.Information Technology/Systemsa71 1.Payloadsa71 1.Softwarea71 1
24、.Standarda71 1.Test & VerificationAdditional Info: a71 Project: various NASA, DoD, and contractor projectsProvided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-Approval Info: a71 Approval Date: 2009-07-10a71 Approval Name: mbella71 Approval Organization: HQProvided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-