REG NASA-LLIS-1799-2007 Lessons Learned Erroneous Onboard Status Reporting Disabled IMAGE-s Radio.pdf

上传人:eveningprove235 文档编号:1019294 上传时间:2019-03-21 格式:PDF 页数:5 大小:24.69KB
下载 相关 举报
REG NASA-LLIS-1799-2007 Lessons Learned Erroneous Onboard Status Reporting Disabled IMAGE-s Radio.pdf_第1页
第1页 / 共5页
REG NASA-LLIS-1799-2007 Lessons Learned Erroneous Onboard Status Reporting Disabled IMAGE-s Radio.pdf_第2页
第2页 / 共5页
REG NASA-LLIS-1799-2007 Lessons Learned Erroneous Onboard Status Reporting Disabled IMAGE-s Radio.pdf_第3页
第3页 / 共5页
REG NASA-LLIS-1799-2007 Lessons Learned Erroneous Onboard Status Reporting Disabled IMAGE-s Radio.pdf_第4页
第4页 / 共5页
REG NASA-LLIS-1799-2007 Lessons Learned Erroneous Onboard Status Reporting Disabled IMAGE-s Radio.pdf_第5页
第5页 / 共5页
亲,该文档总共5页,全部预览完了,如果喜欢就下载吧!
资源描述

1、Lessons Learned Entry: 1799Lesson Info:a71 Lesson Number: 1799a71 Lesson Date: 2007-07-10a71 Submitting Organization: JPLa71 Submitted by: David Oberhettingera71 POC Name: Michael Prior (IMAGE FRB Chair), Richard J. Burley (GSFC Flight Director for IMAGE)a71 POC Email: mpriorpop500.gsfc.nasa.gov, rb

2、urleypop600.gsfc.nasa.gova71 POC Phone: 301-286-1418 (M. Prior), 301-286-2864 (R. Burley)Subject: Erroneous Onboard Status Reporting Disabled IMAGEs Radio Abstract: The loss of the IMAGE satellite was attributed to a Single Event Upset-induced “instant trip“ of the Solid State Power Controller (SSPC

3、) that supplies power to the single-string Transponder. The circuit breaker was not reset because this hybrid device incorrectly reported the circuit breaker as closed, and ground could not command a reset because the satellites single telemetry receiver had been disabled by the SSPC. The SSPCs prob

4、lematic state reporting characteristic was an intentional design feature that was not reflected in any part documentation, and three similar “instant trips“ on other NASA satellites had not been reported in the GIDEP system. Consider hardwiring receiver power to the power bus, or build redundancy in

5、to the power switching or into the operational status sensing. Ensure that GIDEP reports or NASA Alerts are written and routed to mission operations (as well as to hardware developers), and that flight software responds to command loss with a set of timed spacecraft-level fault responses.Description

6、 of Driving Event: The NASA Imager for Magnetopause-to-Aurora Global Exploration (IMAGE) spacecraft became non-responsive to ground commands in December 2005, after almost 6 years of successful on-orbit operation. Designed for a two-year mission, IMAGE was the first satellite dedicated to imaging th

7、e Earths magnetosphere. The only likely cause of the IMAGE failure is a Single Event Upset (SEU) induced “instant trip“ (i.e., from a short duration, high current transient) of the Solid State Power Controller (SSPC) that supplies power to the single-string Transponder (Reference (1). Because the SS

8、PC device that powers the satellite Transponder (receiver/transmitter) also performs a circuit breaker function, the instant trip severed both uplink and downlink communications. An SSPC trip should have been reported in its status telemetry lines that are continuously monitored by onboard Error Det

9、ection Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-and Correction (EDAC) logic in the Power Distribution Unit (PDU). This allows the PDU EDAC to command the SSPC to close, reapplying power to the Transponder. However, due to a design oversight in

10、 the device, instant trip events are not reported in the status telemetry lines (see Figure 1). This allowed the circuit breaker to be in an open state while still reporting a closed state. The result is that the Transponder remains OFF because the EDAC logic detects the SSPC to still be closed (due

11、 to the erroneous status line indication). Figure one is a simplified schematic of the Solid State Power Controller (SSPC). Three blocks, each with different shades of color, distinguish three distinct sections of the schematic.Figure 1. Simplified schematic of the circuit breaker (SSPC). The circui

12、t breaker protects itself against a current spike (e.g., caused by shorting or an SEU) by means of a trip function that directly turns the MOSFET (1) off. Because the design improperly allows the trip function to sidestep the status line (2), fault detection logic mistook the breaker as ON and would

13、 not force a reset.The SSPC hybrid device could be susceptible to radiation-induced upsets, depending on the year of manufacture, and the ones used on this cost-constrained project were not screened. In September 2001 (well after IMAGE was launched) it was learned that the lack of proper status repo

14、rting following instant trip events was actually inherent in the parts design, but had not been reflected in any SSPC part documentation provided to SSPC users (Reference (2). This prevented the PDU EDAC designers from incorporating a logic design that could compensate for this device characteristic

15、. (By design, the PDU software is not patchable in flight.) With the inability of EDAC to detect and reset the tripped breaker, secondary failure recovery measures that Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-would have saved the satellite we

16、re not available: 1. The loss of uplink triggered an automatic “warm reboot“ of the flight computer, but a warm boot does not reset the breaker. If the designers of the onboard fault recovery logic had wished to provide for a complete reset (“cold boot“) of the computer after several unsuccessful re

17、start attempts, the circuit breaker would have been commanded to ON. 2. Manual ground commanding of a reset was not feasible because the satellites receiver was already disabled, and IMAGE lacked a redundant Transponder. Prior to the IMAGE mishap, three on-orbit SEU-induced instant trips of this SSP

18、C part series from the same manufacturer occurred aboard the Earth Observing 1 (EO-1) and Wilkinson Microwave Anisotropy Probe (WMAP) satellites. Recovery by means of ground-commanded resets was successful because the EO-1 and WMAP configurations did not permit an SEU-disabled breaker to cut power t

19、o the command receiver. None of these three anomalies resulted in issuance of a GIDEP report or a NASA Alert. Reference(s): 1. IMAGE Failure Review Board Final Report, Flight Programs distribution to operations personnel would allow early consideration of possible mitigation action. Provided by IHSN

20、ot for ResaleNo reproduction or networking permitted without license from IHS-,-,-4. Protection against the loss of ground commanding was needed at the flight system level. A system-level command loss fault response from the flight software in IMAGEs main computer that reached to the root level (i.e

21、., including the SSPC) would have reset the telecom power supply SSPC as well as reasserting attitude control and other subsystem and system functions. “Command loss fault protection“ typically resets its timer periodically over the course of normal operations; after the flight software loses ground

22、 commanding (for whatever reason), the fault protection will (after a period of days or months) cycle through different system configurations and options until contact is restored. Recommendation(s): 1. Recognizing that loss of the ability to command the vehicle is a total loss of mission, consider

23、a design rule that would mandate permanently connecting (hardwiring) receiver power to the power bus, with only the transmitter switched. 2. If a switched receiver design is chosen, provide additional redundancy to prevent the switching device itself from becoming a single failure point in the desig

24、n. Build redundancy into the receivers power switching or into the sensed operational status so that on-board EDAC logic can correct an inadvertent mis-configuration. 3. Generate appropriate GIDEP reports or NASA Alerts as a standard element in all investigations of mission operational anomalies. Ro

25、ute these part anomaly alerts to operations personnel on those flight projects that use the part so that workarounds, if feasible, can be implemented. 4. Assure that the flight software has a fault protection feature, such as a set of timed spacecraft-level command loss fault responses, that safegua

26、rds against command loss by triggering a system interrupt, return to a prior state, or restart. Assure that the set of fault responses address not just the command processor, but all components that may need to be reset. Evidence of Recurrence Control Effectiveness: JPL has referenced this lesson as

27、 additional rationale and guidance supporting Paragraph 6.2.1 (“Telecommunication Design”) Paragraphs 7.5.4 and 7.5.5 (“Electronic Parts Reliability, Application, and Acquisition”), and Paragraph 7.6.2 (“Problem Reporting”) in the Jet Propulsion Laboratory standard “Flight Project Practices, Rev. 6,

28、” JPL DocID 58032, March 6, 2006, and supporting Paragraph 4.3.2.3 (“Power/Pyrotechnics Design: Power Distribution Load Shedding Architecture”), Paragraph 4.4.4.4 (“Information System Design: Commanding and Sequencing Assured commanding”), Paragraph 4.9.2.2 (“System Fault Protection Design: Fault Pr

29、otection Response Flight System Safing”), Paragraph 4.9.3.4 (“System Fault Protection Design: Flight-Ground Interface Spacecraft State Information”), Paragraph 4.9.4.2 (“System Fault Protection Design: Fault Detection Deviation from Expected Behavior”), Paragraph 4.11.2.1 (“Flight Software System De

30、sign: Initialization Nominal Flight Software Initialization”), Paragraph 4.11.4.7 (“Flight Software System Design: Design Robustness Use of Time-Outs”). Paragraph 4.12.1.8 (“Flight Electronics Hardware System Design: Use of Fuses Fuse Utilization”), and Paragraph 4.12.7.3 (“Flight Electronics Hardwa

31、re System Design: Power-on Reset Design Power-On Reset Circuits”) in the JPL standard “Design, Verification/Validation and Operations Principles for Flight Systems (Design Principles),” JPL Document D-17868, Rev. 3, December 11, 2006. Provided by IHSNot for ResaleNo reproduction or networking permit

32、ted without license from IHS-,-,-Documents Related to Lesson: 1. IMAGE Failure Review Board Final Report 2. EO-1 Anomaly Resolution Report for ACE Anomaly of 9-14-01Mission Directorate(s): a71 Space Operationsa71 Sciencea71 Exploration SystemsAdditional Key Phrase(s): a71 Program Management.Communic

33、ations between different offices and contractor personnela71 Program Management.Cross Agency coordinationa71 Engineering Design (Phase C/D).a71 Engineering Design (Phase C/D).Powera71 Mission Operations and Ground Support Systems.Mission control Planninga71 Safety and Mission Assurance.Product Assur

34、ancea71 Additional Categories.Communication Systemsa71 Additional Categories.Computersa71 Additional Categories.Flight Equipmenta71 Additional Categories.Flight Operationsa71 Additional Categories.Hardwarea71 Additional Categories.Information Technology/Systemsa71 Additional Categories.Mishap Report

35、inga71 Additional Categories.Parts, Materials, & Processesa71 Additional Categories.Safety & Mission Assurancea71 Additional Categories.Softwarea71 Additional Categories.SpacecraftAdditional Info: a71 Project: IMAGEApproval Info: a71 Approval Date: 2007-08-31a71 Approval Name: ghendersona71 Approval Organization: HQProvided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 标准规范 > 国际标准 > 其他

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1