1、Lessons Learned Entry: 1764Lesson Info:a71 Lesson Number: 1764a71 Lesson Date: 2006-07-27a71 Submitting Organization: JPLa71 Submitted by: David Oberhettingera71 POC Name: Wilson Harkinsa71 POC Email: wilson.b.harkinsnasa.gova71 POC Phone: 202-358-0584Subject: Critical Facilities Maintenance Assessm
2、ent Abstract: CFMA is an ongoing NASA activity that was initiated following the March 2000 HESSI spacecraft overtest incident that severely damaged the spacecraft. CFMA is a comprehensive assessment of NASA critical facilities and equipment to identify inadequacies in ground facility readiness that
3、could harm people or NASA hardware. It involves an inventory of critical facilities and equipment, identification of equipment failure modes, establishment of appropriate reliability centered maintenance (RCM) methods, and related activities. This lesson captures a NASA Preferred Practice that was d
4、rafted but did not complete a NASA-wide review cycle.Description of Driving Event: This lesson learned documents a NASA Preferred Practice for Design and Test that was drafted shortly before the cancellation of the Preferred Practice task. Hence, the draft was not subjected to review by the NASA fie
5、ld centers and should not be viewed as a formally accepted NASA-wide practice. Practice Prepare and implement an institutional plan for the comprehensive assessment of NASA critical facilities and equipment that includes: 1. An inventory of critical facilities and equipment. Provided by IHSNot for R
6、esaleNo reproduction or networking permitted without license from IHS-,-,-2. Comprehensive assessment of failure modes for critical equipment. 3. Establishment of appropriate Reliability Centered Maintenance (RCM) methods. 4. Acquisition of necessary Predictive Testing and Inspection (PT&I) equipmen
7、t. 5. Implementation of RCM using a Computerized Maintenance Management System (CMMS) and appropriate performance metrics 6. Center-wide training in RCM and Critical Facilities Maintenance Assessment procedures. Benefit Critical Facilities Maintenance Assessment (CFMA) was first implemented by NASA
8、following the March 2000 High Energy Solar Spectroscopic Imager (HESSI) spacecraft overtest incident. Inadequate maintenance of the test equipment was one of the principal causes of the major structural damage sustained by HESSI during this JPL test. Subsequently, other incidents have occurred at NA
9、SA centers such as contamination in an assembly facility that housed a spacecraft due to rainwater from a leaky roof. CFMA identifies inadequacies in ground facility readiness that could affect the safety of the public, the NASA workforce, flight hardware, and other critical equipment and property.
10、Implementation Method To simulate the extreme environments of space, test equipment may subject spaceflight hardware to stress levels that approach the test articles limits. Often with repeated test-analyze-and-fix cycles and few or no flight spares, it is essential that testing be conducted with hi
11、gh reliability and fidelity. The potential cost and schedule impact of a test failure or facility failure increases disproportionably with proximity to the systems launch date. The March 2000 HESSI spacecraft overtest incident alerted NASA to the substantial risks of utilizing aging industrial facil
12、ities for the development of high value, one-of-a-kind products. The assessment and management of these programmatic risks, as well as safety risks, require data on the facilities condition, characteristic failure modes, and maintenance practices. Hence, NASA initiated a review of all critical groun
13、d facilities at each of the NASA centers shortly after the HESSI mishap. A comprehensive program was begun to define RCM gaps, complete a critical facility inventory, perform failure assessments, define and document maintenance procedures, and implement a CMMS. This process is depicted in Figure 1.
14、NASA Policy Directive NPD 8831.1, Maintenance of Institutional and Program Facilities and Related Equipment (Reference (1), was revised to require self-assessments of facilities maintenance programs and utilization of accepted standards as a guideline to determining facilities maintenance funding re
15、quirements. Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-Figure 1 is a color flow chart of the CFMA process. The first block is labeled ?Business Model? within which an ?Identify Critical Systems? box flows into a ?Determine Key Performance Indica
16、tors? box. The first block flows into an ?FMEA & RCM? block, which contains a single ?Generate RCM Based Maintenance Strategy? box. This second block flows into a ?CMMS? block which contains a single ?Implement Strategy? box. This third block flows into the last block? ?Reliability Engineering? cont
17、aining a single ?Reliability Based Analysis? box. There is a feedback loop from the last block flowing back to the ?FMEA & RCM? block, with the feedback loop labeled ?Continuous Improvement.?Figure 1. NASA Vision for Asset ManagementCFMA is conducted on NASA facilities only. However, given the oppor
18、tunities for damage to NASA flight systems in facilities owned by system contractors, contractor proposals and operations should be reviewed for the maturity of their maintenance practices. For on-site contractors, opportunities for collaborative use of maintenance resources (e.g., sharing FMEA data
19、) should be pursued. Facility Inventory and Audit: Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-CFMA should be conducted by a working group composed of maintenance specialists competent to review the health of the centers current maintenance progr
20、ams and identify any needed improvements. The CFMA Working Group (CWG) should also include those individuals responsible for key facilities such as major integration and test laboratories, with support by reliability engineers and other specialists as needed. The first task for the CWG is to complet
21、e a comprehensive inventory of equipment and facilities that are critical to the mission of the center. Critical equipment may be defined as “equipment, which if not operated or maintained correctly, could endanger the operating personnel or the product being processed.“ The inventoried items are th
22、en ranked in order of relative criticality. Technical equipment with primary functions related to flight system development and operation are identified and ranked separately from non-technical, general-purpose equipment such as HVAC facilities. The purpose is to separate civil engineering related i
23、tems with well-understood failure modes and maintenance needs from specialized engineering facilities that are candidates for failure mode analysis and RCM. Technical equipment that processes flight hardware, defined as “product items deliverable to a customer or deliverable to a significantly high-
24、level of integration,“ is given a high ranking. High ranking items are given priority for further assessment within the CFMA process, and this prioritization may also impact capital equipment improvement plans and other institutional processes. Once the critical facilities are identified, they are a
25、udited for compliance with maintenance policies and procedures. The following information is elicited from facility operators through interview or survey (i.e., questionnaires) techniques: 1. NASA and center-specific policies and procedures are adequate to provide the necessary level of protection,
26、and they are currently in use by the facility operator for the inspection, calibration, maintenance, repair, and control of technical equipment. Procedures for “control“ include those issued to ensure proper operation, industrial safety, system safety, emergency response, etc. 2. Statutory and regul
27、atory requirements that must be complied with- including safety, environmental, physical security, and information technology security- and NASA and center-specific compliance policies and procedures tailored to these requirements. 3. Practices employed independently by operators to remain current o
28、n regulatory requirements. 4. Schedule of third-party ISO audits, internal audits, self-assessments, and other continuing activities in use to monitor that policies and procedures are understood, adhered to, and effective. 5. Formal training provided to center staff on policies, requirements, and pr
29、ocedures affecting facility maintenance and operation. Training provided to staff on maintenance techniques, including RCM. The audit also includes CWG visits to the facilities to observe activities, examine procedures and records, observe housekeeping conditions, and interview facility managers and
30、 personnel. The results of this audit are evaluated by the CWG to determine whether there is sufficient objective evidence, Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-including maintenance and training records, to verify compliance with maintena
31、nce policies and procedures. Deficiencies and remedial plans are documented in an institutional closed-loop corrective action system such as JPL“s Corrective Action Notice (CAN)/Preventive Action Notice (PAN) system. The CWG issues a report that documents the facility inventory and audit, and summar
32、izes the current maintenance programs used by the Field Center to assure that critical facilities and equipment are kept in a good state of repair. Failure Mode Assessment for Critical Equipment: Each Field Center conducts an informal assessment to determine and document the principal latent failure
33、 modes and the consequences of failure for the inventoried facilities. Identification of failure modes and failure causes resembles a high-level fault tree, annotated to indicate the potential impact of maintenance. The purpose is to identify the principal failure modes of concern, the likely cause
34、of failure, the possible types of damage to flight hardware, and the extent to which the failure modes are preventable by maintenance. Hence, test facilities that are intended to apply calibrated levels of stress to flight hardware would be prime candidates for assessment. Table 1 illustrates this s
35、ummary level of failure mode assessment for such a candidate. The table labeled Table 1 contains the following text: Loss of Chamber Vacuum, Caused by pump or valve failure, Preventable by maintenance, Caused by loss of electrical power, Only partially preventable by maintenance, Caused by broken wi
36、ndow or feed-through penetration, Not preventable by maintenance, Possible types of damage to flight hardware , Corona discharge, Contamination, Protection of flight hardware , Vacuum failure alarm system, Spacecraft power cut-off pressure sensor system, Emergency power generator, Loss of Chamber So
37、lar Simulation, Caused by cooling water Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-failure, Preventable by maintenance, Caused by loss of electrical power, Only partially preventable by maintenance, Caused by lamp explosion, Not preventable by m
38、aintenance, Possible types of damage to flight hardware, Exceed lower temperature limits, Protection of flight hardware , Over/under temperature alarm system, Spacecraft safing heaters, Emergency power generator, (Next failure mode), Loss of Chamber Shroud Temperature Control, Caused by controller f
39、ailure, Preventable by maintenance, Caused by loss of electrical power, Only partially preventable by maintenance, Possible types of damage to flight hardware, Exceed upper temperature limits, Exceed lower temperature limits, Protection of flight hardware , Over/under temperature alarm system, Over-
40、temperature heater power cut-off system, Spacecraft safing heaters, Emergency power generatorTable 1. Failure Mode Assessment for a Large Space SimulatorProvided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-Particularly for facilities and equipment that ha
41、ve implemented an RCM process, the recent reliability performance of candidate facilities should also be reviewed to identify the facilities and failure modes to be included in the informal assessment. This initial assessment is intended to identify major areas of concern and does not cover all GSE,
42、 review failure modes in great detail, penetrate to root causes of failure, or assess the statistical likelihood of failure. Such comprehensive failure analysis is the province of RCM, which seeks to optimize the distribution of maintenance resources by application of standard reliability analysis t
43、echniques to determine where the greatest benefits may accrue. Over the course of an RCM program, Failure Mode and Effects Analysis (FMEA) will be conducted to identify lower tier equipment failures that could cause catastrophic failure of critical equipment and facilities. In tandem, root cause ana
44、lysis will explore all possible causes for a postulated machine failure. Because these techniques are labor-intensive and conducted over the life cycle of the facilities, the CWG may propose equipment candidates for FMEA and Root Cause Failure Analysis (RCFA) based on their criticality and failure h
45、istory. Establishment of RCM Methods: NASA policy promotes the use of RCM techniques to assure that maintenance resources are applied cost-effectively and where they can best mitigate mission risk. CFMA evaluates the extent to which RCM and PT&I practices are being incorporated into the development,
46、 improvement, operation, and maintenance of critical facilities and equipment to minimize life-cycle maintenance and repair costs, maintain facilities and equipment at the desired level of reliability and availability, and maximize safety. More information on RCM methodology is available in a relate
47、d preferred practice (Reference (3), Preventative Maintenance Strategies Using Reliability Centered Maintenance (RCM). The NASA RCM Guide (Reference (4) serves as the basis for the NASA RCM assessment process. This NASA RCM guide suggests specific components (see Reference (4), Chapter 3) for benchm
48、arking the maturity of an organizations RCM program within each of the following seven program elements: 1. Maintenance Philosophy 2. Program Organization 3. Performance Measurements and Indicators (Metrics) 4. Proactive Maintenance 5. Predictive Test and Inspection (PT&I) Technologies 6. Preventive
49、 Maintenance 7. Training and Personnel Development The CWG, as part of the CFMA audit but with specialized assistance as required, assesses the current status of the RCM program within each maintenance organization that is responsible for Provided by IHSNot for ResaleNo reproduction or networking permitted without license from IHS-,-,-critical facilities. Interview