1、 TECHNICAL REPORT ISA-TR18.2.5-2012 Alarm System Monitoring, Assessment, and Auditing Approved 26 October 2012 ISA-TR18.2.5-2012, Alarm System Monitoring, Assessment, and Auditing ISBN: 978-1-937560-60-7 Copyright 2012 by the International Society of Automation. All rights reserved. Printed in the U
2、nited States of America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopyin g, recording, or otherwise), without the prior written permission of the publisher. ISA 67 Alexander Drive P.O. Box 12
3、277 Research Triangle Park, North Carolina 27709 E-mail: standardsisa.org - 3 - ISA-TR18.2.5-2012 Copyright 2012 ISA. All rights reserved. Preface This preface, as well as all footnotes and annexes, is included for information purposes only and is not part of ISA TR18.2.5-2012. This technical report
4、 has been prepared as part of the service of ISA, the International Society of Automation, toward a goal of helping in the understanding and use of ANSI/ISA-18.02-2009, Management of Alarm Systems for the Process Industries. To be of real value, this document should not be static but should be subje
5、ct to periodic review. Toward this end, the Society welcomes all comments and criticisms and asks that they be addressed to the Secretary, Standards and Practices Board; ISA, 67 Alexander Drive; P.O. Box 12277; Research Triangle Park, NC 27709; Telephone (919) 549-8411; Fax (919) 549-8288; E-mail: s
6、tandardsisa.org. This ISA Standards and Practices Department is aware of the growing need for attention to the metric system of units in general, and the International System of Units (SI) in particular, in the preparation of instrumentation standards, recommended practices, and technical reports. T
7、he Department is further aware of the benefits of USA users of ISA standards of incorporating suitable references to the SI (and the metric system) in their business and professional dealings with other countries. Toward this end, this Department will endeavor to introduce SI-acceptable metric units
8、 in all new and revised standards, recommended practices, and technical reports to the greatest extent possible. Standard for Use of the International System of Units (SI): The Modern Metric System, published by the American Society for Testing systematically compare alarms to the alarm philosophy;
9、and determine the alarm setpoint, consequence, operator action, priority, and class. Activities include, but are not limited to, identification, justification, prioritization, classification, and documentation. TR3 Basic Alarm Design - provides guidance on basic alarm design. TR3 focuses on the scop
10、e of ANSI/ISA-18.02-2009 Clause 10 and may include other clauses as needed (e.g., operations and maintenance). Basic alarm design covers the selection of alarm attributes (e.g., types, deadbands, and delay times) and may be specific to each control system . TR4 Enhanced and Advanced Alarm Methods -
11、provides guidance on advanced and enhanced alarm methods. TR4 focuses on the scope of ANSI/ISA-18.02-2009 Clause 12. Enhanced alarm design covers guidance on additional logic, programming, or modeling used to modify alarm behavior. These methods may include: dynamic alarming, state-based alarming, a
12、daptive alarms, logic-based alarming, predictive alarming, as well as most of the designed suppression methods. TR5 Alarm Monitoring, Assessment, and Audit - provides guidance on monitoring, assessment and audit of alarms. TR5 focuses on the scope of ANSI/ISA-18.02-2009 Clauses 16 and 18. Monitoring
13、, assessment, and audit cover the continuous monitoring, periodic performance assessment, and recurring audit of the alarm system. TR6 Alarm Systems for Batch and Discrete Processes - provides guidance on the application of ANSI/ISA-18.02-2009 alarm life cycle activities to batch and discrete proces
14、ses, expanding on multiple clauses of ANSI/ISA-18.02-2009. Each technical report is written to be a standalone document. In an effort to minimize repetition, the technical reports have cross references. The guidance as presented in this document is general in nature, and should be applied to each sy
15、stem as appropriate by personnel knowledgeable in the manufacturing process and control systems to which it is being applied. ISA-TR18.2.5-2012 - 12 - Copyright 2012 ISA. All rights reserved. Introduction ANSI/ISA-18.02-2009 gives requirements that address alarm systems for facilities in the process
16、 industries to improve safety, quality, and productivity. The general principles and processes in ANSI/ISA-18.02-2009 are intended for use in the lifecycle management of an alarm system based on programmable electronic controller and computer-based human-machine interface (HMI) technology. These req
17、uirements are presented in the standard using the alarm management lifecycle shown in ANSI/ISA-18.02-2009, Figure 1. M o nit ori n g it is a human cognitive process involving thought , analysis, and action. - 21 - ISA-TR18.2.5-2012 Copyright 2012 ISA. All rights reserved. Figure 4-1: Feedback model
18、of operator process interaction (ISA-18.2, Figure 6) 4.2 Operator response to alarms The steps involved in alarm response require time. It is the human factors that limit the alarm rate that can be successfully handled by an operator. The time it takes to respond to an alarm is the sum of several st
19、eps. Figure 4-1 illustrates a model for the human response to an alarm. To understand and illustrate this, consider the mental and physical steps an operator takes during effective alarm response. These generic steps are the same regardless of industry segment, process type, or control system type.
20、The operators steps are as follows. a) Detection of the alarm. This requires capturing the attention of the operator from other tasks and may include silencing or acknowledging the alarm. b) Diagnosis of the situation. This may include navigation to the appropriate screen(s) to obtain contextual inf
21、ormation from the process of which the alarm is a part, followed by analysis of the process situation to determine the alarms cause. c) Determination of the action. The operator decides upon the proper action(s) to take in response to the alarm based upon the diagnosis. This may involve consulting r
22、eference information or other people. d) Taking the action. The operator implements the chosen action(s), generally through manipulation of the control system, contacting and directing other people to perform tasks, leaving the console to take action that cannot be accomplished without doing so, or
23、a combination of all of these. e) Monitoring of the result. The operator monitors to ensure that the action(s) performed correct the situation that caused the alarm or if further action is needed. This sequence of operator steps indicates that alarm response cannot be instantaneous. Several of these
24、 steps can only be accomplished sequentially. It is possible to address several alarms at once, and some of the steps for the different alarms can be accomplished in parallel. From consideration of the human factors, it is obvious that an alarm handling rate of on e alarm per second is untenable, an
25、d one alarm per hour is certainly possible. The maximum rate that can be handled lies somewhere in between. The reference clause contains results of research indicating that the handling of one alarm in 10 minutes can generally be accomplished without the significant sacrifice of other operational d
26、uties, and is termed “very likely to be acceptable.” More than this rate (150 per day) may be problematic for the operator. Reference/ Objective Measurement Action Operator Sub-System Detect Process/System Respond Diagnose Disturbance/ Malfunction Deviation ISA-TR18.2.5-2012 - 22 - Copyright 2012 IS
27、A. All rights reserved. A rate of up to 2 alarms per 10 minutes is considered “manageable” (300 alarms per day). A higher rate may be “unmanageable.” Higher numbers than these represent thresholds above which proper alarm response becomes less likely; alarms are likely to be missed, and operational
28、performance is potentially affected. Between 2 and 5 alarms per 10 minutes can be characterized as “possibly over-demanding.” More than 5 but less than 10 alarms per ten minutes becomes “likely to be over-demanding.” It has been demonstrated that alarm response rates of 10 alarms per 10 minutes can
29、possibly be achieved for short periods of time. More than 10 alarms in 10 minutes is considered “very likely to be unacceptable.” The depiction of alarm rates in the form of hourly and daily charts greatly aids in the visualization of performance. The examples in this technical report are designed t
30、o illustrate proper analysis principles and depiction of analysis results. Reporting only the single-number averages of 10-minute, hourly, daily, weekly, or monthly alarm rates is likely to be misleading. It is worth repeating the qualification from ISA-18.2, Subclause 16.5: “The target metrics in t
31、he following sections are approximate and depend upon many factors, (e.g. process type, operator skill, HMI, degree of automation, operating environment, types and significance of the alarms produced). Maximum acceptable numbers could be significantly lower or perhaps slightly higher , depending upo
32、n these factors. Alarm rate alone is not an indicator of acceptability.” 4.3 The nature of averages in alarm rate analyses The use of averages can be misleading, and care must be taken in interpreting averaged data. Averages are insufficient to indicate alarm system performance. As an example, an an
33、alysis might show that in the prior month, a daily average of 130 alarms per day was achieved with an alarm-per-10-minute average of 0.9. Upon first examination, this might seem like good performance. However, a more detailed look at the data is needed. The average alarm rate might not indicate that
34、 there were many shorter periods of time when the alarm rate greatly exceeds 10 alarms per 10 minutes, yielding times when many alarms were likely to be missed. Other performance metrics such as the percent time in flood or maximum alarm rate should be used in conjunction with average alarm rate whe
35、n evaluating alarm system performance. The issue is illustrated by an example analysis showing the total quantity of alarms produced that exceeded 10 in 10 minutes. (If a single 10-minute period produced 29 alarms, the quantity for that period would be 19.) The summation can produce a metric such as
36、: During the week beginning xx/xx/xx, 280 alarms were likely to have been missed. Such an analysis can assist management in understanding the degree to which the alarm system is contributing to non-optimal operation or is in a condition unhelpful to the operator. 4.4 Impact of alarm differences on o
37、perator response In examining acceptable alarm rates for small periods of time (such as 10 minutes or an hour) the specific nature of the alarms involved becomes much more of a determining factor than does the simple number of alarms. Specific alarm response is highly variable in terms of demand upo
38、n the operators time. No single number or average can accurately represent the time required for an operator to respond to a typical alarm. As an example, consider a simple tank with three inputs and three outputs , controlled either automatically or manually. A high level alarm occurs. There are do
39、zens of combinations of inlet - 23 - ISA-TR18.2.5-2012 Copyright 2012 ISA. All rights reserved. and outlet flow rates and instrumentation problems that could result in the high tank level. Diagnosing the situation can take some time, and involve looking at trends or readings of the flows and compari
40、ng them to the proper numbers for the current process situation. The correct action to take varies highly with the proper determination of the cause(s). Some HMI implementations make the problem diagnosis quite easy, while others make it much more difficult or highly variable based upon the experien
41、ce of the operator. The HMI directly affects the ability of the operator to quickly detect an alarm, diagnose the cause, and determine and accomplish the corrective action. The quality and capabilities of an operators HMI vary widely throughout industry. The result is that the diagnosis and response
42、 to a simple “high tank level” alarm may be quite complicated. Given the tasks involved, certainly many fewer than 10 such alarms can be handled in a 10 minute period. Compare and contrast the above “simple high level tank alarm” to another, different simple alarm saying, “Pump 14 is supposed to be
43、running but has stopped.” The action for that alarm is very direct, to “Restart the pump or if it wont, start the spare.” Operators can handle several such alarms as these in 10 minutes. The time required to figure out the situation is much less. 4.5 Alarm rates, process types, and operator staffing
44、 issues Certain process types, facility issues, and operator staffing practices give rise to questions regarding alteration of alarm system performance targets based on such factors. The situations are as follows. a) Alarms in non-continuously manned control centers, where the operator has to both m
45、anage the control console plus perform duties away from that console for significant periods of time. b) Alarms in facilities where the nature of the process is that overall alarm response can take significant time, such as pipeline operations that may dispatch people for alarm response, involving t
46、ravel over some distance. Alarms in operations with variable staffing, such as significantly reduced on-site staff on nights and weekends. c) Alarms in facilities where a single console is sometimes or always manned by more than one person, but the alarms are not segregated. The general principle th
47、at applies to these situations is: When alarms are properly created to indicate abnormal situations needing a response to prevent an undesirable consequence, then receiving “too many alarms” and missing some will generally result in the defined consequence for each alarm that is missed. The alarm ra
48、te reflects the capability of the control system to keep the process in bounds where operator intervention to avoid a consequence is not necessary. In a well rationalized alarm system, missing an alarm will generally result in the defined consequence. The process type or the staffing scenarios do no
49、t affect this principle. 4.5.1 Staffing practices and alarm rates A company may arrange their facilities, staffing levels, and operator duties so that there could be significant delays in initial detection of an alarm. In such circumstances, any safety, environmental, quality, or financial consequences associated with such delayed alarm detection and response should be acceptable. This may be a valid assessment and work arrangement, given the specific considerations of the process, the alarms, and the consequences. These staffing practices and scenarios are not uni