Alarm Management – the strategy and process of striving for a well-managed operation

Chris Bamber, Yokogawa Middle East & Africa B.S.C.(c)

Introduction

Maintaining a safe and stable plant is the objective of everyone involved in the manufacturing process. As Peter Drucker once said, "A well-managed plant is silent and boring," but it actually takes a lot of work and effort to ensure this is the case.

Alarm management in the plant is not just another project that gets executed, but it is a philosophy, a way of life just like safety. We don't ever enter the process area without wearing PPE, so why work in an environment where there is no strategy for alarm handling? The alarm management system is one of the most important aspects of the plant and, like everything else, it must be maintained to meet the ever-changing needs of the plant.

In the early days of control systems, before the Distributed Control System (DCS) became commonplace, configuration of alarms used to be done through mechanical means with annunciators, light boxes, etc. Now with the advent of the DCS, the cost of making extra alarms available has significantly reduced as it can be mostly done by software. However, the operator still becomes overwhelmed with unnecessary alarms if the control system design is not approached correctly.

To fully understand the purpose of the alarm management system, we must look at the basic meaning of what an alarm actually is.

  • Anything that requires an operator to take an action to maintain safety and integrity of the process
  • An alarm is designed to prevent or mitigate process upsets and disturbances

Most alarm problems exist because the above criteria are not met. Understanding this definition is key to implementing a successful alarm management system. Alarm rationalization is a process of optimizing the alarm system for safe operation by reducing the number of alarms, reviewing their priority, and validating their alarm limits. By undertaking such steps, we help reduce the workload of the operators and promote a safer working environment within the plant, and when a plant upset does occur, more visibility is available on the alarms that really matter.

As highlighted previously, alarm management is not just a project that has a start and end date; it's a continuous cycle. Once the alarm system has been reviewed and improvements have been identified, we must check that controls are in place to ensure the alarm system remains functional. The key  is to ensure that the system is continuously monitored and any changes are fully documented. It is essential that any initiatives regarding alarm management have management support available, otherwise little improvement will be made in reducing the alarm counts and improving overall safety and improvement in the process.

Alarm Management

 

Seven Key Steps for Alarm Management

There are seven key steps for alarm management. Rationalization is one of those critical steps. 

  1. Alarm Philosophy Creation
    The alarm philosophy document is critical and, without it, there can be no way to implement a successful alarm management system. This document forms the basis of the overall design guidelines and will record all the expected KPIs that will be used to measure the success of the alarm management system.  The alarm philosophy should also cover the design of the interface to the operator so the graphics are clear and upsets are easy to spot etc.
  2. Alarm Performance Benchmarking
    To measure the success of any alarm management system, we must know how big the alarm problem is that is currently being experienced. How many alarms are being generated per day, how many alarms does the operator handle on an hourly basis, what are the deficiencies we currently have in the control system? These are all valid questions and benchmarking is the starting point.  Perhaps even performing a HAZOP-like study at this stage would be advantageous.
  3. Bad Actors Resolution
    Most alarms in the control system come from relatively few sources, and checking these and fixing them will make a big difference to the overall alarm count. Reviewing the Top-10 list helps to keeps it under control. Yokogawa's Exaquantum/ARA software can provide this list on a daily basis by email or, by using Yokogawa's Exaplog alarm/event analysis tool, we can manually extract the bad actors.
  4. Documentation/Rationalization
    The most important step of the alarm rationalization process is to ensure that each change is documented and the alarm changes comply with the alarm philosophy. Alarms can be eliminated completely by re-engineering in the DCS or adopting suppression techniques.  
  5. Audit/EnforcementAlarm Management Steps
    Once the rationalization is done, the hard work is not over! Without proper change management controls in place the alarm system will slip back into its old ways. Consider adopting a Management of Change (MOC) approach to the alarm system to ensure all changes are tacked. Exaquantum/AMD can also help by identifying changes to the alarm settings and, if required, the optimal settings can be enforced automatically.
  6. Real Time Alarm Management
    For day-to-day operations, we should adopt alarm management techniques that will support rather than hinder the operator by providing Alarm Shelving, state based alarming or other alarm suppression technologies.
  7. Control & Maintain Performance
    Continued compliance to the alarm philosophy is crucial by continuously monitoring the alarm KPIs and making any required changes through a MOC type procedure. Nominate an "alarm champion" that will oversee and manage day-to-day issues. Remember that alarm management is not a one step process.

 

Alarm Rationalization: Finding the Bad Actors

Before Alarm Rationalization TableA general approach of alarm management and the steps required to implement a successful alarm management strategy was addressed in Part 1. Now, we explore the concept of alarm rationalization. As discussed earlier, the best starting point is to look at how big of an alarm problem we actually have. We can also use this as a baseline to track progress for the future. The first item to address is our "bad actors." That is the alarms that are causing the most issues within the process. Eliminating the top ten of these alarms will make a big improvement in the overall alarm count in a short period of time.  The bad actors can be obtained easily by using Yokogawa's alarm/event analysis software tool, Exaplog, or its alarm reporting and analysis software, ExaquantumARA. These tools should be run and the results reviewed on a regular basis.  In Exaplog, a report can be manually run when needed, and in ARA, a report can be generated automatically and sent via email.  The bad actor list in the table on the left is an example of a plant before alarm rationalization was started.

The alarm counts for the first three tags in this list were exceptionally high and were all found to be caused by an input open (IOP) error, which in most cases is related to a communication issue in the field, a hardware issue with the transmitter itself, or possibly an incorrect alarm threshold setting.

In this case, all of the concerned transmitters were connected to a faulty fieldbus segment. Replacing a fieldbus component cleared the problem and suddenly there were no more alarms. This immediately made a big impact on the alarm count.

The following table can be used as a general reference for help in troubleshooting different alarm types in a Yokogawa CENTUM system.

Reference for Troubleshooting Alarms Table

Alarm by Condition

It is always best to remember that just because the alarm count is high for a particular tag, there may be a logical explanation for it, and the tag should not just be suppressed because it's a nuisance to the operators. This first stage of alarm rationalization is called "Fundamental Nuisance Alarm Reduction" (FNAR).

Running a report for the bad actors and displaying the condition is recommended, as it can be filtered for the different conditions, plant areas and even down to an individual unit.

Alarm by Condition

After looking at the bad actors we can also look at the "chattering alarms." The EEMUA#191 alarm standard specifies that a chattering alarm is a tag that goes into alarm and normal again more than five times in a 60-second period. In most cases, these chattering alarms could be caused by incorrect alarm limits.  As part of the rationalization, the chattering alarms should be looked at closely and the limits reviewed accordingly.

Familiarization with the EEMUA#191 guideline and the ISA18.2 standard are important to understanding alarm rationalization, alarm management and the key performance indicators. The EEMUA#191 guideline is a detailed specification of alarm management and goes down to the detail of providing guidance of how DCS mimic displays should look and what type of furniture to use in the control room to make the operators more comfortable during their shifts. All Yokogawa alarm management based products were initially based on the EEMUA#191 guideline and are being applied to the ISA18.2 standard.  In the ideal world, EEMUA#191 recommends no more than one alarm per operator every 10 minutes. That would be  quite an achievement and is a rare occurrence. A big difference can be made the bad actors list; to identify them, and eliminate them. Making the review of the bad actors list part of the daily activities is a work process well worth the effort. Otherwise, your effort will be wasted and soon those alarms will be coming back again!

相关行业

  • 石油和天然气

    横河电机在石油天然气行业的各个环节均拥有丰富经验,业务覆盖海上与陆上设施、管道运输、终端处理及深水作业等领域。我们提供的解决方案能够提升安全水平、确保设备准确可靠地运行,并有效提高工厂运营效率。

  • 油气下游

    近年来,石油天然气下游行业正面临日益增多的挑战。这些挑战包括:需处理的原料特性不断变化、工艺设施与设备逐渐老化、能源成本持续攀升、能够安全高效运营炼油厂的熟练操作人员紧缺,以及市场和客户需求的快速变化。

    多年来,横河电机与众多下游企业携手合作,提供专注于解决这些挑战与难题的工业解决方案。横河电机的 VigilantPlant解决方案助力工厂业主实现工厂内的更大盈利和可持续安全运营。

  • 上游

    上游行业涵盖海上和陆上作业,包括井口自动化、分馏、完井和分离等环节,旨在开采并初步处理地下或水下原油与天然气。

    石油被采至地表后,必须在运输前进行分离。通常在初级和二级分离阶段,通过三相分离将气流、水流和油流分开。气体输送需要管道,并可在上游阶段加入分馏处理。液体则存入储罐或通过管道输送至加工环节,此过程需准确的液位测量。

  • 海上设施(FPSO、FLNG与FSRU)

    海上勘探与生产需要在严苛环境下实现更长运行时间。有人与无人设施需配备具备先进远程监控功能的可靠集成控制与安全系统(ICSS)。横河电机拥有先进技术及执行各种规模与复杂程度海上项目的丰富经验。

  • 管道

    恰当的管道控制与仪表系统能显著提升性能与盈利能力。横河电机拥有专用技术,可优化管道解决方案中各组件的性能,包括压缩机、泵、阀门以及中间存储与分配设施。

  • 炼油

    在不断变化的市场环境中,炼油厂不仅被视为原油加工单元,更是利润中心。与此同时,业界对这些设施的安全需求保持着高度重视。要实现长期盈利、效率提升与环境保护的目标,需要一套涵盖规划、调度、管理与控制的完整生产解决方案。凭借在自动化领域多年的专业积淀,横河电机能够为您提供高性价比的整体解决方案,助力提升运营效能,共创更清洁的世界。

  • 化工

    化工厂存在连续生产与批次生产两种流程,两者对控制系统需求各异。连续工艺要求控制系统具备高稳定性与可靠性,避免因故障导致生产线停机;而批次工艺则强调灵活调整配方、工序等参数的能力。两类系统均需实现产品质量追溯管理,并支持非常规操作执行。横河电机凭借丰富的产品组合、资深系统工程师团队及全球销售服务网络,可为不同工艺流程提供定制化解决方案。

  • 石化

    石化企业的需求具有多样性。为了在当今竞争激烈的市场中脱颖而出,生产商正致力于提升质量与生产效率。横河电机基于在该领域长期且广泛的经验,可为这些需求提供量身定制的解决方案。

相关产品&解决方案

  • 报警主数据库(Exaquantum/AMD)

    Exaquantum/AMD(以下简称“AMD”)是横河电机的报警主数据库解决方案。基于ANSI/ISA-18.2-2009和EEMUA 191标准,AMD协助管理员和监督人员监视、评估和审计报警设置点的动作,并作为报警文档和合理化程序的一部分。

  • 报警分析(Exaplog)

    新的Exaplog事件分析软件包有利于对DCS事件日志中的问题进行定量分析。通过报警设定调整和操作序列整定进行交替分析,可以持续提高生产效率。

  • 报警合理化

    有效的报警管理提供了运行状况的清晰视图,消除了可能导致不必要的工厂停车的盲点。长期经验表明,基于EEMUA#-191指南的报警设计是防止报警泛滥情况的重要方法。横河电机帮助客户实现基于EEMUA191的报警系统。

  • 报警报表及分析(Exaquantum/ARA)

    工厂操作员经常面临大量的报警和异常情况,因此无法快速响应来避免安全相关事件、环境问题、停车和设备损坏。应用不当的报警管理策略会造成过多的报警和事件,也可能导致操作员由于接收过多信息而经常忽视这些报警。

  • 报警管理

    通过管理良好的报警系统,软件解决方案有助于降低风险并提高工厂运行的安全性。

  • 集散控制系统(DCS)

    横河电机的集散控制系统(DCS) 可实现工业过程的自动化和控制,并提高业务绩效。30,000多套系统的经营者选择采用横河电机的DCS来实现其生产目标。


置顶
WeChat QR Code
横河电机(中国)有限公司