Alarm Management – the strategy and process of striving for a well-managed operation

Chris Bamber, Yokogawa Middle East & Africa B.S.C.(c)

Introduction

Maintaining a safe and stable plant is the objective of everyone involved in the manufacturing process. As Peter Drucker once said, "A well-managed plant is silent and boring," but it actually takes a lot of work and effort to ensure this is the case.

Alarm management in the plant is not just another project that gets executed, but it is a philosophy, a way of life just like safety. We don't ever enter the process area without wearing PPE, so why work in an environment where there is no strategy for alarm handling? The alarm management system is one of the most important aspects of the plant and, like everything else, it must be maintained to meet the ever-changing needs of the plant.

In the early days of control systems, before the Distributed Control System (DCS) became commonplace, configuration of alarms used to be done through mechanical means with annunciators, light boxes, etc. Now with the advent of the DCS, the cost of making extra alarms available has significantly reduced as it can be mostly done by software. However, the operator still becomes overwhelmed with unnecessary alarms if the control system design is not approached correctly.

To fully understand the purpose of the alarm management system, we must look at the basic meaning of what an alarm actually is.

  • Anything that requires an operator to take an action to maintain safety and integrity of the process
  • An alarm is designed to prevent or mitigate process upsets and disturbances

Most alarm problems exist because the above criteria are not met. Understanding this definition is key to implementing a successful alarm management system. Alarm rationalization is a process of optimizing the alarm system for safe operation by reducing the number of alarms, reviewing their priority, and validating their alarm limits. By undertaking such steps, we help reduce the workload of the operators and promote a safer working environment within the plant, and when a plant upset does occur, more visibility is available on the alarms that really matter.

As highlighted previously, alarm management is not just a project that has a start and end date; it's a continuous cycle. Once the alarm system has been reviewed and improvements have been identified, we must check that controls are in place to ensure the alarm system remains functional. The key  is to ensure that the system is continuously monitored and any changes are fully documented. It is essential that any initiatives regarding alarm management have management support available, otherwise little improvement will be made in reducing the alarm counts and improving overall safety and improvement in the process.

Alarm Management

 

Seven Key Steps for Alarm Management

There are seven key steps for alarm management. Rationalization is one of those critical steps. 

  1. Alarm Philosophy Creation
    The alarm philosophy document is critical and, without it, there can be no way to implement a successful alarm management system. This document forms the basis of the overall design guidelines and will record all the expected KPIs that will be used to measure the success of the alarm management system.  The alarm philosophy should also cover the design of the interface to the operator so the graphics are clear and upsets are easy to spot etc.
  2. Alarm Performance Benchmarking
    To measure the success of any alarm management system, we must know how big the alarm problem is that is currently being experienced. How many alarms are being generated per day, how many alarms does the operator handle on an hourly basis, what are the deficiencies we currently have in the control system? These are all valid questions and benchmarking is the starting point.  Perhaps even performing a HAZOP-like study at this stage would be advantageous.
  3. Bad Actors Resolution
    Most alarms in the control system come from relatively few sources, and checking these and fixing them will make a big difference to the overall alarm count. Reviewing the Top-10 list helps to keeps it under control. Yokogawa's Exaquantum/ARA software can provide this list on a daily basis by email or, by using Yokogawa's Exaplog alarm/event analysis tool, we can manually extract the bad actors.
  4. Documentation/Rationalization
    The most important step of the alarm rationalization process is to ensure that each change is documented and the alarm changes comply with the alarm philosophy. Alarms can be eliminated completely by re-engineering in the DCS or adopting suppression techniques.  
  5. Audit/EnforcementAlarm Management Steps
    Once the rationalization is done, the hard work is not over! Without proper change management controls in place the alarm system will slip back into its old ways. Consider adopting a Management of Change (MOC) approach to the alarm system to ensure all changes are tacked. Exaquantum/AMD can also help by identifying changes to the alarm settings and, if required, the optimal settings can be enforced automatically.
  6. Real Time Alarm Management
    For day-to-day operations, we should adopt alarm management techniques that will support rather than hinder the operator by providing Alarm Shelving, state based alarming or other alarm suppression technologies.
  7. Control & Maintain Performance
    Continued compliance to the alarm philosophy is crucial by continuously monitoring the alarm KPIs and making any required changes through a MOC type procedure. Nominate an "alarm champion" that will oversee and manage day-to-day issues. Remember that alarm management is not a one step process.

 

Alarm Rationalization: Finding the Bad Actors

Before Alarm Rationalization TableA general approach of alarm management and the steps required to implement a successful alarm management strategy was addressed in Part 1. Now, we explore the concept of alarm rationalization. As discussed earlier, the best starting point is to look at how big of an alarm problem we actually have. We can also use this as a baseline to track progress for the future. The first item to address is our "bad actors." That is the alarms that are causing the most issues within the process. Eliminating the top ten of these alarms will make a big improvement in the overall alarm count in a short period of time.  The bad actors can be obtained easily by using Yokogawa's alarm/event analysis software tool, Exaplog, or its alarm reporting and analysis software, ExaquantumARA. These tools should be run and the results reviewed on a regular basis.  In Exaplog, a report can be manually run when needed, and in ARA, a report can be generated automatically and sent via email.  The bad actor list in the table on the left is an example of a plant before alarm rationalization was started.

The alarm counts for the first three tags in this list were exceptionally high and were all found to be caused by an input open (IOP) error, which in most cases is related to a communication issue in the field, a hardware issue with the transmitter itself, or possibly an incorrect alarm threshold setting.

In this case, all of the concerned transmitters were connected to a faulty fieldbus segment. Replacing a fieldbus component cleared the problem and suddenly there were no more alarms. This immediately made a big impact on the alarm count.

The following table can be used as a general reference for help in troubleshooting different alarm types in a Yokogawa CENTUM system.

Reference for Troubleshooting Alarms Table

Alarm by Condition

It is always best to remember that just because the alarm count is high for a particular tag, there may be a logical explanation for it, and the tag should not just be suppressed because it's a nuisance to the operators. This first stage of alarm rationalization is called "Fundamental Nuisance Alarm Reduction" (FNAR).

Running a report for the bad actors and displaying the condition is recommended, as it can be filtered for the different conditions, plant areas and even down to an individual unit.

Alarm by Condition

After looking at the bad actors we can also look at the "chattering alarms." The EEMUA#191 alarm standard specifies that a chattering alarm is a tag that goes into alarm and normal again more than five times in a 60-second period. In most cases, these chattering alarms could be caused by incorrect alarm limits.  As part of the rationalization, the chattering alarms should be looked at closely and the limits reviewed accordingly.

Familiarization with the EEMUA#191 guideline and the ISA18.2 standard are important to understanding alarm rationalization, alarm management and the key performance indicators. The EEMUA#191 guideline is a detailed specification of alarm management and goes down to the detail of providing guidance of how DCS mimic displays should look and what type of furniture to use in the control room to make the operators more comfortable during their shifts. All Yokogawa alarm management based products were initially based on the EEMUA#191 guideline and are being applied to the ISA18.2 standard.  In the ideal world, EEMUA#191 recommends no more than one alarm per operator every 10 minutes. That would be  quite an achievement and is a rare occurrence. A big difference can be made the bad actors list; to identify them, and eliminate them. Making the review of the bad actors list part of the daily activities is a work process well worth the effort. Otherwise, your effort will be wasted and soon those alarms will be coming back again!

產業別

  • 上游

    上游業包含海上和陸上活動,包括井口自動化,分餾,完井和分離,回收和製備地下或水下原油和天然氣。

    當石油被帶到地表時,在運輸之前必須將其分離。在三相分離中,一級和二級分離階段通常分佈有氣體流量,水流量和油流量,氣體的輸送需要管道,並且在輸送之前的上游階段需要進行分餾。液體需要通過罐和管道輸送並處理,這就需要精確的液位測量。

    更多
  • 石油和天然氣下游

    近年來,石油和天然氣下游產業面臨著越來越多的挑戰。這些挑戰包括待加工原料的特性變化、工業設施及設備的老化、能源成本的上升、缺乏能夠使煉油廠安全有效運行的熟練技術工人,以及市場和客戶的需求不斷變化。
    多年來,橫河與許多下游公司合作,致力於提供應對這些挑戰和問題的工業解決方案。橫河的解決方案幫助工廠投資者盡可能實現最大的盈利能力和工廠內可持續的安全。

     

    更多
  • 管道

    能否正確控制管道和儀表會在性能和盈利方面會產生巨大的差異。橫河擁有專業的技術,可以優化管道解決方案中所有因素的性能,包括壓縮機,泵,閥門及中間存儲和分配設施。

    更多
  • 精煉

    在不斷變化的市場中,煉油廠不僅是原油加工單位,而且是利潤中心。同時,人們強烈意識到煉油廠設施安全性的重要。為了實現盈利,效率及環境保護的長期目標,需要包括計劃,調度,管理和控制的總體生產解決方案。憑藉在自動化領域多年積累的專業知識,橫河可以為您提供經濟實惠的完善的整體解決方案,改善操作並使世界更清潔。

    更多
  • 近海 (FPSO FLNG & FSRU)

    近海的勘探和生產需要在惡劣的條件下最大限度地保障正常運行時間。載人和無人駕駛設施需要可靠的綜合控制和安全系統(ICSS),並且需要具有先進的遠程監控能力。橫河電機擁有先進技術和豐富經驗,可以執行各種規模及自動化程度複雜的離岸項目。

    更多

相關產品&解決方案

  • 報警合理化

    有效的報警管理提供了運行狀況的清晰視圖,消除了可能導致不必要的工廠停車的盲點。長期經驗表明,基於EEMUA#-191指南的報警設計是防止報警泛濫情況的重要方法。橫河電機幫助客戶實現基於EEMUA191的報警系統。

    更多
  • 警報主要資料庫(Exaquantum/AMD)

    Exaquantum/AMD(以下簡稱“AMD)是橫河電機的警報主要資料庫解決方案。支援ANSI/ISA-18.2-2009EEMUA 191標準,AMD協助管理員和監督人員監視、評估和審計警報設定點的動作,並作為警報檔和合理化程式的一部分。

    更多
  • 警報分析(Exaplog)

    新的Exaplog事件分析套裝軟體有助於定量分析DCS事件日誌中的問題。交替分析警報設定調整和操作序列調整,您可以不斷提高生產效率。

    更多
  • 警報報表及分析(Exaquantum/ARA)

    工廠操作員經常面臨大量的警報和異常情況,無法快速回應來避免安全相關事件、環境問題、停俥和設備損壞。應用不當的警報管理策略會造成過多的警報和事件,也可能導致操作員由於接收過多資訊而經常忽視這些警報。

    更多
  • 分散式控制系統(DCS)

    我們的分散式控制系統 (DCS) 可自動化和控制製程,提高績效。 已經超過 10,000 家工廠利用Yokogawa DCS 實現生產目標。

    更多
  • 警報管理

    管理完善的警報系統軟體解決方案有助於降低風險,並提高工廠運行的安全性。

    更多

置頂