Using Alarm and Event Analysis to Achieve Batch Productivity Improvements

Download (129 KB)

Jim Parks, Instrument Engineer, Lonza Inc., Conshohocken, PA 19428
Yoshitaka Yuki, Manager, Yokogawa Corporation of America, Carrollton, TX 75006

With the help of the methods discussed in this article, one pharmaceutical manufacturer was able to reduce spurious alarms by 30% and smooth out operations procedures.

Lonza's Riverside plant manufactures bulk active ingredients for the pharmaceutical industry. A new, fully automated multipurpose reactor train was installed in early 1996. The train includes ten 500 to 1500 gallon vessels used for reaction, distillation, phase separation, and crystallization; two centrifuges for isolating finished products; and two dryers. Automated support equipment includes a bulk metering station, five vacuum pumps, four process scrubbers, and four weigh scales. Roughly 20 to 30 batch processes are running concurrently on this train, giving the facility the ability to produce a variety of products.

To help improve operational efficiency, the company investigated the relationship between alarm notification and operator reactions. To analyze this relationship numerically, it measured the frequency of message notification and operator actions in the historical event database. It found that the frequency-increase patterns of alarm notification and operator reaction are interrelated. Furthermore, some increase patterns are synchronously found with specific recipe operation/phase durations. A similar frequency-increase pattern shows a possible area for improvement. A different pattern in a save phase of a batch shows a batch-specific problem area.

Evil of Excessive Alarm Notification

A fine chemical plant is producing a variety of products concurrently. Although all the batch procedures are automated and controlled by distributed control systems (DCSs), operators still need to watch the process and manually initiate each sequence. The control system notifies operators with alarm messages to direct their attention to unexpected situations. Guidance messages are also often used to prompt operators to start the next step in a procedure. Because a number of recipes are loaded on the system and cycle times are critical, precise and timely operator reactions are essential to maintain the plant's productivity. Also important are the controllers' alarm and guidance message notifications. But as the system grows and the process becomes still more complex, the number of alarm detection points increases, and operators can quickly become overloaded with alarm notifications. The problem is, operators tend to ignore excessive alarm notifications and, as a result, may miss an important alarm message. It is difficult to maintain an adequate message frequency level for operators in order to maintain a high level of productivity.

Event Balance of Process Request and Operator Actions

Operators receive alarm message notifications to get them to react in a certain way. In a sense, the notifications are a "process request," whereby the production process requests the operators' reaction. Because operators start their actions by watching the console messages, the number of operator actions should be closely related to the number of message notifications. However, in some cases, alarm message notifications will increase when the operator's action is not adequate. In both cases, process request and operator actions are interrelated.

FIG. 1: Event balance trend graph. 

FIG. 1: Event balance trend graph.

Figure 1 shows an event balance trend chart of a single day in the train. By counting message notifications for each divided time period, the message frequency can be quantified and visualized as a trend graph. The frequency of operator action also can be quantified by counting the operation records. By comparing the frequency of message notifications with that of operator action, you can see whether or not the operator was busy dealing with alarms.

The trend graph in Fig. 1 shows the balancing event frequency for each ten-minute period of a day. The bars above the 0 baseline show the process request message (alarm, guidance) count per minute for each 10-minute period. The bars below the 0 baseline show the frequency of manual operation per minute for each 10-minute period.

FIG. 2 (top): Expanded daily event balance

FIG. 2 (top): Expanded daily event balance trend graph to batch start/end period.

The event balance trend shows that the frequencies of process requests and operator action are interrelated. It is easy to find out when the operators were busy (7:00-9:00, for example). At 11:00 there was a peak in process request, but the operators' reaction frequency was not increased. By investigating message records in this period, we found that high pressure alarm messages were repeatedly generated because the nitrogen pressure controller's range setting was not adequate. Thus, the meaningful messages and spurious messages can be distinguished easily by comparing them with operator reaction frequency.

Event Balance Analysis of a Batch Process

The company next applied this balance analysis method to a batch process. Figure 2 shows how an event balance trend graph for a batch process is made. As previously mentioned, Fig. 1 shows the event balance of process request and operator action trend for a day. The batch graph in Fig. 2 expands the time range from a day to the start/end time of the entire batch process. The center box in Figure 2 shows a single day event balance on 3/5, and its start/end period is expanded to the entire 22-9DAP batch period (from 3/2 13:00 to 3/7 17:00). Only 22-9DAP batch related events (event messages tagged with the batch-ID) are retrieved to calculate the process request and operator action frequencies.

FIG. 3: (bottom) Batch event balance trend graph1.

FIG. 3: (bottom) Batch event balance trend graph 1.

This batch has four unit procedures. To compare the relative timing relationship with the batch procedure, four box bars are added to the graph to show the unit procedure duration as a Gantt chart, as shown in Figure 3. By looking at this figure, you can see that the process request peak and the operation peak are synchronous with a specific operation of a batch procedure. Figure 4 shows the 9DAP recipe structure and a part of the SFC chart of No. 1 Unit procedure and No.2 Unit procedure. These are also marked as "Operation R160 and R260" in the balance chart. During that period, the balance chart shows that there were some process request messages and operator reactions. In this phase, the operators must confirm the material status via visual inspection, and manually open or close the transfer valves.

At 17:00 on the balance chart, there was a rush of manual operations. The operator has to manually adjust the plow position to remove the cake, which is dried in a centrifuge.

FIG. 4: 9DAP recipe structure and its SFC (partial).

FIG. 4: 9DAP recipe structure and its SFC (partial).

As you can see in Fig. 4, the combination of a batch event balance chart and unit procedure duration blocks makes it easier to review a batch process.

FIG. 5 (top): Batch event balance trend graph 2.

FIG. 5 (top): Batch event balance trend graph 2.

Figure 5 shows another batch balance trend graph. Because the procedure (recipe) of this batch is the same recipe as that shown in Figure 3, the balance peak patterns are similar. By comparing several balance charts, you can see that similar process request peaks and operation peaks occur synchronously with a certain operation/.phase of the batch procedure.

FIG. 6 (bottom): Batch event balance trend graph 3.

FIG. 6 (bottom): Batch event balance trend graph 3.

Figure 6 shows another batch event balance trend graph of the same recipe, but some differences can be seen here. Most importantly, there was a rush of answerback error messages at 4:00 on 3/11. This was caused by a problem with a non-critical field device leading to a large number of answerback alarms. Also, there was a larger number of manual operations during the execution of Unit No. 3. This was due to a more difficult cake removal operation in the centrifuge.

Those repeating answerback errors were not as meaningful to the operators as other messages among that period. The answerback timer can be adjusted in the controller function block for the field device. By making this adjustment, these spurious answerback errors were eliminated so that the operators had a better chance to react to important alarms.

Approach to a Problem Area

To find a way to approach a given problem area, it's important to first find a key event among what is likely to be a huge number of event records. To find the problems discussed in this article, the company used the following filtering key in the event database.

Event Time Stamp.
Batch ID
Operation/Phase ID
Physical Equipment ID (Tag, Unit) Plant Hierarchical ID (Area, Cell)

FIG 7 Example of spurious alarm generating

FIG. 7: Example of spurious alarm generating pressure differential indicator.

By narrowing the balance trend graph with these filter keys, a specific problem area can be found quite easily. For example, as shown in Fig. 7, F4272.PDI36 (pressure differential indicator) was tuned to generate spurious alarms in a certain situation. Through our investigation, we found the following five problem areas:

  1. Inadequate alarm setpoint (high/low limit)—When the high/low limits for the indicator are not set properly, high/low limit alarms can be generated frequently, which may not be essential. There are also some cases where high limit and high high (extreme high) limit values are set too close to each other, which causes high and high high alarms to be generated together. This problem was mainly detected in the material transfer phase between reactor units.
  2. Invalid scale range—If the instruments are operated in a range close to their limit value, "Input Open" errors are often generated. In this case, it is desirable to rescale the indicator's range.
  3. Invalid timer setting—In the stirring phases, there were frequent answerback error notifications to a change in motor speed. This was due to an invalid timer setting of the controller, which was corrected by adjusting the response timer settings.
  4. Redundant alarms—When some equipment indicators/controllers are integrated as one unit, a unit alarm may represent the individual indicators/controllers alarming state. But there are some cases in which the underling blocks generate an alarm message together with the unit alarm messages.
  5. Manual operations—Some operations required a lot more operator intervention than expected (perhaps a machine or field device is not working properly, a loop is poorly tuned, or a phase was poorly designed). The event balance trend graph quickly revealed such operations and led to improvement actions.

By eliminating these spurious alarms, alarm messages were reduced by roughly 30% (from about 350 alarms/day to 250-300). Because the spurious alarms were confusing the operators, operation procedure became smoother after the improvement efforts.


It isn't easy to keep the productivity of a batch plant consistent because many batch procedures are running concurrently and their recipes are frequently changed. Even when an improvement measure is installed, its effectiveness may not last for very long. The key to increased batch productivity is to find the problem area as quickly as possible and apply countermeasures immediately. An event balance trend graph of a batch process showed the exact problem areas that needed to be fixed. Process request and operator action frequencies should be balanced and where they aren't, typically a problem can be found. Overlaying the unit procedure execution period on a batch event balance trend can narrow down the exact operation/phase in the procedure to be improved. The analysis can lead to ideas for automation schemes that have a dramatic impact on the amount of operator interaction required to operate plant equipment. Improvement is possible from both directions—that is, a reduction in spurious alarms/messages and a reduction in unprompted, unexpected manual operations.

(This article was originally presented in April 2000 at the World Batch Forum North American Conference, Atlantic City, NJ. It is presented here by arrangement with the World Batch Forum.)

About the Authors

Jim Parks is instrument engineer at Lonza Inc. Conshohocken, PA. Lonza Group is a leading supplier of active ingredients, chemical intermediates and biotechnology solutions to the pharmaceutical and agrochemical industries.

Yoshitaka YukiYoshitaka Yuki received his bachelor degree in electrical engineering in 1983 from Waseda Univ., Tokyo, Japan. In 1983 he joined Yokogawa as system software development engineer of its System Division. He became software development manager of Advanced Operation Assistance Packages (Yokogawa's Exapilot, Exaplog). In 2001, Yuki was appointed his present position, manager, Business Planning & Administration Dept., System Division, Yokogawa Corp. of America.

Control Solutions February, 2003 Author(s): Jim Parks, Yoshitaka Yuki


  • Pharmaceutical

    The pharmaceutical industry currently faces a major challenge in taking full advantage of the opportunities presented in large emerging markets. Now, more than ever, pharmaceutical companies need to introduce lean manufacturing techniques that will enhance profitability. As one of the world's leading industrial automation suppliers, Yokogawa is committed to delivering the best possible solutions for your best manufacturing practices.

    See More