The quest for zero downtime

Jay Lee
Masoud Ghaffari
National Science Foundation
Industry/University Collaborative
Research Center for Intelligent
Maintenance Systems (IMS)
University of Cincinnati
Cincinnati, Ohio

Edited by Leland Teschler

A Watchdog Agent today physically takes the form of a computer platform that installs on a machine or process or interest. IMS Center researchers set it up to run algorithms from its toolbox that gauge the health of the machine, monitor degradation, and predict when service will become necessary.

In one test bed, IMS Center researchers installed the Watchdog Agent on a Grob Inc. aluminum-cutting machine at Harley-Davidson. The Watchdog Agent automatically converted sensor data to health information and predicted how the machine operation would degrade.

Imagine a machine that can tell you its current state of health — whether key parts are about to fail, whether it needs lubrication, and so forth. Further imagine the machine can tell you the degree to which its operations are likely to deviate from the norm in the near future, though it is producing in-spec parts today.

Of course, a machine with this level of intelligence could truly change a product's life cycle for the better. Moreover, the ability of machines to assess themselves could raise the field of maintenance to more than a low-level topic of study. Machine self-assessment could winnow whole industries away from a reliance on predetermined intervals of preventative maintenance: Checking performance, replacing parts, and lubricating on a set schedule whether there's a true need for these activities or not.

Is there a realistic alternative for scheduled maintenance? This is the question that drives the National Science Foundation Industry/University Cooperative Research Center for Intelligent Maintenance Systems (IMS). Over the last six years, IMS Center researchers developed a toolbox of algorithms, collectively dubbed the Watchdog Agent, capable of converting sensor data and operating conditions to useful information. This information is indicative of a machine's level of performance and health. The Watchdog Agent monitors the degradation rate of components, predicts the likelihood of failure, and the health of a machine. This approach is called predictive maintenance, or prognostics.

Prognostics is a lot more than merely collecting the kind of data that, today, might typically be used for statistical-process control. It truly is a paradigm shift. It relies on three levels of intelligence:

Machine intelligence — Many new industrial machines already come equipped with remote monitoring, data-acquisition, and diagnostic systems. There's a similar trend in the automotive and aviation industries. For example, consider systems such as OnStar. It collects car-health data and transmits it to a remote site. In addition, GE is currently able to collect in-flight status for thousands of its jet engines through its Power-By-The-Hour service business model.

However, both of these services lack degradation assessment — or what is called "health information intelligence." In fact, degradation assessment is also absent from today's maintenance practices. Health intelligence includes health assessment, performance prediction, failure prediction, and the ability to interact with functional intelligence. A machine is considered to have both functional and health intelligence if it is capable of reconfiguring itself, compensating for any degradation it notices, and performing some self-maintenance.

Operations intelligence — It is relatively easy to make decisions about maintenance in the case of a single machine whose health is obviously deteriorating or which is near failure. But what should be done when there are hundreds of machines? The ability to prioritize problems, optimize responses, and responsively schedule maintenance according to the developing situation is referred to as operations intelligence. Researchers have yet to implement this particular paradigm effectively.

Synchronization intelligence — The third maintenance gap directing the IMS Center's activities is synchronization intelligence. Maintenance needs to be a part of asset utilization and business enterprise systems. Synchronization intelligence will facilitate the automatic conversion of data to information between machine and business systems.

DO WE NEED TO MONITOR EVERYTHING?
"Well, who's going to monitor the monitors of the monitors?" A fair question posed by Carla Dean in the movie "Enemy of the State."

Sensors are costly and not every component or system of a machine need be monitored. In fact, only the critical components which cause downtime should be monitored. For components that fail frequently but which have a low cost of failure, it might be enough to keep an adequate number of spare parts on hand. Conversely, costly components that fail frequently may require a change in design. Generally speaking, the system should move toward an ideal of low frequency of failure and low cost. Often a feasibility study using Computerized Maintenance Management Systems (CMMS) and other techniques can determine which critical components to monitor.

In this context, monitoring is more than just collecting data. The methodology for determining critical components, as practiced at the IMS Center, emphasizes the following steps:

Use sensors to collect relevant signals. This often entails installation of additional sensors over and above those needed for normal machine operation.
Determine the machine or component's current level of health. Relate it to the relevant signals collected.
Streamline and preprocess the raw data collected from sensors. This step is designed to eliminate extraneous information and help machine intelligence concentrate on data that is most relevant.
Extract features or health indicators from this version of the data. Note that a feature, in this context, might be multidimensional.
Use these features to determine operating conditions, fault history, and normal and problematic working zones.
Use the previous information to calculate Confidence Values (CV). A CV near one represents a healthy condition, while a CV near zero indicates significant degradation.
Estimate the likelihood of failure in the near future by establishing the rate-of-change and trend of CVs.
Use all this information to present machine users with a simple chart displaying the target machine's level of health in a clear, understandable form. Transforming data into usable, timely information lets machine operators take action instead of guessing about the status of their machines.

Currently it is not practical to build prognostics algorithms into the same controllers that run normal machine operations. Prognostics as practiced at the IMS Center takes place through The Watchdog Agent. Today the Watchdog Agent takes the form of a dedicated computer running a toolbox of algorithms. These algorithms can autonomously assess and predict the performance degradation and remaining life of machines and components. The resulting information can go to a closed-loop product life-cycle management system.

The Watchdog Agent provides machine-level intelligence. It includes signal processing and feature extraction, diagnoses, performance prediction, and performance-assessment modules.

Each module includes several algorithms that can be reconfigured according to application parameters. Specifically, the toolbox includes well-known algorithms such as logistic regression, self-organizing map (SOM, a special kind of neural network), neural networks, Fourier transforms, support vector machine (a way of classifying input data to facilitate machine learning), fuzzy logic, logistic regression (a statistical regression model for variables that follow a Bernoulli distribution), hidden Markov models (a statistical model of a Markov process having some unknown parameters), Bayesian belief networks (models of variables and their probabilistic tendencies), match matrices (a classification scheme for showing whether one class of problem is being confused for another), autoregressive moving average (a way of understanding time-series data), and time-frequency analysis, in addition to many others.

The Watchdog Agent follows an open-architecture design. New tools can be added easily and, depending upon the application, some can even be disabled. Based on restrictions such as memory, processing power, and power consumption, tools can be reconfigured to suit many conditions.

Today, the tasks a Watchdog Agent performs are optimized for the situation at hand by a researcher who has studied the application. User input can be captured based on a Quality Function Deployment (QFD) selection tool. The researcher selects the best tools from each module depending on the user-defined application, and expert knowledge integrated into the QFD selection tool.

THE NEXT STEP: SELF-MAINTENANCE
The ability of a machine to adjust its own functions according to its health is an integral part of a self-maintenance paradigm. Self-maintenance requires both functional intelligence and health intelligence. Functional intelligence provides information about the current operating conditions for the health assessment system. Health intelligence evaluates current health and degradation rate and predicts the likelihood of failure. This information can be fed into a functional intelligence module — e.g., the machine controller — which can adjust the machine's operation accordingly.

Adjusting the machine's operating parameters is not the only way in which self-maintenance can be perceived. Another way to realize some degree of self-maintenance is by activating a self-tuning or self-service function. A copy machine is a typical example used in self-maintenance literature. It can adjust and recalibrate its use of toner when it detects performance degradation.

Self-maintenance need not be fully autonomous in the context of the manufacturing industry. The purpose of self-maintenance is to give maintenance personnel enough time to become available and schedule downtime. In a manufacturing setting, the maintenance crew can conceivably fix the underlying problem. But there are other applications as in the aerospace industry where human intervention is not possible. Here the goal is a higher degree of self-maintenance.

Sensors are, of course, key inputs for self-maintenance systems. But what if a sensor fails? Such a failure could be complete or incipient, which will affect the sensor readings. There are currently two main approaches to solving the problem of sensor failures: hardware redundancy and analytical redundancy. The hardware redundancy approach is common in safety critical systems such as nuclear power plants or airplanes. It uses additional sensors to detect problems in one sensor, or to provide a more accurate reading when one sensor fails by effectively replacing it. The major downsides to this approach are the additional cost and the additional space for extra sensors.

The analytical redundancy approach starts with the analytical or experimental model of a system. It uses this model to detect inconsistencies in the system's behavior, which could be generated by sensor failure. Few sensors are needed to build an analytically redundant system; however, it is not always feasible to identify an analytical model in every situation.

APPLICATIONS IN INDUSTRY
Currently the IMS Center has over 30 global company members and sponsors. They include AMD, Boeing, Borg-Warner, Bosch, Caterpillar, Chevron, DaimlerChrysler, Festo, Ford, GE Aviation, GM, Harley-Davidson, Honeywell, Komatsu, McKinsey & Co., National Instruments Corp., Omron, Parker Hannifin, Proctor & Gamble, Rockwell Automation, Siemens, Toshiba, Toyota, and others.

The Watchdog Agent has been successfully implemented on several company test-beds. For instance, Harley-Davidson installed the Watchdog Agent on a Grob Inc. aluminum-cutting machine. The Watchdog Agent automatically converted sensor data to health information and predicted machine degradation, as well as remaining machine life.

In another project for Toyota, a Watchdog Agent predicted compressor surge. This helped Toyota save a significant amount of time and avoid a lot of downtime costs. All in all, the IMS Center's practices and test beds have proven the veracity of predictive maintenance as an emerging field with many promising applications in the future.

MAKE CONTACT
IMS Center, www.imscenter.net

A Watchdog Agent in action typically runs algorithms selected from a toolbox. These algorithms are sophisticated routines that look at sensor data coming in and draw conclusions about the health of the machine. The results are used to predict the machine degradation rate, trends, and further to optimize the maintenance action based on the impacts of failure.