They've come a long way, but do these machines have what it takes to make it in the real world?
In 1949, IBM had high hopes for its new line of digital computers. The company envisioned selling upwards of 10 to 15 machines. Today computers run our homes, offices, factories, hospitals, and airports. They operate our cars, control our appliances, entertain us, and help us make decisions. But the further computers push into our everyday lives, the more obvious it becomes that they are ill-equipped to deal with the real world.
The problem with computers is that they just don't think like we do. In fact, they don't think at all – they calculate. They solve problems and make decisions by plugging numbers into equations. And since much of the world is outside the realm of equations, it is also beyond the reach of computers.
A new breed of software, however, lets computers process information without explicit equations. Free at last, computers are beginning to make sense of the vagaries of the real world, and are learning to solve problems where even the brightest mathematicians fear to tread. Early applications of the new technology include more accurate medical diagnostic systems, smoother transmissions for cars, quieter vacuum cleaners, and cameras that recognize colors and shapes.
Ironically, the new software is based on programming techniques from decades ago, when computers were hulking number crunchers confined to environmentally controlled rooms. At that time, researchers were hoping to build machines that could "think" using artificial intelligence. Although they failed miserably, the researchers came up with several new techniques which, when combined with conventional software, provide a flexible set of tools that allow computers to tackle some of the problems occurring in everyday life.
Does not compute
We are surrounded by computers, yet many of us know little about them. How do they really work? What goes on inside? And why can they solve some problems, but not others?
It's not as if computers are all that complicated. Take away their modems, screens, and disk drives, and what's left is basically a calculator. These calculators, or microprocessors, do essentially one thing: they combine numbers using addition, subtraction, and Boolean logic operations. The fact that the operations can be programmed in software makes computers extremely useful tools.
Over the years, people have programmed computers to automate just about everything, from vehicle ignition systems to aircraft diagnostics. In practically every case, the heart of the software is a mathematical model that represents the decision or process being controlled. While this approach works well in many instances, sometimes it simply won't do.
There are many things that people would like to automate, but for which the math doesn't exist. Programmers cannot open a book and find an equation to predict the onset of cancer, for example. Nor is there an equation to calculate the quality of welds produced by a welding machine. In general, the more variables involved in a decision, the tougher it is to define in terms of math.
Math is also of little help where ambiguity and uncertainty are involved, and when decisions require judgment or instinct. But such is the real world, and the reason computers have yet to master it.
Then there are the practical limits of computers themselves. Even the highly touted Pentiums and Power- PCs have only so many bits and megahertz that they can throw at an equation. Consider the software for a factory robot. Suppose the robot must drill a hole every half second to within 0.001 in. The equations of motion, though complex, can probably be found in an engineering handbook. For argument's sake, say the robot fails its first field test because it takes too long to calculate the position of each hole. To speed up the machine, you reduce the number of bits in the equations from 16 to 12. Now the robot is fast enough, but the holes are out of spec because the machine is less precise.
Tradeoffs between speed and precision are common in automation, and are often related to a computer's clock frequency and the number of bits in its data paths. While faster, more expensive computers may be the answer in some cases, more often it turns out that automation is simply not an alternative.
Beyond the realm of math
Automation engineers are not the first to encounter such walls. Several years ago, scientists ran into similar problems trying to develop software that simulates human thinking. But rather than fall into the math trap, where computers are viewed as calculators, the researchers began to see computers as information processors. So instead of modeling thought using explicit equations, they developed new techniques to represent, process, and store information using the computer's ability to manipulate data.
From the start, research associated with this new field, called artificial intelligence (AI), was highly controversial. It triggered many debates, especially over what constitutes intelligence and whether machines should be allowed to think. Even the researchers disagreed. One group felt strongly that symbolic processing based on Boolean logic was the answer, while another group believed that the way to make computers think was by modeling biological neurons and the functions they perform.
The belief behind symbolic processing is that computers can simulate human intelligence by combining logic rules using Boolean operations. For example, a computer can act as if it knows when someone has a fever by executing two simple rules:
IF temperature is above 99° THEN fever
IF temperature is 99° or less THEN no fever
Continue on page 2
This sort of programming was ideal for its time because it requires relatively simple hardware. Indeed, many computer circuits perform Boolean operations. Another plus is that writing software is largely a matter of thinking up logic rules, or gleaning the rules from qualified experts. Programmers may interview a doctor, for example, to develop software for diagnosing appendicitis.
Today, such programs, called expert systems, are used in tens of thousands of applications. They're so common, in fact, that you could probably go to your local software store and find a package that would let you set up and run your own.
Sorting and classifying are among the primary functions of an expert system, and have many practical applications. In cars, the capability may be used to decide whether an engine component needs servicing. The expert system would "ask questions" by reading data from a variety of sensors. It would then plug the data into a rulebase derived from an experienced technician. If the condition falls into the "needs service" category, the computer could take necessary action, advising the driver to get the car checked.
Defining the "service" category is not as easy as it may sound. In this case, a few dozen logic rules may suffice. But more often, categories into which we would like to sort things are quite complicated, requiring more than a few rules.
Suppose the software has to monitor the entire engine, not just one component. If anything happens, the car's computer must consider all possible causes. More than likely, a programmer would sit down with a technician and discuss common engine malfunctions on a case-by-case basis. The software probably would reflect this approach, comprising several modules each representing a different case.
Now when a problem occurs, the computer looks at all possible cases and determines which is the closest match. Such programs, called casebased reasoning systems, are just an extension of expert systems. They are widely used in many industries, particularly in customer-service departments, where service agents have to troubleshoot a variety of products over the phone.
Like expert systems, case-based reasoners (CBRs) are handy tools for processing information. Not only can they search through volumes of data in a fraction of the time it would take humans, they are also accurate and highly repeatable. What's more, both types of software can run interactively, playing the role of advisor, or they can just as easily be embedded in conventional software and run transparently.
Smoothing over the bumps
Although expert systems were supposed to make computers less dependent on math, in many cases they did just the opposite. The reason is that Boolean logic rules assume a polarized world, where everything is true or false, black or white, on or off.
But the world we know is colored by many shades of gray. To navigate these areas, we use a process called approximate reasoning. To endow computers with a similar ability, programmers resort to math.
Consider the reasoning involved in opening a window to cool a room. One possible rule might be, "IF I'm hot THEN I open the window wide." Another rule could be, "IF I'm warm THEN I open the window a crack." Using 10 to 20 such rules, most people can stay reasonably comfortable.
Now try automating a power window to perform similar reasoning. Although the same logic rules apply, they must be quantified to speak the language of the motors and sensors in the window. A typical rule might read, "IF temp > 75.0° (hot) THEN window opening = 10.0 in. (wide)."
But what happens when the temperature is 74.9°? No one can tell the difference between 74.9° and 75.0°, yet the control software is set to give a completely different response. Likewise, what would it matter if the window opens nine and a half inches instead of 10.0 in.? Depending on the motor, programming it to move exactly 10.0 in. may require additional sensors and is certain to chew up precious computer time.
Programmers needed a better way to quantify the meaning of human perceptions such as hot, warm, cold, and so on. The traditional approach - - using equations or lookup tables to smooth each data point – is like taking a step backwards. A better approach, one with the smoothing function already built into the variable, is fuzzy logic.
Fuzzy logic stems from "fuzzy set" theory proposed by UC-Berkeley professor Lotfi Zadeh in 1965. Unlike conventional (crisp) set theory, where objects are either in or out of the set, Zadeh's theory allows objects to have partial membership. Each object has a membership value from zero to one, zero meaning non-membership and one meaning full membership. Crisp sets are just a special case of fuzzy sets.
It turns out that human perceptions like hot are more naturally defined by fuzzy sets than crisp ones. The crisp definition, any temperature above 75°, is simply unrealistic. A fuzzy set, on the other hand, is more consistent with what people have in mind when they say hot.
According to fuzzy nomenclature, hot is a "fuzzy variable." Measured temperatures relate to hot through a membership function, a distribution of membership or truth values from zero to one. Temperatures above 80°, for example, would have a membership value of one (definitely hot), and temperatures below 70° would have a membership value of zero (definitely not hot). A linear or curved segment joins the two lines. Other fuzzy variables for temperature may include very cold, cold, warm, and very hot.
Continue on page 3
Fuzzy logic also defines arithmetic operations that can be used to process fuzzy variables. "Inference" methods determine rule outcome, and "defuzzification" methods convert the outcomes to an analog value. Using fuzzy logic, computers can finally make sense of statements like, "IF I'm hot THEN I open the window wide."
Many of the benefits of fuzzy logic are due to the fact that the software models the decision maker, not the process. This is particularly an advantage if the process you're trying to control is nonlinear or subject to constant variation. And since accuracy doesn't depend on a process model and fuzzy math tolerates imprecision, sensor feedback is less critical and noise and component variations are not as big a concern as in conventional control.
In many respects, fuzzy logic is a return to the past. Proponents say that it achieves analog control using digital means. Apparently camcorder makers would agree. When they added fuzzy logic to autofocusing motors, the number of focusing speeds jumped from three to 127. What's even more incredible is that the motors and chips didn't change, only the software. Now the autofocusing mechanism is faster, quieter, and nearly undetectable.
Not everyone in the 60's bought into the concept of symbolic processing. A small but dedicated group believed that the only way to build a thinking machine was by modeling biological neurons. They became known as "connectionists," and are recognized today as the founders of a special branch of computing called neural networks.
Neural networks or "nets" are best known for their use in pattern recognition. On Wall Street, analysts use neural nets to predict stock patterns, for example. The software is also used in palm-top PCs to interpret handwritten characters, and on assembly lines at Ford Motor Co. to diagnose engines.
Neural nets are ideal for these applications because they don't require explicit equations, and because they can produce reasonable answers even when input data are noisy or incomplete. The latter quality, called robustness, allows programmers to automate more and more applications in diagnostics and control.
So how do neural nets work? In most cases, a neural network is just a program running on an ordinary computer. The software consists of hundreds of simple equations that process information. Each equation – often called a processing element or neuron – sums dozens of weighted inputs and calculates an output between zero and one. The outputs may be fed to any number of elements in the network, creating a highly interconnected grid.
These networks go through a "learning" process before they can do useful work. What happens is that a programmer feeds data samples into the network, and certain variables in the equations change to reflect correlation's in the data. The variables, called connection weights, are simply multipliers on each input connection.
In essence, connection weights are the coefficients of a multivariable sysuncertainty about how long a network should tem of equations that evolves to map input data to output data. As a network "learns," the connection weights are incrementally adjusted to get a better match between calculated and known data.
Neural nets have different properties, depending on how they are trained. Some train only on input data and are particularly good at spotting similarities. The dynamics of these networks are sensitive to repetition, allowing them to evolve equations influenced by natural clustering in the data.
Clustering can determine how input samples match up with average signals or distributions, for example. Such comparisons are particularly helpful in data compression and validation.
Consider a neural net used to compress electrocardiogram data. The motivating factor here is that patients often wear heart monitors for 24 hours at a time, necessitating large memory chips to store data from over 100,000 heartbeats. By contrast, the neural data compressor stores only deviations from an average heartbeat, packing 30 times more information into the same space.
Another way to train a network is to present input samples with corresponding outputs. For each data pair, the network reads the input and estimates the output. It then compares its estimate with the correct answer and uses the error to adjust the connection weights. Most networks train this way, a process called supervised learning.
The goal of this is to teach networks how to calculate outputs for any related input. Networks that get to this point are said to "generalize." This means they can interpolate between training cases and find approximate answers for new inputs.
Networks that generalize may seem like they're exercising intelligence, but what they're really doing is classifying data. Classification – matching signals to learned patterns – is at the root of most neural net applications.
In addition to clustering and classifying data, neural nets are also adept at generating complex signals. Applications such as speech synthesis and robotic control take advantage of this skill.
Neural programmers typically attack pattern-generation problems by breaking them into smaller parts. Networks that "talk," for example, connect letters with phonemes in rapid succession to form words. Think of the network as a speech piano: the letters are the keys (inputs) and the phonemes are the chords (outputs). Control applications are "played" in a similar manner, except that the inputs (sensor signals) stimulate motion (output) instead. Speed is the main advantage in these applications because neural nets make connections almost instantaneously.
Continue on page 4
Neural networks are not without drawbacks, however. Even the experts agree that the technology is still more of an art than a science. One problem is that neural nets conform to the data on which they train. If the data are bad, the results will be bad.
There's also some uncertainty about how long a network should train. If a network is undertrained it won't give good answers. If it's overtrained, it "memorizes" the training cases and loses its ability to generalize. Fortunately, most of these questions can be answered through trial and error.
It's in the genes
Another new branch of computing inspired by biological science is genetic algorithms. As its name implies, this new type of software is based on the principles of recombinational genetics. Emulating the natural selection process, genetic algorithms search through data (possible answers to problems), find the best solutions, and combine their outstanding features to evolve optimum solutions. Applications range from routing and scheduling to system design.
Genetic algorithms take advantage of a computer's ability to tirelessly search through mountains of data and grind out repetitive calculations. In return, the software endows computers with the power to generate new solutions from existing ones, in a sense, "thinking" creatively.
Perhaps the most successful application of genetic algorithms so far is its use in engine design at General Electric. Essentially, the software automates the "what-if" process that engineers go through as they search for better designs.
In engine design, optimization is usually measured in terms of fuel efficiency, weight, cost, and structural integrity. Optimizing these attributes simultaneously is a complex process, requiring engineers to consider hundreds of design variables, including dimensions, material types, machining processes, and so on. Typically, engineers start with a known design then add variations to gauge their effect. They keep the good variations and drop the bad variations. The idea is to look at as many variations as possible (within an allotted time) to find the best combinations.
GE's genetic algorithm, dubbed EnGENEous, takes a similar approach. By changing design variables and measuring the effect on engine performance – calculated using conventional CAD programs – the software considers thousands of possible designs. In the process, it captures the most outstanding elements, combining them in an optimum solution. In a recent test based on the design of turbine components, the software beat out an experienced engineer, coming up with a design 0.92% more efficient.
Although there are many variations on genetic algorithms, they all share certain qualities. The fact that they all run on computers, for example, means they can only deal with information if it's reduced to computer code. Most algorithms represent information as binary strings of fixed length. The strings usually represent cause-and-effect conditions, such as design variables and calculated performance measures.
To emulate natural selection, genetic algorithms must have some sort of criteria by which to rank binary strings. In other words, they need to know how to recognize a good solution from a bad one. The criteria may be nothing more than an equation comparing two numbers, or it could be the qualitative judgment of a human observer registering his or her preference by clicking a mouse. Once the genetic algorithms find the good strings, they also need a systematic way to combine them to form new ones with better qualities.
The entire "evolutionary" process involves three steps: selection, crossover, and mutation. In the selection phase, the software pulls the best strings out of a finite population. Typically, strings compete against each other in tournament fashion. After a few rounds, only the fittest strings remain. From these strings come new ones through a process called crossover.
Crossover is analogous to what takes place over generations of reproduction. It involves two strings from the preferred "mating" population. The strings exchange one or more bits from random sites, creating two new strings. The new strings then form a new population, possibly for another round of selection and crossover. By performing such number games, genetic algorithms are able to consider myriad solutions to any problem.
Mutation, the final step, is necessary to avoid falling into a common trap associated with automated searches. A computer searching for an optimum solution is like a hiker walking blindfolded through a mountain range, searching for the highest elevation. Each step has to be evaluated on the basis of whether or not it increases elevation. Eventually, the hiker reaches a point where every subsequent step decreases elevation, implying that he has found a peak. Unless the hiker is very lucky, however, there's likely to be even higher peaks nearby.
Computers face a similar problem. It's quite possible for an algorithm to converge on a local optima, while higher peaks exist elsewhere in the search plane. Mutation prevents this from happening by occasionally selecting a string and altering one of its bits. This simple step not only keeps genetic algorithms out of local optima, it also promotes diversity within the context of maintaining preferred features.
These qualities are probably best demonstrated by a genetic algorithm developed to help crime witnesses recall a perpetrator's face. The software, called Faceprints, begins by displaying several faces on a computer screen, asking the witness to rank each one. Next, using selection, crossover, and mutation, the program generates new faces, emphasizing outstanding features from the previous set. The algorithm uses five different input strings, representing the mouth, hair, eyes, nose, and chin. Researchers at New Mexico State University, where the software was developed, claim that Faceprints is faster and more accurate than traditional reconstruction techniques. Note, however, that like all forms of computer intelligence, it still takes a human touch to make it work.