The self-aiming camera in the breadboard stage at a UI lab. The foreground camera and microphones input to the rotating-table-mounted camera that tracks targets.
A leaf rustles and a flicker of light catches your eye. Instantly, your head rotates and your gaze shifts to lock on the target. Human brains do such reckoning automatically, but now researchers at the University of Illinois, Champaign, can mimic the reaction electronically.
The group steers a self-aiming camera with a computer neural network modeled after the superior colliculus of the human brain. "The superior colliculus serves as the visual reflex center," explains Sylvian Ray, a UI professor of computer science and a researcher at the Beckman Institute for Advanced Science and Technology. "It is the primary agent for deciding which direction to turn the head in response to sensory stimuli such as visual and auditory cues."
Here's how the system works: One camera looks for motion by comparing successive video frames while a pair of omnidirectional microphones monitors audio signals. A program parses reflected sound from the desired direct sound waves by listening for wave-energy onset only (after the echoes from the previous signal die out), explains Ray. A sound-location algorithm analyzes the sounds and sends the information to the neural network. The program then figures the correct position and rotates a second camera — mounted on a servomotor-driven table — to acquire the target. Sound need not be located precisely. Ballpark estimates are OK, says Ray. The target image transmits to a human operator for further analysis. A computer with a 200-MHz processor runs the system.
Though the self-aiming camera can be attracted by either sight or sound, the combination of the two offers a much stronger stimulus. Neuroscientists say this is how the brain processes multiple signals from the same direction. Even smaller signals issue a large response, provided they are coincident. Coincident (superimposed) signals make for a more interesting target that tells the brain where to turn the head. And like a human brain, the neural network learns what to respond to and what to ignore.
For instance, during infancy, the superior colliculus helps a baby's brain associate external direction with an internal visual reference grid, mapping a mother's moving lips to the sound of her voice. Similarly, the researchers' model learns to align its sound-source location processing with an embedded visual map. "As the system learns to correctly locate both sound and visual sources, it also learns what types of objects are preferred targets," says Ray. "We want to teach it to ignore common objects and focus on unusual sounds or visual motions." Look-up libraries of sights and sounds could let the system differentiate between an aircraft on the horizon and a flock of birds, for example. And stimuli aren't limited to light and sound. With the proper sensors, the system could be programmed to respond to heat, infrared, or other inputs.
One possible application for self-aiming cameras is distance learning. Here, one camera could follow the speaker while another points at the audience, automatically zeroing in on a student raising a hand to ask a question. Funding agency, the Office of Naval Research, sees self-aiming cameras one day serving as intelligent cyber sentinels in military applications.