Products everywhere have gotten so small that many necessitate automated systems specifically designed to feed, align, and fasten small and complex 3D parts. A key tool in these systems is vision guidance — using machine vision to locate parts to be accessed by a robot. Consumer demand for digital and cell phone camera technology has greatly reduced the cost of high-resolution industrial cameras, so they can be economically integrated into automated systems. However, these cameras can be put to use in different ways.

Traditional vision guidance is an excellent way to find parts, but its functionality in small, high-precision assembly operations is limited by the accuracy of the robot it controls. For example, placing a 1 mm2 laser diode or a semiconductor disk-drive read head requires accuracy down to two or one microns. For placement accuracies in this range, traditional machine vision requires extremely expensive, high-accuracy robots. What's more, the ability to move to an absolute XYZ position commanded by traditional vision is limited by the robot's manufacturing variations, thermal expansion, and a host of mechanical effects.

With visual servoing, which closes the robot's position loop using visual feedback, robot placement accuracies are based on encoder resolution rather than absolute accuracy. So, applications such as placing laser diodes into DVD read heads or teaching wafer slot positions in a semiconductor wafer carrier can be effectively automated using low-cost robots with limited intrinsic absolute accuracy. The only requirement is good robot move resolution.

Threading the needle

In a traditional vision guidance application, the vision system captures a single image and locates a target's position in world coordinates. The system then transmits these coordinates to a robot, and relies on the mechanism to accurately move to the target position. This is analogous to a person looking at a part, and then grabbing it with their eyes closed: If the part is big enough, this method may be successful. However, for finer operations, this approach is not effective.

So how else can machine vision operate? Just as a human performs finer tasks, such as threading a needle: By looking at both objects simultaneously, determining relative distance and direction, and moving the thread toward the eye of the needle accordingly. In fact, this is how robotic visual servoing works. The vision system takes a picture and analyzes the relative positions of actuator and target. Instead of sending the robot a single motion command in world coordinates, the software sends a series of incremental distance and direction motion commands. As the motion is executed, more pictures are taken. The software continuously analyses the new images and updates the motion commands accordingly — until the vision system confirms that the task is accomplished.

Resolution, not accuracy

In a traditional vision guidance system, it's the robot's responsibility to move to the commanded location accurately and in a repeatable manner. The success of this process relies on many factors:

  • An accurate knowledge of the camera's position relative to the robot
  • An accurate knowledge of where the camera is looking (field of view)
  • The robot's ability to accurately move to a commanded XYZ position after taking into account the effects of thermal expansion, runout, backlash, drive train wear, and manufacturing tolerances in the straightness and perpendicularity of its links

Because there's no verification that the motion is executed correctly, the system must assume the process is successful and progress to the next step. However, even the most accurate mechanisms require periodic recalibration of the entire system. Visual servoing replaces a single world coordinate command in favor of a series of distance and direction commands, so the need for absolute accuracy is significantly reduced.

Here's how it works. The robot executes a motion by traveling a specified incremental distance. As the motion progresses, the system continuously takes pictures, updating the motion command with every image. Since the motion continues until the vision system confirms that the target has been reached, it doesn't matter if the robot is unable to move exactly to each requested position. The visual servoing loop adjusts for any differences. Thus, as long as the resolution of the camera capturing the images is as good as (or better than) the resolution of the robot's encoders, the system can achieve placement accuracies equal to the encoder resolution. Visual servoing also confirms that motions are properly executed before advancing to the next step.

End effectors made easy

Because traditional vision-guidance systems only observe target locations, and not parts as they are seized, it falls to the robot to capture parts in an identical manner. This is almost impossible with low-cost end effectors, such as vacuum grippers. That's why traditional systems must rely on expensive or custom grippers to reliably clasp parts.

In contrast, visual servoing systems track instantaneous relative position, so parts can be gripped in a range of positions or orientations. This is particularly useful where parts and targets change in real time, such as during the insertion of sutures into suture-needle holes. As a part or target changes position, the vision system detects the variation and adjusts accordingly.

Variation in camera position

Traditional vision guidance requires an accurate calibration between its field of view and its robot's coordinates. If the camera is mounted on a stand, the stand is susceptible to the same thermal expansion and age inaccuracies as a robotic mechanism. Likewise, even on-robot mounts can vary over time.

With visual servoing, as long as the relationship between pixels and real distance stays relatively constant, small disturbances in camera mounting do not affect system accuracy. Visual servoing provides dynamically adaptive calibration, so the need to recalibrate is minimized.

Lens and lighting

Visual servoing requires that both the target location and the part in the end effector are viewed simultaneously. Because these two objects are often in different Z planes, the choices of lens and lighting are very important. A lens with a large depth of field is useful, but it's also necessary to choose a focal distance that either fully views the part, the target, or an area between the two. In any case, setups in which one or both objects go slightly out of focus necessitate vision software with an extensive toolkit.

Cycle time critical

The cycle time from image capture to robot motion is critical. Because systems must process multiple images for every assembly operation, they should be able to successfully complete an image-processing-to-robot-motion cycle in less than 100 ms. Faster processors and improved vision algorithms make these speeds possible. With a fully integrated system, communication delays are as low as 2 ms and (with efficient messaging) as many as 17 frames/sec can be processed. On the other hand, a visual servoing system created with mixed-and-matched components typically suffers from vision-to-motion-system communication delays of about 50 ms. These delays added to processing and motion time reduce throughput to 5 to 10 frames/sec.

To learn more about Precise Automation, visit preciseautomation.com. To read more about machine vision systems, how they work, and how to select critical components, visit motionsystemdesign.com's Knowledge FAQtory and look for links that will connect you to related articles and information.