New to machine vision? Here’s a quick primer.
A standard machine-vision system consists of five basic parts: the camera, optics, illumination or lighting, the image-acquisition hardware, and the machine-vision software. While some people tend to group optics along with the camera, it really is a stand-alone topic. However, the choice of camera plays a role in the choice of optics and vice versa.
The selection of a camera is determined by the object the camera views and the type of image. Most installations use visible- light cameras, but special applications may demand infrared sensitivity. Infrared cameras typically handle heat-mapping applications but also work well when ambient light may vary widely and analysis is sensitive to the change. Though designed for visible spectrum light, many CCDbased cameras also work in the infrared spectrum.
The selection of imaging sensor or imager used in the camera is not as critical as it once was. The old vidicon tubes have given way to solid-state imaging by chargecoupled devices (CCD) and complementary metallic-oxide semiconductor (CMOS) technology. While CCDs held sway over CMOS in the past, neither sensor today is superior to the other. Both have strengths and weaknesses that give each advantages in specific situations. CCDs are the venerable workhorse of solid-state imagers, creating the standard for image quality in photographic, scientific, and industrial applications. CMOS imagers are rapidly gaining ground through the integration of processing circuitry on the same chip as the imager and lower power needs.
More critical than the choice of imager type is its resolution. Each imager is made up of an XY grid of photosites. Each photosite corresponds to a picture element or pixel, the smallest part of an image. Typical pixel resolution today is 640 480 pixels, but imagers can run from 128 96 to 7,216 5,412 pixels. Cameras with higher resolutions see greater details over larger areas. The trade-off is in the speed of processing the image. Larger images take longer to process, thus vision response slows as image resolution grows.
Camera outputs designed to the EIA RS-170 video standard used analog signals viewable on any standard video monitor. But RS-170 was developed for the U.S. broadcast-television industry, and thus was not optimized for machine-vision applications. Video output was limited to 30 frames/sec, and the images were not synchronized with the operation of the machine.
To overcome the RS-170 limitations, early machine-vision systems used proprietary standards established by the vision-system manufacturer. This locked in users of these early systems to the manufacturer for upgrades and service. A change in conditions or parameters of the vision system may have forced a move to an entirely new system from a different maker.
New cameras today now use industry-standard digital outputs for image acquisition. The most common digital connection is IEEE-1394, also known as FireWire. FireWire provides a 400-Mbps bidirectional digital connection between the camera and vision hardware. Because FireWire is a true digital interface, commands can be sent back to the camera for synchronizing the capture of a video image at a specific time, say, as a part moves into focus. Other digital interfaces include USB 2.0 and Ethernet, though the latter is more closely associated with smart cameras rather than image acquisition.
The height and width of the area seen by the camera is called the camera field of view. Its size is controlled by the lens or optics used with the camera. Some cameras come with a fixed-lens system the lens is permanently attached with a fixed magnification or zoom. These cameras are purchased for specific applications that are not expected to change. But more versatile cameras have removable lenses to let optics change to match differing conditions.
Image quality can at best be only as good as the lens that imaged the scene. The basic parameters for selecting a lens are the f-stop range, the focal length, the zoom or magnification factor, and whether to use a telecentric or conventional lens.
The f-stop rating specifies the amount of light that can pass through the lens. The lower the f-stop number, the more light the lens lets through to the imager. Low-light operations demand low f-stop numbers, while areas with high brightness get by with a higher f-stop. F-stop also comes into play with the camera image acquisition speed. Cameras imaging many times per second need a lower f-stop value to admit more light over a shorter period of time. Likewise, zoom lenses with long focal lengths also require lower f-stops as they have less light entering the lens.
Lens focal length defines the image field-of-view. This is the size of the scene the camera sees at a given distance away. A longer focal length means scenes have a smaller height and width for a given distance. For example, switching from 50 to a 200-mm focal length shrinks picture height and width to 25% of that seen at a given distance. Put another way, the original scene will be viewed at 4 the distance away.
The choice of telecentric versus conventional lenses is only critical when size is measured optically. A telecentric lens reduces the viewing angle error and magnification error common in conventional lenses. With conventional lenses, an object appears to grow in size as it gets closer to the camera. A telecentric lens lets light enter the camera straight in, so object size does not change with distance to the camera. This makes setup and calibration easier as there is no requirement for parallax-error compensation.
The key to lighting is a good contrast on the image allowing the camera to detect changes in the object. That’s easier said than done.
Most vision systems use a dedicated light source for illumination. Dedicated lighting optimizes the contrast between the object viewed and its background. It also provides a more uniform lighting condition that reduces the affect of ambient light. Some factors to look at include the optical properties of the target as per its shape, texture, color, or translucency; the geometry of the lighting system; whether the system uses backlit, reflected, or direct illumination; and the type of light such as LED, fluorescent, or halogen.
Directional lighting includes high-pressure sodium and quartz halogen lamps. They can produce sharp shadows and don’t illuminate uniformly. But directional lighting is good for finding irregularities in surfaces, scratches, and other imperfections.
Diffuse lighting gives the most uniform illumination. It minimizes glare and shadows and provides the best illumination for curved surfaces. However, it tends to hide surface features under a uniform glow, making it difficult to detect irregularities in the surface. The area of illumination must be at least 3 larger than the area of inspection. So inspection usually takes place under a light dome a large round dome that distributes light evenly in all directions.
Ring lights are a form of diffuse lighting. A ring of light surrounds the camera lens and is oriented with the lens axis through the center opening of the ring. The ring evenly lights an object in the camera’s field-of-view from all sides. Ring lights reduce shadows on objects with protrusions, but large objects tend to lose illumination on the corners. If too close to an object, a ring light can create a dark spot right in the center where light intensity falls off.
Backlighting is used to more reliably detect shapes and make dimensional measurements. It’s also used to detect foreign material on a clear web or inspect for cracks and holes in opaque materials like sheet metal. Because backlight places the viewed surface in shadow, it does not reveal surface colors or textures. Like diffuse lighting, the illumination area must be larger than the area of inspection.
Strobe lights are necessary with area cameras in high speed applications. The strobe effect freezes the image to prevent blurring. All forms of lighting, such as diffuse, point, or backlight, can have strobe actions associated with them.
An image-acquisition board or frame grabber brings image data into the vision system for interpretation and analysis. Signals from analog cameras must first be converted to a digital format by a frame grabber. The frame grabber than sends the image information to the analysis software and video display card.
In contrast to analog cameras, digital cameras use a digital image- acquisition board that converts the image into the form needed by the vision system. Digital cameras have the advantage of lower noise, higher potential frame rates, and higher potential resolution. While digital video has been more expensive than analog cameras, today the price difference is minimal.
Most contemporary image-acquisition boards support the multiplexing of two to four cameras. The multiple cameras operate simultaneously and independently, through use of multithreading software. In many instances, a PC with multiple FireWire inputs is all that’s necessary, although true acquisition boards off-load the computer CPU to speed capture and processing of images.
Image-acquisition boards typically have trigger inputs to time and synchronize image acquisition. Some boards also support configurable strobe-light control and synchronization with the proper software. Digital outputs interface with other devices and controllers to signal inspection processes and results such as cycle-completed, inspectionpassed, or inspection-failed.
Application software for machine vision is typically created using one of three approaches. The most time consuming is to build applications from scratch using machine-vision libraries with custom code developed in Visual C/C++, Visual Basic, or Java. However, the availability of thirdparty libraries and more comprehensive analysis tools have made this approach less painful in recent years. Development and debugging environments have also improved with greater feature flexibility. However, the custom software that must be developed still requires expertise and is often left to specialized system integrators and equipment manufacturers.
Graphical-programming environments are typically easier to learn and develop than traditional programming methods. However, most were designed for general-purpose data acquisition, not machine vision. This imposes architectural limitations and other constraints that hamper application development and performance efficiency. As with all software, though, the situation improves almost daily. One limit shared with machine-language programming, though, is that each new applications needs its own programming.
The third concept, configurable machine-vision application software, gives equipment manufacturers, system integrators, and end users a point-and-click teaching environment to define vision system functions. If executed effectively, configurable machine-vision software can be set up quickly with little operator training to execute different functions. To maintain this goal, the software typically supports only a basic subset of machine-vision functions. While viable for many machine-vision applications, it does not fulfill them all.
The smart camera
A new line of machine-vision tools is starting to appear called the smart camera. With smart cameras, the entire vision system (except lighting) resides within the camera body. The camera holds the analysis software, outputting results for use by other equipment. Programming is simplified, as is installation and operation. The system basically becomes another sensor in the production line, yet performs a sophisticated series of image capture and analysis. A review of smart cameras appeared in the April 12, 2007, edition of Machine Design.
Machine vision has traditionally been used for inspection applications. But it also possesses power as a tool that dynamically adjusts manufacturing processes. One or more cameras integrated into the design of a manufacturing machine can find and measure features on objects and guide the operation of the machine accordingly. Information from cameras can determine position and orientation of objects for robotic pick-and-place actions, align components into the proper positions and orientations, and determine the proper motion paths for use by automated laser-processing or liquid-dispensing systems. One such example is the use of machine vision to guide a laser-welding process to compensate for variations in weld-component geometry.
A manufacturer of special batteries for space and deep-sea applications needs high-quality seals on their battery casings. The seal has to withstand the high pressures of deep ocean work as well as the vacuum of outer space. Laser welding the seal meets their needs, but it’s essential to position the weld seam accurately along the inner wall of the casing where it meets the outer perimeter of the casing cap.
Automation Engineering Inc., of Wilmington, Mass., (AEi, www. aeiboston.com) tackled the problem by developing a vision-guided laserwelding system. AEi is an automation-equipment developer that uses machine vision on about 90% of the machines they design and deploy.
AEi adapted its Vision Guided Laser Assembly Tool (VGLAT) station to use its Flexauto machine-vision software to find the exact position of the seam. With the laser off the station rotates the battery casing under the laser-welding head while a coaxial digital camera acquires an image of the seam for every degree of rotation to track the variation in seam position. A least-squares fit of each position was mathematically aligned to the part to control the position of the laser. The laser was turned on and the part rotated again to produce an accurate laser-welded seam.
The ranges in size and type of batteries made vary significantly in height and diameter. The VGLAT station uses another side-looking camera with both vertical and horizontal machine-vision edge-detection algorithms to verify that the correct height and diameter battery casing is loaded in the machine for the currently selected process.