IniLabs Dynamic Vision Sensor solves the sensor-bandwidth-processor problem by mimicking the human eye/optic nerve/brain.
A group called “iniLabs” in Zurich, Switzerland—funded in part by the U.S. Defense Advanced Research Projects Agency (DARPA)—has developed a new kind of optical sensor, more akin to a human eye than to a camera. Its first application is in mini or micro unmanned aerial vehicles (UAVs), but it has many more possible applications. The Swiss project is part of a larger movement tapping the natural world to solve technological problems. It also is part of a larger trend in sensors.
The human eye solves a difficult problem. What you see is a complex scene—were it a computer image, it would consist of many millions of picture elements (pixels)—and it is seamless as you move your head. Yet the optic nerve leading from the retina (where the image is focused) to the brain (which uses it) has a limited capacity. In computer terms, the optic nerve has limited bandwidth. How does nature solve this problem, and can the solution be translated to technology?
Nature does it by creating the picture that matters not on the retina but in the brain. The retina registers only changes in the image, moment to moment. It transmits those changes—not full images—to the part of the brain that maintains the image. In computer terms, nature uses the part of the system with maximum processing power (the brain) to create the picture on which you rely. It accepts limited bandwidth between the processor (the brain) and the sensor (the retina). Placing the image in the brain has considerable advantages. You act based on what you see, and the center of action is in the same brain as the compiled image. You see as though the image is stabilized to a considerable extent (it seems to jump around only if you are shaken, too). On the other hand, experiments have shown that people often miss important details. That happens because the image that matters is the one the brain creates, not the image the eyes actually see. What you see seems to be reality, but it is not quite.
As this picture of a tennis player demonstrates, only the pixels that change are transmitted from image to image.
An electronic camera is very different. Typically, it employs a light sensor that turns light level into an electronic signal. The camera may have one sensor or a matrix of them. It creates an image by scanning the sensor(s) over a scene, collecting data for every pixel. The system is limited by scanning and sensor sampling time, which lengthen as the picture becomes more complex. In a dynamic situation, a picture and the one taken right afterward have to be compared to indicate differences, which in turn have to be interpreted by a processor of some kind. In effect, pixels that do not change between the two pictures are wasted, and handling them eats processor time and power.
The human eye reacts more quickly to changes in the picture. It takes time, however, to create a usable picture in the first place. The eye begins with no picture at all, so everything it sees is a change. The more changes it must process, the slower it must go. That is probably why we do not see clearly upon waking; the brain-imager is collecting changes (from zero). As it collects the data, the picture clears and sharpens.
The idea of a processor collecting bits of information to create a dynamic picture should be familiar. It is how command-and-control (tactical picture-keeping) systems often work. The vocabularies of their data links are orders to make particular changes in, for example, a track picture. When the system is turned on, its picture is blank. As it receives commands to insert or change data, it quickly creates an elaborate picture. Typically, data comes from many sources, so the system has to do more than just keep up with changes. It has to make sure that they are not contradictory. For anyone involved with the old Naval Tactical Data System, that includes the gridlock problem: making sure every message refers to the same coordinates. Without that, the picture soon becomes confused: two ships seeing the same airplane, for example, may report it in different terms. The system the ships share may therefore double-count the airplane. This is not an abstract problem. Part of the reason the USS Vincennes (CG-49) shot down an Iranian Airbus airliner in 1988 was an error in sorting out data from different sources. The Airbus was confused with a warplane landing about a hundred miles away, so those on board the cruiser thought incorrectly that the airliner was diving toward them.
If the picture from your two eyes does not quite match, which is common, your brain has to sort out what it sees, so that it associates what comes from one eye with what the other sees. The classical solution is image-correcting glasses. Understanding that what you see is a processed image raises another possibility. To what extent can the brain be trained to correct what it receives from two different eyes? To what extent is it hard-wired (so that correction has to come from outside), and to what extent can it be programmed? In recent years, there have been claims that training can correct vision.
The eye-brain combination does not have to create large images and then compare them bit by bit to understand what is happening. Instead, it creates a dynamic picture that is relatively easy for the brain to interpret quickly. The iniLabs system is analogous. The designers call changes in the image “events.” Their small drone combines a conventional imager with an event monitor; the combination is called Dynamic Vision Sensor. Presumably the conventional imager is used mainly to provide the drone with an initial image as a basis for changes. Without it, the drone needs time to create such an image out of perceived changes: to wake up, in effect. To force the drone to rely on events, the Swiss team switched a light on and off, creating dramatic events. When the light was on, the drone could refresh its base image using a conventional camera. The drone seems to have been programmed so that it could remember the data base it received in the light, and could hence navigate to a considerable extent in the dark. In an experiment, the hybrid was 130 percent better than one relying only on events and 85 percent better than a standard camera. The drone probably would have been much better in a situation in which the scene around it was changing rapidly, as it would if the drone had been flying at high speed.
DARPA is interested in micro UAVs to operate in dense urban areas. The smaller the UAV, the less elaborate its sensors are likely to be and the less the bandwidth between sensors and central processor. The eye-like approach developed by iniLabs minimizes the bandwidth requirement. At the same time, it should allow the UAV to react faster; for example, if it suddenly approaches a wall or a closing door.
The Swiss approach has a further implication. Perhaps the most profound fact of current technology is Moore’s Law: computer processing power doubles about every 18 months. No such exponential growth continues forever, but it is not clear what will stop Moore’s Law. The question for developers is how to harness inexpensive computing power. Sensors generally are a different matter. The devices that convert light into electronic signals, for example, are not improving at anything like the rate of computer-processing chips. The Swiss approach emphasizes the reality that what comes out of a sensor is a product of both the sensor proper (in this case an imager) and the processor. Doing more with the processor may make a lot more sense than trying to improve the sensor.
To an extent, the Swiss were attacking a problem similar to one the U.S. Navy faced at the end of the Cold War—how to solve a sensing problem given limitations inherent in the sensor itself. The Soviet Navy was closing the acoustic gap—its submarines were becoming quieter and quieter and hence more difficult to detect. The solution attempted late in the Cold War, in the U.S. Navy’s Seawolf (SSN-21)-class submarines, was to increase sensor gain to soak up more of the available signal by using much larger sonar arrays. To accommodate the larger arrays, the submarine had to be larger than its predecessors. With the Cold War over, such large submarines were considered unaffordable. The submarine community was compelled to rethink its approach to sensing, recognizing that it was using a sensor-processor combination. Then, as now, better processing was becoming more affordable. Even without any change in sonar arrays, better signal processing made an enormous difference. The program was called Acoustic Rapid COTS Insertion (A-RCI). COTS—commercial off the shelf—was the vital element, because it enabled the Navy to exploit Moore’s Law. As A-RCI has developed, the Navy has changed computer hardware every few years to exploit what it calls the state of practice, rather than the state of the art—a dramatic improvement, but not experimental and not very expensive. These regular hardware upgrades support frequent changes in software. A key part of the program was the fiber-optic connection between sensor and processor, to carry the signals received in as much detail as possible, so that as much as possible could be extracted from what is received. In A-RCI, the fiber-optic bus serves some of the functions of the optic nerve in a human eye.
As it happened, it also was possible to improve submarine sensors at an affordable cost, without requiring a larger and more costly submarine. This improvement also seems to have been associated with the revolution in digital processing. The new submarine arrays—a big passive bow array, low-cost lightweight flank arrays, and a lightweight long-line towed array—all use fiber-optic sensors, which convert pressure or velocity directly into interferometer readings, which can be read as numbers and inserted directly into digital processors. Thus, current U.S. submarine-sensing capability is a combination of much improved sensors and A-RCI. The reality, however, remains that once a sensor has been installed its performance is difficult to improve. Processing is a different matter.
Norman Friedman is the author of The Naval Institute Guide to World Naval Weapon Systems and Fighters Over the Fleet: Naval Air Defence from Biplanes to the Cold War, both available from the Naval Institute Press (www.usni.org).