Optical Flow for Industrial Applications
What is optical flow and why is it useful? In this article, I am going to answer these two questions and discuss related topics such as the requirements for its application and the state of the art in research.
An optical flow algorithm estimates the displacement of each pixel in an image with respect to a reference image. Before having a closer look at this definition, let me describe some of its uses. Although you will most probably wonder about its industrial applications, I cannot give many examples. The main reason is that optical flow is just at the verge of being applied more widely. Thus, my examples concentrate on other commercial fields in the hope to inspire you to find your own, new ways of applying this method.
So what can we do with optical flow fields? In machine vision, Bill Silver of Cognex believes that optical flow is a vital ingredient for sub-pixel accuracy: low-resolution high-speed cameras could be applied to virtually increase the resolution of the image while saving money on the system setup.
In large companies such as Robert Bosch and Daimler, driver assistance systems based on cameras either rely on stereo vision, on optical flow, or both. Distances to obstacles can be measured, moving objects can be detected, collisions can be predicted and much more. Another huge application area is medical imaging. Companies such as Siemens and General Electric do a great deal of corporate research in this field. Here, optical flow is often called "registration". It is for example applied to time series for the analysis of tumor growth or for denoising x-ray image sequences.
Another important field occupied for example by Sony, Microsoft and Canon is entertainment: innovative game controllers use motion trajectories of humans as input device, video and photo cameras compensate motion and TV sets virtually create 100Hz videos from frame rates as low as 25Hz.
Companies such as Industrial Light and Magic employ similar optical methods for movie and special effects production. For example, the famous "Trinity Scene" in the movie Matrix (1999) was first shot with a camera array.
The view between each camera pair was interpolated by creating intermediate frames based on an optical flow field. During stereo postproduction, flow fields are used to create depth maps which can be used to vary the baseline of the stereo rig synthetically. This is for example done in the software "Nuke" (The Foundry Visionmongers Ltd.).
As the flow usually changes very much at object boundaries, it can also be used to separate objects from each other. This information is regularly used in video compression standards such as MPEG. In case the recorded scene is not moving or the camera stands still and only a single object moves, we can create 3D reconstructions of the world and therefore turn our camera into a 3D scanner. The result is often not highly accurate, but this can be very useful in scenarios were a large number of devices is needed as might be the case in B2C products. Of course you know Google Street View which uses motion and stereo vision to reconstruct entire cities in 3D.
Very often, optical flow is used as a scientific measurement device. For example, blood pumps are analyzed by seeding small, reflecting particles into a fluid which is inserted into the device. The flow of the fluid is recorded with a high-speed camera. The characteristics of the motion are extracted from flow fields. Another fluid is air. Here, flow fields are used both scientifically and commercially for weather forecasting and the optimization of car and airplane shapes.
Also a promising but not yet fully commercialized application is to measure motion characteristics in surveillance scenarios. Here, suspicious behavior can be detected based for example on the gait of a person which is contained in the flow field this person generates.
Challenges in Optical Flow Estimation
This list of applications is not complete. There are a host of applications both in science and business. On first sight this technology has enormous potential. Yet, there are a number of challenges. In order to see if and how you can apply optical flow in your environment you should know its limits. So what are the most important questions for determining how to implement optical flow?
First there is the so-called brightness model. To measure the optical flow, one usually takes two subsequent images of a sequence. A typical assumption is that the intensity remains constant along each pixel's motion trajectory. When objects typically change their distance to the light source, it might be useful to assume a quadratic change of intensity from one frame to the other. If the illumination changes wildly, you will most likely opt not to use optical flow.
The second question is about the so-called regularizer: let us assume constant illumination and consider a single pixel in the first frame. There are many noisy pixels with the same intensity in the second frame which might correspond to the one in the first frame. On first sight, all motions of this pixel are equally probable. Without prior knowledge on which motions are more likely to occur, we cannot devise the correct flow.
A solution to this problem is to either assume or to learn this prior knowledge and use it as a constraint for the motion estimation process. For example, a simple prior is that many pixels are part of the same object and therefore neighboring pixels tend to have the same flow. Thus, we could punish any deviations from a constant flow with an energy term that is large for non-constant motion vectors. There are local and global motion models. In case you already know a lot about your motions or you can produce ground truth easily, optical flow can be implemented both accurately and computationally effective.
The third question is about optimization. Once you have decided on a brightness model and a motion model, you can formulate an energy describing how well a given flow field fits to your models. The optimization scheme will tell you how to change the flow field to better fit your models. This choice is crucial for deciding whether your algorithm will be fast enough for your application. Depending on the needed accuracy and density of the flow field (and of course depending on the system hardware) its computation of the size of 640x480 can take from a few milliseconds to several minutes.
Another factor is the image itself. Optical flow can only be measured reliably if the moving structures are either sufficiently large or the flow is sufficiently small. A typical rule of thumb is that the largest flow vector should be around one pixel in magnitude. With the cost of computation time this can be relaxed to the range of around five pixels or by increasing the camera frame rate. On the other hand one can reduce the accuracy (to a considerable extent) to estimate even larger flows. Whenever both very small and very large flows occur in the sequences, one has to decide which of the two should be measured more accurately.
Very often, vision algorithms are applied to images which have not been carefully captured for the sole purpose of being evaluated by a computer. In fact, many problems of the algorithms can be solved by simple modifications to the image acquisition setup. Here are a few examples.
The texture of the image and the light sources play another crucial role. Ideally, objects should have lambertian materials and their position should be fixed relative to the light source. Shadows cause brightness changes which are often not handled correctly by the brightness model. In case you have the option to spray some removable color to your objects or to control both the shape and the spectrum of the light source without spending considerable amounts of money, you should do so.
Occlusions are also difficult to handle. Fast methods usually cannot deal with occlusions reliably and therefore motion segmentation is still very slow. The main challenge here lies in the fact that very often whole groups of pixels are analyzed instead of single pixels. A typical assumption for this group of algorithms is that all pixels move in the same direction, which is not true at occlusion boundaries. If you can control the motion of objects in a way that helps to devise stricter motion models you can dramatically speed up your optimization schemes while increasing the accuracy of flow estimates at the same time.
Sometimes, you need to be sure about the accuracy of the resulting flow fields. Hence, another important point for consideration is how to benchmark the outcome of the optical flow algorithm. Today, in the four most important journals (IJCV, PAMI, CVIU, IP) there are more than 1500 publications on optical flow. Most surprisingly, there are only four evaluation papers which compare a few of the suggested techniques. It remains an open problem, whether the proposed benchmark datasets are meaningful for industrial applications. As there are also very few example implementations of optical flow methods, one cannot simply test the results on other data. The implementation of a relatively complex algorithm from scratch takes around half a year for a graduate student. Hence, it is very difficult to decide which method could be useful for your application.
All Things Considered
The art of optical flow estimation consists of five important parts. First, one has to model how the intensity of a pixel changes along its motion trajectory. Second, prior knowledge about the resulting flow field needs to be incorporated. Third, the correct flow field has to be estimated based on these two models. To facilitate motion estimation, a modification of the image acquisition scheme should be considered. As reliable benchmarks on existing techniques are unavailable, application-specific tests should be devised to validate the full setup.
As of today, I believe that many motion estimation problems can be solved fast and reliably once all these questions are answered. Yet, optical flow is no all-in-one device suitable for every purpose and is likely to fail completely in equally many scenarios. The choice of all these components very much depends on the application requirements and should be selected very carefully instead of using black-box or general-purpose solvers.