The USC Andrew and Erna Viterbi School of Engineering USC Signal and Image Processing Institute USC Ming Hsieh Department of Electrical and Computer Engineering University of Southern California

Technical Report USC-SIPI-182

“3-D Motion Estimation Using a Sequence of Noisy Images”

by Gem-Sun Jason Young

May 1991

In this dissertation we study several aspects of three dimensional (3-D) motion and structure estimation fro a sequence of noisy images. A high order motion model is proposed based on representing the constant acceleration translational motion and constant precession rational motion in the form of a bilinear state space model using standard rectilinear states for translation and quaternions for rotation. The measurements are noisy perturbations of 3-D feature points represented in an inertial coordinate system. The structure of the moving object is estimated from the stereo image pairs prior to the estimation of motion parameters. Owing to the nonlinearity in the state model, nonlinear filters are then designed for the estimation of motion parameters. The Cramer-Rao performance bounds for motion parameter estimates are computed. A constructive proof for the uniqueness of motion parameters is given. It is shown that with uniform sampling in time, three noncollinear feature points in five consecutive binocular image pairs contain all the spatial and temporal information required for motion estimation.

If 3-D motion and structure are estimated simultaneously, we show that the standard 3 x 3 rotation matrix is more suitable for representing the rotational motion than quaternions. Using this representation, a new motion model for constant velocity translation and constant angular velocity rotation is given. Both monocular and binocular systems are considered. Batch and recursive algorithms are designed to search ad track, respectively, the moving rigid object. Occlusion among image frames as well as between each stereo pair in the binocular case is also dealt with. A nonlinear least squares method is used to formulate the batch estimation of motion and structure parameters. Nonlinear Kalman filters are used for the recursive algorithms. Owing to parameterization of translational motion using a time-invariant global normalization factor, linear plant models are used in the recursive filters to update the motion in closed form, and thus avoiding the time-consuming numerical integration, which commonly arises in nonlinear filtering problems.

The optical flow field is another commonly used image measurement for motion estimation. However, flow fields generated by existing algorithms are noisy and thus induce ambiguities in motion parameters. In this work, the inherent ambiguities in recovering 3-D motion information from a single optical flow field are studied using a statistical model. These ambiguities are quantified using the Cramer-Rao lower bound (CRLB), which is a lower bound for the error variances of motion parameters estimates. This performance bound is independent of this motion estimation algorithms, and can always be computed for any arbitrary 3-D motion of a rigid surface by inverting a 5 x 5 matrix. As a special case, the performance bound for the motion of 3-D rigid planar surfaces is studied in detail. For the general motion of an arbitrary surface, it turns out that not every pixel gives information regarding 3-D motion estimation. It is shown that the aperture problem in computing optical flow restricts the nontrivial information about the 3-D motion to a sparse set of pixels at which both components of the flow velocity are observable. Effects of two smoothing schemes on estimation accuracy are analyzed. It is shown that by introducing a smoothness constraint by fitting local patches to 3-D depths gives lower CRLB's. Surprisingly, this reduction in CRLB's is very small. Further, fitting local patches also relaxes the aperture problem since the motion information is not restricted to points at which both optical flow components are observable. In contrast, imposing smoothness on the optical flow by regularization does not lower the CRLB's. Although these results are all derived using a simple Gaussian noise model and should be interpreted with caution, they nevertheless give some new insights in analyzing inherent ambiguities.

To download the report in PDF format click here: USC-SIPI-182.pdf (6.7Mb)