580 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 19, NO. 6, JUNE 1997 In Defense of the Eight-Point Algorithm Richard I. Hartley Abstract—The fundamental matrix is a basic tool in the analysis of scenes taken with two uncalibrated cameras, and the eight-point algorithm is a frequently cited method for computing the fundamental matrix from a set of eight or more point matches. It has the advantage of simplicity of implementation. The prevailing view is, however, that it is extremely susceptible to noise and hence virtually useless for most purposes. This paper challenges that view, by showing that by preceding the algorithm with a very simple normalization (translation and scaling) of the coordinates of the matched points, results are obtained comparable with the best iterative algorithms. This improved performance is justified by theory and verified by extensive experiments on real images. Index Terms—Fundamental matrix, eight-point algorithm, condition number, epipolar structure, stereo vision. —————————— ✦ —————————— 1 INTRODUCTION T eight-point algorithm for computing the essential matrix was introduced by Longuet-Higgins in a now classic paper [1]. In that paper the essential matrix is used to compute the structure of a scene from two views with calibrated cameras. The great advantage of the eight-point algorithm is that it is linear, hence fast and easily implemented. If eight point matches are known, then the solution of a set of linear equations is involved. With more than eight points, a linear least squares minimization problem must be solved. The term eight-point algorithm will be used in this paper to describe this method whether only eight points, or more than eight points are used. The essential property of the essential matrix is that it conveniently encapsulates the epipolar geometry of the imaging configuration. One notices immediately that the same algorithm may be used to compute a matrix with this property from uncalibrated cameras. In this case of uncalibrated cameras it has become customary to refer to the matrix so derived as the fundamental matrix. Just as in the calibrated case, the fundamental matrix may be used to reconstruct the scene from two uncalibrated views, but in this case only up to a projective transformation [2], [3]. Apart from scene reconstruction, the fundamental matrix may also be used for many other tasks, such as image rectification [4], computation of projective invariants [5], outlier detection [6], [7], and stereo matching [8]. Unfortunately, despite its simplicity the eight-point algorithm has often been criticized for being excessively sensitive to noise in the specification of the matched points. Indeed this belief has become the prevailing wisdom. Consequently, because of its importance, many alternative algorithms have been proposed for the computation of the fundamental matrix. See [9], [10] for a description and comparison of several algorithms for finding the fundamental matrix. Without exception, these algorithms are considerably more complicated HE ———————————————— • The author is with G.E. CRD, Schenectady, NY 12301. E-mail: [email protected] than the eight-point algorithm. Other iterative algorithms have been described (briefly) in [11], [12]. It is the purpose of this paper to challenge the common view that the eight-point algorithm is inadequate and markedly inferior to the more complicated algorithms. The poor performance of the eight-point algorithm can probably be traced to implementations that do not take sufficient account of numerical considerations, most specifically the condition of the set of linear equations being solved. It is shown in this paper that a simple transformation (translation and scaling) of the points in the image before formulating the linear equations leads to an enormous improvement in the condition of the problem and hence of the stability of the result. The added complexity of the algorithm necessary to do this transformation is insignificant. It is not claimed here that this modified eight-point algorithm will perform quite as well as the best iterative algorithms. However it is shown by thousands of experiments on many images that the difference is not very great between the modified eight-point algorithm and iterative techiques. Indeed the eight-point algorithm does better than some of the iterative techniques. 2 OUTLINE OF THE EIGHT-POINT ALGORITHM 2.1 Notation Vectors are represented by bold lower case letters, such as u, and all such vectors are thought of as being column vectors unless explicitly transposed (for instance, u> is a row vector). Vectors are multiplied as if they were matrices. In particular, for vectors u and v, the product u>v represents the inner product, whereas uv> is a matrix. The norm of a vector f is equal to the square root of the sum of squares of its entries, that is the Euclidean length of the vector. Similarly, for matrices, we use the Frobenius norm, which is defined to be the square root of the sum of squares of the entries of the matrix. Manuscript received 26 Jan. 1996; revised 11 Apr. 1997. Recommended for acceptance by J. Connell. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number 104913. 0162-8828/97/$10.00 © 1997 IEEE HARTLEY: IN DEFENSE OF THE EIGHT-POINT ALGORITHM 2.2 581 > Linear Solution for the Fundamental Matrix The fundamental matrix is defined by the equation u¢>Fu = 0 (1) for any pair of matching points u¢ ´ u in two images. Given sufficiently many point matches ui¢ ´ ui (at least eight) this equation (1) can be used to compute the un> known matrix F. In particular, writing u = (u, v, 1) and u¢ = > (u¢, v¢, 1) each point match gives rise to one linear equation in the unknown entries of F. The coefficients of this equation are easily written in terms of the known coordinates u and u¢. Specifically, the equation corresponding to a pair of points (u, v, 1) and (u¢, v¢, 1) will be uu¢F11 + uv ¢F21 + uF31 + vu¢F12 + vv ¢F22 + vF32 + u¢F13 + v ¢F23 + F33 = 0 (2) The row of the equation matrix may be represented as a vector (uu¢, uv¢, u, vu¢, vv¢, v, u¢, v¢, 1) (3) From all the point matches, we obtain a set of linear equations of the form Af = 0 (4) where f is a nine-vector containing the entries of the matrix F, and A is the equation matrix. The fundamental matrix F, and hence the solution vector f is defined only up to an unknown scale. For this reason, and to avoid the trivial solution f, we make the additional constraint ifi=1 (5) where i f i is the norm of f. Under these conditions, it is possible to find a solution to the system (4) with as few as eight point matches. With more than eight point matches, we have an overspecified system of equations. Assuming the existence of a non-zero solution to this system of equations, we deduce that the matrix A must be rank-deficient. In other words, although A has nine columns, the rank of A must be at most eight. In fact, except for exceptional configurations [13] the matrix A will have rank exactly eight, and there will be a unique solution for f. This previous discussion assumes that the data is perfect, and without noise. In fact, because of inaccuracies in the measurement or specification of the matched points, the matrix A will not be rank-deficient—it will have rank nine. In this case, we will not be able to find a non-zero solution to the equations Af = 0. Instead, we seek a least-squares solution to this equation set. In particular, we seek the vec> tor f that minimizes iAfi subject to the constraint ifi = f f = 1. It is well known (and easily derived using Lagrange multipliers) that the solution to this problem is the unit ei1 1. An alternative is to set F33 = 1 and solving a linear least squares minimization problem. The general conclusions of this paper are equally valid for this version of the algorithm. genvector corresponding to the smallest eigenvalue of A A. > Note that since A A is positive semi-definite and symmetric, all its eigenvectors are real and positive, or zero. For convenience, (though somewhat inexactly), we will call this > eigenvector the least eigenvector of A A. An appropriate algorithm for finding this eigenvector is the algorithm of Jacobi [14] or the Singular Value Decomposition [14], [15]. 2.3 The Singularity Constraint An important property of the fundamental matrix is that it is singular, in fact of rank two. Furthermore, the left and right null-spaces of F are generated by the vectors representing (in homogeneous coordinates) the two epipoles in the two images. Most applications of the fundamental matrix rely on the fact that it has rank two. The matrix F found by solving the set of linear equations (4) will not in general have rank two, and we should take steps to enforce this constraint. The most convenient way to enforce this constraint is to correct the matrix F found by the solution of (4). Matrix F is replaced by the matrix F¢ that minimizes the Frobenius norm i F - F¢i subject to the condition det F¢ = 0. A convenient method of doing this is to use the Singular > Value Decomposition (SVD). In particular, let F = UDV be the SVD of F, where D is a diagonal matrix D = diag(r, s, t) > satisfying r ≥ s ≥ t. We let F¢ = Udiag(r, s, 0)V . This method was suggested by Tsai and Huang [16] and has been proven to minimize the Frobenius norm of F - F¢, as required. Minimizing the difference between F and F¢ in Frobenius norm has little theoretical justification, and in fact there are other methods of enforcing the singularity constraint a posteriori which have more theoretical basis (for instance, [17]). However, as will be seen this method gives good results. Thus, the eight-point algorithm for computation of the fundamental matrix may be formulated as consisting of two steps, as follows: 1) Linear solution: Given point matches ui¢ ´ ui, solve > the equations ui¢ Fui = 0 to find F. The solution is the > least eigenvector, f of A A, where A is the equation matrix. 2) Constraint enforcement: Replace F by F¢, the closest singular matrix to F under Frobenius norm. This is done using the Singular Value Decomposition. The algorithm thus stated is extremely simple and rapid to implement, assuming the availability of a suitable linear algebra library (for instance, [14]). 3 TRANSFORMATION OF THE INPUT Image coordinates are sometimes given with the origin at the top-left of the image, and sometimes with the origin at the center. The question immediately occurs whether this makes a difference to the results of the eight-point algorithm for computing the fundamental matrix. More generally, to what extent is the result of the eight-point algorithm dependent on the choice of coordinates in the image. Suppose, for instance the image coordinates were changed by some affine or even projective transformation before running the algorithm. Will this materially change the result? That is the question that we will now consider. 582 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 19, NO. 6, JUNE 1997 Suppose that coordinates u in one image are replaced by u$ = Tu , and coordinates u¢ in the other image are replaced > by u$ ¢ = T ¢u¢ . Substituting in the equation u¢ Fu = 0, we derive the equation u$ ¢>T ¢ ->FT -1u$ = 0 , where T ¢ -> is the inverse transpose of T¢. This relation implies that T ¢ ->FT -1 is the fundamental matrix corresponding to the point correspondences u$ ¢ ´ u$ . An alternative method of finding the fundamental matrix is therefore suggested, as follows. 1) Transform the image coordinates according to transformations u$ i = Tui and u$ ¢i = T ¢u¢i . 2) Find the fundamental matrix F$ corresponding to the matches u$ ¢i ´ u$ i . $ . 3) Set F = T ¢>FT The fundamental matrix found in this way corresponds to the original untransformed point correspondences ui¢ ´ ui. What choice should be made for the transformations T and T¢ will be left unspecified for now. First, we need to determine whether carrying out this transformation has any effect whatever on the result. As verified above, u¢>Fu = u$ ¢>F$ u$ , where F$ is defined > by F$ = T ¢ >FT 1 . Thus, if u¢ Fu = e, then also u$ ¢>F$ u$ = e . Thus, there is a one-to-one correspondence between F and F$ giving rise to the same error. It may appear therefore that the matrices F and F$ minimizing the error e (or more exactly, the sum of squares of errors corresponding to all points) will be related by the formula F$ = T ¢ ->FT -1 , and $ . This conhence one may retrieve F as the product T ¢>FT clusion is false however. For, although F and F$ so defined give rise to the same error e, the condition i F i = 1, imposed as a constraint on the solution, is not equivalent to the condition F$ = 1. In particular, there is no one-to-one correspondence between F and F$ giving rise to the same error e, subject to the constraint F = F$ = 1. This is a crucial point, and so we will look at it from a different point of view. A set of point correspondences ui¢ ´ ui give rise to a set of equations of the form Af = 0. If now we make the transformation u$ i = Tui and u$ ¢i = T ¢u¢i , then the set of equations will be replaced by a different set $ $ = 0 . One may verify, in parof equations of the form Af ticular that the matrix A$ may be written in the form A$ = AS where S is a 9 ¥ 9 matrix that may be written explicitly in terms of the entries of T and T¢ (but it is not very important exactly how). Therefore one is led to consider the two sets of equations Af = 0 and ASf$ = 0 . One may guess that the least-squares solutions to these two sets of equations will be related according to f$ = S-1f . If this were so, then replacing f$ by Sf$ one once more retrieves the original solution f. The mapping f$ a Sf$ corresponds precisely to $ . the matrix mapping F$ a T ¢>FT However, things are not that simple. Perhaps the leastsquares solutions to the two sets of equations Af = 0 and ASf$ = 0 are not so simply related. The solution f to the > system Af = 0 is the least eigenvector of the matrix A A. Is > it so that f$ = S-1f is the least eigenvector of (AS) (AS)? Let> ting l be the least eigenvalue of A A, we verify: S>A>ASf$ = = = = π S>A>ASS f S>A>Af S>lf lS>Sf$ lf$ -1 > -1 Thus, in fact, S f is not the least eigenvector of (AS) AS. In fact it is not an eigenvector at all. Let us see how significant this effect is. We take the example that T and T¢ are simply scalings of the coordinates, in fact, multiplication of the coordinates by a factor of 10. These transformations are represented by diagonal matrices of the form T = T¢ = diag(10, 10, 1) acting on homogeneous coordinates. In this case, the matrix S is also a diagonal 2 2 2 2 matrix of the form S = diag(10 , 10 , 10, 10 , 10 , 10, 10, 10, 1), assuming that the vector f represents the elements of F in the row-major order f11, f12, f13, f21, f22, f23, f31, f32, f33. The 4 4 2 4 4 2 2 2 > matrix S S equals diag(10 , 10 , 10 , 10 , 10 , 10 , 10 , 10 , 1). In this case, we see AS >ASf$ = lS>Sf$ , and so f$ is very far > from being an eigenvector of (AS) AS. a f We conclude that the method of transformation leads to a different solution for the fundamental matrix. This is a rather undesirable feature of the eight-point algorithm as it stands, that the result is changed by a change of coordinates, or even simply a change of the origin of coordinates. A similar problem was observed by Bookstein [18] in the problem of fitting conics to sets of points. To correct this, it seems advisable to normalize the coordinates of the points in some way by expressing them in some fixed canonical frame, as yet unspecified. 4 CONDITION OF THE SYSTEM OF EQUATIONS The linear method consists in finding the least eigenvector > > of the matrix A A. This may be done by expressing A A as > a product UDU where U is orthogonal and D is diagonal. We assume that the diagonal entries of D are in non> increasing order. In this case, the least eigenvector of A A is the last column of U. Denote by k the ratio d1/d8 (recalling > that A A is a 9 ¥ 9 matrix). The parameter k is the condition 2 > number of the matrix A A, well known to be an important factor in the analysis of stability of linear problems [19]. Its relevance to the problem of finding the least eigenvector is 2. Strictly speaking, d1/d9 is the condition number, but d1/d8 is the parameter of importance here. HARTLEY: IN DEFENSE OF THE EIGHT-POINT ALGORITHM 583 briefly explained next. The bottom right hand 2 ¥ 2 block of matrix D is of the d 0 form 8 , assuming that d9 = 0, which ideally will be 0 0 the case. Now, suppose that this block is perturbed by the d e . In order to restore this addition of noise to become 8 e 0 matrix to diagonal form we need to multiply left and right FG H IJ K IJ K FG H We may now use the Interlacing Property [19, p. 411] for the eigenvalues of a symmetric matrix to get a bound on the condition number of the matrix. Suppose that the diagonal 8 8 4 8 8 4 > entries of X = A A are equal to (10 , 10 , 10 , 10 , 10 , 10 , 4 4 10 , 10 , 1). We denote by Xr the trailing r ¥ r principal submatrix (that is, the last r columns and rows) of the matrix > > A A, and by li(Xr) its ith largest eigenvalue. Thus, X9 = A A > by V and V, where V is a rotation through an angle q = (1/2)arctan(2e/d8) (as the reader may verify). If e is of the same order of magnitude as d8 then this is a significant ro> > tation. Looking at the full matrix, A A = UDU , we see that and k = l1(X9)/l8(X9). First we consider the eigenvalues of 4 X2. Since the sum of the two eigenvalues is trace(X2) = 10 + 4 1, we see that l1(X2) + l2(X2) = 10 + 1. Since the matrix is positive semi-definite, both eigenvalues are non-negative, 4 the perturbed matrix will be written in the form I 0 . Multiplying by V reUVD¢V >U> where V = 7 ¥ 7 0 V places the last column of U by a combination of the last two columns. Since the last column of U is the least eigenvector of the matrix, this perturbation will drastically alter the > > least eigenvector of the matrix A A. Thus, changes to A A FG H IJ K of the order of magnitude of the eigenvalue d8 cause significant changes to the least eigenvector. Since multiplication by an orthogonal matrix does not change the Frobenius norm of a matrix, we see that A>A = FH Â IK 1/2 9 . d2 i =1 i If the ratio k = d1/d8 is very large, then d8 represents a very small part of the Frobenius norm of the matrix. A perturbation of the order of d8 will therefore cause a very small relative > change to the matrix A A, while at the same time causing a > very significant change to the least eigenvector. Since A A is written directly in terms of the coordinates of the points u ´ u¢, we see that if k is large, then very small changes to the data can cause large changes to the solution. This is obviously very undesirable. The sensitivity of invariant subspaces is discussed in greater detail in [19, p. 413], where more specific conditions for the sensitivity of invariant subspaces are given. We now consider how the condition number of the ma> trix A A may be made small. We consider two sorts of transformation, translation and scaling. These methods will be given only an intuitive justification, since a complete analysis of the condition number of the matrix is too complex to undertake here. The major reason for the poor condition of the matrix > A A is the lack of homogeneity in the image coordinates. In an image of dimension 200 ¥ 200, a typical image point will be of the form (100, 100, 1). If both u and u¢ are of this form, then the corresponding row of the equation matrix will be 4 4 2 4 4 2 2 2 > of the form r = (10 , 10 , 10 , 10 , 10 , 10 , 10 , 10 , 1). The > > contribution to the matrix A A is of the form rr , which will 8 contain entries ranging between 10 and one. For instance, 8 8 4 8 8 > the diagonal entries of A A will be (10 , 10 , 10 , 10 , 10 , 4 4 4 10 , 10 , 10 , 1). Summing over all point correspondences > will result in a matrix A A for which the diagonal entries are approximately in this proportion. so we may deduce that l1(X2) £ 10 + 1. From the interlacing 4 property, we deduce that l8(X9) £ l7(X8) £ º l1(X2) £ 10 +1. On the other hand, also from the interlacing property, we > know that the largest eigenvalue of A A is not less than the 8 largest diagonal entry. Thus, l1(X9) ≥ 10 . Therefore, the 8 4 ratio k = l1(X9)/l8(X9) ≥ 10 /(10 + 1). Usually, in fact l8(X9) 4 will be much smaller than 10 + 1 and the condition number will be far greater. This analysis shows that scaling the coordinate so that the homogeneous coordinates are on the average equal to > unity will improve the condition of the matrix A A. 4.1 Translation Consider a case where the origin of the image coordinates is at the top left hand corner of the image, so that all the image coordinates are positive. In this case, an improvement in the condition of the matrix may be achieved by translating the points so that the centroid of the points is at the origin. This claim will be verified by experimentation, but can also be explained informally by arguing as follows. Suppose that the first image coordinates (the u-coordinates) of a set of points are {1001.5, 1002.3, 998.7, º}. By translating by 1,000, these numbers may be changed to {1.5, 2.3, -1.3}. Thus, in the untranslated values, the significant values of the coordinates are obscured by the coordinate offset of 1,000. The significant part of the coordinate values is found only in the third or fourth significant figure of the coordinates. This has a bad effect on the condition of the corre> sponding matrix A A. A more detailed analysis of the effect of translation is not provided here. 5 NORMALIZING TRANSFORMATIONS The previous sections concerned with the condition number > of the matrix A A indicate that it is desirable to apply a transformation to the coordinates before carrying out the eight-point algorithm for finding the fundamental matrix. This normalization has been implemented as a prior step in the eight-point algorithm with excellent results. 5.1 Isotropic Scaling As a first step, the coordinates in each image are translated (by a different translation for each image) so as to bring the centroid of the set of all points to the origin. The coordi- 584 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 19, NO. 6, JUNE 1997 nates are also scaled. In the discussion of scaling, it was suggested that the best results will be obtained if the coordinates are scaled, so that on the average a point u is of the > form u = (u, v, w) , with each of u, v, and w having the same average magnitude. Rather than choose different scale factors for each point, an isotropic scaling factor is chosen so that the u and v coordinates of a point are scaled equally. To this end, we choose to scale the coordinates so that the average distance of a point u from the origin is equal to 2 . > This means that the “average” point is equal to (1, 1, 1) . In summary the transformation is as follows: 1) The points are translated so that their centroid is at the origin. 2) The points are then scaled so that the average distance from the origin is equal to 2 . 3) This transformation is applied to each of the two images independently. 5.2 Non-Isotropic Scaling In non-isotropic scaling, the centroid of the points is translated to the origin as before. After this translation the points form a cloud about the origin. Scaling is then carried out so that the two principal moments of the set of points are both equal to unity. Thus, the set of points will form an approximately symmetric circular cloud of points of radius one about the origin. Both translation and scaling can be done in one step as > follows. Let ui = (ui, vi, 1) for i = 1, º, N and form the ma> trix Âiuiui . Since this matrix is symmetric and positive tude. Thus, entries of F small in absolute value may be expected to undergo a perturbation much greater relative to their magnitude than the large entries. Suppose that a set of matched points is normalized so that on the average all three homogeneous coordinates have the same magnitude. Thus, a typical point will look > like (1, 1, 1) . The fundamental matrix computed from these normalized coordinates may be expected to have all its entries approximately of the same magnitude. This may not be true if applied to specific classes of cameras, but it will be true for fundamental matrices computed from arbitrarily selected matched points, as the following argument shows. A permutation of the three homogeneous coordinates in either or both the images will result in another set of realizable matched points. The corresponding fundamental matrix will be obtained from the original one by permuting the corresponding rows and/or columns of the matrix. In doing this, any entry of F may be moved to any other position. This means that no entry of the fundamental matrix is qualitatively different from any other, and hence on the average (over all possible sets of matched points) all entries of F will have the same average magnitude. Now, consider what happens if we scale the coordinates of points ui and ui¢ by a factor which we will assume is equal to 100. Thus, a typical coordinate will be of the order > of (100, 100, 1) . The corresponding fundamental matrix F will be obtained from the original one by multiplying the first two rows, and the first two columns by 10 2. Entries in -4 the the top left 2 ¥ 2 block will be multiplied by 10 . We conclude that a typical fundamental matrix derived from > coordinates of magnitude (100, 100, 1) will have entries of the following order of magnitude: F 10 F = G 10 GH 10 definite, we may take its Choleski factorization [15], [14] to get N Âi =1 > > ui ui = NKK , where K is upper triangular. It fol-1 > -> lows that ÂiK uiui K = NI, where I is the identity matrix. Setting u$ i = K -1ui , we have u$ i u$ > i = NI . Consequently, Âi the set of points u$ i have their centroid at the origin and the two principal moments are both equal to unity, as desired. -1 Note that K is upper triangular, and so it represents an affine transformation. To summarize, the points are transformed so that 1) Their centroid is at the origin. 2) The principal moments are both equal to unity. 6 SCALING IN STAGE 2 So far we have discussed the effect of a normalizing transformation on the first stage of the eight-point algorithm, namely the solution of the set of linear equations to find F. The second step of the algorithm is to enforce the singularity constraint that det F = 0. The method described above of enforcing the singularity constraint gives the singular matrix F$ nearest to F in Frobenius norm. The trouble with this method is that it treats all entries of the matrix equally, regardless of their magni- -4 -4 -2 - 10 4 10 4 -2 10 - 10 2 10 2 1 I JJ K (6) To verify this conclusion, below is the fundamental ma3 trix for the pair of house images in Fig 1. F GH -9.796e - 08 1.473e - 06 -6.660e - 04 F = -6.346e - 07 1.049e - 08 7.536e - 03 9.107 e - 04 -7.739e - 03 -2.364e - 02 I JK (7) In comparing (7) with (6), one must bear in mind that F is defined only up to nonzero scaling. The imbalance of the matrix (7) is even worse than predicted by (6) because the image has dimension 512 ¥ 512. Now, in taking the closest singular matrix, all entries will tend to be perturbed by approximately the same amount. However, the relative perturbation will be greatest for the smallest entries. The question arises whether the small entries in the matrix F are im> portant. Consider a typical point u < (100, 100, 1) . In computing the corresponding epipolar line Fu, we see that the largest entries in the vector u are multiplied by the smallest, and hence least relatively stable entries of the matrix F. Thus, for computation of the epipolar line, the smallest entries in F are the most important. We have the following undesirable condition: The most important entries in the fun-8 3. The notation -9.766e-08 means -9.766 ¥ 10 . HARTLEY: IN DEFENSE OF THE EIGHT-POINT ALGORITHM damental matrix are precisely those that are subject to the largest relative perturbation when enforcing the singularity constraint without prior normalization. This condition is corrected if normalization of the image coordinates is carried out first, for then all entries of the fundamental matrix will be treated approximately equally, and none is more important than another in computing epipolar lines. 585 line in the second image corresponding to point ui is Fui. > Similarly, F u¢i is the epipolar line corresponding to u¢i. Point ui¢ lies on epipolar line Fui if and only if u¢i>Fui = 0 . > However, the quantity ui¢ Fui does not correspond to any meaningful geometric quantity, certainly not to distance between the point ui¢ and the epipolar line Fui. Writing Fui > 7 EXPERIMENTAL EVALUATION The eight-point algorithm with prior transformation of the coordinates, as described here will be called the normalized eight-point algorithm. This algorithm was tested on a large number of real images to evaluate its performance. In carrying out these tests, the eight-point algorithm with prenormalization as described above was compared with several other algorithms for finding the fundamental matrix. For the most part the implementations of these other algorithms were provided by other researchers, whom I will acknowledge later. In this way the results were not biased in any way by my possibly inefficient implementation of competing algorithms. In addition, the images and matched points that I have tested the algorithms on have been supplied to me. Methods of obtaining the matched points therefore varied from image to image, as did methods for eliminating bad matches (outliers). In all cases, however, the matched points were found by automatic means, and usually some sort of outlier detection and removal was carried out, based on least-median squares techniques (see [6], [7], [8]). The general procedure for evaluation was as follows: 1) Matching points were computed by automatic techniques, and outliers were detected and removed. 2) The fundamental matrix was computed using a subset of all points. 3) In the case of algorithms, such as the eight-point algorithm, that do not automatically enforce the singularity constraint (that is the constraint that det F = 0) this constraint was enforced a posteriori by finding the nearest singular matrix to the computed fundamental matrix. This was done using the Singular Value Decomposition (as in [16], [20]). 4) For each point ui, the corresponding epipolar line Fui was computed and distance the line Fui from the matching point ui¢ was calculated. This was done in both directions (that is, starting from points ui in the first image and also from ui¢ in the second image). The average distance of the epipolar line from the corresponding point was computed, and used as a measure of quality of the computed Fundamental matrix. This evaluation was carried out using all matched points, except outliers, and not just the ones that were used to compute F. 7.1 Other Algorithms A brief description of the algorithms tested follows, but first some notation. Given fundamental matrix F and point ui, the epipolar = (l, m, n) , the distance d(ui¢, Fui) is equal to u¢i>Fui / l2 + n 2 , provided ui¢ = (ui¢, vi¢, 1) . Similarly, de> > noting F u¢i by (l¢, m¢, n¢), one has d ui , F>u¢i = u¢i>Fui / l ¢2 + n ¢ 2 . e j 7.1.1 The Eight-Point Algorithm In this algorithm, the points were used as is, without pretransformation to compute the fundamental matrix. The algorithm minimizes the quantity Âieu¢i>Fui j 2 . The singu- larity constraint was enforced. 7.1.2 The Eight-Point Algorithm With Isotropic Scaling The eight-point algorithm was used with the translation and isotropic scaling method described in Section 5.1. The singularity constraint was enforced. 7.1.3 The Eight-Point Algorithm With Non-Isotropic Scaling This is the same as the previous method, except that the nonisotropic scaling method described in Section 5.2 was used. 7.1.4 Minimizing the Epipolar Distances An implementation by Zhengyou Zhang of an algorithm described in [9], [6], [10] was used. This is an iterative algorithm that uses a parametrization of the fundamental matrix with seven parameters. Thus the singularity constraint is enforced as part of the algorithm. The cost function being minimized is the squared sum of distances of the points from epipolar lines. The point-line distances in both images are taken into account. Thus this algorithm minimizes Â dcu¢i , Fui h 2 + d ui , F>u¢i e j 2 = i F eu¢ Fu j GH l i 1 2 > i 2 i + 2 mi + I J + m¢ K 1 2 l ¢i 2 i Two versions of this algorithm were tested, in which respectively the unnormalized and normalized versions of the eight-point algorithm were used for initialization. 7.1.5 A Gradient-Based Technique This algorithm is related to the previous method, but it minimizes a slightly different cost function, namely > eu¢ Fu j i 2 i Â l2 + m 2 + l ¢2 + m ¢2 i i i i i This cost function is a first order approximation to the 586 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 19, NO. 6, JUNE 1997 Fig. 1. Houses Images. The epipoles are a long way from the image centers. Fig. 2. Statue image. An outdoor scene with the epipoles well away from the center. point-displacement error discussed in the next method (below). The implementation tested was by Zhengyou Zhang, and the algorithm is discussed in [9], [10]. Here also this algorithm was initialized using either the normalized or unnormalized eight-point algorithm. 7.1.6 Minimizing Point Displacement This algorithm (my own implementation) is an iterative algorithm. It finds the fundamental matrix F and points u$ i > and u$ ¢i such that u$ ¢i Fu$ i = 0 exactly, det F = 0 and the squared pixel error Âidcu$ i , ui h 2 c + d u$ ¢i , u¢i h 2 is minimized. The details of how this is done are described in [11], [21]. Under the assumption of gaussian noise in the placement of the matched points (an approximation to the truth), this algorithm gives the fundamental matrix corresponding to the most likely true placement of the matched points (the estimated points u$ i ´ u$ ¢i ). For this reason, I have generally considered this algorithm to be the best available. The experiments generally bear out this belief, but it is not the purpose of this paper to justify this point. This algorithm is referred to as the “optimal algorithm” in this paper. 7.1.7 Approximate Calibration The results of an algorithm of Beardsley and Zisserman [12] were provided for comparison. This algorithm does an approximate normalization of the coordinates by selecting the origin of coordinates at the centre of the image, and by scaling by division by the approximate focal length of the camera (measured in pixels—that is, the scaling factor in the calibration matrix). Since this method employs a normalization similar to the isotropic scaling algorithm, one expects it to give similar results. It does, however rely on some approximate knowledge of camera calibration. 7.1.8 Iterative Linear Another algorithm provided by Beardsley and Zisserman is representative of a general approach to improving the performance of linear algorithms. This same approach can be applied to many different linear algorithms, such as camera pose and calibration estimation [22], projective reconstruction from lines [23], and reconstruction of point positions in space [24]. In this approach, the eight-point algorithm is run a first time. From this initial solution a set of weights for the linear equations are computed. The set of linear equations are multiplied by these weights and the eightpoint algorithm is run again. This may be repeated several times. The weights are chosen in such a way that the linear HARTLEY: IN DEFENSE OF THE EIGHT-POINT ALGORITHM 587 Fig. 3. Grenoble museum. The epipoles are close to the image. Fig. 4. Corridor scene. In the corridor scene the epipoles are right in the image. equations express a meaningful measurable quantity. In this case, to minimize point-epipolar line distance, each > equation u¢i Fui = 0 is multiplied by the weight 7.3 Graphical Presentation of the Results where the values li, mi, li¢, and mi¢ are computed from the previous iteration. The advantage of this type of algorithm is that it is simple to implement compared with iterative parameter estimation methods, such as LevenbergMarquardt [14]. Figs. 6-11 show the results of several runs of the algorithms, with different numbers of points being used. The number of points used to compute the fundamental matrix ranged from eight up to three-quarters of the total number of matched points. For each value of N, the algorithms were run 100 times using randomly selected sets of N matching points. The average error (point–epipolar line distance) was computed using all available matched points. The graphs show the average error over the 100 runs for each value of N. The error shown is the average point-epipolar line distance measured in pixels. 7.2 The Images 7.3.1 Effect of Normalization on the Condition Number The various algorithms were tried with five different pairs of images. The images are presented in Figs. 1-5 to show the diversity of image types, and the placement of the epipoles. A few of the epipolar lines are shown in the images. The intersection of the pencil of lines is the epipole. There was a wide variation in the accuracy of the matched points for the different images, as will be indicated later. Fig. 6 shows a plot of the base-10 logarithm of the condition number of the linear equation set in the case of the house images, for varying numbers of points (the x-axis). The upper curve is without normalization, the lower one with 8 normalization. The improvement is approximately 10 . wi = F GH l 2 1 2 +m + I J + m¢ K 1 l ¢2 1/2 2 7.3.2 Effect of Normalization on the Two Stages of the Algorithm Fig. 7 shows the effect of normalization in the two stages of the eight-point algorithm. To explain this, four algorithmic steps may be identified: 588 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 19, NO. 6, JUNE 1997 Fig. 5. Calibration jig. In this calibration jig, the matched points were known extremely accurately. Fig. 6. Effect of normalization on the condition number. • Normalization: Transformation of the image coordinates using transforms T and T¢. • Solution: Finding matrix F by solving a set of linear equations. • Constraint enforcement: Replacing F by the closest singular matrix. > • Denormalization: Replacing F by T¢ FT. It is possible to take these steps in a different order to show the effect of normalization on the Solution (stage 1) and Constraint enforcement (stage 2) steps of the algorithm. Thus, the four curves shown correspond to the following algorithm steps: 1) No normalization: Solution–Constraint enforcement. 2) Stage 1 normalization: Normalization–Solution–Denormalization–Constraint enforcement. 3) Stage 2 normalization: Solution–Normalization–Constraint enforcement–Denormalization. 4) Both stages of normalization: Normalization– Solution–Constraint enforcement–Denormalization. As may be seen, normalization has the greatest effect on stage 1 (the Solution stage), but normalization for stage 2 Fig. 7. Effect of normalization on the two stages of the algorithm. has a significant effect as well. The best results are had by doing normalization in both stages. Note how for N = 8 the normalization has no effect on stage 1, since in this case we are finding the solution to a set of equations, and not a least-squares solution to a redundant set. This explains why the two pairs of curves show the same results for N = 8. For these experiments, the house images were used. 7.3.3 Comparison of Normalized and Unnormalized Eight-Point Algorithms Fig. 8 shows the improvement achieved by normalization. The images used are: house, statue, museum, calibration, and corridor. Note the differences in Y-scale for the different plots. For some of the images the matched points were known with extreme accuracy (calibration image, corridor scene), whereas for others, the matches were less accurate (museum image). In all cases the normalized algorithm performs better than the unnormalized algorithm. In the cases of the statue and corridor images the effect is not so great. In the case of the images with less accurate matches, the advantage of normalization is dramatic. HARTLEY: IN DEFENSE OF THE EIGHT-POINT ALGORITHM 589 Fig. 8. Comparison of normalized and unnormalized eight-point algorithms. 7.3.4 Comparison of the Eight-Point Algorithm With the Optimal Algorithm Fig. 9 is the same as Fig. 8, except that it compares the normalized eight-point algorithm with the optimal (minimized point displacement) algorithm. In all cases the normalized eight-point algorithm performs almost as well as the optimal algorithm. 7.3.5 Isotropic vs. Non-Isotropic Scaling The eight-point algorithm with isotropic and non-isotropic scaling was compared in Fig. 10. The two variables are almost indistinguishable. 7.3.6 Comparison With Other Algorithms The papers [9], [10] give details of several good algorithms, and the normalized eight-point algorithm was carefully compared with some of these. Two algorithms were tried: 1) the iterative algorithm, minimizing the symmetric point-epipolar line distance in the two images 2) the gradient-based method. See Section 7.1 for more details. For the tests, implementations of these algorithms supplied by Zhang in executable format were used. These are among the best algorithms available for computing the fundamental matrix. On theoretical grounds, the second of these methods may be preferable, but in our experiments they performed 590 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 19, NO. 6, JUNE 1997 Fig. 9. Comparison of the eight-point algorithm with the optimal algorithm. Fig. 10. Isotropic vs. non-isotropic scaling. almost identically. This is confirmed by [10]. Consequently, only the results of the comparisons with the gradient-based method are shown in the following graphs, which compare the normalized eight-point algorithm with the gradientbased method (see Fig. 11). Results are shown in Fig. 11 for three of the data sets. In the other two cases (statue and corridor), the results of the two algorithms were almost indistinguishable. In fact, it is a curious thing that all algorithms (even the unnormalized eight-point algorithm) give very similar performance on these two data sets. In the three graphs shown, the normalized eight-point algorithm performs distinctly better than the iterative algorithms on the house data set, worse on the museum data set and just slightly worse on the calibration set. In this comparison the iterative algorithms were initialized using the unnormalized eight-point algorithm. Comparison with Fig. 9 shows that they do not perform as well as the optimal algorithm. If the normalized eight-point algorithm is used for initialization, then the results improve and are not significantly different from those of the optimal HARTLEY: IN DEFENSE OF THE EIGHT-POINT ALGORITHM 591 Fig. 11. Comparison with other algorithms. (a) (b) Fig. 12. Reconstruction error. (a) A comparison of the unnormalized and the normalized eight-point algorithms. (b) The normalized eight-point and optimal algorithms. algorithm. Once more, Zhang’s implementation was used for this test. Thus, in carrying out an iterative algorithm to find the fundamental matrix, good initialization seems to be more important than exactly which cost function is being minimized. The normalized eight-point algorithm was also compared with the Least Median of Squares algorithm of Zhang, but the latter algorithm did not perform so well on our tests. This is probably because it is weeding out outliers. Outlier rejection has already been performed on the data sets using the techniques of [7] and all remaing points are used in evaluating the fit, including points that Zhang’s Least Median of Squares algorithm may have rejected. The normalized eight-point algorithm was also compared with two algorithms supplied by Andrew Zisserman and Paul Beardsley. These are, respectively, the algorithms referred to as “Approximate Calibration” and “Iterative Linear” in Section 7.1. The results of all three algorithms were roughly comparable, though insufficiently many tests were run to reach a firm conclusion. The results of this test are reported in [25]. 7.3.7 Reconstruction Error To test the performance of the various algorithms for reconstruction accuracy, experiments were done to measure the degradation of accuracy as noise levels increase. The Calibration images (5) were used for this purpose. Since reconstruction error is most appropriately measured in a Euclidean frame, a Euclidean model was built for the calibration cube, initially by inspection and then by refinement using 592 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 19, NO. 6, JUNE 1997 the image data. This model served as ground truth. Next, the image coordinates were corrected (by an average of 0.02 pixel) to agree exactly with the Euclidean model. Varying amounts of zero-mean Gaussian noise were added to the image coordinates, a projective reconstruction was carried out, and a projective transformation was computed to bring the projective reconstruction most nearly into agreement with the model. The average 3D displacement of the reconstructed points from the model was measured. The plotted values are the result average over all points (128 in all) for 10 trials. The reconstruction error is measured in units equal to the length of the side of one of the black squares in the image. In Fig. 12a is a comparison of the unnormalized and the normalized eight-point algorithms. In Fig. 12b, the normalized eight-point and optimal algorithms are shown. The result shows that the results of the normalized eight-point algorithm is almost indistinguishable from the optimal algorithm, but that the unnormalized algorithm performs very much worse. 8 CONCLUSIONS With normalization of the coordinates in order to improve the condition of the problem, the eight-point algorithm performs almost as well as the best iterative algorithms. On the other hand, it runs about 20 times faster and is far easier to code. There seems to be little advantage in choosing the non-isotropic scaling scheme for the normalization transform, since the simpler isotropic scaling performs just as well. Without normalization of the inputs, however, the eight-point algorithm performs quite badly, often with errors as large as 10 pixels, which makes it virtually useless. It would seem to follow that the reason that other researchers have had such poor results with the eight-point algorithm is that they have not carried out any preliminary normalization step as discussed here. Even if extra accuracy is needed and an iterative algorithm is used, it is best to use the normalized, rather than the unnormalized eight-point algorithm to provide a starting point for iteration. Difficulties with stopping criteria, as well as the risk of finding a local minimum mean that the quality of the iteratively estimated result depends on the initial estimate. The technique of data normalization described here is widely applicable to other problems. Among others it is directly applicable to the following problems: computing the projective transformations between point sets; estimating the trifocal tensor [26] and determining the camera matrix of a projective camera using the DLT algorithm [27]. Jean-Claude Cottier gave me the museum and statue images and matched points, and Long Quan and Boubakeur Boufama gave me the house images and matched points and the use of their algorithm. In addition, Gerard Medioni supplied a coding of Berthold Horn’s algorithm [28] for reconstruction in the calibrated case. Evaluation of these methods in the calibrated case is a project for possible future work. Finally, thanks to Roger Mohr for making possible my sojourn in Grenoble, allowing me the possibility to do this work. REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] ACKNOWLEDGMENTS I wish to thank all those people who supplied data and algorithms to me for the running of these tests. This includes most specifically Andrew Zisserman and Paul Beardsley, who gave me the corridor and calibration jig image sets and matched points; Zhengyou Zhang supplied implementations of other methods used for comparison. These are available at: http://www.inria.fr/robotvis/personnel/zzhang/zzhangeng.html [17] [18] [19] [20] H.C. Longuet-Higgins, “A Computer Algorithm for Reconstructing a Scene From Two Projections,” Nature, vol. 293, pp. 133–135, Sept 1981. O.D. Faugeras, “What Can Be Seen in Three Dimensions With an Uncalibrated Stereo Rig?” Computer Vision—ECCV ‘92, LNCSSeries Vol. 588. New York: Springer-Verlag, 1992, pp. 563–578. R. Hartley, R. Gupta, and T. Chang, “Stereo From Uncalibrated Cameras,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1992, pp. 761–764. R. Hartley and R. Gupta, “Computing Matched-Epipolar Projections,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1993, pp. 549–555. S. Carlsson, “Multiple Image Invariants Using the Double Algebra,” Proc. Second Europe-U.S. Workshop on Invariance, pp. 335–350, Ponta Delgada, Azores, Oct. 1993. R. Deriche, Z. Zhang, Q.-T. Luong, and O. Faugeras, “Robust Recovery of the Epipolar Geometry for an Uncalibrated Stereo Rig,” Computer Vision-ECCV ‘94, vol. 1, LNCS-Series Vol. 800, Springer-Verlag, 1994, pp. 567–576. P.H.S. Torr and D.W. Murray, “Outlier Detection and Motion Segmentation,” Sensor Fusion VI, P.S. Schenker, ed., 1993, pp. 432– 443, SPIE vol. 2059, Boston. Z. Zhang, R. Deriche, O. Faugeras, and Q.-T. Luong, “A Robust Technique for Matching Two Uncalibrated Images Through the Recovery of the Unknown Epipolar Geometry,” Artificial Intelligence J., vol. 78, pp. 87–119, Oct. 1995. Q.-T. Luong, R. Deriche, O.D. Faugeras, and T. Papadopoulo, “On Determining the Fundamental Matrix: Analysis of Different Methods and Experimental Results,” Technical Report RR-1894, INRIA, 1993. Z. Zhang, “Determining the Epipolar Geometry and Its Uncertainty: A Review,” Technical Report RR-2927, INRIA, 1996. R.I. Hartley, “Euclidean Reconstruction From Uncalibrated Views,” Proc. Second Europe-U.S. Workshop on Invariance, pp. 187– 202, Ponta Delgada, Azores, Oct. 1993. P.A. Beardsley, A. Zisserman, and D.W. Murray, “Navigation Using Affine Structure From Motion,” Computer Vision-ECCV ‘94, vol. 2, LNCS-Series vol. 801, Springer-Verlag, 1994, pp. 85–96. S.J. Maybank, “The Projective Geometry of Ambiguous Surfaces,” Phil. Trans. R. Soc. Lond., vol. A 332, pp. 1–47, 1990. W.H. Press, B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling, Numerical Recipes in C: The Art of Scientific Computing. Cambridge, England: Cambridge Univ. Press, 1988. K.E. Atkinson, An Introduction to Numerical Analysis, 2nd ed. New York: John Wiley and Sons, 1989. R.Y. Tsai and T.S. Huang, “Uniqueness and Estimation of Three Dimensional Motion Parameters of Rigid Objects With Curved Surfaces,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, pp. 13–27, 1984. R.I. Hartley, “Minimizing Algebraic Distance,” Proc. DARPA Image Understanding Workshop, 1997. F.L. Bookstein, “Fitting Conic Sections to Scattered Data,” Computer Graphics and Image Processing, vol. 9, pp. 56–71, 1979. G.H. Golub and C.F. Van Loan, Matrix Computations, 2nd ed. Baltimore, Md.: Johns Hopkins Univ. Press, 1989. R.I. Hartley, “Estimation of Relative Camera Positions for Uncalibrated Cameras,” Computer Vision-ECCV ‘92, LNCS-Series vol. 588, Springer-Verlag, 1992, pp. 579–587. HARTLEY: IN DEFENSE OF THE EIGHT-POINT ALGORITHM [21] R.I. Hartley, “Projective Reconstruction and Invariants From Multiple Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, pp. 1,036–1,041, Oct. 1994. [22] I.E. Sutherland, “Sketchpad: A Man-Machine Graphical Communications System,” Technical Report 296, MIT Lincoln Laboratories, 1963, also published by Garland Publishing Inc., New York, 1980. [23] R.I. Hartley, “Projective Reconstruction From Line Correspondences,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1994, pp. 903–907. [24] R.I. Hartley and P. Sturm, “Triangulation,” Proc. ARPA Image Understanding Workshop, 1994, pp. 957–966. [25] R.I. Hartley, “In Defence of the 8-Point Algorithm,” Proc. Int’l Conf. Computer Vision, 1995, pp. 1,064–1,070. [26] R.I. Hartley, “A Linear Method for Reconstruction From Lines and Points,” Proc. Int’l Conf. Computer Vision, 1995, pp. 882–887. [27] I.E. Sutherland, “Three Dimensional Data Input by Tablet,” Proc. IEEE, vol. 62, no. 4, pp. 453–461, Apr. 1974. [28] B.K.P. Horn, “Relative Orientation,” Int’l J. Computer Vision, vol. 4, pp. 59–78, 1990. 593 Richard I. Hartley received the BSc degree in mathematics from the Australian National University and MSc and PhD degrees also in mathematics from the University of Toronto. He also holds an MS degree in computer science from Stanford University. He has held various research positions in mathematics at the University of Frankfurt, Germany; Columbia University, New York; and Melbourne University, Australia, carrying out research in the area of 3D geometric and algebraic topology. Since 1985, he has been employed at GE’s Corporate Research and Development Center, where he has carried out research in the areas of VLSI CAD, rapid prototyping of electronic systems, DSP circuit design, and computer vision. His interests include automated techniques for DSP chip and multichip module design. He was the lead developer of the Parsifal silicon design system, which was used extensively within GE, and he has recently published a book on digit-serial computation, describing that work. In recent years, Dr. Hartley has concentrated much of his research effort in the areas of computer vision and photogrammetry, particularly related to camera modeling and structure from motion from uncalibrated images, as well as industrial and medical applications of computer vision techniques. He is the author of over 70 research papers in the areas of photogrammetry, geometric topology, geometric voting theory, computational geometry, and computer-aided design. He holds 30 U.S. patents in the areas of CAD, circuit design, DSP design, and industrial and medical imaging.

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement