CSC2503—Foundations of Computer Vision, Fall 2015 Assignment 4: Detecting Human Eyes Due: 2pm, Wednesday, December 8 This assignment uses a learning approach for view-based object detection. To get started, first download A4Handout.zip from the course web page. Figure 1: Results of an eye detector roughly similar to the detector constructed in this assignment. Green circles indicate detected eyes. Your results will vary. What to hand in. Write a short report i PDF format addressing each of the itemized questions below (LATEX or Word reports are preferred but a scanned handwritten report is OK too). You can assume that the marker knows the context of the questions, so do not spend time repeating material in the hand-out or in class notes. Include the output from your programs in your report. Pack your report into file A4.zip, along with any Matlab files you altered or created, and any additional result files you want to show. Submit this zip file electronically via CDF. Training Data. One large set of eye and non-eye images has been split into two disjoint sets, one in trainSet.mat, the other in testSet.mat. We will use the former to train various eye detectors and, once the training is complete, we will use the latter to test their performance. The images of eyes have been warped to be roughly of constant position, orientation, and scale (the scale was set by specifying an inter-eye distance of about 40 pixels). These warped images were then cropped to be of size 20 × 25. They are represented as 500 dimensional columns. There is also a set of non-eye images in the training and test sets, which are used to tune and test the detectors. 1. Weak Classifiers. We demonstrate an eye classifier built using the AdaBoost learning algorithm. This is a boosted classifier, based on many weak classifiers. Your task in this question is to complete trainStump.m, which sets some of the parameters for a weak classifier. The input for this M-function is a projection vector f~, a flag useAbs, an N × K dimensional data matrix X, a 1 × K class label vector ~y (with yk equal to 0 or 1, indicating the kth column of X corresponds to a non-target or target, respectively), and a 1 × K vector of non-negative weights 1 w. ~ We seek a “weak-classifier” of the form: ( I(u(f~ T ~x) ≤ θ1 ), for θ2 = 1, h(~x, ~ θ) = I(u(f~ T ~x) > θ1 ), for θ2 = −1, (1) where u(z) is either just z or |z|, depending on whether or not the flag useAbs is false or true. Also, I(b) in (1) is equal to 1 (one) when the boolean value b is true, and 0 otherwise. Note, this weak ~ is similar to the weak classifier f (~x, ~θ) on pp. 14-17 of the Object Recognition classifier h(~x, θ) lecture notes, except there the classifier outputs values +1 or −1, while here it outputs +1 or 0. The only unknowns for trainStump.m are the threshold value θ1 and the parity θ2 . Complete the M-function trainStump.m so that it returns the values described in the function comment, θ. Denoting the kth column of X by ~xk , the optimum ~θ should including the optimum parameters ~ minimize the weighted error P ~ 6= h(~xk , θ)) k wk I(y Pk err = (2) k wk Hint: You only need to consider discrete values of θ1 , namely the values u(f~ T ~xk ) where ~xk is a positive or negative training example. You can then find θ1 and θ2 in Matlab without writing your own loop by using the built-in functions cumsum, sum, max, and/or min. Complete a short Matlab script, testStump.m, to demonstrate your M-function trainStump.m. In particular, build several weak classifiers based on projection vectors f~ obtained from derivatives of a Gaussian kernel (i.e., generated by buildGaussFeat.m in the util directory of the handout code). You can assume (for now) that the weights in w ~ are all equal (although it is important that your implementation of trainStump.m works correctly for any non-negative weights with P ~ T x) for both k wk 6= 0). In your report show histograms of the continuous-valued function u(f ~ target ~x’s and non-targets. Report the true positive rate and the false positive rate of each of ~ your weak classifiers using the optimal θ. 2. AdaBoost. Browse the script file trainAdaGauss.m. The next thing you need to do to complete the training part of this script is to write evalBoosted.m as described in the handout code’s comments. Given a list of weak classifiers {hm (~x, θ~m )}M m=1 , which have been trained as in question 5 above and found to have errors errm , for m = 1, . . . , M , then the strong classifier is given by SM (~x) = M X αm (2hm (~x, θ~m ) − 1), (3) m=1 here αm = log((1 − errm )/errm ). (This expression is different from the one on p.16 of the lecture notes, since our weak classifiers have discrete values in {0, 1} instead of {−1, 1}.) The first part of the script file trainAdaGauss.m can then be run to train a boosted classifier for eyes. This iteratively adds one weak classifier to the additive linear model (3), trying many different weak classifiers to determine which one to add. This search over many weak classifiers may take a minute or so per classifier so you might try using only nFeatures = 20 weak classifiers. If you have more CPU resources, try training nFeatures = 50 or 100. As each weak classifier is selected, it is particularly instructive to observe the histograms of the feature before thresholding, that is, histograms of u(f~ T ~x), for both the targets and the non targets. (The M-file trainStump.m in the handout code is set up to make these plots.) If there is an edge in most of the training target images at a particular scale, sign, and orientation, what might you expect a selected feature to be? If the sign of the edge varies (i.e., either light to dark 2 or dark to light), what might you expect a selected feature to be? If the training target images are all relatively smooth (i.e., no edges) in a particular region, what might you expect a selected feature to be? Add code to the end of trainAdaGauss.m to plot the DET curves (see p.7 of the Object Recognition lecture notes) obtained by thresholding the strong classifier in (3), that is, SM (~x) > τ , for different values of τ . Draw separate DET curves for various values of M . Do this for both the training set of eyes and non-eyes, and the test set. To evaluate SM (~x) for some M less than the maximum number of features you trained, you do not need to rerun the training. Why? Simply limit the sum in (3) to the first M terms. Comment on the results. Are the testing and training errors for a given strong classifier (i.e., the same M ) similar? If not, explain what the cause for this might be. 3. Application to an Image Patch. The script file tryEyeDetector.m reads in an image of several people, none of whom appear in the previous training or test sets. Write Matlab code to select a rectangle from this image (using ginput(2)) and run an eye detector centered at every pixel within the selected rectangle. Show the resulting detections overlayed on top of the original image. The reason for not running the eye detector over the whole image is that these Matlab implementations are slow. You can write a relatively efficient Matlab implementation by storing the image as one long column vector I, and then forming X = I(J), where J is a 500 × K matrix of indices into I. If you choose the indices in J correctly, each column of X will be the pixels from a 20 × 25 patch within I. The detector can then be applied to all the columns in X quite efficiently. This trick won’t work for the whole image, since you will run out of memory, but it will be able to process many patches at a time (i.e., K patches). Small Print. The eye detection results in Figure 1 were obtained using a similar positive set to the one used here, but with a much larger negative set. The main point of showing these results here is to demonstrate that a plausibly useful result can be obtained with the techniques described in this assignment. 3

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement