# /smash/get/diva2:19612/FULLTEXT01.pdf | Datasets and Evaluation Framework

**4 Datasets and Evaluation Framework **

This chapter describes the practical thesis work including definition of general testing framework and measures, implementation of scale-space methods and evaluation of the methods and their results.

The first few parts describe and define the general preparatory processes and measurements involved in one or more of the scale-space methods implemented. These include selection of test data, image pre-processing, definition of evaluation measures, description of edge detection method and calculation of initial evaluation measures. These introductive parts end with a section describing the general framework defined and utilised for testing and evaluating all the scale-space methods investigated in this thesis.

This is followed by information pertaining to implementation, testing and evaluation of the different scale-space methods considered. Included in these sections are motivation, method specific implementation details, summary of overall results and more detailed samples of representative results. There are three different scale-space methods implemented; linear scale-space, nonlinear isotropic diffusion and nonlinear anisotropic diffusion.

*4.1 Selection of Test Data *

*4.1 Selection of Test Data*

The fingerprint database used for testing has been made available by the company Fingerprint

Cards AB, and the images have been acquired using their “livescan” fingerprint scanner

FPC1010. Throughout the thesis fingerprint images show ridges as dark and furrows as bright pixels.

Since the measurement of a “livescan” fingerprint scanner is done by direct contact with the finger it is possible to define its inner and outer scale. The characteristics of FPC1010 is

presented in Table 4-1, which shows an inner scale of 70x70 µm and the outer scale is

10.64x14.00 mm.

Pixel cell size 70x70 µm

Number of pixels

Active sensing area

152x200 pixels

10.64x14.00 mm

Pixel resolution 8 bits

**Table 4-1: Characteristics of “livescan” fingerprint scanner FPC1010 [38]. **

The evaluation of scale-space methods is intended to be qualitative and not quantitative, why the selection of images to be used for testing has been limited both in number and size. The main aim is to see how scale-space methods affect fingerprint images on detail level why the selected test images are 53x53 pixels small sub-areas of a fingerprint (thus reducing the outer scale to 3.71x3.71 mm). A maximum of two sub-areas from the same fingerprint have been selected. A total of 44 unique sub-areas are used of which 34 have been acquired at five different occasions, while the remaining 10 have been acquired at three different occasions.

This gives a total of 200 images, depicting 44 unique sub-areas of fingerprints. All three or five images representing the same sub-area will be referred to as a sub-area group. The subarea images have been selected to represent different probable occurrences of template ageing problems. The defined selections of images will be used for all testing to render possible comparison between results from the different scale-space methods tested.

35

All fingerprint images provided to the author by Fingerprint Cards AB had been manually matched (aligned) so that no translation or rotation appears between images of the same fingerprint (i.e. a feature in one image appears at the same position in all other images of the same fingerprint). This effectively excludes the problem of misregistration between two fingerprint images, a problem which is outside the scope of this thesis.

The fingerprint images have all been acquired over a period of 27 weeks with a minimum of one week between acquisitions. During testing and evaluation no consideration will be given to the actual time difference between the acquisitions of two fingerprints. Instead the focus lies on the amount of change or dissimilarities between the images in a sub-area group. The level of dissimilarity varies considerably between different sub-area groups, for one sub-area

other sub-group types are included for reference and for conclusions to be as general as possible.

**Figure 4-1: Sub-area group with images very similar to each other. **

**Figure 4-2: Sub-area group including images with essential dissimilarities. **

In each sub-area group the centre area of size 12x12 pixels is selected as distinct feature, hereafter referred to as feature segment or feature tile. The features tiles have been selected mainly depending on how they are affected by the template ageing problem. Since the images are aligned the feature is at the same position (a small amount of deviation is allowed) in all images of a sub-area group. The criterion for a distinct feature is that it is unique compared to the rest of the sub-area image. A distinct feature is typically, but not exclusively, minutiae. No judgements have been made as to whether a feature tile is distinct enough for identification purposes.

For testing to be worthwhile the performance of the methods and the concluded results must be evaluated, hence it is necessary to define measures for evaluation. An evaluation measure should be adapted to accurately represent the quality of the property which it is to assess. This chapter will define, and motivate the use of, the different evaluation measures that are applicable for this thesis.

36

**4.2.1 Modified Normalised Correlation Coefficient **

The aim of this thesis is to enhance fingerprint images in the sense that images depicting the same finger become more alike. Consequently some kind of similarity/matching measure would be apt to use. This will give a direct measure showing the quality of the enhancement without depending on a minutiae extraction algorithm or likewise.

Selecting a matching measure involves the decision of matching criteria. Since two images depicting the same object will probably look slightly different it is important to define the matching criteria. That is, what kinds of dissimilarities that are allowed to still consider two image segments a match. The intensity matching measure chosen for this thesis is the normalised correlation coefficient (NCC) [31].

*I*

1

and

*I*

2

are the windows (or tiles) selected for correlation and *I * and

1

*I * represent the

2 corresponding window means. Correlation of a window across an image is calculated by, for each pixel in the image, extract a window of the same size as the correlation window and calculate the NCC.

∑∑

*NCC*

=

(

*I*

1

∑∑

*m n m n*

(

*I*

1

(

*m*

,

*n*

)

(

*m*

,

*n*

)

−

*I*

1

2

)

−

*I*

1

( )

*I*

(

*n*

)

2

*m*

,

−

*I*

2

)

∑∑

*m n*

(

*I*

2

(

*m*

,

*n*

)

−

*I*

2

2

)

Although considering some other distant metrics, the normalised correlation coefficient was ultimately selected since it is invariant to linear brightness and contrast variations [31]. These dissimilarities are likely to depend on other reasons than differing fingerprint patterns.

The NCC value ranges between –1 and 1, where a negative result indicates the similarity of the inverted pattern. Applications only considering intensity differences in a pattern thus regard an inversion of the original image as a perfect match, typically use the absolute value of the NCC which gives values between 0 and 1. Where 0 is a mismatch and 1 is a perfect match. However, in fingerprint matching the actual intensity values is important since this is what distinguishes ridges from furrows. A ridge ending is always surrounded by a furrow bifurcation and a ridge bifurcation surrounds a furrow ending. Allowing inverse patterns would in other words result in high correlation score between a ridge ending and a ridge bifurcation of the same direction, which is a very misleading result. Therefore the original

NCC value is used, however to get a more intuitive result the NCC measure is mapped to lie between 0 and 1 instead of the initial range from –1 to 1. This measure will be referred to as the modified normalised correlation coefficient (MNCC).

*MNCC*

=

*NCC*

2

+

1

Correlating a feature segment over each position in a target image creates a correlation image, which at each pixel contains the correlation result computed from the corresponding position in the target image.

37

Figure 4-3 shows the original image, an extracted tile, the image it is correlated over and the

computed MNCC image. The correlation image contains values between zero (black) and one

(white), thus black means a total mismatch while white represents a perfect match.

**Figure 4-3: (I) original image (II) extracted feature tile, (III) target image, (IV) MNCC image. **

Of interest to this thesis is the value in the MNCC image at the position where the feature was extracted from the original image. Even though the images have been aligned, some misregistration is likely to exist, hence the resulting correlation value is interpreted to be the maximum value of a 5x5 pixel area centred at the feature position in the MNCC image. This area is referred to as the ground truth region and the value is referred to as the feature tile modified normalised correlation coefficient, *MNCC*

*feat*

.

**4.2.2 Uniqueness Measure **

Since the scale-space methods suppress details, a smoothed image will at coarser scales loose intricate information and the image pattern will be flattened. This especially holds for linear scale-space methods, whereas scalar diffusion may be an exception since it actually enhances edges. It is thus reasonable to assume that a selected feature tile will become more and more similar to the whole sub-area image it is correlated to when advancing up the scale-space ladder, given that both images are scale-space filtered. Consequently similarity by itself is not sufficient as a measure of evaluating the enhancement algorithms. Some form of relative similarity would instead be more informative. For this purpose the uniqueness measure (*UM*) is introduced. It is defined as the difference between the *MNCC*

*feat*

and the highest correlation measure throughout the rest of the sub-area image. This means that a positive uniqueness measure is only achieved when the *MNCC*

*feat*

is the highest correlation value in the sub-area image, meaning that the feature tile is correctly matched.

Regarding the implementation process what is interesting is the positions where the highest correlation values are found, or more accurately, whether they are within the 5x5 centre area or not. However, a correlation value is often related to the nearby correlation values, which means that a high correlation value will have several equally, or nearly as, high correlation values adjacent. Thus the only interesting correlation values are the maxima peaks. A maxima filtering process has been implemented to extract the maxima peaks across the MNCC image.

Such a method is often referred to as non-maximum suppression. Each pixel in the image that has the highest value of a surrounding centred area of 5x5 pixel, keeps its value while every other pixel is set to zero. Of note is that this does not correspond to finding all maxima in a mathematical sense since if two maximums are close enough to each other the lowest of them will be ignored.

38

The result of the maxima filtering process is shown in Figure 4-4; the left image is the MNCC

image, the centre is the maxima extracted image and the right is an image where the maxima peaks have been added to a dimmed version of the MNCC image.

**Figure 4-4: (I) MNCC image, (II) extracted maxima, (III) maxima peaks shown on the dimmed MNCC image. **

Although the most important factor is to decide if the maximal value of a MNCC image appears at the actual position of the feature tile, that is the uniqueness value is positive, it is also interesting to look at the actual value of the uniqueness measure. The higher the uniqueness measure, the more unquestionable the selection of the feature tile position would be as a correct match. As previously stated, it is assumed that as the image pattern flattens at coarser scales the correlation value across the whole sub-area image is likely to increase, which in turn would imply that the uniqueness measure will decrease at coarser scales.

The uniqueness measure, as previously explained, depends on two values; the *MNCC*

*feat*

and the highest correlation measure throughout the rest of the sub-area fingerprint image. In other words how high a correlation value obtained at the correct position (that is how similar is the feature in the two images), and the highest correlation value throughout the rest of the image

(that is how unique the feature is compared to the rest of the sub-area image). The uniqueness measure computed at different scales depends on the initial uniqueness of the feature, however no initial uniqueness measure have yet been defined. To gain a more general uniqueness measure, which may be compared to uniqueness measures computed from another correlation pair, a relative uniqueness measure (*UM*

*rel*

) is defined.

*UM rel*

=

*UM*

*UM init*

Where *U*

*M*

is the uniqueness measure and *UM*

*init*

is the initial uniqueness measure. The initial uniqueness measure is only computed for the normalised original images and is calculated as follows; the feature tile is extracted from the sub-area fingerprint image and is then correlated over the same image resulting in a MNCC image. Since the feature tile originates from the same image that it is correlated over, the highest *MNCC*

*feat*

will be 1, hence *UM*

*init*

will be one minus the highest correlation measure throughout the rest of the fingerprint image. By defining the initial uniqueness measure it is possible to calculate the relative uniqueness measure and thereby achieve a measure useful for comparison between results from different correlation pairs.

Normalisation of an image can appear quite differently depending on what image feature is to be normalised. In this thesis image normalisation is referring to image intensity values, and an image normalisation results in an 8-bit grey-scale image with pixel values between 0 and 1.

The acquisition of a fingerprint is a mapping of the ridge and valley pattern in three dimensions to a two dimensional signal. The signal is usually visualised as a grey-scale image where ridges are shown as dark pixels and valleys as bright. Apart from the ridge/valley-

39

pattern the actual pixel intensity values depend on skin wetness, the pressure of the finger on the sensor and the properties of the measurement device. Consequently two acquired images

of the same finger might appear quite differently (see Figure 4-5); hence some kind of grey-

scale transformation is needed. Considering the mentioned problem, a sufficient solution would be linear histogram stretching to make the images use the full grey-scale range.

However, varying pressure and inexact measuring by the sensor across the finger may give rise to shifting ranges of pixel intensity across the fingerprint image. Meaning that two spatially separated ridges (or furrows) of the same height in the fingerprint may not have the same intensities in the acquired image. This problem will not be helped by a global operation like linear histogram stretch; instead some local operation must be implemented. The aim is to normalise the image so that all ridges at their peaks have value 0 and furrows at their bottom have value 1 no matter what their spatial position. Because of the uncertainty in absolute ridge/valley height of a scanned fingerprint the pixel intensity alone cannot be used for identification purposes and therefore no vital information will be lost by the normalisation procedure.

**Figure 4-5: Two different images of the same fingerprint **

The method for normalisation proposed in this thesis involves a local linear histogram stretch.

However, if such a grey-scale transformation is calculated area-wise unwanted block pattern may appear. Instead the computation is done pixel-wise, implemented as follows. For every pixel in the image the normalised tile (linear history stretch) for an area of MxM, centred at the current pixel, is computed. The new calculated value of the current pixel is saved in the new image. The procedure is repeated for all pixels in the image. For implementation efficiency the calculation can be done only using the minimum and the maximum pixel intensities in the selected area around the current pixel. The new pixel value is then calculated by the following formula:

*I norm*

( )

=

*I*

( ) max

−

( )

*x*

,

*y*

min

−

( ) min

*x*

,

*y*

( )

*x*

,

*y*

Where *I(x,y)* is the image pixel value at position *(x,y)*, *I*

*norm*

is the normalised image and min*(W*

*x,y*

*)*

and max*(W*

*x,y*

*)*

is the minimum and maximum pixel values of the window *W* centred at *(x,y)*.

The only parameter to be defined is the size of *W* over which the grey-value transformation should be calculated. It should be large enough to contain a ridge maximum and a furrow minimum and small enough for the computation to be considered local. A ridge/valley period

(i.e. the width of one ridge and one valley) is typically between 6-10 pixels in the test images why a window size of 11x11 has been chosen.

40

**Figure 4-6: Normalised images of the two fingerprints from Figure 4-5, using local linear **

**histogram stretch. **

Throughout the rest of the report all initial images used should be presumed normalised, if not stated otherwise.

As described in Chapter 2.2.3, the most commonly used conductivity coefficient, or

diffusivity, for nonlinear isotropic diffusion is the gradient magnitude (

∇

*L*

). The gradient magnitude works as an edge detector and allows the diffusion process to adapt the amount of smoothing to the edge response. There are however certain issues when using the gradient magnitude as an edge detector, especially at small scales, which motivates the implementation of an alternative method.

The gradient magnitude at scale *t* is calculated by:

∇

*t*

*L*

=

⎛

⎜⎜

∂

*L t*

∂

*x*

⎞

⎟⎟

2

+

⎛

⎜⎜

∂

*L t*

∂

*y*

⎞

⎟⎟

2

=

*L t x*

2

+

*L t y*

2

The value of the gradient magnitude depends on two measures; the first derivative in x and y direction. For a horizontal (vertical) edge of maximum strength (i.e. highest possible contrast) the gradient magnitude will be equal to the maximum response of the first derivative in y (x) direction, since the response of the derivative parallel to the edge will be zero and the derivative perpendicular to the edge will give maximum response. Considering Gaussian first derivative convolution kernels which are normalised to provide a maximum response of 1, the gradient magnitude for a horizontal or vertical edge will be 1.

Figure 4.7 depicts an edge of maximum strength and with varying orientation.

**Figure 4-7: Edge of maximum strength with varying orientation. **

41

For edges with orientation that is neither perpendicular nor parallel to any of the two axes both derivatives will have values larger than zero, and the gradient magnitude may result in

values larger than one. This effect is demonstrated in Figure 4-8, where the gradient

magnitude of the image in Figure 4-7 has been calculated at five different scales. The scales

have been selected to generate Gaussian derivative kernels of sizes 3x3, 5x5, 7x7, 9x9 and

11x11. The intensity values of the images have been adjusted to represent values between 0

(black) and 1.4 (white) to be able to depict the gradient magnitude values.

1.4

1.2

1.4

1.2

1.4

1.2

1.4

1.2

1.4

1.2

1 1 1 1 1

**Figure 4-8: Top row: Gradient magnitude of Figure 4-7 at scales; (I) t = 0.0625, (II) t = **

**0.25, (III) t = 0.56, (IV) t = 1.0, (V) t = 1.56. Bottom row: Plots of maximum pixel value of each image column. **

Concluded from the examples in Figure 4-8 is that there are three disadvantages with using

the gradient magnitude as edge detector;

• it is difficult to predict the response of the gradient magnitude for a certain type of edges, since it depends on two separate measures (i.e. the horizontal and vertical derivatives),

• the gradient magnitude is not only dependent on the edge strength, but also on the orientation,

• the response of the gradient magnitude is dependent on the scale at which it is calculated.

These problems are most evident at smaller scales, but are still present at rather large scales,

as can be seen in Figure 4-8. Ridge structure in fingerprint images are fairly small therefore

accurate response from an edge detector at detailed scales is essential.

A proposed alternative to the gradient magnitude is the first derivative of the Gauge axis perpendicular to the isophote. This is the measure which is implemented and used as conductivity coefficient for the scalar driven diffusion in this thesis. Since this method only involves a single measure it is easy to control the result by normalising the maximum response. To get intuitive values for the Gauge derivative when used as an edge detector, it is normalised to give a maximum response of 1.

The Gauge derivative is implemented by calculating the Gaussian derivative of different directions at each position and selecting the maximum response. Since it is only the strength of the edge that is considered, and not the direction, it is sufficient to calculate the Gaussian derivative for 180 degrees of rotation and take the absolute value of the result. The number of directions that the Gaussian derivative should be calculated for is dependant on the desired accuracy of the result.

42

In Figure 4-9 the absolute value of the Gauge derivative of Figure 4-7 has been calculated at

the same scales as the gradient magnitudes found in Figure 4-8. The absolute value of the

Gauge derivative has been calculated with an accuracy of 5 degrees. The intensity values of

the images in Figure 4-9 have been adjusted to represent values between 0 (black) and 1.4

(white), to allow comparison with the images within Figure 4-8. The actual maximum value

in the images is however 1, which is apparent when observing the plots in the bottom row.

1.4

1.4

1.4

1.4

1.4

1.2

1.2

1.2

1.2

1.2

1 1 1 1 1

**Figure 4-9: Top row: Absolute value of gauge derivative perpendicular to isophotes of **

**Figure 4-7 at scales; (I) t = 0.06, (II) t = 0.25, (III) t = 0.56, (IV) t = 1.0, (V) t = 1.56. **

**Bottom row: Plots of maximum pixel value of each image column. **

Comparing the images in Figure 4-9 to those in Figure 4-8 proves that the Gauge derivative

provides a more even edge response. The three disadvantages of the gradient magnitude are no longer apparent when using the Gauge derivative. The negative aspect of the Gauge derivative is that it is computationally more complex. Computational efficiency is however not the focus of this thesis, hence this aspect is ignored.

*4.5 Initial Evaluation Measures *

*4.5 Initial Evaluation Measures*

To be able to appraise the quality of a fingerprint image enhancement method, it is necessary to compare evaluation measures for enhanced images with evaluation measures for original images. The latter is termed initial evaluation measures, and they are presented in this section.

As acknowledged earlier the selection of fingerprints to use in the tests include a total of 44 unique sub-areas, of which 34 were acquired at five different occasions, while the remaining

10 were acquired at three different occasions. For each sub-area group, every image is correlated over the remaining images, resulting in 20 correlation pairs for a sub-area group with five images. The number of correlation pairs for the whole data set is 740. The initial evaluation measures are calculated for every correlation pair, using the normalised images.

Included in the initial evaluation measures are *MNCC*, *MNCC*

*feat*

, *UM* and *UM*

*rel*

shows a histogram of the initial *UM*

*rel*

values. Of the 740 correlation pairs, there are initially

666 whose feature tiles are accurately matched (i.e. *UM*

*rel*

> 0).

43

45

40

35

30

25

20

15

10

5

0

-0.6

-0.4

-0.2

0 0.2

0.4

0.6

**Figure 4-10: Histogram of UM**

*rel*

** values for initial fingerprint images. **

Figure 4-11 shows a histogram of the initial *MNCC*

*feat*

values. The bars in the histogram have been divided into two groups; correlation pairs where the feature tile has been correctly matched (dark grey bars) and where they have been mismatched (light grey bars).

60

50

40

30

20

10

0

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

**Figure 4-11: Histogram of MNCC**

*feat*

** values for initial fingerprint images. Light grey bars are correlation pairs with **

*UM rel*

≤

0

** and dark grey bars represent correlation pairs with **

*UM rel*

>

0

**. **

It is evident that there is a greater risk of a mismatch for lower *MNCC*

*feat*

values, however one interesting aspect is that even *MNCC*

*feat*

values above 0.9 may give false matches. The fact that no single value for *MNCC*

*feat*

exists that separates correct and false matches proves the necessity of the uniqueness measures (i.e. *UM* and *UM*

*rel*

) as a compliment to the *MNCC*

*feat*

measure.

*4.6 Framework for Testing and Evaluating Scale-Space Methods *

*4.6 Framework for Testing and Evaluating Scale-Space Methods*

Many of the different steps used when implementing, testing and evaluating the different scale-space methods are very similar. Thus it is motivated to define a general framework to be used for all methods tested within this thesis. This will simplify the comparison between

44

results of the different methods, as well as make the descriptions of the methods easier to follow for the reader. This chapter details the different parts of this general framework.

The framework is divided into three areas; preparation, implementation and evaluation. The preparatory steps consist of definition of evaluation measures, data set selection, normalisation of images and calculation of initial evaluation measures, of which all formerly have been described in detail. The implementation part includes the different steps of implementing each scale-space method in practice, for example parameter definitions and parameter boundary specifications. The final part is the evaluation and it includes summary of overall results as well as more detailed samples of representative results.

**4.6.1 Preparation Framework **

The preparatory framework includes the steps performed prior to implementation of the scalespace methods, and they will not be included in the sections describing the implementations.

*I) Definition of Evaluation Measures *

First the measures used to evaluate the resulting effects of the scale-space methods are defined. These measures consist of modified normalised correlation coefficient (*MNCC*), feature tile modified normalised correlation coefficient (*MNCC*

*feat*

), uniqueness measure (*UM*) and relative uniqueness measure (*UM*

*rel*

). These measures have all previously been described in detail.

*II) Data Set Selection *

All tests are performed on the same data set. The selection of the images included in this data

set is described in 4.1 Selection of Test Data.

*III) Image Normalisation *

Image normalisation is explained in 4.3 Image Normalisation.

*IV) Calculation of Initial Evaluation Measures *

Calculation of initial evaluation measures is detailed in 4.5 Initial Evaluation Measures.

**4.6.2 Implementation Framework **

This section include the parts involved in implementation of the different scale-space methods. Only the steps that are shared by all methods are included in the implementation framework, additional method specific elements are described within the corresponding sections.

*I) Description and Motivation of Method *

Each implemented method is outlined and motivated. Especially noted are deviations from

basic scale-space methods, as described within 2.2 Image Representation at Different Scale.

*II) Definition of Method Parameters and Boundaries *

Parameters which affect the scale-space method are defined. The effect of the different parameters is investigated and boundaries are set to include values which provide meaningful results for fingerprint image enhancement. Parameter boundaries not possible to decide by theoretical means are instead chosen through practical testing and visual assessment. A parameter will be set to a fixed value if no other logical choice exists, or if other reasonable choices of the parameter value render very similar results, hence not affecting the final result of the scale-space smoothing method.

45

*III) Specification of Parameter Sample Values *

This part specifies sample values (i.e. values that should be tested during scale-space smoothing) for all parameters that were provided with boundaries in the previous step. For each parameter considered the effect of the boundary values are investigated. The difference in the effect of the boundaries decides how many values, or samples, that should be chosen for that parameter. For example if the results from the boundary values differ greatly, then many sampling points should be included when testing the scale-space method. If boundary values render similar results, fewer sample values are required. The amount of sample points selected and their actual values are specified through practical testing and visual assessment.

Parameter values will be sampled more frequently around values that are likely to render more interesting results. It is important to mention that the purpose is to evaluate the affect different parameters have on the results and try to find patterns which explains the behaviour.

Hence fine-tuning the parameters to provide highest possible evaluation measures is not the focus of the thesis.

There are three different aspects to consider when selecting parameter sample values;

• to include a sufficient amount of values to be able to detect patterns in the effect the parameter has on the result,

• for sample positions to represent interesting values (i.e. closer sampling at interesting values),

• to limit the number of values so that the generated amount of data is comprehendible.

The latter statement contradict the two previous ones, therefore the actual selection of the amount of values must be a balance between these three aspects.

*IV) Scale-Space Smooth Images *

In this step all 200 selected input images are scale-space smoothed by the current method, using every possible different combination of the parameters previously defined. This means that each variable parameter defined adds a new axis to the dimension of result images.

Consider for example the images in Figure 2-19 and Figure 2-22, which in both cases

produces a two dimensional result map, since they depend on two parameters.

*V) Calculate Evaluation Measures *

This step is performed for each combination of parameters. For each sub-area group the first image is selected as template image and the feature tile in the remaining images in the subarea group are correlated over the template image. This results in four (for sub-area groups with five images), or two (for sub-area groups with three images), *MNCC* images. The

*MNCC feat*

value is extracted and *UM* and *UM*

*rel*

are calculated, as previously described. In the next step the second image is selected as template image and the procedure is repeated. The evaluation measures are calculated using all images in a sub-area group as template image, and this is repeated for all sub-area groups. Thereafter images scale-space smoothed with the next combination of parameter values are selected and the whole procedure is repeated from scratch. The process is presented in pseudo-code below for a more comprehendible description of the calculation of evaluation measures.

46

**for** each defined value of variable parameter 1, **par_1**

** … **

**for **each defined value of variable parameter n, **par_n**

*Select all 200 scale-space images calculated with the current parameter values ( par_1, …, par_n) *

**for **each sub-area group

**for** each sub-area image, **templImg = 1 to 5** (or **3**)

*Select current image ( templImg) as template *

**for **remaining images, **corrImg = 1 to 4** (or **2**)

*Extract feature tile from current image (corrImg) *

**end **

** end **

** end end end **

**… **

**4.6.3 Evaluation Framework **

The evaluation of each method will start by finding the parameter settings which generate best and worst overall results. This will be estimated by counting the number of accurate matches.

The parameter settings with the lowest amount of correct matches are considered worst and vice versa. This is the only time when the result for the whole data set is considered. The following evaluation steps will consider separate correlation pairs in context of their sub-area groups.

Every method will generate a large amount of data, therefore it is impossible to analyse each correlation pair of each parameter setting in detail. Two starting points are defined from where the result data is analysed, and the amount of data (i.e. number of correlation pairs and sub-area groups) thoroughly investigated depends on whether a pattern is evident in the results or not. For each correlation pair the difference between the *UM*

*rel*

and the initial *UM*

*rel*

is calculated. The values are then sorted in a list, where positive values indicate a higher *UM*

*rel*

for the image enhancement method. The beginning and the end of this list are the two starting points from where the in depth analysis is executed. The detailed evaluation focuses on the cases where the scale-space enhanced images have performed best and worst.

One of the main focuses of the evaluation of the different scale-space methods is to examine how it performs for fingerprints affected by the template ageing problem. The concept of template ageing is however a general concept describing deviations in a fingerprint over some time. It is difficult to define a measure for the amount of difference between two instances of a fingerprint. Hence, to try and estimate how much a correlation pair is affected by template ageing is not really viable. The fingerprint images that will be considered template ageingcases in this thesis are the correlation pairs that initially fail correlation. The calculation of the

initial measures (Chapter 4.5) shows that there are 666 correlation pairs that are initially

accurately matched. Thus the remaining 74 correlation pairs are considered significantly affected by the template ageing problem, and any of these correlation pairs that succeed correlation when scale-space smoothed will be investigated in detailed.

Aspects considered when evaluating the effect of the image enhancement methods are the size of the fingerprint feature, the amount of difference between the correlation pair images,

47

clearness of fingerprint structure and similarity/difference compared to results from other correlation pairs of the same sub-area group.

To better be able to go through the large amount of data generated for each method, a

graphical user interface (GUI) was created in Matlab for this purpose (see Figure 4-12). The

GUI allowed for quick and easy overview of initial and enhanced fingerprint images, as well as correlation results for a certain template image and specific parameter settings. The plot window was implemented to visualise the effect a specific parameter has on an evaluation measure.

**Figure 4-12: GUI for visualisation of generated result data. **

The evaluation framework described is used as a basis when evaluating the scale-space methods. Deviations might however occur since the evaluation of each method to a certain extent will depend on, and be adapted to, the differences and similarities between anticipated and actual results for that method.

48

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project