# /smash/get/diva2:19612/FULLTEXT01.pdf | Background

**2 Background **

The main purpose of this chapter is to contextualize the thesis and give a brief introduction to the different areas investigated within this work. Chapter two will familiarise its reader with the theory and terminology needed to understand the rest of the thesis. Areas described within this chapter include the characteristics of fingerprints, the structure of Automatic Fingerprint

Identification Systems, the basic theory of linear scale-space, and nonlinear isotropic and anisotropic scale-spaces.

*2.1 Fingerprints *

*2.1 Fingerprints*

**2.1.1 Biometrics and Fingerprints **

Personal identification are usually divided into three types; by what one owns (e.g. a credit card or keys), by something you know (e.g. a password or a PIN code) or by physiological or behavioural characteristics. The last method is referred to as biometrics and the six most commonly used features include face, voice, iris, signature, hand geometry and of course fingerprint identification [1].

**Method Examples **

What you know

What you have

What you are (biometrics) password, PIN code, user id cards, keys, badges fingerprint, face, voice, iris, signature, hand geometry

**Table 2-1: Identification methods **

It has been established, and is commonly known, that everyone has a unique fingerprint [2] which does not change over time [3].

Each person’s finger has its own unique pattern, hence

any finger could be used to successfully identify a person.

**2.1.2 A Fingerprint Autopsy **

A fingerprint’s surface is made up of a series of ridges and furrows. It is the exact pattern of these ridges and furrows (or valleys) that makes the fingerprint unique. The features of a fingerprint can be divided into three different scales of detail, of which the coarsest is the classification of the fingerprint.

The classification of fingerprints can be traced back to 1899 when Sir Edward Richard Henry, a British policeman, introduced the Henry Classification System [4, 5, 6] which classifies fingerprints into five different types; right loop, left loop, whorl, arch, tented arch. This classification system is still in use today but has been extended to include more types, for example double loop, central pocket loop and accidental.

1

That a fingerprint does not change over time is actually a qualified truth and it will be described more in the section on template aging.

5

**Figure 2-1: Different fingerprint types of the Henry Classification System. **

**Top to bottom, left to right: right loop, left loop, whorl, arch, tented arch [40] **

Fingerprint databases, which usually tend to be comprehensive, often index fingerprints based on their classification types [7]. Before searching, a quick classification of the fingerprint will help exclude most of the database, which consequently will reduce the search time. The classification indexing method is also often adopted by automatic fingerprint identification systems, where short search times are essential.

The second scale of fingerprint details consists of features at ridge level. The discontinuities

(endings, bifurcations, etc.) that interrupt the otherwise smooth flow of ridges are called minutiae, and analysis of them, their position and direction, is what identifies a person. Many courts of law considers a match with 12 concurring points (*the 12-point rule*) present in a clear fingerprint as adequate for unique positive identification [3].

Some of the most common minutiae are presented in Table 2-2. The more unusual a minutiae

is the more significance it has when used for identification.

Ending (or termination)

Bifurcation

Independent (or short) ridge

Dot

Bridge

Spur (or hook)

Eye (or island)

**Table 2-2: Examples of minutiae types **

6

A special category of minutiae is the type usually referred to as singularity points which include core and delta. A core is defined as the topmost point on the innermost upward uturning ridge [6], and a delta is defined as the centre of a triangular region where there is a convergence of ridges that flow from three different directions [6]. The number of cores and deltas depends on the type of fingerprint. There are even fingerprint types completely lacking singularity points. However, using the number of singularity points, their location and type, is a common way to decide the fingerprint type in automatic fingerprint identification systems

[8, 9] and can also be used to calculate the rotation and translation between two images of the same fingerprint [10].

**Figure 2-2: Core (**

Ο**) and delta (**∆**) points in three different types of fingerprints **

The third detail level is the finest level at which fingerprints can be analysed. Features at this scale include for example ridge path deviation, ridge width, ridge edge contour and pores.

Analyses of third level detail require that the method or device used for acquisition of the fingerprint pattern is highly detailed and accurate. Historically pores have been used to assist in forensic identification, however most matching methods mainly use minutiae comparisons while pore correlation can sometimes be used as a secondary identification method [11].

**Figure 2-3: Level 3 detail of fingerprint [41] **

**2.1.3 Template Ageing and Fingerprint Quality **

It is commonly known that the two foremost advantages with fingerprint identification are that fingerprints are unique (individuality) [2] and remain unchanged over time (persistence)

[3]. The latter statement is however a qualified truth as fingerprint may actually vary significantly during a short period of time. The main pattern will not change but at a smaller, more detailed scale differences may occur due to wear and tear, scars, wetness of skin etc.

This is referred to as the problem of template ageing.

The reading of a fingerprint at two separate occasions may give relatively different results.

For the minutiae extraction to be as accurate as possible the quality of the fingerprint needs to be adequate. Apart from the method or technology used, the quality of an acquired fingerprint depends highly on the condition of the skin. The characteristic most likely to differ between two readings is the wetness of the skin.

7

Dry prints can appear broken or incomplete to electronic imaging systems, and with a broken ridge structure identification becomes harder due to the appearing of false minutiae. Too wet a fingerprint on the other hand causes adjacent features to blend together.

Scar tissue is also highly affected by the wetness of the skin; a dry finger and the scar will not print well, a wet finger and the scar will have the appearance of a puddle. Since scar appearance is even more sensitive to the level of skin wetness than ordinary ridge structure a scar that is not permanent can affect the accuracy of the minutiae extraction tremendously.

**Figure 2-4: Fingerprints of different qualities; (I) too dry, (II) too wet, (III) just right and (IV) scarred. Copyrighted imaged by BIO-key International, Inc., and used with permission [42]. **

**2.1.4 Automatic Fingerprint Identification System (AFIS) **

For contemporary applications the fingerprint identification/verification process is undertaken automatically. Such a system is commonly known as an Automatic Fingerprint Identification

System (AFIS).

A generic AFIS consists of five different stages; fingerprint acquisition, image enhancement,

feature extraction, matching and decision, which is illustrated in Figure 2-5.

Stored templates

Scanner

Image enhancement

Feature extraction

Matcher match / no match

**Figure 2-5: A generic Automatic Fingerprint Identification System [based on 1, 12] **

**2.1.4.1 Identification and Verification **

An AFIS distinguishes between two different types of algorithm; identification and verification. For identification, also known as 1:N matching, a match for the acquired fingerprint is searched for in a database containing many different fingerprints. A match is achieved when a person is identified. Whereas for verification, also known as 1:1 matching, a single fingerprint template is available for comparison. In this case a match verifies that the person leaving the fingerprint is the same person who’s fingerprint the template was originally created from. An example could be a smart card containing the template, in which case a

8

verification would prove that the person who left the fingerprint is the actual owner of the smart card; hence it could for instance be used to replace the PIN code for a cash card.

**2.1.4.2 Enrolment **

To be identified as a valid user of a system a person first needs to be registered on that system. For an AFIS this means the enrolment of one or more fingerprints. However, a fingerprint image requires relatively large storage space and contains a lot of unnecessary information, why only specific information used for identification purposes is stored in the system. The stored fingerprint information is usually referred to as a feature template (or feature vector).

**2.1.4.3 Acquisition **

The acquisition of a fingerprint is achieved via a fingerprint scanner and several different types exist. These scanners are known as “livescan” fingerprint scanners since they do not use ink but direct finger contact to acquire the fingerprint. They can be divided into five groups depending on the technique by which they acquire the fingerprint; optical, capacitive, thermal, ultrasound and non-contact methods [1]. The characteristic of the image a scanner returns depends on the type of scanners used. For example optical and capacitive scanners tend to be sensitive to the dryness/wetness of the skin and thermal scanners, although overcoming the wetness problem, gives images with poor grey values [1].

**2.1.4.4 Image Enhancement **

After acquisition, a fingerprint image usually contains noise and other defects due to poor quality of the scanning device or similar reasons. Therefore image enhancement is required.

The performance of a feature extraction algorithm relies heavily on the quality of the input fingerprint images, so the typical purpose of image enhancement in an AFIS is to prepare for feature extraction by improving the clarity of ridges and furrows [13] and suppress noise [14].

It is however difficult to suppress noise and other spurious information, without corrupting the actual fingerprint pattern. Various image processing techniques have been proposed, and which to use depends on what type of image defects need to be suppressed. Some examples includes normalisation [13], clipping [8] and compensation for non-uniform inking or illumination characteristics of an optical scanner [2].

A further example of image processing, closely related to image enhancement, is the segmentation of fingerprint images [15]. A segmentation algorithm is used to decide which part of the image is the actual fingerprint and what part is the background (i.e. the noisy area at the borders of the image). Discarding the background will reduce the number of false features detected.

Also often used is some type of quality measure, which has a similar goal as a segmentation algorithm, namely to define the part of the image that contains fingerprint pattern of adequate quality. This is accomplished by determining the fingerprint image quality locally over the whole image, and then discarding parts of the fingerprint not reaching the required quality value. Examples include the coherence measure [16] and certainty level of the orientation field [12].

**2.1.4.5 Feature Extraction **

The fingerprint signal in its raw form contains the necessary data for successful identification hidden amongst a lot of irrelevant information.

9

Thus image enhancing processes will remove noise and other clutter before the next step of localising and identifying distinct features, so called feature extraction. Today’s AFISes commonly identify only ridge endings and bifurcations as distinct features [1, 12]. This mainly because all other type of minutiae can be expressed using only these two main types and they are by far the most common [19]. Algorithms often return too many features, some of which are not actual minutiae; hence some kind of post-processing to remove these spurious minutiae is necessary.

A typical feature extraction algorithm is shown in Figure 2-6, and is explained more

thoroughly in [12]. It involves five operations; (I) orientation estimation, with the purpose to estimate local ridge directions (II) ridge detection, which separate ridges from the valleys by using the orientation estimation resulting in a binary image (III) thinning algorithm/skeletonization, giving the ridges a width of 1 pixel, (IV) minutiae detection, identifying ridge pixels with three ridge pixel neighbours as ridge bifurcations and those with one ridge pixel neighbour as ridge endings and (V) post processing, which removes spurious minutiae.

**Figure 2-6: Example of minutiae extraction algorithm; (I) input fingerprint, (II) orientation field, (III) extracted ridges, (IV) skeletonized image and (V) extracted minutiae. **

**2.1.4.6 Matching **

The matching module determines whether two different fingerprint representations (extracted features from test finger and feature template) are impressions of the same finger [12, 17].

There are six possible differences between the extracted template and the reference template that need to be compensated for [1]; (I) translation, (II) rotation, (III) missing features, (IV) additional features, (V) spurious features and (VI) elastic distortion between a pair of feature sets. Missing and additional features may depend on overlap mismatch due to translation between the two fingerprint readings.

Fingerprint matching algorithms usually adopt a two-stage strategy; firstly the correspondence between the feature sets are recognized and secondly the actual matching is performed [17,

12]. The matching algorithm defines a metric (the match score) of the similarity between the two fingerprint feature sets and a comparison with a system defined decision threshold results in a match or a non-match. The value of the decision threshold decides the system security level; a high value will give a more secure system but will also result in more false rejections, while a lower value may give additional false acceptances and hence be less secure.

An example of a match score is the Goodness Index [18] which takes into consideration the number of spurious, missing, and paired minutiae and weighs them with a local quality factor.

10

The effect of the quality factor is that spurious and missing minutiae in a high quality area of the fingerprint affects the Goodness Index more than in a low quality area.

**2.1.4.7 Performance Evaluation **

There are four possible outcomes of an identification or verification attempt; a valid person being accepted (true positive), a valid person being rejected (false negative or false rejection), an impostor being rejected (true negative) and an impostor being accepted (false positive or false acceptance). The accuracy of an AFIS is defined by the relative number of false acceptances (false acceptance rate, FAR) and false rejections (false rejection rate, FRR).

Figure 2-7 shows a plot of impostor (*H*

*1*

) and genuine (*H*

*0*

) distribution curves with the match score (*s*) on the horizontal axis and a decision threshold (*T*

*d*

) defined as a specific match score.

A matching attempt giving a match score higher than the decision threshold will result in user acceptance and a match score lower than the decision threshold will give a rejection. The area under the genuine distribution, left of the decision threshold, is the FRR and the area under the impostor distribution, right of the decision threshold is the FAR. An optimal situation would be for the distribution curves to be completely separated since that would allow for a decision threshold resulting in zero FAR and FRR. However, in reality no AFIS is that accurate and the threshold must be decided depending on the sought characteristics of the

AFIS. The value of the decision threshold is a trade-off between security and user inconvenience. For example a high security access applications for obvious reasons uses a high decision threshold to get a low FAR whereas a less secure system may use a lower decision threshold to avoid unnecessary false rejections that could disturb a user of the system. Further examples of this type of application that uses a low decision threshold include forensic applications, which want to make sure that the AFIS do not overlook a potential suspect. Thus at the cost of more false acceptances a low decision threshold is preferred.

The FAR and the FRR distribution curves are usually used when evaluating an AFIS. The two measurements give information on different characteristics of an AFIS system. FAR analysis focuses on the individuality of fingerprints (i.e. how unique a fingerprint or fingerprint representation actually is), as the reason for a high FAR is a high level of similarity between non-matching fingerprints/fingerprint representations. FRR analyses centres on the template ageing problem, because a high FRR is due to the dissimilarities between two different acquisitions of the same fingerprint [11].

*p *

*H*

*1*

*H*

*0*

FRR

*T d s *

FAR

**Figure 2-7: Impostor (H1) and genuine (H0) distribution curves **

11

*2.2 Image Representation at Different Scale *

*2.2 Image Representation at Different Scale*

**2.2.1 The Notion of Scale **

As stated by Tony Lindeberg in [20] *“An inherent property of real-world objects is that they *

*only exist as meaningful entities over certain ranges of scale”*. Everyday humans view many objects over a large range of scales without reflecting on them. To better be able to describe the concept of scale require going outside the scale range perceivable by human vision. These scale ranges are less intuitive from the human vision point of view but will hopefully, and because of that, make the notion of scale more comprehendible. A perfect example is the

Power of 10 series [39] where images at scales of integer powers of 10 meters are shown. A

sample is shown in Figure 2-8 where the leftmost image representing the scale of 10

21

m illustrates a swirl of billions of stars within the Milky Way galaxy, the middle one of scale 10

0 m shows a man resting at a picnic and the rightmost picture shows individual neutrons and protons that make up the structure of the carbon atom at the scale of 10

-14

m.

**Figure 2-8: Powers of 10. Images with scales of powers of 10 meters. (I) 10**

**21**

** m, a swirl of a hundred billion stars in our Milky Way galaxy, (II) 10**

**7**

** m, the Earth, (III) 1 m, a man, **

**(IV) 10**

**-3**

** m, just below the skin of the man's hand and (V) 10**

**-14**

** m, an atom with individual neutrons and protons visible. [39] **

Each image in Figure 2-8 represents a certain scale range. For example the image of Earth has

a defined scale of 10

7

m, hence objects of larger scale are not visible in this image. This is referred to as the *outer scale* of the image. When observing the next image, of the man at scale 1 m, we realise that the man (or any other human being for that sake) is not visible in the image of Earth. This means that there is also a smallest scale of what is being depicted in an image. This is referred to as the *inner scale* of the image and it is defined by the resolution of

the image. In the case of Figure 2-8 all images are of resolution 198 x 198 pixels, which gives

an inner scale of approximately 51 mm for the image of the man. This means that we are able to see the fingers of the man, since the scale range for a grown man’s finger is within the scale range of 51 mm to 1 meter, but we are not able to identify the hands of the watch since their scale range is outside the image scale range.

The scale range of an object is closely connected to the process of observation. An observation is the measurement of a physical property made by an *aperture*. In a mathematical sense a measurement can be made infinitely small (sampling), however for physical measurements the aperture must for obvious reasons be of finite size. The physical property is integrated (weighted integration) over the size of the aperture, and the size of the aperture defines the resolution of the resulting signal.

12

Take for instance a digital camera where each element (aperture) of the CCD

light over a spatial area resulting in a pixel value in the final image. A measurement device

(like an object) is also limited by a scale range, which is defined by the smallest (inner scale) and the largest (outer scale) size of measurable objects or features. The outer scale is thus bounded by the size of the detector (e.g. the whole CCD for a digital camera) and the inner scale is limited to the integration size of the smallest aperture (e.g. a pixel for a digital camera) [35]. However, the scale boundaries of for example a camera is not fixed since it depends on the distance between the camera and the object of interest, instead the ratio between the outer scale and the inner scale is commonly used to define the dimension of a measurement device [35].

An example of the use of scale and measurement device limitations is image dithering. When printing a grey-scale image on a black and white laser printer, the grey-scale intensity values are achieved by adjusting the frequency of the black dots on the paper, the higher the frequency of printed black dots, the darker the colour. This is called dithering. The reason for this is because the dots applied on paper by a laser printer are smaller than what a human eye is able to perceive. Thus the eye will integrate the intensity values over a small area (defined by the inner scale of the eye) and the relative coverage of black respectively white within such

an area defines the intensity value perceived by the eye. Figure 2-9 illustrates an example of

image dithering.

**Figure 2-9: (I) Original image, (II) dithered image and (III) magnification of a detail. **

**2.2.2 Linear Scale-Space **

One of the most basic and important tasks for the field of image analysis is deriving useful information about the structure of a 2D signal. To extract data of any type of representation from an image an operator is used to interact with the data. General questions that always need to be answered when developing a system of automatic image analysis include firstly what kind of operator to utilise, and secondly what size it should have. Regarding the first question of which type of operator that should be used is dependant on what feature or sort of structure is being detected in the image. Examples of image features commonly interesting

(within the field of image analysis) include edges, corners and ridges (i.e. lines).

The second question concerning the size of the operator is depending on the expected size of features to detect. However, sometimes the size of the sought features are not known, why it may be of interest to search for features at different scales. This section of the thesis will

2

A CCD (charge coupled device) is a small chip with several hundred thousand of individual picture elements. It is commonly used in digital cameras where it at each pixel position absorbs incident light and converts it to an electric signal. Each picture element on the CCD results in a pixel in the digital image.

13

describe how the notion of scale has been incorporated into the mathematics of uncommitted observation resulting in the framework of linear scale-space.

The initial scale-space idea is to be able to represent an image at arbitrary scales. An image is initially bound by its inner and outer scale which limits the scale ranges that may be represented (i.e. there is no information in the 2D signal about objects or features of scale ranges outside the scale range of the image). In other words, it is impossible to derive an image of a finer scale than the inner scale of the original image without additional information. However, what is possible is to describe the image at coarser scales by raising the inner scale. A very common practical example where this may be useful is to suppress noise. Since noise is usually apparent at fine scales (often at pixel level) it will be effectively suppressed if the inner scale of the image is raised not to include the scale of the noise.

Representing an image at a coarser scale can be compared to observing it through an aperture of larger width than its inner scale. Practically this means that the image must be filtered by an operator (which represents the aperture). The basic question to ask is what operator to use.

To be able to derive an operator that is used to represent images at coarser scales some requirements of its behaviour must be specified. Linear scale-space can be compared to the visual front-end of human vision. The visual front-end is defined as the first stage of vision, where an uncommitted observation is made. No prior knowledge about, and no feedback to, the image is available at this stage of the vision process. From this starting point several papers have defined a number of similar axioms, or requirements, from which they all in different ways have derived the Gaussian kernel to be the unique kernel for a linear scalespace framework [36].

Using the concept of an uncommitted observation as a prerequisite the following axioms may be used to derive the Gaussian [21]:

• *linearity*, no a priori knowledge about, or model of, the image exists,

• *spatial shift invariance (homogeneity)*, no spatial position is preferred (i.e. the whole image is treated equally),

• *isotropy*, no preferred direction, features of all directions are treated equally (this axiom automatically results in a circular operator in 2D, and spherical in 3D [35])

• *scale invariance*, no specific scale is emphasized.

The Gaussian kernel supplies the tool of a one-parameter kernel family to describe images at arbitrary (coarser) scales. The Gaussian kernel of second dimension;

*g*

(

*x*

,

*y*

;

*t*

)

=

1

2

π

*t e*

−

⎛

⎜

*x*

2

+

2

*t y*

2

⎞

⎟ where *x* and *y* are the spatial coordinates and *t* is the scale parameter. The relation between the scale parameter and the standard deviation of the Gaussian is:

*t*

=

σ

2

. The factor in front of the exponential function, *e*, is a normalising factor which results in the integral of the

Gaussian function to always be exactly one.

− ∞

∞

∫ ∫

∞

− ∞

*g*

(

*x*

,

*y*

)

*dx dy*

=

1

This is an important feature of the Gaussian function when used within the scale-space concept since it ensures that the average grey-level of the image remains the same when blurring with the Gaussian kernel [35]. The normalisation must not be forgotten when

14

discretising the function. Figure 2-10 shows the Gaussian kernel both as a 2D image and a 3D

mesh.

**Figure 2-10: 2D Gaussian kernel; (I) 2D image, (II) 3D mesh. **

An image scale-space (a term developed by Witkin 1983 [22] and Koenderink 1984) [21] is defined as the stack of images created by including the original image and all subsequent images resulting from convolution with the Gaussian kernel of increasing width, with the scale parameter bounded by the image inner and outer scale. Although Witkin’s and

Koenderink’s articles are considered to have pioneered the concept of linear scale-space in the western world, it is also necessary to point out that similar results had already been achieved in 1959 in Japan by Taizo Iijima [23]. However, these results and research following them were not known in the western world until much later (first known reference in western

research literature is dated to 1996 [23]). Figure 2-11 shows three different images and

samples from their scale-spaces. An image of scale zero (*t* = 0) is defined to be the original, unsmoothed, image.

**Figure 2-11: Scale-space images; (I) t = 0, (II) t = 4, (III) t = 8, (IV) t = 16 and (V) t = 32. **

It is easy to see how features of finer scales are suppressed at higher scales. Take for instance

the images of the baboon, in the top row of Figure 2-11. In the leftmost (original) image the

fine structure of the hair in the baboon’s fur is visible, but it quickly disappears when

15

traversing up the scale-space ladder. At coarser scales somewhat larger features, like the eyes and nostrils, disappear, and at scale *t* = 32 only the outlines of the different parts of the baboon’s face are visible. The images in the middle row illustrate a cosine signal with varying period. It can be interpreted as vertical lines of different scales. It is easily noticed how the lines of smaller scales disappear early in the image scale-space, and how thicker lines are successively smoothed at coarser scales. At a large enough scale an image will always converge towards a single grey value.

Figure 2-12 shows a detail of the fingerprint in Figure 2-11, and how it evolves at coarser

scales. The upper left image is the original fingerprint feature and following are (left-to-right, top-to-bottom) scale-space smoothed versions at t = 1, 4 and 16. The top right image

(calculated at t = 1) shows how deviations within the ridges and furrows (i.e. very small features) are evened out. In the image calculated at scale, t = 4, it is noticeable how the ridge/furrow-pattern itself is weakened, and for the lower right image (t = 16) the feature is almost completely flattened and there is barely any structure left.

1

0.5

0

10

20

30

1

402

4

6

8

0.5

0

10

20

30

402

4

6

8

1 1

0.5

0.5

0 0

10

20

30

402

4

6

8

10

20

30

402

4

6

8

**Figure 2-12: Detail of fingerprint; (left-to-right, top-to-bottom) original and scale-spaced smoothed at t = {1, 4, 16}. **

As previously mentioned linear scale-space has been derived in many different ways, where one more is essential to mention. In the 1984 article “The structure of images“ Koenderink was first to show that the *diffusion equation* is the generating equation of a linear scale-space

[21]. Koenderink used the concept of *causality* as a starting point, an axiom which defines that new level surfaces must not be created in the scale-space representations when the scale parameter is increased [20]. This axiom has also been formulated in several different ways, of

16

which one is that local extrema isn’t enhanced at coarser scales (i.e. intensity values of maxima decrease and minima increase).

The diffusion equation is defined as:

∂

*s*

*L*

= ∇

2

*L*

= ∆

*L*

, where *s* is the scale parameter. The diffusion equation states that the derivative to scale equals the divergence of the gradient of *L*, the luminance function (or image) in our case [21]. This is the same as the sum of the second partial derivatives, which is the Laplacean (

∆

*L*

). Bart ter

Haar Romeny supplies an interpretation of the diffusion equation in the case of scale-space smoothing an image [35]; “The luminance can be considered a flow that is pushed away from a certain location by a force equal to the gradient. The divergence of this gradient gives how much the total entity (luminance in our case) diminishes with time”. The relationship between scale parameter *t* in the Gaussian function and *s* in the diffusion equation is *t* = 2*s* [21]. In the two dimensional case the diffusion equation becomes:

∂

*s*

*L*

=

∂

∂

*L s*

=

∂

∂

2

*x*

2

*L*

+

∂

∂

2

*y*

*L*

2

=

*L xx*

+

*L yy*

With the requirements of causality, isotropy, homogeneity and linearity the solution to the diffusion equation is the Gaussian kernel [21]. This solution is referred to as the Green’s function of the diffusion equation. The initial condition to the diffusion equation is defined as

*L*

(

⋅

; 0 )

=

*L*

0

, which means that the scale-space image ( *L*) at scale 0 is the original image (*L*

*0*

).

The diffusion equation is well known within physics and is often referred to as the

*heat *

*equation* within the field of thermodynamics, since it describes the heat distribution (*L*) over time ( *s*) in a homogeneous medium with uniform conductivity [37].

Solving the linear diffusion equation and convolving with a Gaussian gives the same results, thus there are two options when implementing a linear scale-space; to approximate the diffusion equation or the convolution process [36]. Throughout this report the linear scalespace is implemented by approximating the convolution with a Gaussian kernel. Nevertheless, the diffusion equation will prove to be a better alternative for implementing nonlinear scalespaces, which will be described in the following sections.

An essential additional result to linear scale-space is that the spatial derivatives (of arbitrary

order) of the Gaussian are also solutions to the diffusion equation [21]. Shown in Figure 2-13

is an image and a 3D mesh of the 1st derivative in x-direction of a 2D Gaussian kernel.

**Figure 2-13: 1st derivative of a 2D Gaussian in x-direction; (I) 2D image, (II) 3D mesh. **

Together with the Gaussian kernel, the Gaussian derivatives form a complete family of differential operators [21]. Using the scale-space family of differential operators it is possible

to create a scale-space of any measurement. Figure 2-11 emphasized samples from the scale-

17

space of image intensity values. Figure 2-14 shows a scale-space of the gradient magnitude.

In this case the scale parameter defines the size of the Gaussian derivatives. The gradient magnitude at scale *t* is calculated by:

∇

*t*

*L*

=

⎛

⎜⎜

∂

*L t*

∂

*x*

⎞

⎟⎟

2

+

⎛

⎜⎜

∂

*L t*

∂

*y*

⎞

⎟⎟

2

=

*L t x*

2

+

*L t y*

2

**Figure 2-14: Samples from a gradient magnitude scale-space; (I) original image, (II) t = **

**0.01, (III) t = 1, (IV) t = 4 and (V) t = 16. **

Obviously the edges of finer scale features are apparent in images of low scale and in the images at scale t = 16 only the edges of larger objects are visible. An interesting feature in the fingerprint image at scale t = 4 is that the gradient magnitude of the ridges close to the core point of the fingerprint is reasonably strong, while the gradient magnitude of ridges elsewhere in the fingerprint are barely noticeable. This is because the ridges in the centre of the fingerprint are slightly wider and further apart, thus they are still large enough to be detected as edges at scale t = 4.

This section has described the concept of linear scale-space and demonstrated it to be a framework for multi-scale image analysis, based on a solid mathematical foundation, which gives us the tools of a one-parameter family of kernels to derive images and image measurements at arbitrary scales within the image scale range. The solid mathematical foundation highly motivates the usage of scale-space methods in image processing and analysis.

**2.2.3 Nonlinear Isotropic Scale-Spaces **

Chapter 2.2.2 showed that the isotropic Gaussian kernel is a unique kernel to form linear

scale-space and the obvious choice when no prior knowledge is available about an image and its structure. There are however some important disadvantages with using operators from the linear scale-space family. Firstly, filtering an image with the Gaussian smoothes both noise and other unwanted features as well as important features (like edges) which makes identification harder [36]. Secondly edges are dislocated when smoothing an image at coarser scales [36] which makes it harder to relate edges detected at coarse scales with edges at finer scales. By relaxing the axioms defined for the linear scale-space it is possible to design the smoothing process to better preserve (or even enhance) important features like edges.

18

In the case of *nonlinear isotropic scale-spaces*, as described in this section, two of the required axioms for the linear scale-space are excluded, namely homogeneity and linearity.

Excluding the axiom of homogeneity (making the process *inhomogeneous*) introduces the possibility to smooth with different scales at different positions in the image. For example coarser scales of smoothing may be used in image areas of similar intensity values, while finer smoothing scales can be used at edges and at features where the gradient is strong (i.e. where the intensity values locally differ rapidly). Since the smoothing process changes the image it may be wise to reanalyse the structure of the image at different scales to create a more accurate image evolution towards coarser scales. This introduction of feedback to the system makes the process *nonlinear*, and requires an iterative implementation to allow for reanalysis of the image structure during the diffusion process.

The idea of a nonlinear isotropic scale-space was first introduced [36] in Perona & Malik’s

1987 article “Scale space and edge detection using anisotropic diffusion“ [24]. They proposed an implementation using the diffusion equation:

∂

*s*

*L*

= ∇ ⋅

(

*c*

∇

*L*

)

, where *c* is a scalar function dependent on spatial position and recalculated at each iteration.

Written in its spatial components, the partial differential equation is [26]:

∂

*s*

*L*

= ∂

*x*

(

*c*

∂

*x*

*L*

)

+ ∂

*y*

(

*c*

∂

*y*

*L*

)

As the title of the Perona & Malik article states, their scale-space method is referred to as being anisotropic. Following the terminology of Weickert [36] the Perona & Malik method should however be considered isotropic. Since the diffusion process is controlled by a scalar, resulting in equal diffusion in each spatial direction (i.e. isotropy).

The main examination when implementing the diffusion equation as described above, is related to how one chooses the scalar function *c*, which is often referred to as the conductivity function. Perona & Malik define three criteria [24, 25]; causality, immediate localisation and piecewise smoothing. The causality criteria have previously been explained in the section on

linear scale-space (see 2.2.2), and Perona & Malik define it as “no ‘spurious detail’ should be

generated passing from finer to coarser scales“ [24, 25]. The second criteria, immediate localisation, suggests that region boundaries at a certain scale should be sharp and localised at positions meaningful for region boundaries at that particular scale, i.e. edges should not be dislocated at coarse scales. Piecewise smoothing means that the smoothing process should be stronger intra-regionally than inter-regionally. In other words intensity values within a region should be blurred together before blurring with intensity values over the region boundaries.

An intuitive representation of region boundaries are strong edges which separates regions of similar intensity values. Considering this definition it is reasonable to consider an edge detection operator to define the diffusivity of an image. The most commonly used conductivity coefficient (i.e. diffusivity), also the one used by Perona & Malik [24, 25], is that of the gradient magnitude (

∇

*L c*

(

∇

*L*

)

). Hence a function depending on the gradient magnitude,

, would be a consistent selection for the conductivity function. Before defining the conductivity function we shall take a closer look at an implementation of the nonlinear isotropic (or scalar driven) diffusion to obtain a better understanding of its effect on the image evolution.

19

(

A discrete approximation of the scalar driven diffusion is described in [26], and is defined as:

*L s x*

+

,

*y ds c s x*

+

1 ,

*y*

=

+

*L s x c s*

,

*x*

,

*y y*

+

*ds*

)(

*L s x*

2

+

1 ,

[

*y*

(

*c s x*

−

,

*y*

+

1

*L s x*

,

*y*

+

*c s x*

,

) (

*y c*

)(

*s x*

,

*L s x y*

,

+

*y*

+

1

*c s*

−

*x*

−

1 ,

*y*

*L s x*

,

)(

*y*

*L s x*

,

) (

*y*

−

*c s x*

,

*y*

+

*L s x*

−

1 ,

*y c*

)

]

*s x*

,

*y*

−

1

)(

*L s x*

,

*y*

−

*L s x*

,

*y*

−

1

)

+ where *s* is the scale, *x* and *y* are the spatial position, *L*

*s*

is the image at scale *s* and *c* is the conductivity function. The scale-step, *ds*, should be set to less than 0.25 to ensure a stable

*L s x*

+

,

*y ds*

=

*L s x*

,

*y*

+

*ds*

[

*L s x*

,

*y*

+

1

+

*L s x*

,

*y*

−

1

+

*L s x*

+

1 ,

*y*

+

*L s x*

−

1 ,

*y*

−

4

*L s x*

,

*y*

Within the parenthesis on the right hand side of the equation is the commonly used discrete version of the Laplacean:

1

1

−

4 1

1

Recollecting the definition of the linear scale-space diffusion equation (see 2.2.2), where the

right hand side of the equation is the Laplacean, it is evident that replacing the conductivity function with a constant value of 1 results in a discrete approximation of the linear scale-space diffusion. If we instead set the conductivity function to a constant value of zero, the result achieved will be that the image at the coarser scale is the same as the one at finer scale (i.e. the image is not diffused at all).

Using these two conclusions the conductivity function should be selected to give a value equal to one (i.e. maximum diffusion) at low values of the gradient magnitude, and values of zero (i.e. no diffusion) for strong gradients. This will preserve edges while blurring regions of similar intensity values. Perona & Malik [24, 25] proposed two different conductivity functions with the properties described;

*c*

*PM*

1

= exp

⎛

⎝

−

⎛ ∇

λ

*L*

⎞

2

⎞

⎠

and

*c*

*PM*

2

=

1

1

+

⎛ ∇

⎜⎜

λ

*L*

⎞

⎟⎟

2

.

20

The function curves are shown in the following figures.

1

0.8

0.6

0.4

0.2

0.005

0.01

0.025

0.1

0

0 0.05

0.1

0.15

0.2

||

∇

L||

0.25

0.3

0.35

0.4

**Figure 2-15: Perona & Malik conductivity function, c**

*PM1*

**, of different **

λ

**. **

1

0.8

0.6

0.005

0.01

0.025

0.1

0.4

0.2

0

0 0.05

0.1

0.15

0.2

||

∇

L||

0.25

0.3

0.35

0.4

**Figure 2-16: Perona & Malik conductivity function, c**

*PM2*

**, of different **

λ

*.*

Other conductivity functions have been proposed, and apart from the two Perona & Malik functions described above, one more have been considered in this thesis and it is taken from

Weickert [27]:

*c*

*W*

=

⎧

⎪

⎪

1

− exp

⎛

⎝

(

1

−

3 .

315

∇

*L*

/

λ

)

4

⎞

⎠

(

(

*s s*

=

>

0

0

)

)

Function curves for *c*

*w*

with varying

1

λ

0.8

0.6

0.005

0.01

0.025

0.1

0.4

0.2

0

0 0.05

0.1

0.15

0.2

||

∇

L||

0.25

0.3

0.35

**Figure 2-17: Weickert conductivity function, c**

*W*

**, of different **

λ

**. **

0.4

21

Figure 2-18 displays examples of images taken from scale-spaces using the different

diffusivities described above. The images have been calculated at scale *s* = 0.5, with

λ

= 0.01 during 100 iterations and *ds* = 0.2. The gradient magnitude image of the first iteration is

shown in Figure 2-19 (top row, column *s* = 0.5).

**Figure 2-18: (top left) original image, resolution 400x300, (top right) Weickert diffusivity **

*c*

*W*

**, (bottom) Perona & Malik diffusivity c**

*PM1*

** (left) and c**

*PM2*

** (right). **

The images in Figure 2-18 shows that the choice of conductivity function strongly affects the

result of the diffusion process. This is also the case with the selection of the parameter

λ

which is evident considering the images in Figure 2-19. In all three conductivity functions

previously described

λ

has the role of a contrast parameter, separating high contrast and low contrast regions. Image areas with diffusivity larger than lambda, i.e.

∇*L* >

λ

, are considered to be edges and areas with

∇*L* <

λ

are regarded to belong to the interior of a region [36, 27].

The remaining parameter that strongly affects the outcome of the diffusion process is related to the calculation of the image gradient. A common way to do this is by calculating the image derivatives through convolution with the Gaussian first derivatives, and using those results to

calculate the gradient magnitude. This was utilised to calculate the images in Figure 2-14.

This means that the size (scale parameter) of the Gaussian kernels used to calculate the gradient also affects the results of the diffusion. The scale at which the gradient is calculated is sometimes referred to as the observation scale.

Figure 2-19 illustrates the results from 12 different scale-spaces of the top left image in Figure

2-18. The scale-spaces have been calculated using the Weickert conductivity function of

different

λ

-values (0.005, 0.01 and 0.025) and different scale parameter, *s* (0.05, 0.5, 1 and 4), for the gradient calculation. Included in the figure are images of the gradient magnitude of different scale, calculated at first iteration (top row). The intensity values of the gradient

22

magnitude images have been rescaled to include values between 0 (black) and 0.5 (white), to obtain higher contrast. Since the gradient magnitude never exceeded 0.47 no information is

lost by rescaling the intensity values. The images in Figure 2-19 intend to show the effect the

selection of observation scale and

λ

-value (for the Weickert conductivity function) has on the diffused images.

23

∇

*L*

*s* = 0.05 *s* = 0.5

λ

** = 0.005 **

λ

** = 0.010 **

λ

** = 0.025 **

**Figure 2-19: Nonlinear isotropic scale-spaces (scalar driven diffusion) of top left image **

**in Figure 2-18. Images evolved through 100 iterations using a scale-step, ds, of 0.2. **

24

*s* = 1

25

*s* = 4

∇

*L*

λ

** = 0.005 **

λ

** = 0.010 **

λ

** = 0.025 **

Figure 2-20 illustrates how a detail of a fingerprint is affected by nonlinear isotropic scale-

space smoothing. The fingerprint feature is the same as was previously shown in Figure 2-12.

The top left image is the original signal, and the rest of the images have been calculated at 5,

25 and 100 iterations respectively.

1

0.5

0

10

20

30

1

4

6

8

0.5

0

10

20

30

4

6

8

1 1

0.5

0.5

0 0

10

20

30

402

4

6

8

10

20

30

402

4

6

8

**Figure 2-20: Detail of fingerprint; (left-to-right, top-to-bottom) original and nonlinear isotropic scale-spaced smoothed at 5, 25 and 100 iterations. **

In the case of linear scale-space it was established that the implementation could be done either by convolving an image with the Gaussian kernel or by approximating the linear isotropic diffusion process. Both methods achieve the same result. Considering implementation using Gaussian convolution for a nonlinear isotropic scale-space would involve for the conductivity function to control the size of the Gaussian kernel at each spatial position in the image, and in that way steer the smoothing process. This is possible but it should be noted that it will not render the same results as approximating the nonlinear diffusion process since in the latter case the diffusivity (i.e. gradient magnitude) is calculated for each iteration. This feedback is missed when using a Gaussian convolution implementation. Throughout the thesis implementation of nonlinear isotropic scale-spaces has been undertaken using the discrete approximation of the diffusion process.

**2.2.4 Nonlinear Anisotropic Scale-Spaces **

An additional type of scale-space to consider is nonlinear anisotropic scale-spaces. Apart from the axioms relaxed in the case of nonlinear isotropic scale-space, the axiom of isotropy is excluded also. This makes the process both inhomogenic and *anisotropic*, hence not only is it possible to decide the scale (i.e. amount of smoothing) in each image position

(inhomogeneity), but it is also viable to define a preferred direction of the smoothing

26

(anisotropy). In the case of nonlinear isotropic scale-spaces only the amount of smoothing is decided. In the previous section it was described how the scale could be steered to get the value zero near edges, meaning that no smoothing was performed there. This means that noise at edges is not suppressed. To smooth the edges, a preferred method would be to control the smoothing process to smooth more along edges than perpendicular to them. The framework of a nonlinear anisotropic scale-space makes this possible by defining the preferred smoothing direction to be along edges.

The implementation of an nonlinear anisotropic scale-space is preferably achieved by approximating the diffusion process, as was the case for nonlinear isotropic scale-space methods. To introduce the anisotropy in the diffusion equation the diffusion process must be controlled by a tensor function, instead of a scalar function as was the case for nonlinear isotropic scale-spaces. The equation of tensor dependent diffusion is as follows:

∂

*s*

*L*

= ∇ ⋅

(

*D*

∇

*L*

)

, where *D* is a tensor function dependent on spatial position and recalculated at each iteration.

*D* is at each position a positive semidefinite symmetric tensor, which in the case of 2D images has the size 2x2 and is of the form [26]:

*D*

=

⎛

⎜⎜

*a b b c*

⎞

⎟⎟

Written in its spatial components, the partial differential equation is [26]:

∂

∂

*s x*

*L*

=

(

∂

*x*

∂

*y*

)

⎛

⎜⎜

(

*a*

∂

*x*

*L*

)

+ ∂

*x b*

∂

*a b*

⎞

⎟⎟

⎛

⎜⎜

+ ∂

∂

∂

*x*

*L*

⎞

⎟⎟

*b*

∂

= ∂

*x*

+

(

*a*

∂

*x*

*L c*

∂

+

*b*

∂

(

*b y*

*L c*

)

*y y*

(

*L x*

*L*

)

∂

*y*

(

*y*

*L*

)

*y*

*L*

+

*b*

∂

*x*

*L*

+

*c*

∂

*y*

*L*

)

=

As previously mentioned it is preferred for a diffusion process to smooth more along edges than across. For this the diffusion tensor must be adapted to the local image structure. A local coordinate frame is defined with one axis, *v*, in line with the isophote (i.e. line of constant intensity) and the other axis, *w*, is orthogonal to the first and thus aligns with the gradient.

This is referred to as Gauge coordinates [21]. In the case of fingerprint images the *v*-axis could be considered parallel to ridges and furrows, while the *w*-axis is perpendicular to them.

When the local coordinate frame has been defined it is only a matter of defining the conductivity coefficients in each direction, *c*

*v*

and *c*

*w*

. With these definitions the diffusion matrix can be defined as;

*D*

=

*R*

*T*

⎛

⎜⎜

*c w*

0

0

*c v*

⎞

⎠

*R*

, where *R* is the rotation matrix with column vectors normalised to length 1. If

α

is the angle between the Gauge axes and the image axis (i.e. *x* and *y*), then *R* can be calculated by;

*R*

=

⎛

⎜⎜ cos sin

( )

( )

− sin cos

( )

( )

⎞

⎟⎟

When *c*

*v*

and *c*

*w*

are equal the diffusion process is isotropic. Interpreting the diffusion process from the diffusion tensor means that the image will be smoothed by the strength of the eigenvalues, in the direction of the corresponding eigenvectors [28]. For the diffusion tensor defined above, calculating the eigenvalues will result in *c*

*v*

and *c*

*w*

, and calculating the eigenvectors will give the column vectors in *R* [26, 19].

A straightforward way to calculate the diffusion tensor, and an intuitive development considering the scalar driven diffusion case, is to calculate the rotation matrix from the

27

gradient direction and to control the amount of diffusion by the gradient magnitude. Such a method, referred to as edge enhancing diffusion, is mentioned in [29, 28] and described in more detail in [26].

Another implementation of tensor driven diffusion is the so called coherence-enhancing diffusion proposed by Weickert in [28]. This method is well suited for images with strong coherent structures (as is the case for fingerprint images), since it in addition to smooth noise at edges also enhances coherent structures. The coherence is a measure for the strength of the local orientation. If the orientations of a structure within a local area are parallel the coherence becomes large, and for structures with orientations equally distributed over all directions (i.e. isotropic structures) the coherence tends to zero. Due to the flow-like structures (i.e. ridges and furrows) in fingerprint images the coherence, measured at ridge scale, is large. One of the advantages with coherence-enhanced diffusion is that it uses orientation to calculate the diffusion tensor, instead of direction, which is the case when using the gradient to calculate the diffusion tensor. Directions with 180 degrees difference share the same orientation. For example in a fingerprint the direction of a ridge has the opposite sign to the direction of the adjacent furrow, but they share the same orientation. The operator used by Weickert [28], as well as Almansa and Lindeberg [19], to analyse coherent flow-like structure and calculate the local orientation is the *structure tensor* (also referred to as the *second-moment matrix* or the

*interest operator*

);

*J o*

(

∇

)

*t*

*L*

= ∇

*t*

*L*

∇

*t*

*L*

*T*

=

*L t x*

*L t y*

(

*L t x*

*L t y*

)

=

*L t x*

*L*

*L t x*

*L t y t x*

*L t x*

*L t y*

,

*L t y*

*L t y*

where

∇

*t*

*L*

is the image gradient at scale *t*, referred to as the observation scale. The structure tensor is positive and semi-definite, and its eigenvectors are parallel, respectively orthogonal, to the gradient direction [28]. The coherent structures in a fingerprint image are often larger than scale *t*, that is ridges and furrows tend to be parallel longer than their width. Since we are considering orientations, as opposed to directions, it is possible to scale-space filter the structure tensor to a scale which better represents the size of the coherent image structures. In other words the structure tensor is smoothed by a Gaussian *g*

ρ

;

*J*

ρ

(

∇

*t*

*L*

)

=

*g*

ρ

∗

(

∇

*t*

*L*

∇

*t*

*L*

*T*

)

=

⎛

⎜⎜

*j*

11

*j*

12

*j*

12

*j*

22

⎞

⎟⎟

The scale of the Gaussian,

ρ

, is referred to as the integration scale and it should reflect the characteristic size of the typical image structures [28]. The Gaussian smoothing of the structure tensor integrates the orientation locally meaning that local orientations becomes more parallel, which is the same as enhancing the coherence of the structures.

28

Figure 2-21 shows a fingerprint image and the structure tensor orientation calculated at

different observation and integration scales.

**Figure 2-21: (I) original image, (II) structure tensor orientation, t = 0.5, **

ρ

** = 1, (III) structure tensor orientation, t = 0.5, **

ρ

** = 16 and (IV) structure tensor orientation, t = 16, **

ρ

** = 0. **

It is important that the image gradient is calculated at a fairly detailed scale, to get an accurate

estimation of the structure orientation, before the smoothing is carried out. Image IV in Figure

2-21 shows a case were this has not been considered. Here the observation scale,

*t*, has been set to 16 and no smoothing has been performed afterwards (i.e. the integration scale

ρ

is 0). It is apparent that a derivative operator not accurately adjusted to the image structure will result in poor orientation estimation. In the image referred to the scale of the derivative operators have been chosen far too large and results in no detection of the finer ridge structure within the fingerprint. Instead it only gives significant response from the edges separating the

fingerprint and the background. Image II and III in Figure 2-21 shows how the coherence of

the structure tensor is enhanced for larger values of scale

ρ

.

Weickert [28] defines a coherence measure,

(

κ

, as

κ

=

µ

1 where

µ

−

µ

)

2

,

2

*1*

and

µ

*2*

are the eigenvalues of the structure tensor, calculated by

µ

1 , 2

=

1

2

*j*

11

+

*j*

22

±

(

*j*

11

−

*j*

22

)

2

+

4

2

*j*

12

.

The coherence measure becomes large for coherent structures and tends to zero for isotropic structures.

The structure tensor’s aptness to analyse coherent flow-like structure and calculate local orientation makes it suitable to define the orientation of the diffusion tensor, why the diffusion tensor is constructed to have the same eigenvectors as the structure tensor. The remaining question is how to define the conductivity coefficients, that is the eigenvalues of the diffusion tensor. The thesis continues to follow the approach described by Weickert [28], who proposed the following calculation of the eigenvalues:

λ

1

=

α

λ

2

=

⎧

⎪⎩

α

+ for *C* > 0,

(

1

α

−

∈

α

α

)

(

κ

=

0

)

0 , exp

⎝

κ

*C*

, and

κ

is the coherence measure described above. *C* works as a threshold parameter and should be adopted to the expected value range of

κ

. For

κ

>> *C* we get

λ

*2*

≈ 1,

29

and

κ

<< *C* will result in

λ

*2*

≈

α

. The parameter

α

defines the lowest level of smoothing, as well as the largest possible difference between diffusion along isophotes and perpendicular to them (i.e. 1-

α

). For a

κ

close or equal to zero, which indicate isotropic structure, both eigenvalues get the value of

α

, resulting in an isotropic diffusion process.

With the rotation matrix, *R*, defined by the columns of the structure tensor and the conductivity coefficients (or eigenvalues) defined as outlined above, the diffusion tensor *D* is calculated by [26];

*D*

=

⎡

⎢

*a b b c*

⎤

⎥

*a*

=

1

2

⎛

⎜

⎝

λ

1

*b*

=

+

λ

2

+

(

λ

1

(

*j*

11

−

−

(

*j*

11

(

λ

1

−

−

*j*

22

λ

2

)

2

)

+

*j*

12

4 *j*

12

2

λ

2

)(

*j*

22

*j*

11

)

2

+

−

4

*j*

22

*j*

12

2

)

⎞

⎟

⎠

*c*

=

1

2

⎛

⎜

⎝

λ

1

+

λ

2

−

(

λ

1

(

*j*

11

−

−

λ

2

)(

*j*

22

*j*

11

)

2

+

−

4

*j*

22

*j*

12

2

)

⎞

⎟

⎠

A discrete approximation of the tensor dependent diffusion is described in [26], and is defined as:

*L s x*

+

,

*y ds*

2

(

(

*a s x*

,

=

*y*

−

1

*L s x*

+

,

*a y s x*

,

+

*y*

)

)

*ds*

4

*L x*

,

[

*y*

−

−

1

(

*s b x*

,

−

2

*y*

(

−

1

*a*

(

*x*

,

+

*y b x s*

−

1

+

1 ,

+

*y*

2

)

*a*

*L x x*

,

+

1 ,

*y y*

+

−

1

*a x*

+

2

*y*

+

1

(

,

) (

*c*

+

*s x*

+

1 ,

*c y x*

,

*y*

+

−

1

*c*

+

*s x*

,

*y*

2

*c*

)

*L x x*

,

)

*y*

+

1 ,

+

*y c*

+

*x*

(

,

*b x s*

,

*y*

+

1

]

)

*y*

+

1

*L x*

,

*y*

+

*b*

+

*s x*

+

1 ,

2

(

*a y*

)

*x*

,

*L x*

+

1 ,

*y*

+

1

+

*y*

+

1

*a x*

,

+

*y*

)

*L x*

,

*y*

+

1

+

*b x*

,

*y*

−

1

+

*b x*

−

1 ,

*y*

*L x*

−

1 ,

*y*

−

1

+

2

*c x*

−

1 ,

*y*

+

*c x*

,

*y*

*L x*

−

1 ,

*y*

−

*b x*

,

*y*

+

1

+

*b x*

−

1 ,

*y*

*L x*

−

1 ,

*y*

+

1 where *s* is the scale, *ds* is the scale-step, *x* and *y* are the spatial position, *L*

*s*

is the image at scale *s* and *a*, *b* and *c* are the components of the diffusion tensor. This discrete implementation

has been implemented and used to calculate the images in Figure 2-22. The top row shows the

structure tensor orientation calculated at first iteration for each integration scale. The tensor diffused images illustrate how integration scale,

ρ

, and diffusivity constant, *C*, affects the diffusion process.

30

ρ

** = 1 **

ρ

** = 4 **

ρ

** = 16 **

ρ

** = 64 **

**Figure 2-22: Nonlinear anisotropic scale-spaces (tensor driven diffusion) of leftmost **

**image in Figure 2-21. Images have been calculated with the observation scale, t, set to **

**0.0625 and evolved through 50 iterations using a scale-step, ds, of 0.2. **

It has been explained that the integration scale,

ρ

, enhances the coherence of the structure tensor. Since the structure tensor is used to define the eigenvectors of the diffusion tensor (i.e. the preferred smoothing directions) the integration scale will have the effect of also enhancing coherence for the diffusion process as well as for the diffused image. This effect is apparent

when comparing images calculated using integration scales 1 and 64 in Figure 2-22 above.

Too large an integration scale may destroy features with initial low coherence, such as ridge

31

bifurcations, ridge endings and in the worst case singularity points. The diffusivity constant,

*C*, affects the diffusion by relation to the coherence,

κ

. The smaller the value of *C*, the lower coherence is needed for strong anisotropic diffusion, which is illustrated in the images in

Figure 2-23 illustrates how a detail of a fingerprint is affected by nonlinear anisotropic scale-

space smoothing. The fingerprint feature is the same as was previously shown in Figure 2-12

and Figure 2-20. The top left image is the original signal, and the rest of the images have been

calculated at 5, 25 and 100 iterations respectively.

1

0.5

0

10

20

30

1

402

4

6

8

0.5

0

10

20

30

402

4

6

8

1 1

0.5

0.5

0 0

10

20

30

402

4

6

8

10

20

30

402

4

6

8

**Figure 2-23: Detail of fingerprint; (left-to-right, top-to-bottom) original and nonlinear anisotropic scale-spaced smoothed at 5, 25 and 100 iterations. **

This section has described the theory for nonlinear anisotropic scale-spaces implemented as tensor driven diffusion, introduced the different parameters of the diffusion process and explained how they affect the outcome. Scale-space methods explained in previous sections have also been described as implementation by convolution with Gaussian kernels. Such a method also exists for nonlinear anisotropic scale-spaces. The anisotropic smoothing is achieved by creating the Gaussian kernel from the diffusion tensor, making it anisotropic.

This is often referred to as affine Gaussian scale-space. Using Gaussian convolution to implement anisotropic smoothing will exclude the recalculation of the diffusion tensor during the smoothing, i.e. the feedback of the process is lost, hence the two implementation methods will not render the same result. For more detailed description on affine Gaussian scale-space the reader is referred to [19] and Chapter 1.2.6 in [36].

32

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project