Unmixing and target recognition in airborne hyper

Unmixing and target recognition in airborne hyper

Earth Science Research; Vol. 1, No. 2; 2012

ISSN 1927-0542 E-ISSN 1927-0550

Published by Canadian Center of Science and Education

Unmixing and Target Recognition in Airborne Hyper-Spectral Images

Amir Averbuch

1

, Michael Zheludev

1

& Valery Zheludev

1

1

School of Computer Science, Tel Aviv University, Tel Aviv, Israel

Correspondence: Amir Averbuch, School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel. Tel:

972-54-569-4455. E-mail: [email protected]

Received: May 25, 2012 Accepted: June 16, 2012 Online Published: July 6, 2012 doi:10.5539/esr.v1n2p200 URL: http://dx.doi.org/10.5539/esr.v1n2p200

Abstract

We present two new linear algorithms that perform unmixing in hyper-spectral images and then recognize their targets whose spectral signatures are given. The first algorithm is based on the ordered topology of spectral signatures. The second algorithm is based on a linear decomposition of each pixel's neighborhood. The sought after target can occupy sub- or above pixel. These algorithms combine ideas from algebra and probability theories as well as statistical data mining. Experimental results demonstrate their robustness. This paper is a complementary extension to Averbuch & Zheludev (2012).

Keywords: hyper-spectral processing, target recognition, sub- and above pixel, unmixing, dimensionality reduction, diffusion maps

1. Introduction

1.1 Data Representation and Extraction of Spectral Information

We assume that a hyper-spectral signature of a sought after material is given. In many applications according to

Winter (1999), a fundamental processing task is to automatically identify pixels whose spectra coincide with the given spectral shape (signature). This problem raises the following issues: How the measured spectrum of a ground material is related to a given “pure” spectrum and how to compare between them to determine if they are the same?

Spatial and spectral sampling produce a 3D data structure referred to as a data cube. A data cube can be visualized as a stack of images where each plane on the stack represents a single spectral channel (wavelength). The observed spectral radiance data, or the derived surface reflectance data, can be viewed as a scattering of points in a

K

-dimensional Euclidean space

K

where K is the number of spectral bands (wavelengths). Each spectral band is assigned to one axis. All the axes are mutually orthogonal. Therefore, the spectrum of each pixel can be viewed as a vector

x

x

1

, , spectral band. Since

x

K x i

where its Cartesian coordinates

0,

i

 

K x i

are either radiance or reflectance values at each

,  then the spectral vectors lie inside a positive cone in

K

. Changes in the illumination level can change the length of the spectral vector but not its, which is related to the shape of the spectrum. When targets are too small to be resolved spatially or when they are partially obscured or of an unknown shape, as shown in Winter (1999), then the detection has to rely on the available spectral information.

Unfortunately, a perfect fixed spectrum for any given material does not exist.

In agreement with Winter (1999), spectra of the same material are probably never identical even in laboratory experiments. This is due to variations in the material surface. The variability amount is even more profound in remote sensing applications because of the variations in atmospheric conditions, sensor noise, material composition, location, surrounding materials and other contributing factors. As a result, the measured spectra, which correspond to pixels with the same surface type, exhibit an inherent spectral variability that prevents the characterization of homogeneous surface materials by unique spectral signatures.

Another significant complication arises from the interplay between the spatial resolution of the sensor and the spatial variability present in the observed ground scene. According to Winter (1999), a sensor integrates the radiance from all the materials within the ground surface that are “seen” by the sensor as a single image pixel.

Therefore, depending on the spatial resolution of the sensor and the distribution of surface materials within each ground resolution cell, the result is a hyper-spectral data cube comprised of “pure” and “mixed” pixels, where a pure pixel contains a single surface material and a mixed pixel contains multiple (superposition of) materials.

200

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

A linear mixing model is the most widely used spectral mixing model. It assumes that the observed reflectance spectrum of a given pixel is generated by a linear combination of a small number of unique constituent known as endmembers. This model is defined with constraints in the following way (Harsanyi &

Chang, 1994):

x

k

M

1

a s

  

,

k

M

1

a k

1,

additivity constraint,

a k

0 positivity constraint (1) where

s

1

s

M

are the M endmember spectra that assumed to be linearly independent, corresponding abundances (cover material fractions) and

w

is an additive-noise vector.

a

1

1.2 Outline of the Algorithms to Identify Target with Known Spectra a

M

are the

The new methods in this paper achieve targets identification with known spectra. Target identification in hyper-spectral has the following consecutive steps: 

1) Finding suspicious points: there are points whose spectra are different in any norm from the spectra of the points in its neighborhood. This is also called anomaly detection;

2) Extracting from the suspicious points the spectra of the independent components (unmixing) where one of them is the target that its spectrum fits the given (sought after) spectrum.

We assume that spectra of different materials are statistically dependent and the difference between them occurs from the behavior of the first and second derivatives in some sections in the spectrum. If they are statistically independent, then all the related work such as Maximum Likelihood (ML) and Geometrical (MVT, PPI and

N-FINDR) work well.

The experiments in this paper were performed on three real hyper-spectral datasets, which were measured as reflectance, titled: “desert”, “city” and “field” which were acquired by the Specim camera SPECIM camera (2006) located on a plane. Their properties with a display of one waveband per dataset are given in Figures 1-3.

Figure 1. The dataset “desert” is a hyper-spectral image of a desert place taken from an airplane flying 10,000 feet above sea level. The resolution is 1.3 meter/pixel, 286 2640 pixels per waveband with 168 wavebands

201

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

Figure 2. The dataset “city” is a hyper-spectral image of a city taken from an airplane flying 10,000 feet above sea level. The resolution is 1.5 meter/pixel, 294  501 pixels per waveband with 28 wavebands

Figure 3. The dataset “field” is a hyper-spectral image of a field taken from an airplane flying 9,500 feet above sea level. The resolution is 1.2 meter/pixel, 286  300 pixels per waveband with 50 wavebands

The paper has the following structure: Section 2 describes the related work. The two algorithms, which are described in this paper, are compared with the performance of the orthogonal subspace projection (OSP) algorithm.

Section 3 presents an algorithm that identifies the target's spectrum where the target occupies at least a whole pixel.

This method assumes that the target's spectrum is distorted by atmospheric conditions and noised. Section 4 presents an unmixing method that is based on neighborhood analysis of each pixel. This method can also be used for detecting a subpixel target. This algorithm contains two parts. In the first part, suspicious points are discovered.

The algorithm is based on the properties of neighborhood morphology and on the properties of the Diffusion Maps

(DM) algorithm Coifman & Lafon (2006). The second part unmixes the suspicious point. It is based on the application of DM to the linear span of the neighboring background spectra. The appendix describes the Diffusion

Maps algorithm for dimensionality reduction.

2. Related Work

Up-to-date overview on hyper-spectral unmixing is given in Bioucas-Dias & Plaza (2010; 2011). The challenges related to target detection, which is the main focus of this paper, are described in the survey papers Manolakis,

Marden, & Shaw (2001), Manolakis & Shaw (2002). They provide tutorial review on state-of-the-art target detection algorithms for hyper-spectral imaging (HIS) applications. The main obstacles in having effective detection algorithms are the inherent variability target and background spectra. Adaptive algorithms are effective to solve some of these problems. The solution provided in this paper meets some of the challenges mentioned in

Manolakis & Shaw (2002).

202

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

In the rest of this section, we divided the many existing algorithms into several groups. We wish to show some trends but do not attempt to cover the avalanche of related work on unmixing and target detection.

Linear approach: Under the linear mixing model, where the number of endmembers and their spectral signatures are known, hyper-spectral unmixing is a linear problem, which can be addressed, for example, by the ML setup

Settle (1996) and by the constrained least squares approach Chang (2003). These methods do not supply sufficiently accurate estimates and do not reflect the physical behavior. Distinction between different material's spectra is conditioned generally by the distinction in the behavior of the first and second derivatives and not by a trend.

Independent component analysis (ICA) is an unsupervised source separation process that finds a linear decomposition of the observed data yielding statistically independent components Common (1994), Hyvarinen,

Karhunen, & Oja (2001). It has been applied successfully to blind source separation, to feature extraction and to unsupervised recognition such as in Bayliss, Gualtieri, & Cromp (1997), where the endmember signatures are treated as sources and the mixing matrix is composed by the abundance fractions. Numerous works including

Nascimento & Bioucas-Dias (2005) show that ICA cannot be used to unmix hyper-spectral data.

Geometric approach: Assume a linear mixing scenario where each observed spectral vector is given by

r

  

,

,

where r is an L vector ( L is the number of bands),

M m

1

m p

is the mixing matrix (

m i

denotes the

i

th endmember signature and

p

is the number of endmembers present in the sensed area),

s

a

T a a

1

a p

(

 is a scale factor that models illumination variability due to a surface topography),

is the abundance vector that contains the fractions of each endmember ( T denotes a transposed vector) and

n

is the system's additive noise. Owing to physical constraints, abundance fractions are nonnegative and satisfy the so-called positivity constraint

k p

1

a k

1 . Each pixel can be viewed as a vector in a

L

-dimensional Euclidean space, where each channel is assigned to one axis. Since the set

a

p

:

p k

1

a k

1,

a k k

is a simplex, then the set

S x

x

L

:

,

p k

1

a k

1,

a k k

is also a simplex whose vertices correspond to endmembers.

Several approaches Ifarraguerri & Chang (1999), Boardman (1993), Craig (1994) exploited this geometric feature of hyper-spectral mixtures. The minimum volume transform (MVT) algorithm Craig (1994) determines the simplex of a minimal volume that contains the data. The method presented in Bateson, Asner, & Wessman

(2000) is also of MVT type, but by introducing the notion of bundles, it takes into account the endmember variability that is usually present in hyper-spectral mixtures.

The MVT type approaches are complex from computational point of view. Usually, these algorithms first find the convex hull defined by the observed data and then fit a minimum volume simplex to it. Aiming at a lower computational complexity, some algorithms such as the pixel purity index (PPI) Boardman (1993) and the

N-FINDR Winter (1999) still find the maximum volume simplex that contains the data cloud. They assume the presence of at least one pure pixel of each endmember in the data. This is a strong assumption that may not be true in general. In any case, these algorithms find the set of most of the pure pixels in the data.

Extending subspace approach: A fast unmixing algorithm, termed

vertex component analysis (VCA), is described in Nascimento & Bioucas-Dias (2005). The algorithm is unsupervised and utilizes two facts: 1) The endmembers are the vertices of a simplex; 2) The affine transformation of a simplex is also a simplex. It works with projected and unprojected data. As PPI and N-FINDR algorithms, VCA also assumes the presence of pure pixels in the data. The algorithm iteratively projects data onto a direction orthogonal to the subspace spanned by the endmembers already detected. The new endmember's signature corresponds to the extreme projection. The algorithm iterates until all the endmembers are exhausted. VCA performs much better than PPI and better than or comparable to N-FINDR. Yet, its computational complexity is between one and two orders of magnitude lower than N-FINDR.

If the image is of size approximately 300

2000 pixels, then this method, which builds linear span in each step, is too computationally expensive. In addition, it relies on “pure” spectra which are not available all the time.

Statistical methods: In the statistical framework, spectral unmixing is formulated as a statistical inference problem by adopting a Bayesian methodology where the inference engine is the posterior density of the random objects to be estimated as described for example in Dobigeon, Moussaoui, Coulon, Tourneret, & Hero (2009),

Moussaoui, Carteretb, Briea, & Mohammad-Djafaric (2006), Arngren, Schmidt, & Larsen (2009).

203

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

2.1 Orthogonal Subspace Projection (OSP)

The method of orthogonal subspace projection (OSP) for unmixing and target detection is described in Ahmad &

Ul Haq (2011), Ahmad, Ul Haq, & Mushtaq (2011), Ren & Chang (2003). We will compare between our method and the method in Ahmad & Ul Haq (2011) that is currently considered to be very effective. According to the notation in Ahmad & Ul Haq (2011), we are given the dataset

X i

 

where

S

is the set of pure signatures, A is the corresponding abundance fractions and

W

is a white noise matrix. According to the OSP method in Ahmad & Ul Haq (2011), the mixing matrix is found as

A

 

T

T

where

U

,

 are a singular matrix and an eigenvalues-matrix, respectively, of the projection matrix to the subspace L of the pure signatures and is the pseudo inverse of

U

. The creation of the subspace

L

is described in Ren, H., &

Chang, C. I. (2003), pp. 1236.  

We present the results from target detection by the application of the OSP method with a given target signature

s

and compare them to our method. The targets in the scene are detected via the application of the OSP method on multipixels, which contain the dominant coefficient from the matrix A , corresponding to target signature

s

2.2 Linear Classification for Threshold Optimization

According to Cristianini & Shawe-Taylor (2000), a binary classification is frequently performed by using a real-valued function :

n

 class if

 

0,

 in the following way: the input

x

x

1

, , otherwise, to a negative class. We consider the case where

x n

T

is assigned to a positive

is a linear function of

x

with the parameters

w

and

b

such that

 

w x b n

w x

b

(2)

,

n

i

1 where

 

are the parameters that control the function. The decision rule is given by is assumed to be the weight vector and

b

is the threshold. sgn

 

.

w

Definition 2.1. (Cristianini & Shawe-Taylor, 2000))

A training set is a collection of training examples (data)

S

,

1

,

l

X Y l

(3)

where l

is the number of examples,

X

n

,

Y

1,1

is the output domain.

The Rosenblatt's Perceptron algorithm (Cristianini & Shawe-Taylor, 2000; Burges, 1998; pages 12 and 8, respectively) creates an hyperplane

w x b

0

S

. It creates the best linear separation between positive and negative examples via minimization of measurement function of “margin” distribution

i

y i

,

i

 

b

.

 

i

0 that implies the correct classification for

x y i

.

The perceptron algorithm is guaranteed to converge only if the training data are linearly separable. A procedure that does not suffer from this limitation is the Linear Discriminant Analysis (LDA) via Fisher's discriminant functional Cristianini & Shawe-Taylor (2000). The aim is to find the hyperplane

 

on which the projection of the data is maximally separated. The cost function (the Fisher's function) to be optimized is: where

P i

m i

and

b y i j

F

m

1

m

1

2

1

2

1

(4)

are the mean and the standard deviation, respectively, of the function output values

i

P i

 

Definition 2.2. (Cristianini & Shawe-Taylor, 2000)

The dataset S from Equation 3 is linearly separable if the hyperplane w x b

0,

correctly classifies the training data. It means that separation threshold. If

i

0

i

y i

,

i

 

then the dataset is linearly inseparable. b

0,

i

 

l

In this case, b is the

Definition 2.3.

S of

p

1

 

p k

x

  is isolated from the set

P

p

1

,  ,

p k

n

if the training set

is linearly separable according to definition 2.2. In this case, the absolute value b is the separation threshold.

204

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

Suppose that we have a set disjoint subsets

S

1 and

S

1

S

x

1

,  ,

x n

of

n

samples. First, we want to partition the data into exactly two

. Each subset represents a cluster. The solution is based on the K-means algorithm

(Duda, Hart, & Stork, 2001). K-means maximizes the function where

e

is a partition. The value of

  depends on how the samples are grouped into clusters and on the number of clusters (see Duda, Hart, & Stork,

2001)

 

B

(5) where

S

W

i

i

i



i

T

is an “within-cluster scatter matrix” (Duda, Hart, & Stork, 2001),

l

is the classes,

S i

are the classes and

(Duda, Hart, & Stork, 2001), where

m i

S

B

are the center of each class.

i l

1

n m i



i

m

T

,

S

B n i

is called “between-cluster scatter matrix”

is the cardinality of a class and

m

is the center for all the dataset.

Definition 2.4.

Let

 

be the best separation for the set

S

x

1

x n

discriminant analyzes Cristianini and Shawe-Taylor (2000), Burges (1998).

separation and b the Fisher's threshold for the data

P

.

When a dataset is separable? One criterion is when

Equation 4 is used.

diam

is defined as

 

m

1 max

m

1

x y

max

L

2

: ,

n

via K-means and Fisher's

 

is called the Fisher's

 

.

,

 

, where the notation in

Another criterion is:

Definition 2.5. (Duda, Hart, & Stork, 2001)

A dataset is separable if from Equation 5

 

is the partition and the number of classes is 1 and e

2

is the best partition into two classes. If the dataset is inseparable and Fisher's separation is incorrect.

J e

J e

1

 

where

 

e

1

then

3. Method I: Weak Dependency Recognition (WDR) of Targets That Occupy One or More Pixels

We assume that a target occupies one or more pixels. The process, which determines whether a given target's spectrum and the spectrum of the current pixel are dependent, is described next.

Definition 3.1.

Two discrete functions

Y

1

and

Y

2

are weakly dependent if there exists a permutation

of the

coordinates that provides monotonic order for the values of

Y

1

and

Y

2

.

Let

T be a given target’s spectrum and P is the pixel’s spectrum. We assume that the spectra of T and P are discrete vectors. In general, we assume that

T and P are normalized and centralized. The following hypotheses are assumed:

H

0

:

T and P are weakly dependent.

H

1

:

T and P are not weakly dependent.

3.1 Hypotheses Check

We find an orthogonal transformation

 that permutes the coordinates of T into a decreasing order. This permutation

 is applied to P and T . We get that

P

1

 

( ),

1

 

  which means that

T and P are weakly dependent, then the values of increasing and the first and second derivatives of

P

1

P

1

has an oscillatory behavior - see Figure 4 (right). In addition,

P

1

T

P

1 where

T

1

is monotonic. If are close to zero - see Figure 4 (left). Otherwise,

H

0

holds,

are either monotonic decreasing or

H

1 holds and

has a subset of coordinates whose first and second derivatives have an oscillatory behavior - see Figure 4 (right).

205

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

Figure 4. The

x

- and the

y

-axes are the wavebands and their reflectance values, respectively. The spectra are represented after the application of the permutation to the coordinates, which permutes

T into a monotonic deceasing order. Left: Weak dependency between T and P , Right: No weak dependency between T and P

If the permutation of the coordinates of

P provides that their values are either decreasing or increasing monotonically, then the first and second derivatives of

P have a minimal norm. This is another criterion for deciding who has weak dependency.

Let

x

x

1

,  ,

x n

Definition 3.2.

Let

n

.

The norm is defined as

x

 max

i

.

be an orthogonal transformation that permutes the coordinates of T into a decreasing order. Denote the second derivative of a vector X by

 

Let

Y

1

X

1

,  ,

Y

 

X

2

.

can be classified as:

X

2

. Define the mapping

: be a dataset of spectra from all the pixels in the scene. Denote

Y i

n

such that

 

. The dataset

1) The set

2) The set

Y

1

Y

1

,  ,

Y

,  ,

Y

is separable according to definition2.5.

is inseparable according to definition 2.5.

In the first case,

  is the best separation for the set the Fisher's threshold for this separation. Then, the set

 no targets in the scene.

S

:

i

Y

1

b

Y

 according to definition 2.4 and

b

is

is the set of targets. In the other case, there are

The flow of the WDR algorithm is given in Figure 5.

206

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

Figure 5. The flow of the WDR algorithm

3.2 Experimental Results

Figures 6-8 display the results after the application of the algorithm in section 3.1 to the “desert” image (Figure 1).

The yellow lines mark the neighborhood of the detected targets.

Figure 6. Left: One wavelength part from the original “desert” image (Figure 1). Right: The white points mark the detected targets. The intensity of each pixel in the right side corresponds to the value



  where X the spectrum in the current pixel

207

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

Figure 7. Left: One wavelength part from the original “desert” image (Figure 1). Right: The white points mark the detected targets. The intensity of each pixel in the right side corresponds to the value



  where

X is the spectrum in the current pixel

Figure 8. Left: One wavelength part from the original “desert” image (Figure 1). Right: The white points mark the detected targets. The intensity of each pixel in the right side corresponds to the value



 

where X is the spectrum in the current pixel

The desert image contains documented targets. The detection of the suspicious points in Figures 6-8 match exactly the known targets.

The point

P

1 in Figure 8 is the pattern of the known target's material. Its spectrum is displayed in Figure 4 as a plot of the “target”. Other spectra plots, which were detected by the WDR algorithm in the scenes of Figures 6-8, are classified as “spectra of suspicious points”.

208

www.ccsen

et.org/esr Vol. 1, No. 2; 2012 the

y

and the suspic ious points’ sp

x

respectively

- and compared next P algorithm.

11) algorithms . The In this sect tion, we compa ength (multipi ixel) from the original “deser rt” image (Fig ure 1). Right: The oints mark the detected targe algorithm Ah Red circle mar rks thm Ahmad & Ul Haq (2011

209 es not one

www.ccsen

et.org/esr

Figure 12 shows the RO C-curves whil le comparing tw

Vol. 1, No. 2; 2012 y varying the t threshold. method. The green lin e corresponds hod corresp ength (multipi ixel) from the original “deser rt” image (Fig ure 1). Right: The gets by the WD



 

w

X is the

The intensity of each pixel i e spectrum in t xel ength (multipi ixel) from the original “deser rt” image (Fig ure 1). Right: The s by the OSP al lgorithm Ahm ed circle mark s the

210

www.ccsen

et.org/esr Vol. 1, No. 2; 2012

DR one

5. The red line corresponds to method. The green lin e corresponds hod

011)

g by Examini ing the Neighb oint (UNSP)

In this sect tion, we provid projection . They project the data into l linear subspace local model of f the backgrou gical structure of the pixel's n . This yields be (suspicious po . hang, ferent t into troduces the pa square of

m

2

m

1

1

m m

X

which is the nei

  

, where

m

1 is ighborhood's s with a center the radius of t at the pixel X this neighborh ssing.

m

yed in

Figure 17.

m

-neighborho

211 by

m

www.ccsen

et.org/esr

A connect ted component means that t there exists a the next pi ixel is adjacent any two pixel e 18.

Vol. 1, No. 2; 2012 er. This conne ction pixels

Figure 18. The morph ighborhood is represented by nts

Consider t the spectra fro ctral image. Us sually there is high ese spectra in real situation s. For exampl le, Figure 19 displays the spectra from three of three differ rent materials

To reduce the correlation ixel X by

d

and it is c

d

‐sp vative s less than half

m

-nei

(as a subpixel or as a whole

T is the given t ected um. s two steps: D etection of sus ing). n of the target

lter: Detection of Suspicious Points via Nei Morphology

m the

H

0 point.

H

1

: Y is ous point. column j

m

1 is the o e indices of

is denoted by f the neighbor

, , 1,

 rhood’s radius.

.

m

 

, m xample, the cen cated in row

Y

p m

1

1,

m

1

1

i

and

212

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

Figure 20. The indices of a pixel

Denote by

S

 

,

m

the set of multipixels (multipixel means all the wavelengths that belong to this pixel) in the current neighborhood. Consider the mapping

: S

 

corr d p

S

ˆ

 

 

,

d Y

 

1,1

,

m

The set

ˆS can be in one of two cases:

. such that

 

is the correlation coefficient between the vectors

 

 

and

,

 

where

. Denote

1)

ˆS is inseparable according to definition 2.5. This means that the pixels, which are correlated with the target, are inseparable from the other pixels;

2)

ˆS is separable according to definition 2.5. This means that the pixels, which are correlated with the target, are separated from the other pixels.

If we are in case 1, then

Y is not a suspicious point. If we are in case 2, assume that  is the first cluster closest to

1. According to definition 2.4,

 

provides the best separation. It separates the set

 from the other points where

 

ij

:

b

 

,

 

b

.

If the set

 represents two or more connected components, then Y is also not a suspicious point. If

Y

 

then

Y

is also not a suspicious point. Therefore,

H

1 holds. In other words, if

Y is a suspicious point, then

 is a set of pixels that intersects with the target and this set of correlated points is concentrated around the central point

Y .

Here and below, we assume that a correlated point is a pixel whose

d

-spectrum and correlated coefficient that is greater than Fisher's threshold

b

.

are correlated with the

Let

N

1

be the neighborhood

m

2

 

.

N

1

is called the internal square. Let the external square. They are visualized in Figure 21.

N

2

 

m

 

\

N

1

.

N

2

is called

Figure 21. 

N

1 is the internal square and

N

2 is the external square

Assume that

b

 

,

 is the set of all pixels

 

p ij

, which are bounded by the external square with correlation coefficients

, which are associated with the current neighborhood that are less than the Fisher's threshold

. Each pixel in

 is treated as a vector (multipixel) where its entries are spread all over the wavelengths. The

d

-spectra of this vector is denoted by

V v s

where

s

is one of the

. This is the set of all the

d

-spectra that belong to

 . If

 

 

s

  then

.

The set of all these vectors is denoted by

V

v

1

, 

v s

.

In order to derive the

d

-spectrum of some material in the central pixel, the background around the central pixel has to be removed. For that, we construct an orthogonal projection

 , which projects all the

d

-spectra onto the

213

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012 orthocomplement of the linear span where the background of the

d

-spectra is located. If the

d

-spectrum of the central pixel

 

 

does not belong to this linear span, then this projection extracts an orthogonal component of

which does not mix with the background of the

d

-spectrum. For example, if orthocomplement of this span. Then, after projection we obtain

 

d

 

d Y

-spectrum and

d

2

d

2

d

1

2

where

d

1

belongs to the which does not correlate with the background of the

d

-spectrum. Hence, the background influence is removed by this projection.

d

Now, we formalize the above. Assume the matrix

E is associated with the vectors

 

v v j

. Assume that

T e v

1

v s

where

is the Fisher's threshold, which separates between the big and small absolute values of the eigenvalues of the matrix

E . In some cases,

T e

can separate between zero and nonzero eigenvalues.

The eigenvectors associated with the eigenvalues, which are smaller than

T e

, generate the eigensubspace, which is the orthocomplement of the linear span of the principal directions of the set

V

. Denote this orthocomplement by

C

.

Throughout this paper, we assume that in our model the spectrum of any pixel X consists of three components:

1) The spectrum of the material M is different from its background;

2) The spectrum of the background was generated from a linear combination of spectra of pixels from the

X

-neighborhood;

3) Random noise is present.

-spectrum of the material

1

, 

v s

M

,

 

P

'

is a linear combination of the vectors

M

'

v

1

, 

v s

.

1

v s

N

where

P

' 

 

M

'

is the

d

is the portion of the material

M in Y ,

N

is a random noise and

If the correlated points concentrate around

Y , then these points consist of the same material as Y . If the projection operator

 . This operator projects vectors onto the orthocomplement

C

. The vector

1

v s

 is approximated to be a zero vector. Thus, this orthogonal projection removes from the

d

-spectrum of influence of the background.

Let

T

' be the given

d

-spectrum of the target. If the correlation coefficient of than the correlation coefficient of holds.

P

'

and

T

'

 

and

, then

Y is a suspicious point, M is the target,

 

T

' 

M

'

the

is greater

and

H

0

 

4.2 Detection of Outliers within a Single Testing Cube

In section 4.1, we presented how to detect suspicious points. There is another way to do it. An alternative detection method uses dimensionality reduction by the application of the Diffusion Maps (DM) algorithm Coifman & Lafon

(2006) and a nearest-neighbor scheme. The DM is a non-linear algorithm for dimensionality reduction.

Assume, we are given a data cube D of size

X Y Z

wavebands. We define a small testing cube d of size hyper-spectral data cube D.

 

, 

X h

Y

which is included in the

4.2.1 Dimensionality Reduction by DM Application

Assume that a sliding testing cube d, pointed by the arrows in Figure 22, is moving by ironing each time a different fragment in the data cube D described in Figure 2. Section 4.3 describes in details how the testing cube d moves.

214

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

Figure 22. An urban scene of size 294  501 (from the “city” in Figure 2) with different locations of the sliding testing cube d. The arrows point to these locations

The sliding testing cube d contains and

i h

are in the range 30-50,

,

 

N

N

vh

  multipixels each of which comprises Z wavebands. Typically,

v

Z is in the range 30-100,

Y

290 . Thus, each of the

of length Z . We arrange these data points into a matrix M of size

N

N Z

. data points is a vector

The next step applies the DM (see the appendix for its description) algorithm to the matrix M. It reduces the dimensionality of the data vectors by embedding them into the main eigenvectors of the covariance matrix of the data M. This projection reveals the geometrical structure of the data and facilitates a search for singular (abnormal) data points. The data matrix M of size

N Z

is mapped onto the eigenvectors of the matrix P of size

, 

Z

. Typically, R is in the range 3-5, which is determined by the magnitudes of the corresponding eigenvalues. R is the number of essential eigenvalues of the covariance matrix and it is determined as explained in Coifman & Lafon (2006). Figure 23 displays the embedding on three major eigenvectors of the data from four positions of the sliding testing cube. These are the embeddings onto three major eigenvectors of the covariance matrices.

215

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

Figure 23. Embedding of the data from different positions of the sliding testing cube on the image in Figure 2 onto three major eigenvectors of the diffusion matrix

We observe that the overwhelming majority of the embedded data points form a dense cloud while a few outliers present. It can be a single point, which lies far away from the rest of points or, more frequently, there exists a small group of points, which are located close to each other but far away from the majority of the cloud. This reflects the situation when an optional target can occupy the area of size from one to several pixels (or even a subpixel). These single or grouped outliers are detected as explained in the next section on outliers detection.

4.2.2 Detection of Grouped Outliers

Assume we are looking for groups of outliers that consist of no more than

K members. It is done by the following steps:

1) For each row

, 1, , ,

d

i

s i

,1

d

 

i

,1

j s i

,2

 

of the DM matrix P (see Appendix), calculate its Euclidean distances to all other rows and sort them in ascending order

d

2) Form the matrix

J

j

,

i

i

,2

N k

1, ,

s

,

1

N

1,

d

,  1

.

d

,

s

. Thus,

S s

,

i

N j

1, ,

N

1, of the sorted distances and the matrix of the corresponding indices.

j i

,1,

 , columns

j

s

i

,1

,

,

i

1, , ,

1, , ,

s

, distances matrices

S

K

,

i

N

 determine its

K nearest neighbors. For this, take the first K columns

of the index matrix J. The corresponding distances are presented in the first

K

N

of the matrix S. Thus, we have the nearest neighbor index where both are of size

N K

. First, the simplest case are looking for groups of outliers consisting of no more than two points, is handled.

2

J

K

and the

K

 , which means that we

4) Assume that

2

 max

i s i

nearest neighbor of the

,2

i

2

is achieved by

-th data point in the data cube D. Store the point

2

,

2

p i

2

.

2

. It means that the distance to the second in order for the is the largest among the distances to their second nearest neighbors of all the data points. Restore the coordinates

x

2

and

y

2

of the data point

p i

2

(multipixel

m i

2

)

5) Find a) max

P

2

i s i

,1

. Two alternatives are possible:

is an isolated outlier. It takes place when the maximum

1

 that the distances from the point distances of all the other points.

P

2 max

i s i

,1 is achieved by

2

. It means

to its first two nearest neighbors is greater than the respective b) However, it may happen that some point lies close to

P

2

while all the others are far apart. It can be interpreted as a pairwise outlier. An indicator of this situation is the fact that the maximum

2

1

 max

,

2

i s i

,1

is achieved by

and regard

,

2

i i

1

i

2

. In this case, we add the point

as a pairwise outlier. The index of the point

6) While looking for grouped outliers that may contain up to

K

 max

i s

,

(multipixel

m i

K

is achieved by

K

. Restore the coordinates

) in the data cube D. Store the point

7) Find the maximal values in the first

K

1

The following alternatives are possible:

K k

K K

K

.

2

x

K

and max

i s

,

,

k

K

y

K

1,

1

,

1

1

closest to the point

,

1

is

i

1

j i

2

,1

.

K

, such that

of the data point

p i

K

of the distance matrix S.

216

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012 a)

P

K

is an isolated outlier. It takes place when all the maxima means that the distances from the point distances for all the other points.

P

K

k

'

k

K

1,

are achieved

K

. It

to its first nearest neighbors are greater than the respective this case, we add the point achieved by some other the point

i

1

j i

K

P

K i

j i

i

and of the

i

K

,

L

K

1

, respectively.

, c) If the maxima in the columns

1

k

,

k

,

k

L points closest to

K

1 that is the closest to the point pairwise outlier. The index of the point

k

 

1

,

1

 is

K

i

, then we have grouped outliers. These outliers

P

K

1

1,

L j i

2

,1

1,

.

2

 are achieved by

,

2

 and regard

are achieved by

P

1

,

i i

,

K

,

. The indices of the points

, while

P

1

K

1

,

i i

2

K

. In

as a

L

consist of

P

L

are

We emphasize that, once the upper limit

K is given, the number

L

1 automatically depending on the data within the sliding testing cube d. Figure 24 illustrates the grouped detected outliers in the 3-dimensional space of eigenvectors of the data from four positions of the sliding testing cube.

Figure 24. Detection of grouped outliers in data from different positions of the sliding testing cube embedded in the diffusion space

4.3 Detection of Singular Points within the Whole Data Cube

In the section on outliers detection, we described how to find a group of data points (multipixels) within one sliding testing cube, whose geometry differs from the geometry of the majority of the data points. Let

 

P

1

1

, ,

P

1

L

1

be the list of such data points in the sliding testing cube

d of size

v h Z

located in the upper left corner of the sliding data cube D as illustrated by the arrow in Figure 22. The next testing cube

h

/ 4 of

 

P

1

2

P

2

L

2

be the list of outliers in the cube of the vast overlap between the cubes

 2

to

 1

. Because

2 and

217

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

1 . In the united list, these points gain the weight 2. The next right shift produces the sliding testing cube outliers list

 3 s appended to the combined list

1 2

. Again, the common gain weights. We proceed with the right shifts till the right edge of the data cube D. Then, the sliding testing cube slides down by

v

/ 4

and starts

 -shifts to the left and so on. As a result, we get a combined list

 

R

1

i

of outliers, where

R is the number of jumps of the testing cube d within the sliding data cube D. Figure 22 illustrates a route of the cube d on the data cube D.

It is important that each point than 40. The weight that the point

P i w i

P i

in the list

 is supplied with the weight

can serve as a singularity measure for the point

P i w i

, which can range from 1 to more

. A large weight

w i

reflects the fact

is singular for a big number of overlaps between sliding testing cubes. Thus, it can be regarded as a strong singular point in the sliding data cube D and vice versa. Figure 25 illustrates the distribution of the weighted singular points around the data cube

Figure 22 whose source is Figure 2.

D

U

of size 500  294  64 from the urban scene displayed in

Figure 25. Distribution of the weighted singular points around the data cube

D

U

Right: Singular points whose weights exceed 12

. Left: All the singular points.

4.3.1 Examples of Detected Singular Points

We applied the above algorithm to find singular points in different data cubes. The following figures display a few singular points detected in the data cube

D

U

.

Figure 26. A group of singular points centered around the point

Multipixel spectra at the point

P

399,85

P

329,85

. Left: Vicinity of the point P. Right:

and the surrounding points. The weight of the data point P is 19

218

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

 

Figure 27. A strong singular point

P

352,90

P

352,90 . Left: Vicinity of the point P. Right: Multipixel spectra of the point

and the surrounding points. The weight of the data point P is 32

 

Figure 28. A strong singular point

P

117,182

P

117,182 . Left: Vicinity of the point P. Right: Multipixel spectra at the point and the surrounding points. The weight of the data point P is 32

By comparing between Figures 28 and 27 we observe that spectra of singular multipixels located at points

P

117,182

 and

P

352,90

 are similar to each other. Supposedly, they correspond to the same material. A different singular multipixel is displayed in Figure 29.

Figure 29. A singular point

P

242, 202

P

242, 202 . Left: Vicinity of the point P. Right: Multipixel spectra at the point and the surrounding points. The weight of the data point P is 32

219

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

4.4 Extraction of the Target's Spectrum from a Suspicious Point

Let

Y be a suspicious point and let T be the given target's spectrum. What portion of the target is contained in

Y ?

We consider a simplified version of Equation 1 via the definition of a simple mixing model that describes the relation between a target and its background. Assume

P is a pixel of mixed spectrum (a spectrum that contains background influence and the target) and T is the given target's spectrum. Consider three spectra: an average background spectrum

B

M k

1

a B k

, a mixed pixel spectrum (spectrum of a suspicious point)

P and the target's spectrum T . They are related by the following model

P tB

1

t k

M

1

c B k

 

(6) which is a modified version of Eq. 1, where

a

1

t

and

s

1

T t

 ,

t

 

. , 1, ,

M

the neighborhood pixel. Therefore, all of them are close to each other and have a similar feature.

, was taken from

We are given the target's spectrum

T and the mixed pixel spectrum P . Our goal is to estimate t denoted by ˆt , which will satisfy Equation 6 provided that

B and T have some independent features. Once ˆt is found, the estimate of the unknown background spectrum

B , denoted by ˆB , is calculated by

ˆ

P tT

  

.

Estimating the parameter t in Equation 6 is called linear unmixing.

In Step 2 from Section 4.1, we calculated the following:

V

is the

d

-spectra set, which is uncorrelated with

  pixels from the

m

-neighborhood of Y and

 , is the projection operator onto the orthocomplement of the linear span of

V

. Let

P

2

 

,

T

2

 random noise that is independent of

T

2

, then

P

2

t T

2

N

. The parameter

t

' 

where

 

t is an unknown parameter,

N

is a

is estimated as the maximum of the independency between the two

d

spectra

 

T

2

and

P

2

t T

2

.

The fact that two vectors

X

1

and

X

2

are independent is equivalent to

corr

   

 analytical function

 (Hyvarinen, Karhunen, & Oja, 2001). An analytical function can be represented by a

Taylor expansion of its argument's degrees. Then, the condition

    to

n

1, 2,3, 4

X

2

n

0

. From the independency criterion between the two vectors

X corr

1 and

X

   

2

X

1

,

X

2

 we can have

0 equals to

 for any positive integer

n

where

n

denotes a power. In our algorithm, we limit our self

f

,

2

   

2

   

3

   

4

(7) which equals to zero in case

X

1 and

X

2 are independent. If

 '  where P is the spectrum of the suspicious point and

B is a mix of the background's spectrum from the neighborhood that is affected by noise.

The flow of the UNSP algorithm is given in Figure 30.

220

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

Figure 30. The flow of the UNSP algorithm

4.5 Experimental Results

In this section, we consider two scenes “field” (Figure 3) and “city” (Figure 2) that contain the subpixel's targets.

As a first step, we find all the suspicious points via the application of anomaly detection process (section 4.2). The next step checks the anomaly by the “morphological-filter” which was described in section 4.1. If the pixel is passed via the application of the “morphological-filter” then the target is present in it.

Figures 31 and 32 present the outputs from the application of the “morphological-filter” algorithm to two different hyper-spectral scenarios.

Figure 31. Left: The source image (Figure 2). Right: The white points are the suspicious points in the neighborhood of diameter

m

10

 

221

www.ccsen

et.org/esr Vol. 1, No. 2; 2012 hite points are t

m

10

  the suspicious points in the

In Figures 33 and 34, th e

x

- and

y

-

Figures 31 present in

and 32 are the the parameter t from Equat means that this el. The estimat vectors

T

2

an d

P

2

t T

2

us gular points det tected as anom tion of t is d dure. hat is mization of th

f

4.4. Now we pr resent an unmi

in ixing ection

4.1). d their values, respectively. T

Figure 33.

The output fro

This suspi icious point is the suspicious point in Figur re 31. tions

222

www.ccsen

et.org/esr Vol. 1, No. 2; 2012

Figure 34.

The output fro

This suspi icious point is een UNSP and hms

In this sect tion, we comp gures 35-41. g algorithm to t the suspicious point in Figur re 32. tions

, 2011) algorit thms.

Figure 35. Left: The ” image. Right : The white po detected targe

223

www.ccsen

et.org/esr Vol. 1, No. 2; 2012

Figure 36.

Left: The orig ithm

Figure 37. The detected t targets by the O gets. The other r points are “fa ver the original l “city” image.

. The yellow ci ircles

The OSP algorit more

7. The red line corresponds to ethod

224

011)

www.ccsen

et.org/esr Vol. 1, No. 2; 2012

Figure 39. Left: The o oints mark the detected targe

Figure 40 . Left: The ori

Ahmad & Ul Haq (2011) al lgorithm etected targets OSP

Fi igure 41. The d ts by the OSP a w circles mark known places m produces mo

225

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

Figure 42. The “ROC-curve” for scene in Figure 41. The red line corresponds to OSP Ahmad, & Ul Haq (2011) method. The green line corresponds to the WDR method

5. Conclusions

We presented two algorithms for linear unmixing. The WDR algorithm detects well targets that occupy at least one pixel but fails to detect sub-pixel targets. The UNSP algorithm detects well sub-pixels targets but it is computational expensive due to the need to search for the spectral decomposition in each pixel's neighborhood by sliding the “morphology-filer”. In the future, we plan to add to these algorithms a classification method with machine learning methodologies.

References

Ahmad, M., & Ul Haq, I. (2011). Linear Unmixing and Target Detection of Hyper-spectral Imagery.

2011

International Conference on Modeling, Simulation and Control IPCSIT, IACSIT Press, 10, 179-183.

Ahmad, M., Ul Haq, I., & Mushtaq, Q. (2011). AIK Method for Band Clustering Using Statistics of Correlation and Dispersion Matrix.

2011 International Conference on Information Communication and Management,

IACSIT Press, 10, 114-118.

Arngren, M., Schmidt, M. N., & Larsen, J. (2009). Bayesian nonnegative matrix factorization with volume prior for unmixing of hyper-spectral images.

IEEE Workshop on Machine Learning for Signal Processing (MLSP),

Grenoble, 1-6. http://dx.doi.org/10.1109/MLSP.2009.5306262

Averbuch, A. Z., & Zheludev, M. V. (2012). Two Linear Unmixing Algorithms to Recognize Targets using

Supervised Classification and Orthogonal Rotation in Airborne Hyper-spectral Images. Remote Sensing, 4,

532-560. http://dx.doi.org/10.3390/rs4020532

Bateson, C., Asner, G., & Wessman, C. (2000). Endmember bundles: A new approach to incorporating endmember variability into spectral mixture analysis. IEEE Trans. Geoscience Remote Sensing, 38,

1083-1094. http://dx.doi.org/10.1109/36.841987

Bayliss, J. D., Gualtieri, J. A., & Cromp, R. F. (1997). Analysing hyper-spectral data with independent component analysis. Proceeding of the SPIE conference, 3240, 133-143.

Bioucas-Dias, J., & Plaza, A. (2010). Hyper-spectral unmixing: Geometrical, statistical and sparse regression-based approaches. SPIE Remote Sensing Europe, Image and Signal Processing for Remote

Sensing Conference, SPIE.

Bioucas-Dias, J., & Plaza, A. (2011). An overview on hyper-spectral unmixing: geometrical, statistical, and sparse regression based approaches.

Proceeding IEEE Int. Conf. Geosci. and Remote Sensing (IGARSS), IEEE

International, 1135-1138.

Boardman, J. (1993). Automating spectral unmixing of AVIRIS data using convex geometry concepts.

Summaries of the Fourth Annunl Airborne Gcoscicnce Workshop, TIMS Workshop, Jet Propulsion Laboratory,

Pasadena, CA, 2, 11-14.

Burges, J. C. (1998). A Tutorial on Support Vector Machines for Pattern Recognition.

Data Mining and

Knowledge Discovery, 2(2), 121-167. http://dx.doi.org/10.1023/A:1009715923555

226

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012

Chang, C. I. (2003). Hyper-spectral Imaging: Techniques for spectral detection and classification. Kluwer

Academic New York.

Chang, C. I., Zhao, X., Althouse, M. L. G., & Pan, J. J. (1998). Least squares subspace projection approach to mixed pixel classification for hyper-spectral images.

IEEE Trans. Geoscience Remote Sensing, 36(3),

898-912. http://dx.doi.org/10.1109/36.673681

Coifman, R. R., &. Lafon, S. (2006). Diffusion Maps. Applied and Computational Harmonic Analysis, 21(1), 5-30. http://dx.doi.org/10.1016/j.acha.2006.04.006

Common, P. (1994). Independent component analysis: A new concept. Signal Processing, 36, 287-314. http://dx.doi.org/10.1016/0165-1684(94)90029-9

Craig, M. D. (1994). Minimum-volume transforms for remotely sensed data.

IEEE Trans. Geoscience Remote

Sensing, 99-109.

Cristianini, N., & Shawe-Taylor, J. (2000). Support Vector Machines and other kernel-based learning methods.

Cambridge University Press.

Dobigeon, N., Moussaoui, S., Coulon, M., Tourneret, J. Y., & Hero, A. O. (2009). Joint Bayesian endmember extraction and linear unmixing for hyper-spectral imagery. IEEE Trans. Signal Processing, 57(11),

4355-4368. http://dx.doi.org/10.1109/TSP.2009.2025797

Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern Classification. John Wiley & Sons Inc.

Harsanyi, J. C., & Chang, C. I. (1994). Hyper-spectral image classification and dimensionality reduction: an orthogonal subspace projection approach.

IEEE Trans. Geosciencce Remote Sensing, 32(4), 779-785. http://dx.doi.org/10.1109/36.298007

Hyvarinen, A., Karhunen, J., & Oja, E. (2001).

Independent Component Analysis. John Wiley & Sons Inc1.

Ifarraguerri, A., & Chang, C. I. (1999). Multispectral and hyper-spectral image analysis with convex cones. IEEE

Trans. Geoscience Remote Sensing, 37, 756-770. http://dx.doi.org/10.1109/36.752192

Manolakis, D., & Shaw, G. (2002). Detection algorithms for hyper-spectral imaging applications.

IEEE Signal

Processing Magazine, 19(1), 29-43. http://dx.doi.org/10.1109/79.974724

Manolakis, D., Marden, D., & Shaw, G. (2001). Hyper-spectral image processing for automatic target detection applications. Lincoln Lab Journal, 14(1), 79-114.

Moussaoui, S., Carteretb, C., Briea, D., & Mohammad-Djafaric, A. (2006). Bayesian analysis of spectral mixture data using markov chain monte carlo methods.

Chemometrics and Intelligent Laboratory Systems, 81(2),

137-148. http://dx.doi.org/10.1016/j.chemolab.2005.11.004

Nascimento, M. P., & Bioucas-Dias, M. (2005a). Does independent component analysis play a role in unmixing hyper-spectral data? IEEE Trans. Geoscience Remote Sensing, 43(1), 175-187. http://dx.doi.org/10.1109/TGRS.2004.839806

Nascimento, M. P., & Bioucas-Dias, M. (2005b). Vertex component analysis: A fast algorithm to unmix hyper-spectral data. IEEE Trans. Geoscience Remote Sensing, 43(4), 898-910. http://dx.doi.org/10.1109/TGRS.2005.844293

Ren, H., & Chang, C. I. (2003). Automatic Spectral Target Recognition in Hyper-spectral imagery,

IEEE Trans, on

Aerospace and electronic Systems, 39(4), 1232-1249. http://dx.doi.org/10.1109/TAES.2003.1261124

Settle, J. J. (1996). On the relationship between spectral unmixing and subspace projection.

IEEE Trans.

Geoscience Remote Sensing, 34, 1045-1046. http://dx.doi.org/10.1109/36.508422

SPECIM camera. (2006). Retrieved from http://www.specim.fi/

Winter, M. E. (1999). N-findr: An algorithm for fast autonomous spectral end-member determination in hyper-spectral data. Proceeding SPIE Conf. Imaging Spectrometry V, SPIE, 266-275.

Appendix: Diffusion Maps

Diffusion Maps (DM) Coifman, R. R., &. Lafon, S. (2006) analyzes a dataset M by exploring the geometry of the manifold M from which it is sampled. It is based on defining an isotropic kernel

K

 

whose elements

227

www.ccsenet.org/esr Earth Science Research Vol. 1, No. 2; 2012 are defined by

 

e

,

 is a meta-parameter of the algorithm. This kernel represents the affinities between data points in the manifold. The kernel can be viewed as a construction of a weighted graph over the dataset M . The data points in M are the vertices and the weights of the edges are defined by the kernel K .

The degree of each data point (i.e., vertex)

x M

in this graph is

 

. Normalization of the kernel by this degree produces an

n n

row stochastic transition matrix P whose elements are

,

,

/ ( ), ,

, which defines a Markov process (i.e., a diffusion process) over the data points in

M . A symmetric conjugate

P

of the transition operator

P defines the diffusion affinities between data points by

 

 

   

1

, ,

.

DM embeds the manifold into an Euclidean space whose dimensionality is usually significantly lower than the original dimensionality. This embedding is a result from the spectral analysis of the diffusion affinity kernel

P

.

The eigenvalues 1

0

1

 

of

P

and their corresponding eigenvectors desired map, which embeds each data point

x M

into the data point

 

 

 

i

0 are used to construct the

for a sufficiently small

 , which is the dimension of the embedded space.  depends on the decay of the spectrum

P

.

228

Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement