Lecture notes PDF file

Lecture notes PDF file
Contents
Advanced NMR Processing
1
Ulrich Günther
2
Introduction
3
Wavelets
6
EuroLabCourse "Advanced Computing in NMR
2.1
2.2
The Haar System . . . . . . . . . . . . . . . . . . . . . . . . .
Mallat’s Multiresolution Analysis . . . . . . . . . . . . . . . .
6
9
Spectroscopy", Florence, Sept. 2001
2.3
2.4
Thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . .
Smoother Wavelet Bases . . . . . . . . . . . . . . . . . . . . .
14
15
2.5
Applications in NMR Spectroscopy . . . . . . . . . . . . . . .
2.5.1 WAVEWAT . . . . . . . . . . . . . . . . . . . . . . . .
2.5.2 Noise Suppression . . . . . . . . . . . . . . . . . . . .
17
17
21
2.5.3
22
3
4
Noise suppression in three-dimensional spectra . . . . .
SVD-based methods
24
3.1
SVD-based noise and signal suppression
. . . . . . . . . . . .
24
3.2
Linear prediction . . . . . . . . . . . . . . . . . . . . . . . . .
28
Using NMRLab
31
2
1 Introduction
is used to obtain pure absorption mode spectra. The solvent signal is often
suppressed by removing on-resonance components from the FID. This can be
achieved using different algorithms. The most common is the subtraction of a
low-order polynomial which describes the slow variations in the FID. In protein NMR it is common to apply a convolution to the FID which also extracts
low-frequency components.
1 Introduction
Baseline correction is often applied to the final spectrum. This is particularly important for multi-dimensional NMR spectra which require a flat baseline
NMR processing has developed over many years since the introduction of Fourier
Transform (FT) -NMR spectroscopy by Richard Ernst. In FT-NMR spectroscopy
the response to a perturbation from equilibrium is recorded during a certain
amount of time. Because this response originates from the entire ensemble of
spins it is an interferogram containing many different frequencies. The basis
of all NMR processing is based on the fact that the interferogram can be described as a superposition of decaying complex exponentials, a free induction
decay (FID). This signal is comparable to an acoustic signal , a high-frequency
sound originating from nuclear spins. For this reason many of the more recent
developments of acoustic signal processing are applicable to NMR signals.
Modern NMR spectroscopy benefits a lot from modern computational possibilities. The measured FID is digitized and stored as a digital signal, i.e. a series
of complex data points.
NMR processing is used to convert this digital time-domain data into a frequency spectrum. This basic task is usually achieved using a discrete Fourier
transform which provides a spectrum showing signal intensities versus frequency.
Discrete Fourier transforms (DFT) were revolutionized in 1965 by the publication of a fast algorithm by Cooley and Tukey requiring N log (N ) rather than N 2
operations.
Experimental NMR-signals contain noise besides the signals of interest. Most
of this noise originates from the receiver circuitry. However, some of the noise is
also a consequence of subtle instabilities during long measurements. The Fourier
transform has the disadvantage that it convolutes noise with the spectrum. Several approaches to reduce the noise in spectra are commonly used in modern
NMR processing. Most commonly the FID is multiplied with an apodization
function which damps noise towards the end of the FID. Apodization usually
broadens the original signal.
Many other steps are a part of regular NMR processing. Phase correction
3
for visualization and for peak picking. Baseline correction requires two distinct
steps: differentiating between baseline and signal points and the calculation of a
suitable signal to subtract from the spectrum.
Several more advanced signal processing techniques were applied to NMR
spectroscopy. The most frequently used tool of this type is linear prediction, an
algorithm introduced by Tufts and Kumaresan in 1982 [16]. Delsuc first introduced this tool for NMR processing [6]. Linear prediction is based on the fact
that signals are periodic during the course of an FID while noise is not. If it is
possible to determine coefficients which describe the intensity, decay and phase
of the signal components in a FID the noise contribution can be eliminated and
the course of the FID can be predicted beyond the duration during which the FID
was recorded. This technique uses singular value decomposition (SVD) to calculate coefficients. Although it is computationally very demanding it is commonly
used in modern NMR processing.
Many other techniques have also been used to process NMR signals. Maximum entropy reconstruction is the most well-known algorithm which is available
in many NMR processing packages. MaxEnt uses a configurational entropy as
a regularization function which provides a measure for the approximation of the
calculated spectra to the experimental data. Bayesian analysis is a statistical
method to estimate the degree to which a hypothesis is confirmed by experimental data. Bretthorst and Sibsi showed how Bayesian analysis can be used
to process NMR spectra. More recently continuous wavelet transforms (CWT)
were used to analyze NMR signals.
This course focuses on the use of discrete wavelet transforms (DWT) to reduce the noise level in NMR spectra and for suppression of the on-resonance
solvent signal. Wavelet algorithms are being rapidly introduced in many fields of
signal processing. The most common application of wavelets is probably image
and sound compression (e.g. in jpeg, mpeg and mp3 file formats). Smoothing
and noise suppression employing wavelet transforms was originally suggested
4
1 Introduction
by David Donoho [7, 8]. Other applications of wavelets include density estimation, and nonparametric regression. In the course basic principles of wavelet
transforms will be presented.
All algorithms used in this course were implemented in NMRLab, a package
for NMR processing in MATLAB (The Mathworks). Most of the source code of
NMRLab is available. Although most of the routines were vectorized to maxi-
2 Wavelets
mize computational efficiency, a non-vectorized version is usually included as a
comment. For wavelet transforms routines from WAVELAB were used [1].
Christian Ludwig made significant contributions to the software used in this
course. The underlying work was supported by Prof. H. Rüterjans and the Large
Scale Facility at Frankfurt.
Wavelets are a topic of applied mathematics. The mathematical theory of ondelettes (wavelets) was developed by Yves Meyer almost 15 years ago. The
name wavelet originates from the the requirement for these functions to integrate to zero, “waving” above and below the x-axis. Wavelets chop up data into
frequency components, and analyze each frequency component with a resolution
matched to its scale.
General interest in wavelets has grown substantially in the past 10 years because wavelets solve basic problems in signal processing such as data approximation (smoothing), noise reduction, data compression, time-frequency analysis
and image analysis. The availability of fast wavelet transform algorithms was
crucial for their success in signal processing. The application of wavelets to
smooth NMR signals has been inspired by David Donoho, one of the pioneers
of the field, who used an NMR spectrum as an example to illustrate potential
applications [7, 8]. In his book ’NMR data processing’ J. Hoch describes emerging methods in NMR data processing and shows an example for smoothing by
wavelets following the ideas of Donoho [14]. More recent publications describe
the use of wavelets in NMR processing [12, 11].
It is the aim of this course to introduce basic principles of wavelet analysis
and potential applications to NMR researchers. For more advanced reading we
refer to many excellent text books [17, 5, 13] and introductory texts [18, 21, 15,
9, 19].
2.1 The Haar System
There are many types of wavelets. The most simple wavelet is the Haar wavelet
introduced in an appendix of the thesis of A. Haar in 1909. The Haar function is
5
6
2 Wavelets
2 Wavelets
a simple step function a
4
(
ψ(x) =
j=4, k=13
j=3, k=2
3
1; x 2 [0; 12 [
1; x 2 [ 21; 1[
2
j=0, k=0
1
0
shown in Figure 2.1.
−1
−2
−3
−4
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Figure 2.2: The set fψ j;k ; j; k 2 Zg derived from a Haar mother wavelet.
In addition to the set ψ j;k ; j; k 2 Z we also need a scaling function (father
wavelet) φ
(
1; x 2 [0; 1[
:
φ=
0; x 2
= [0; 1[
Figure 2.1: The Haar wavelet
The Haar function ψ is used to define a mother wavelet. From the mother
wavelet a series of wavelets is derived by two dyadic operations: dilatations
and translations. Dilatations compress the function on the x-axis. Translations
slide it along the x-axis. For integer translation indices k and dilatation indices j
1
φ
0.8
wavelets are derived from the mother wavelet by
0.6
ψ j;k (x) = 2 j=2 ψ(2 j x
k):
(2.1)
Haar wavelets ψ j;k have a support
supp ψ j;k
=
0.2
k2
j
;(
k + 1) 2
j
0
(2.2)
i.e. they are zero outside this interval. For each Haar wavelet ψ j;k the integral
Z
∞
∞
0.4
0
0.2
0.4
0.6
0.8
1
This is because construction of wavelets starts with the father wavelet from
which the mother wavelet is derived. The reverse way is not possible. In addition
ψ j;k (x)dx = 0:
i.e. the area above the x-axis is equal to the area below the x-axis.
The set ψ j;k ; j; k 2 Z constitutes a complete orthonormal basis in L2 , the
the fast transform uses the father wavelet. With the scaling function φ we can
expand our original set to fφ j0 ;k ; ψ j;k ; j j0 ; k 2 Zg. The combined set is again
an orthonormal basis in L2 .
space of square integrable functions1. This means that any square integrable
function can be approximated arbitrarily well by a linear combination of these
How can a data series be approximated by functions of the new set? This will
be illustrated by a simple thought experiment. Any L2 -function can be approx-
basis functions.
imated by a simple step function. The approximation converges for infinitely
small steps. Now we must show that the same approximation can be achieved by
R
1 L ( ) is the space of
2R
∞
2
∞.
∞ f (x) dx
j
j
<
complex valued functions f on
7
R
with a finite L2 -norm
jj f jj2
=
a combination of the constant function φ and Haar wavelets.
8
2 Wavelets
2 Wavelets
Any square integrable function2 f (x) can be described by
time series
Fourier transform
;
(2.3)
where c j;k are wavelet coefficients, ψ j;k are wavelets derived from a mother
wavelet ψ and φ is a scaling function (father wavelet), in the case of the Haar
wavelet transform it is unity on the interval [0; 1[. Using equation 2.4 a function
f can be decomposed into a linear combination of wavelets ψ j;k . The same is
true for a data series which can be described by a function f (x).
Wavelet transforms share many properties with Fourier transforms. The algorithm to determine wavelet coefficients is even faster than FFT. While the fast
Fourier transform algorithm by Cooley and Tukey requires N log(N ) operations,
the dyadic wavelet transform gets along with only N operations.
Figure 2.3 compares the Fourier transform, the windowed Fourier transform
(short-time FT) and the wavelet transform. In a time series with a high resolution in the time domain each point contains information about all frequencies.
Due to the convolution properties the opposite is true for the FT of the time series. In this case every point in the frequency domain contains information from
all points in the time domain. The windowed (short-time) Fourier transform divides the time-frequency plane in rectangular boxes. The resolution in time is
increased at the expense of the frequency resolution. The dyadic wavelet transform (DWT) overcomes this problem by scaling the basis functions relative to
their support. The WT needs more time for the detection of low frequencies
time
time
short-time Fourier transform
wavelet transform
time
time
Figure 2.3: Comparison of time-frequency properties for a time series, its Fourier
transform, short-time Fourier transform and wavelet transform.
1. We start with the scaling function (father wavelet)
φ0;k = φ(x
than for the detection of high frequencies. Using these properties of the WT it is
possible to describe an experimental signal on different frequency levels which
leads to Mallats multiresolution analysis (MRA).
frequency
;
frequency
;
frequency
∑ c j k ψ j k (x)
j=0 k=0
frequency
n 1 2j 1
f (x) = c00 φ(x) + ∑
(
with
φ(x) =
2.2 Mallat’s Multiresolution Analysis
f (x) = ∑ ck φ(x
1=2
R
square integrable functions jj f jj = ∞∞ f 2 (x)dx
is a Banach space of square integrable functions.
9
< ∞ for x 2 R, i.e. f 2 L2 (R) . L2 (R)
:
k)
k
which are constant functions on the interval [k; k + 1[. Consequently we
can write
2 For
1; x 2 [0; 1[
0; x 2
= [0; 1[
The set fφo;k g is an orthonormal basis for a reference space V0 . The function in V0 have the form
Mallat’s multiresolution analysis provides a general framework to construct wavelet
bases. The basic idea is to start from a father wavelet, derive an orthonormal
mother wavelet and wavelet subspaces suitable to approximate functions with
increasing resolution.
k)
V0 = f f (x) = ∑ ck φ(x
k
10
k)g:
2 Wavelets
2 Wavelets
2. Starting from V0 we define linear spaces
V1 = fh(x) = f (2x) : f
3. Our set of constant functions is not a basis in L2 . To obtain a basis we must
orthogonalize it. This is achieved if we find a W0 for which
2 V0g
V1
..
.
V j = fh(x) = f (2 j x) : f
V0
2 V0 g
W0
V1 contains all functions constant on [ 2k ; k+2 1 [. The set fφ1;k g is an orthonor-
p
mal basis in V1 with φ1;k (x) = 2φ(2x k).
Analogously, the basis functions of V j are φ j;k = 2 j=2 φ(2 j x
Figure 2.4: V1 = V0
k).
φ generates a sequence of spaces fV j ; j 2 Zg which are nested:
V1 = V0
V0 V1 : : : V j : : :
V j V j+1; j 2 Z:
If in addition every square integrable function can be approximated by
functions in
[V
j0
than fV j ; j 2 Zg is a MRA3 .
j
LW
0:
MW
0:
This means that W0 is the orthogonal complement of V0 in V1 . Figure 2.4
shows this relationship schematically. It requires that φ1k must be a linear
combination of φ0k and ψ0k :
p
φ10 (x) =
2φ(2x)
=
p
2fφ00 (x)
= p1
2
fφ00(x)
ψ00 (x)g=2
ψ00 (x)g
and similarly
φ11 (x) =
p
2φ(2x
1)
=
p1 fφ00 (x) + ψ00 (x)g=2
2
The system fψ j;k ; k 2 Zg, where ψ j;k (x) = 2 j=2 ψ(2 j x
mal basis in W j .
:
k), is an orthonor-
4. The graphical representation of these functions is depicted in Figure 2.5.
The same process can be repeated for higher values of j. This leads to consecutive summation of subspaces
V j+1
=
=
=
Vj Wj
Vj Wj
Wj
V0 W0 W1 W j
1
MW
V j
3S
j 0 V j
=
is dense in L2 (R)
11
0
l =0
12
l
:::
2 Wavelets
1
f
0
0
1
00
+
y
1
=f (x)
00
f
0
1
=y (x)
=f (x)
00
-
0
1
0
2 Wavelets
y
1
=y (x)
00
0
-1
-1
0
1
=
2
F
0
2
(x)
1
=
F
11
(x)
10
0
0
-2
-2
1
0
1
0
Figure 2.5: Basis functions φ j+1;k can be written as linear combinations of basis
functions ψ0;k and φ0;k .
Figure 2.6: MRA of a synthetic spectrum.
Finally this sum of nested spaces spans :
L2 (R ) = V0 MW
It demonstrates that the approximation of the signal improves at higher values
j
l =0
of j where higher frequencies are resolved.
l:
Consequently every square integrable function can be represented as the series
∞
f (x) = ∑ α0;k φ0;k (x) + ∑ ∑ β j;k ψ j;k (x):
j=0 k
k
2.3 Thresholding
The father wavelet φ creates a MRA. α0;k and β0;k are coefficients for mother
and father wavelets. This representation of f provides a location in time and
frequency. The location in time is determined by k and the location in frequency
by j. The larger j, the higher the frequency related to ψ j;k .
It should be noted that the summation for f (x) stops at 2 j
sets:
1
for finite data
n 12j 1
f (x) = α00 φ(x) + ∑
∑ c j k ψ j k (x)
j=0 k=0
;
;
;
(2.4)
Figure 2.6 depicts an example of a MRA for a synthetic NMR spectrum.
13
Wavelet thresholding is the basis of wavelet based noise reduction. For a function
f with Gaussian noise yi = f (ti ) + σεi (i 2 N ) this means that the function f is
restored.
Two types of thresholding were proposed by Donoho and Johnstone [7].
Hard thresholding is a simple “keep or kill” selection. All wavelet coefficients
below a threshold λ are zeroed.
(
chard
j;k =
0;
c j;k ;
14
jd j k j λ
jc j k j λ
;
;
>
:
2 Wavelets
2 Wavelets
Soft thresholding shrinks the coefficients towards zero:
csoft
j;k =
8
>
< c j;k
>
:
λ;
0;
c j;k > λ
jc j k j λ
;
:
c j;k + λ; c j;k < λ
The two shrinkage methods are displayed in Figure 2.7. The most important step
is now a proper choice of the threshold λ. Donoho and Johnstone showed that
p
p
a universal threshold λ = σ 2 log n= n where n is the sample size and σ the
Figure 2.8: Types of wavelets from left to right: Daubechies (4), Coiflet (5) and
Symmlet (8).
scale of the noise on a standard deviation scale4 .
Daubechies’ wavelets:
supp φ [0; 2N 1]
supp ψ [ N + 1; N ]
R x ψ x dx
l
( )
= 0; l = 0; :::; N 1
For the D4 Daubechies wavelet:
R ψ x dx
( )
Not symmetric.
Figure 2.7: Hard and soft thresholding.
supp φ [ 2K ; 4K 1]
supp ψ [ 4K + 1; 2K ]
The overall procedure of noise suppression consists of a wavelet transform
(WT) which yields the wavelet coefficients c j;k , thresholding of these coefficients
followed by an inverse wavelet transform (IWT) which restores the original spec-
R x φ x dx
R x ψ x dx
smoother wavelet bases is far beyond the scope of this script. Wavelet bases
must form an orthonormal basis for L2 (R ). Wavelets with good smoothing properties are designed to minimize the wavelet coefficients for smooth functions.
= 0;
l
= 0;
( )
Symmlets
supp φ [0; 2N 1]
supp ψ [ N + 1; N ]
R x ψ x dx
Although the Haar wavelet is convenient to describe the basics of wavelet transforms it is not suitable for most wavelet applications because too many coefficients are required to approximate a signal. The description of the design of
l ( )
l = 0; :::; N 1
l = 0; :::; N 1
Coiflets are not symmetric.
trum.
2.4 Smoother Wavelet Bases
Coiflets
N = 2K
l
( )
= 0;
l = 0; :::; N
1
Symmlets are not symmetric.
The number of such constraints applied during the design of wavelets determines
the number of vanishing moments. Wavelets of the Daubechies family shown in
Figure 2.8 also have compact support which is important for noise reduction.
Daubechies wavelets have vanishing moments for mother but not for father
wavelets and are fairly asymmetric. Coiflets have additional vanishing moments
for father wavelets. Symmlets are as close to symmetry as possible.
Basic properties of wavelets with compact support:
4
σ=
j
j
median( cJ 1;k median(cJ 1;k ) )
0:6745
15
16
= 0,
R xψ x dx
( )
= 0.
2 Wavelets
2 Wavelets
2.5 Applications in NMR Spectroscopy
3000
MRA and wavelet shrinkage have useful applications in NMR spectroscopy.
MRA of the FID can be used to remove slow frequencies from the FID, i.e. to
remove on-resonance components. Wavelet shrinkage has been used to denoise
one- and multi-dimensional NMR-spectra.
2500
2000
1500
1000
2.5.1 WAVEWAT
500
Figure 2.9 shows a multiresolution decomposition of a FID. The high- and lowfrequency components are nicely separated.
0
−500
Multiresolution Decomposition
0
500
1000
1500
2000
2500
3000
3500
4000
4500
Dyad
demonstrate the effect of the filter. It can hardly be detected in Figure 2.10A
underneath the strong water resonance which is not in phase with the rest of the
0
0.1
0.2
0.3
0.4
0.5
t
0.6
0.7
0.8
0.9
1
spectrum. In Figure 2.10B the WAVEWAT-filtered spectrum (zero filling once,
ZF = 2; reconstruction using levels 7, J = 7) is shown. Peak shapes and intensities are recovered perfectly. For comparison the effect of a convolution filter
employing a 32 points Gaussian apodization function is shown (Figure 2.10C).
This algorithm was originally proposed by Marion et al. [4]. Figure 2.11 demon-
J=7 Reconstruction.
strates the principles of this filter.
Here the signal close to water is barely recovered and significantly distorted.
Both the WAVEWAT and the convolution filter do not distort off-resonance peaks
in the spectrum.
100
200
300
400
500
600
700
800
900
1000
Figure 2.9: Top: Multiresolution plot of a FID from a 15 N-HSQC spectrum
recorded at 500 MHz using a 1.2 mM protein sample. Bottom: Original FID and FID recovered from the MRA (gray) shown in Figure 1
using only using levels with J 7.
Noise levels were calculated for all columns of a two-dimensional HSQC
spectrum after convolution water suppression and WAVEWAT water suppression. For convolution water suppression the noise level is reduced over an area
of at least 80 points close to the water signal (Figure 2.12A). For WAVEWAT the
area in which signals are distorted is much narrower and the edge of the filter is
Edge effects seen in Figure 2.9 are minimized when mirror reflections of the
FID are used for the MRA. A basic limitation is the fact that the number of dyadic
sharper. This is demonstrated in Figure 2.12B where a symmlet wavelet with 8
vanishing moments (Symmlet(8)) has been used after zero filling to 1024 points
levels is limited by 2J = N because the number of dyadic levels determines the
width of the filter. A sufficiently large number of data points for a reasonably
narrow filter width can be obtained by repeated zero filling.
and J = 7. In Figure 4C the same wavelet is used after zero filling to 1024 points
using J = 9. In Figure 4B the area in which signals are suppressed is narrower
and the edges of the water signal becomes visible. In this case signals close to
An example of a DWT water suppression is illustrated in Figure 2.10. The
signal originated from a 15 N-HSQC spectrum of a SH2 domain. The low in-
water will be recovered without distortion of signal intensities. Although approximately 20 points close to the water resonance were eliminated in Figure
tensity signal close to water was added synthetically to the experimental FID to
2.12C, the spectral area which is affected by the filter is much smaller than that
17
18
2 Wavelets
2 Wavelets
A
−500
−400
−300
−200
−100
0
100
−400
−300
−200
−100
0
100
−400
−300
−200
−100
0
100
B
−500
C
Figure 2.10: Spectrum obtained from the FID shown in Figure 2 after Fourier
transformation. The small signal close to water was added synthetically to the experimental FID. A: Fourier transformed spectrum
without prior water suppression. B: WAVEWAT was applied applied to the FID prior to Fourier transformation; the signal was zerofilled to 2048 points prior to MRA; the signal was recovered after
rejecting 7 levels as shown in Figure 1; a symmlet (8) wavelet was
used for the wavelet transform. C: water suppression was achieved
by a convolution of the FID with a 32-point Gaussian window.
Figure 2.11: Convolution filter for noise suppression.
A
noise
−500
0
100
200
300
400
500
600
100
200
300
400
500
600
100
200
300
400
500
600
in the case of a convolution filter.
noise
B
0
noise
C
0
Figure 2.12: Noise levels calculated for the incremented dimension. A: after water suppression using time-domain convolution ; B: after water suppression using WAVEWAT after zero-filling to 1024 points using
levels 7; C: after water suppression using WAVEWAT after zerofilling to 1024 points using levels 9.
19
20
2 Wavelets
2 Wavelets
2.5.2 Noise Suppression
2.5.3 Noise suppression in three-dimensional spectra
The basis of noise suppression by wavelet shrinkage was described in chapter
2.3. It is achieved by performing a wavelet transform and applying a threshold
to the wavelet coefficients. This is illustrated in Figure 2.13 which shows a
Figure 2.15 shows the effect of wavelet shrinkage on two-dimensional slices
one-dimensional spectrum before and after noise suppression together with the
corresponding wavelet coefficients.
A: or iginal spectrum
B: denoised spectr um
from a three-dimensional 15 N-NOESY spectrum. Here a Daubechies (4) wavelet
(with the smallest support among commonly used wavelets) gave the best results.
Noise reduction had to be applied in both proton dimensions of the 15 N-NOESYHSQC spectrum. The two spectra were plotted at comparable levels and peak
picking was performed at the same levels for both spectra. In the case of the
original spectrum 203 peaks were obtained whereas 101 peaks were obtained
after noise reduction.
0
200
400
600
0
C: wavelet coefficients
200
400
600
D: wavelet coefficients after soft thresh.
6
6
j
8
j
8
4
4
2
2
0
0
0.2
0.4
0.6
0.8
0
1
0
0.2
0.4
k
0.6
0.8
1
k
Figure 2.13: Wavelet shrinkage using a Daubechies (4) wavelet and soft thresholding.
Figure 2.14 shows an extreme case of an NMR spectrum with low signal-to-
105
105
110
110
110
115
115
115
120
120
125
125
130
130
135
D2 [ppm]
105
D2 [ppm]
D2 [ppm]
noise.
125
130
135
12
11
10
9
8
D1 [ppm]
7
6
5
120
135
12
11
10
9
8
D1 [ppm]
7
6
5
12
11
10
9
8
7
6
5
D1 [ppm]
Figure 2.14: Left: Original HSQC spectrum. Middle: HSQC spectrum after
baseline correction. Right: HSQC spectrum after wavelet shrinkage using a Symmlet (8) and hard-thresholding.
21
22
2 Wavelets
0
3 SVD-based methods
1
2
3
D2 [ppm]
4
5
3.1 SVD-based noise and signal suppression
6
7
8
Singular value decomposition (SVD) has been used for many purposes in signal
9
processing. SVD decomposes a matrix
10
10
9.5
9
8.5
8
7.5
D1 [ppm]
7
6.5
6
5.5
U and V are (L L) and (M M ) unitary matrices1 , S is (L M ) in size and contains the singular values. The singular values correspond to signal components
0
1
2
in H. Large signal components belong to large signals in H, small signal components are often noise related. This feature is often used in signal processing
because it allows one to distinguish between small and large signal components.
3
4
D2 [ppm]
H = ULL SLM V†MM :
5
Noise related singular values are often nicely separated from those related to
signals. This principle was first used in signal processing by Cadzow [2]. In
5
6
the following paragraphs a series of applications derived from this feature of the
SVD will be presented.
7
8
For all applications a matrix H must be derived from the FID before anything
9
can be started. For an FID
10
10
9.5
9
8.5
8
7.5
D1 [ppm]
7
6.5
6
5.5
5
Figure 2.15: Two-dimensional slices from a three-dimensional 15 N-NOESY
spectrum of sulfide dehydrogenase (SUD) spectrum before and after noise suppression. using a Daubechies (4) wavelet and softthresholding.
f
= (s0 ; s1 ; : : : ; sN 1 )
this is achieved by forming a Hankel type matrix by using a fraction of the FID
which is shifted by one point in each row of the matrix
matrix U is unitary if VV† = E. For real numbered matrices this is equivalent to orthogonal
matrices.
1A
23
24
3 SVD-based methods
0
s0
s1
s2
..
.
s2
..
.
s3
..
.
B
B s1
H=B
B
@
sL
1
sL sL+1
3 SVD-based methods
sM
line) was calculated as N = 2=3 128 ' 86 1:37.
1
1
sM
..
.
C
C
C:
C
A
14
12
10
sN
1
8
For a FID which does not contain any noise the rank of this matrix is equivalent to the number of signals in the FID. However, if the signal contains noise
6
4
the rank will be full (M).
2
The SVD picks up periodicities in the FID. Periodic components in the FID
0
0
10
20
30
40
50
60
70
80
90
will be represented by singular values. On the diagonal of S the singular values
are sorted by size. This makes it very easy to select signal components. An
Abbildung 3.2: Differential plot of singular values.
example is presented in Figure 3.1. The noise test signal contains three distinct
signals which are represented by three distinct singular values. In principle it
is straightforward to use this procedure to suppress noise. If all singular values
It is important to note that the number of singular values is twice the number
of signal components for real signals. If the complex points are reconstructed
below the blue line are set to zero and the FID is restored from H a noise-free
signal is obtained. The difficulty is always to find the noise level.
25
employing a Hilbert transform2 this introduces erroneous signal components in
the singular values shown in Figure 3.3. These errors are typically in the order
of 3 Sthres: .
20
12
18
20
16
10
14
15
8
12
10
10
6
8
5
6
4
4
0
2
2
−5
0
50
100
150
200
250
300
0
0
10
20
30
40
50
60
70
80
90
0
Abbildung 3.1: Left: NMR spectrum with three signals. Right: singular values
sorted by size. Blue line: noise level.
A differential plot of the singular values (Figure 3.2 on the following page)
shows that noise-related singular values have small positive differential numbers.
These values are positive because the singular values were ordered by size and
they are small because the difference between adjacent singular values is small.
This differential plot can help to defined the noise level without visual inspection.
If the number of data points is large compared to the number of signals in the
spectrum the mean value of the last singular values ∆S can be used to calculate
the noise level by extrapolating the line back to the first singular value: Sthresh =
N ∆S . For the shown example the value of ∆S was 0.0160, the noise level (blue
25
0
10
20
30
40
50
60
70
80
90
Abbildung 3.3: Singular values calculated for the spectrum from Figure
3.1 after Hilbert transform and inverse Fourier transform.
The SVD applied to a Hankel matrix derived form a FID combined with the
described selection of a noise level can be used to determine the number of signal
components in a FID. This is illustrated in Figure 3.4 which shows 5 spectra with
complex line shapes and varying signal intensity.
Figure 3.5 depicts the corresponding signal component analysis using automatic noise level detection. Singular values above the noise level are marked
2 The
Hilbert transform must be applied in the frequency domain. This requires a Fourier transform prior to the Hilbert transform and an inverse Fourier transform to obtain the complex
FID.
26
3 SVD-based methods
3 SVD-based methods
a
1.8
1.6
b
s ig n a ls = 3
s ig n a ls = - 1
1.4
1.2
1
311K
0.8
308K
0
2 0
0.6
305K
0.2
8 0
P u n k te
1 0 0
1 2 0
1 4 0
2 0
4 0
6 0
8 0
P u n k te
1 0 0
1 2 0
1 4 0
d
293K
0
−0.2
6 0
4 0
c
0.4
283K
0
5
10
15
20
25
30
Abbildung 3.4: Test signals for signal component analysis.
0
by an extra circle. The results are in good agreement with the number of signal
components seen by visual inspection.
0.11
0.18
283K
0.1
0.09
0.14
0.08
305K
0.4
0.35
0.12
0.3
0.1
0.25
1 0
1 5
2 0
P u n k te
2 5
3 0
3 5
4 0
4 5
0
5
1 0
1 5
2 0
P u n k te
2 5
3 0
3 5
4 0
4 5
Abbildung 3.6: Cadzow noise suppression and signal suppression.
in NOESY spectra [22]. It is almost routinely used to suppress the water signal in in vivo MRS [20, 23]. Unfortunately this algorithm can not be used for
high-resolution NMR-spectra and for large multi-dimensional spectra because
0.45
293K
0.16
5
0.07
the SVD is a O ((L
0.06
0.08
0.2
0.06
0.15
0.05
0.04
0.03
0.02
0.01
1
2
3
4
5
6
0.04
0.1
0.02
0.05
0
1
7
2
3
4
5
6
0
1
7
2
3
4
5
6
M )2 M ) process.
7
3.2 Linear prediction
0.7
0.8
308K
311K
0.7
0.6
Linear prediction has in principle nothing to do with SVD. However, SVD is
0.6
0.5
0.5
frequently used to determine LP coefficients. This version of linear prediction is
called LP-SVD. The LP-SVD algorithm was originally described by Kumaresan
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0
1
and Tufts [16] and modified by Porat and Friedlander.
0.1
2
3
4
5
6
7
0
1
2
3
4
5
6
7
Abbildung 3.5: Singular values for the 5 spectra shown in Figure 3.4.
The forward linear prediction model assumes that a data point can be described by a linear combination of K preceding points
K
The same principle was used by Cadzow to suppress the noise level in signals
[2]. The principle is shown in Figure 3.6a. By dropping the noise related singular
values an almost noise-free spectrum is obtained.
The procedure can also be used in the other direction to suppress signals.
This is demonstrated in Figure 3.6b where the largest signal component was zeroed. This eliminated one signal in the corresponding spectrum. For equally
sized signals the choice of the peak which will be eliminated will be random.
However, if the largest signal is to be eliminated the method is very stable. This
xn =
∑ ak xn
k=1
k;
(3.1)
the backward linear prediction model by a linear combination of the K following points:
K
xn =
∑ bk xn+k
k=1
:
Equation 3.1 can be written in matrix form as
algorithm has been used in NMR processing to eliminate large diagonal peaks
x = aY
27
28
(3.2)
3 SVD-based methods
3 SVD-based methods
from which the matrix coefficients can be obtained by an inversion of Y. Inver-
eliminated. If forward and backward linear prediction coefficients were calcu-
sion of Y can be achieved by SVD
lated it is common to use average of the reflected roots for forward and backward
linear prediction.
Y = UΛV†
1
from which Y0 is obtained by
0.5
0
−0.5
Y† = VΛ
1
U† :
−1
0
100
200
300
400
500
600
0
50
100
150
200
250
300
0
50
100
150
200
250
300
60
Now a can be computed by simple matrix algebra
40
20
a = VΛ
1
U† x:
0
60
In the presence of noise K must be larger than the number of frequency components; it is typically set to N =4
N =3.
20
Using these prediction coefficients ak the frequencies and damping factors
are obtained by calculating the roots of the polynomial
zK
a1 zK
1
40
:::
cK
0
Figure 3.7: LPSVD. Top: Simulated FID (256 points, *) and predicted FID (512
point, solid line). Middle: Spectrum calculated after zero filling.
Bottom: Spectrum after lpsvd.
= 0:
The signal xn is described by
K
xn =
∑ Ak zkn
k=1
1
:
(3.3)
Now the Ak which encode phase and amplitude of each signal component can be
determined by an additional SVD. Equation 3.3 can be rewritten in matrix form
x = AZ:
It is important to note that the SVD steps used to calculate the linear prediction coefficients ak (and bk ) can be used to eliminate noise related signals by
truncating Λ to include solely large singular values which correspond to signals
rather than noise.
Linear prediction is used to predict additional data points to improve resolution. This can be helpful to avoid truncation artefacts. It can also be applied to
enhance signal-to-noise.
A can be determined by inverting Z which can be achieved employing a second
SVD.
These procedures are repeated for backward linear prediction. Forward and
backward linear prediction yield two sets of polynomial roots. For forward linear prediction the complex roots lie inside the unit circle. For the backward
linear prediction they are inside the unit circle. For the forward linear prediction
the roots inside the circle are related to signal, those outside to exponentially
increasing signals or to noise. Several procedures were proposed to minimize
linear prediction artefacts. Root reflection replaces roots zk by zk =jz2k j. This
ensures that all signals are within the unit circle and noise reacted signals are
29
30
4 Using NMRLab
at two-dimensional NMR data. MAT is set to ’-1’ when a matrix is saved to disk
and deleted from RAM.
ACQUS contains the acquisition parameters which were read from the BRUKER
acqus, acqu2s and acqu3s files with the readacqus command. It is itself a two
or three dimensional array which contains the information for the three spectral
dimensions.
4 Using NMRLab
PROC has a similar structure and holds the processing parameters (set in EDP).
It is also a three-dimensional array of structures.
The algorithms described in this course were implemented in NMRLab [10].
NMRLab uses MATLAB (The Mathworks). Here some of the most basic tools
and data structures in NMRLab are described.
NMRPAR is initialized as a global structure by nmrlab.m and holds all basic
parameters required to run NMRLAB. The fields of NMRPAR define availability
of RAM, computer type (detected by nmrlab.m) and other parameters.
The NMRDAT structure
NMRDAT is a global structure which is another set up by the nmrlab.m script
usually executed from STARTUP. Some fields can be edited, saved and restored
using BROWSE but NMRDAT it is also manually readable on the MATLAB command line. Table 4.1 lists the field names in NMRDAT.
Table 4.1: NMRDAT structure
NMRDAT field name
NAME
SER
MAT
ACQUS
PROC
DISP
Dataset name. Used for saving substructure to disk.
Converted SER file.
Processed data matrix.
Acquisition parameters (3D).
Processing parameters (3D).
Display parameters.
NAME is a string describing the name of the .mat file when data is saved to
disk with the browse command. This is the only use of NAME. It has no influence
on the name of the data after it is retrieved because a data set will always become
an element in the NMRDAT structure.
SER is simply an array for FIDs in the order they were recorded. The fast (t2)
dimension is in columns. MAT is the processed data matrix with dimension 1 in
rows. This is not consequent but it saves time when data is displayed because the
contour command logically plots matrix rows as rows, the way we usually look
31
32
4 Using NMRLab
NMR processing functions in NMRLAB.
Function
Description
hft
Hilbert Transform
fft
Fast Fourier transform
ift
Inverse fast Fourier transform
dft
Fourier transform of BRUKER digital
filtered data
rft
Real Fourier transform for TPPI-type
data
smo
smooth = polynomial sovent filter [3]
sol
solvent filter by time-domain convolution [4]
wavewat
WAVEWAT water suppression
wdwf2
Window functions (gm, em, sine bell,
cubic sine bell)
baseline2 Different algorithms for baseline correction
chi2_flatt Calculate chi2 vector for FLATT
absc
call and setup for FLATT baseline flattening (calls flatten2)
lpsvd2
SVD based Linear Prediction [24, 16]
lpx2
Linear Prediction (LPC, Prony, or
Steiglitz-McBride)
rev
reverse data
cshl
circular shift left
cshr
circular shift right
shl
shift left
shr
shift right
revm
reverse data in one dimension
revm2
reverse matrix in both dimensions
transm
transpose real or complex matrix
(same as ctranspose in MATLAB)
phase
phase vector or matrix (called by
uiphase)
strip
cut a strip out of a matrix
cadzow2
perform cadzow algorithm on a twodimensional matrix (uses CADZOW)
MATLAB built-in function. Requires MATLAB signal processing tool box.
4 Using NMRLab
Control functions in NMRLAB.
Function
Description
nmrlab
MATLAB script to setup parameters for NMRLAB
re
read raw data from disk
relist
read series of experiments
readser
read BRUKER ser files.
readacqus
read parameters from BRUKER ser file
snd
show data sets and sizes NMRDAT
uiphase
interactive phase correction
uicont
interactive contour plotting
sartitr
analyze series of 2D NMR spectra (e.g. SAR by NMR series)
edp
edit proocessing parameters
edd
edit display parameters
browse
browse, edit, save, export and load NMRDAT contents
xfb
process two dimensional data
xfall
process series of 2D data sets (e.g. SAR by NMR series)
tf
process three dimensional data
absc
2D/3D post processing baseline correction
phasend
2D/3D post processing phase correction
denoise
2D/3D post processing wavelet denoising
xyztranspose transpose 3D structures
makespc
utility to create synthetic spectra
MATLAB 5.3 will automatically activate graphical tools and zoom options.
For MATLAB 5.0-5.2 use the zoom and plotedit commands instead.
Table 4.3:
Table 4.2: All processing commands can be executed from the command line.
However, usually processing routines are called from the processing
functions xfb and tf. Help for all commands is available by typing
help(’command’).
33
34
4 Using NMRLab
Wavelet Shrinkage Parameters in NMRLAB
qmf_type
wavelet type
Haar, Beylkin, Coiflet, Symmlet,
(quadrature mirror filter)
Daubechies, Vaidyanathan
par
QMF paramater
Coiflet: 1-5 (3).Daubechies:
4,6,8,10,12,14,16,18,20.
Symmlet: 4-10 (8).
thr_type
Type of shrinkage
hard, soft, SURE, Hybrid,
MinMax, MAD
L
Low-Frequency cutoff for shrinkage. must be < < J, N = 2J (2-4)
N = number of data points
normalize
normalize noise WT_type
periodized orthogonal
fully translation invariant
thr
threshold value
manual
entry,
universal
p
log(n), n = number of
data points.
2D and 3D version of normalization has been implemented in NMRLAB.
Other wavelet transforms (i.e. the Meyer wt) are available in WaveLab .
Parameters which yield good results for NMR spectroscopy have been italicized.
Table 4.4:
Bibliography
[1] J.B. Buckheit and D.L. Donoho. Wavelab and reproducible research. In
A. Antoniadis and G. Oppenheim, editors, Wavelets and Statistics, pages
53–81. Springer, Berlin, 1995.
[2] J. Cadzow. Signal enhancement. A composite property mapping algorithm.
IEEE Trans. Acoust. Speech Signal Process., ASSP-36:49–62, 1988.
[3] P.T. Callaghan, A.L. MacKay, K.P. Pauls, O. Soderman, and M. Bloom. J.
Magn. Reson., 56:101–109, 1984.
[4] A. Bax M. Ikura D. Marion. Improved solvent suppression in one- and two-
Type of LP
lpsvd
prony
stmbc
lpc
algorithm
SVD based linear prediction
PRONY’s method
Steiglitz McBride
LPC
parameter vector
[FID, start,stop,poles,signals]
[FID,NB,NA]
[FID,NB,NA,N]
[FID,N]
Lpsvd does not give a choice regarding forward vs. forward-backward linear prediction because forward linear prediction is prone to pick up faulty noise
peaks and should therefore never be used without user interaction.
dimensional NMR spectra by convolution of time-domain data. J. Magn.
Reson., 84:425–430, 1989.
[5] I Daubechies. Ten Lectures on Wavelets. SIAM, Philadelphia, 1992.
[6] MArc A. Delsuc, Feng Ni, and George C. Levy. Improvement of linear
prediction of nmr spectra having very low signal-to-noise. J. Magn. reson.,
73:548–552, 1987.
[7] D.L. Donoho and I.M. Johnstone. Ideal spacial adaptaion via wavelet
shrinkage. Biometrika, 81:425–455, 1994.
[8] D.L. Donoho and I.M. Johstone. Adapting to unknown smoothness via
wavelet shrinkage. J. of Amer. Stat. Assoc., 90:1200–1224, 1995.
[9] Amara Graps. An introduction to wavelets. IEEE Computational Science
and Engeneering, 2:1–18, 1995.
[10] UL Günther, C Ludwig, and H Rüterjans. NMRLAB - Advanced NMR
data processing in Matlab. J. Magn. Reson., 145(2):101–108, 2000.
35
36
Bibliography
[11] UL Günther, C Ludwig, and H Rüterjans. Improved automatic structure
calculation usind wavelet denoised data. in preparation, 2001.
[12] UL Günther, C Ludwig, and H Rüterjans. WAVEWAT - Improved solvent
suppression in NMR spectra employing wavelet transforms. submitted to
J. Magn. Reson., 2001.
[13] W. Härdle, G. Kerkyacharian, D. Picard, and A. Tsybakov. Wavelets, Approximation, and Statistical Applications. Springer, 1998.
[14] Jeffrey C. Hoch and Alan S. Stern. NMR data processing. 1996.
[15] B Jawerth and W Sweldens. An overview of wavelet based multiresolution
analysis. SIAM Review, 36:377–412, 1994.
[16] R. Kumaresan and D.W. Tufts. Estimating the parametes of exponentially
damped sinosoids and pole-zero modelling in noise. IEEE Trans. Acoust.
Speech Signal Process., ASSP-30:833–840, 1982.
[17] S. Mallat. A wavelet tour of signal processing. Academic Press, 1998.
[18] Yves Nievergelt. Wavelets Made Easy. Birkhäuser, 1999.
[19] Todd R Odgen. Essential wavelets for statistical applications and data
analysis. Birkhäuser, 1997.
[20] WWF Pijnappel, A van den Boogaart, R de Beer, and D van Ormondt.
SVD-based quantification of magnetic resonance signals. J. Magn. Reson.,
97:122–134, 1992.
[21] Brani Vidakovic and Peter Müller.
Wavelets for kids.
Available at
http://www.isye.gatech.edu/ brani/wavelet.html, 1994.
[22] G. Zhu, W.Y. Choy, G. Song, and B.C. Sanctuary. Suppression of diagonal
peaks with singular value decomposition. J. Magn. Reson., 132:176–178,
1998.
[23] G Zhu, D Smith, and Y Hua. Post-acquisition solvent suppression by
singular-value decomposition. J. Magn. Reson., 124(1):286–9, 1997.
[24] Guang Zhu and Ad Bax. Improved linear prediction of damped nmr signals
using modified "forward-backward" linear prediction. J. Magn. Reson.,
100:202–207, 1992.
37
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement