Generalized adaptive comb filters/smoothers and their application to the identification of quasi

This article appeared in a journal published by Elsevier. The attached
copy is furnished to the author for internal non-commercial research
and education use, including for instruction at the authors institution
and sharing with colleagues.
Other uses, including reproduction and distribution, or selling or
licensing copies, or posting to personal, institutional or third party
websites are prohibited.
In most cases authors are permitted to post their version of the
article (e.g. in Word or Tex form) to their personal website or
institutional repository. Authors requiring further information
regarding Elsevier’s archiving and manuscript policies are
encouraged to visit:
http://www.elsevier.com/authorsrights
Author's personal copy
Automatica 49 (2013) 1601–1613
Contents lists available at SciVerse ScienceDirect
Automatica
journal homepage: www.elsevier.com/locate/automatica
Generalized adaptive comb filters/smoothers and their application to
the identification of quasi-periodically varying systems and signals✩
Maciej Niedźwiecki 1 , Michał Meller
Faculty of Electronics, Telecommunications and Computer Science, Department of Automatic Control, Gdańsk University of Technology,
Narutowicza 11/12, 80-233 Gdańsk, Poland
article
info
Article history:
Received 17 May 2012
Received in revised form
27 December 2012
Accepted 22 January 2013
Available online 26 March 2013
Keywords:
System identification
Time-varying processes
abstract
The problem of both causal and noncausal identification of linear stochastic systems with quasiharmonically varying parameters is considered. The quasi-harmonic description allows one to model
nonsinusoidal quasi-periodic parameter changes. The proposed identification algorithms are called
generalized adaptive comb filters/smoothers because in the special signal case they reduce down to
adaptive comb algorithms used to enhance or suppress nonstationary harmonic signals embedded in
noise. The paper presents a thorough statistical analysis of generalized adaptive comb algorithms, and
demonstrates their statistical efficiency in the case where the fundamental frequency of parameter
changes varies slowly with time according to the integrated random-walk model.
© 2013 Elsevier Ltd. All rights reserved.
1. Introduction
1.1. Problem statement
We will consider the problem of identification of quasiperiodically varying complex-valued systems governed by
y(t ) =
n

θi (t )u(t − i + 1) + v(t ) = ϕT (t )θ(t ) + v(t )
(1)
i =1
where t = 1, 2, . . . denotes the normalized discrete time, y(t )
denotes the system output, ϕ(t ) = [u(t ), . . . , u(t − n + 1)]T
denotes regression vector, made up of the past input samples, v(t )
denotes measurement noise, and θ(t ) = [θ1 (t ), . . . , θn (t )]T is the
vector of time-varying system coefficients, modeled as weighted
sums of complex exponentials
θ(t ) =
K

βk (t )ej
t
i=1
ωk (i)
(2)
k=1
βk (t ) = [bk1 (t ), . . . , bkn (t )]T
bki (t ) = aki (t )ejνki ,
i = 1, . . . , n.
✩ This work was supported by the National Science Center. The material in
this paper was not presented at any conference. This paper was recommended
for publication in revised form by Associate Editor Wolfgang Scherrer under the
direction of Editor Torsten Söderström.
E-mail addresses: maciekn@eti.pg.gda.pl (M. Niedźwiecki),
michal.meller@eti.pg.gda.pl (M. Meller).
1 Tel.: +48 58 3472519; fax: +48 58 3415821.
0005-1098/$ – see front matter © 2013 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.automatica.2013.02.037
The following three types of real-valued quantities are incorporated in (2): the instantaneous angular frequencies ωk (t ), the instantaneous amplitudes aki (t ), and the time-invariant phase shifts
νki . With a slight abuse of terminology, the complex-valued vectors
βk (t ) will be further referred to as ‘complex amplitudes’.
Under certain circumstances (in the presence of several strong
reflectors) the model (1)–(2) can be used to describe rapidly fading
mobile radio channels (Bakkoury, Roviras, Ghogho, & Castanie,
2000; Giannakis & Tepedelenlioǧlu, 1998; Tsatsanis & Giannakis,
1996). In this case y(t ) denotes the sampled baseband signal
received by the mobile unit, {u(t )} denotes the sequence of
transmitted symbols, and v(t ) denotes channel noise.
We will assume that the frequencies ωk (t ) are harmonically
related, namely
ωk (t ) = mk ω0 (t ),
k = 1, . . . , K
(3)
where ω0 (t ) denotes the slowly varying fundamental frequency
and mk are integer numbers. Such multiple frequencies, called
harmonics, appear in the Fourier series expansions of periodic
signals. For example, if parameter trajectory θ(t ) is periodic with
period L, it admits the following Fourier representation: θ(t ) =
L−1
jkω0 t
, ω0 = (2π )/L.
k=0 βk e
The notion of ‘time-varying harmonics’ can be regarded as
a natural extension of the Fourier analysis to quasi-periodically
varying systems, such as (1)–(2). The choice of the multipliers
mk , k = 1, . . . , K , depends on our prior knowledge of the system
time variation. When all harmonics are expected to be present, one
should set mk = k. In the presence of odd harmonics only, the
natural choice is mk = 2k − 1, etc.
Author's personal copy
1602
M. Niedźwiecki, M. Meller / Automatica 49 (2013) 1601–1613
In the special case where n = 1 and ϕ(t ) ≡ 1, Eqs. (1)–(3)
describe a complex-valued harmonic signal s(t ) = θ (t ) buried in
noise
y(t ) = s(t ) + v(t ),
s( t ) =
K

bk (t )ej
t
i=1
ωk (i)
.
(4)
k=1
The problem of either elimination or extraction of harmonic signals
buried in noise can be solved using adaptive comb filters (Nehorai
& Porat, 1986; Regalia, 1995). For this reason the system identification/tracking algorithm described below can be considered a
generalized comb filter.
1.2. Contribution
The problem of causal identification (tracking) of single-mode
quasi-periodically varying systems was studied in Niedźwiecki and
Kaczmarek (2004, 2005a,b).
In the recent conference paper (Niedźwiecki & Meller, 2011a),
the results presented earlier were extended to noncausal identification (smoothing). Additionally, a more sophisticated frequency
estimation scheme was proposed, incorporating frequency rate
tracking/smoothing and yielding better results in practice.
All papers published so far focus on the identification of singlemode systems, i.e., systems with parameters that can be modeled
as complex sinusoids (cisoids) with slowly varying amplitudes and
a slowly varying instantaneous frequency.
This paper extends results presented in Niedźwiecki and
Meller (2011a) to nonstationary systems with quasi-harmonically
varying parameters, i.e., to systems with several frequency modes
governed by the same slowly varying fundamental frequency. In
practice such harmonic modes of variation often arise in oscillatory
systems with nonlinear elements and/or loads (Neimark, 2003).
In principle, quasi-harmonically varying systems can be
identified using the multiple-frequency versions of the algorithms
mentioned above. Such algorithms are made up of several singlefrequency sub-algorithms that work in parallel and are driven by
the common prediction error. Since the estimated frequencies are
in this case regarded as mutually unrelated quantities, the harmonic structure of the system/signal time variation is not exploited
in any way. In this paper we present algorithms that take advantage of such a prior information, i.e., the algorithms that perform a
coordinated frequency search. This allows one to improve estimation results considerably.
2. Generalized adaptive notch filter — overview of known
results
Suppose that the identified nonstationary system has a single
frequency mode (K = 1), i.e., it is governed by
y(t ) = ϕT (t )θ(t ) + v(t ),
θ(t ) = β(t )ej
t
i=1
ω(i)
(5)
where β(t ) = [b1 (t ), . . . , bn (t )] and ω(t ) ∈ (−π , π] are slowly
varying quantities. Furthermore, suppose that:
T
(A1) The measurement noise {v(t )} is a zero-mean circular white
sequence with variance σv2 .
(A2) The sequence of regression vectors {ϕ(t )}, independent
of {v(t )}, is nondeterministic, wide-sense stationary and
ergodic with known correlation matrix2 8 = E[ϕ∗ (t )ϕT (t )].
Denote by α(t ) = ω(t + 1) − ω(t ) the rate of change of the instantaneous frequency ω(t ). Under assumptions (A1)–(A2), identification of the system (5) can be carried out using the following
generalized adaptive notch filtering (GANF) algorithm proposed in
Niedźwiecki and Meller (2011a)
ω(t −1)+
α (t −1)]

f (t − 1)
f (t ) = ej[
T

ε(t ) = y(t ) − ϕ (t )f (t )
β(t − 1)

β(t ) = 
β(t − 1) + µ8−1 ϕ∗ (t )
f ∗ (t )ε(t )


Im ε ∗ (t )ϕT (t )
f (t )
β(t − 1)
g (t ) =
H

β (t − 1)8
β(t − 1)

α (t ) = 
α (t − 1) − γα g (t )

ω (t ) = 
ω(t − 1) + 
α (t − 1) − γω g (t )


θ(t ) = 
f (t )β(t )
(6)
t
where 
f (t ) is an estimate of f (t ) = ej i=1 ω(i) , and µ > 0, γω >
0, γα > 0, such that γα ≪ γω ≪ µ, denote small adaptation gains
determining the rate of amplitude adaptation, frequency adaptation and frequency rate adaptation, respectively.
The gradient search strategy, incorporated in (6) for the purpose
of tracking ω(t ) and α(t ), is based on minimization of the following
instantaneous measure of fit J (t ) = |ϵ(t )|2 /2, where ϵ(t ) = y(t )−
ϕT (t )f (t )
β(t − 1). Note that


∂ J (t )
∂ϵ ∗ (t )
= Re ϵ(t )
∂ω(t )
∂ω(t )


∗
= −Re jϵ(t )ϕH (t )f ∗ (t )
β (t − 1)


= Im ϵ ∗ (t )ϕT (t )f (t )
β(t − 1) .
(7)
Therefore, the term g (t ) in (6) can be interpreted as a normalized
estimate of the gradient (7). Normalization makes the algorithm
scale-invariant. When it is not applied, the tracking properties
of the GANF algorithm depend not only on the user-defined
adaptation gains γω and γα , but also on the system-related
variables, which is inconvenient from the practical viewpoint.
Tracking properties of the GANF algorithm (6) were analyzed
in Niedźwiecki and Meller (2011a) in the case where the vector of
‘amplitudes’ β(t ) is unknown but constant:
(A3∗ ) β(t ) ≡ β, i.e., θ(t ) = ejω(t ) θ(t − 1), ∀t.
H
Note that under (A3∗ ) the normalization term 
β (t − 1)8
β(t − 1)
can be regarded as an estimate of the power of the noiseless system
output
b2 = βH 8β = E βH ϕ∗ (t )ϕT (t )β = E |ϕT (t )θ(t )|2 .




Using the approximating linear filter (ALF) technique – the
stochastic linearization approach proposed in Tichavský and
Händel (1995) – one can show that the frequency and frequency
rate estimation errors ∆
ω(t ) = ω(t )−
ω(t ), ∆
α (t ) = α(t )−
α (t ),
can be approximately expressed in the form
∆
ω(t ) ∼
= G1 (q−1 )e(t ) + G2 (q−1 )δ(t )
∆
α (t ) ∼
= H1 (q−1 )e(t ) + H2 (q−1 )δ(t )
(8)
(9)
where {e(t )}, e(t ) = −Im[β ϕ (t )f (t )v(t )/b ], is a zero-mean
white noise with variance σe2 = σv2 /(2b2 ), δ(t ) = α(t ) − α(t − 1)
denotes the one-step change of the frequency rate, and
H
∗
∗
2
G1 (q−1 ) = (1 − q−1 )[γω + (γα − γω )q−1 ]/D(q−1 )
G2 (q−1 ) = q−1 [1 − γω − (1 − µ)q−1 ]/D(q−1 )
2 Hereinafter the symbol ∗ will denote complex conjugation, and the symbol H
— Hermitian (conjugate) transpose.
H1 (q−1 ) = γα (1 − q−1 )2 /D(q−1 )
H2 (q−1 ) = [1 + (µ + γω − 2)q−1 + (1 − µ)q−2 ]/D(q−1 )
Author's personal copy
M. Niedźwiecki, M. Meller / Automatica 49 (2013) 1601–1613
where D(q−1 ) = 1 + d1 q−1 + d2 q−2 + d3 q−3 , d1 = µ + γω + γα −
3, d2 = 3 − 2µ − γω , d3 = µ − 1. All filters are asymptotically
stable if adaptation gains fulfill the following (sufficient) stability
conditions: 0 < µ < 1, 0 < γω < 1, 0 < γα < 1 and
µ(γω + γα ) > γα .
In spite of its simplicity, the gradient frequency tracking
mechanism adopted in (6) has very good statistical properties — as
shown in Niedźwiecki and Meller (2011a), when the instantaneous
frequency drifts according to the Gaussian integrated randomwalk (IRW) model, namely
(A4) {δ(t )}, independent of {v(t )} and {ϕ(t )}, is a zero-mean white
sequence with variance σδ2 ,
(A5) The sequences {v(t )} and {δ(t )} are normally distributed,
the optimally tuned GANF algorithm (6) is statistically efficient,
i.e., it reaches the Cramér–Rao-type lower frequency and frequency rate tracking bounds.
Note that α(t ) − α(t − 1) = δ(t ) implies (1 − q−1 )2 ω(t ) =
δ(t − 1). Since (1 − q−1 )2 ω(t ) = 0 entails ω(t ) = γ1 +γ2 t, where γ1
and γ2 denote arbitrary constants, the IRW model can be regarded
as a perturbed linear growth/decay model – for small perturbations
(σδ ≪ σv /b) the corresponding frequency changes will be further
referred to as quasi-linear.
3. Multiple-frequency GANF
Denote by yk (t ) = ϕT (t )θ k (t ) + v(t ), where θ k (t ) = fk (t )βk (t ),
t
and 
fk (t ) is an estimate of fk (t ) = ej i=1 ωk (i) , the output of this
subsystem of (1) which is associated with the frequency ωk . If
the signals y1 (t ), . . . , yK (t ) were measurable, one could design K
independent GANF algorithms of the form (6), each taking care of
K
a particular subsystem. Since it holds that θ(t ) =
k=1 θ k (t ), the
final parameter estimate could be easily obtained by combining the
partial estimates.
Even though the outputs yk (t ) are not available, one can replace
them with the surrogate (estimated) outputs obtained from

yk (t ) = y(t ) − ϕT (t )
K

ωk (t −1)+
αk (t −1)]

fk (t ) = ej[
fk (t − 1)

fk (t )
β k (t − 1)
k=1
K

fk (t )
βk (t − 1)
t
=
K

ωk (i)
t
= ejmk i=1 ω0 (i) . Note that


K

∂ J (t )
∗
H
∗

= −Re jϵ(t )ϕ (t )
mk fk (t )βk (t − 1)
∂ω0 (t )
k =1
i=1


mk Im ϵ ∗ (t )ϕT (t )fk (t )
βk (t − 1) .
(11)
k=1
This leads to the following recursive estimation scheme which will
be further referred to as a generalized adaptive comb filter (GACF)
ω0 (t −1)+
α0 (t −1)]

fk (t ) = ejmk [
fk (t − 1)
K


fk (t )
β k (t − 1)

β k (t ) = 
βk (t − 1) + µ8−1 ϕ⋆ (t )
fk⋆ (t )ε(t )

θ k (t ) = 
fk (t )
βk (t )
k = 1, . . . , K
K

g (t ) =

mk Im ε ⋆ (t )ϕT (t )
fk (t )
βk (t − 1)

k=1
K

H
m2k 
βk (t − 1)8
βk (t − 1)
k=1

α0 (t ) = 
α0 (t − 1) − γα g (t )
ω0 (t − 1) + 
α0 (t − 1) − γω g (t )

ω0 (t ) = 
k = 1, . . . , K

θ k (t ).
ϵ(t ) = y(t ) − ϕT (t )
k=1

β k (t ) = 
βk (t − 1) + µ8−1 ϕ⋆ (t )
fk⋆ (t )ε(t )


Im ε ⋆ (t )ϕT (t )
fk (t )
β k (t − 1)
gk (t ) =
H

βk (t − 1)8
βk (t − 1)

αk (t ) = 
αk (t − 1) − γα gk (t )

ωk (t ) = 
ωk (t − 1) + 
αk (t − 1) − γω gk (t )



θ k (t ) = fk (t )βk (t )
K

In order to arrive at the algorithm which performs a coordinated
search of the instantaneous fundamental frequency ω0 (t ), one
should minimize J (t ) for
ε(t ) = y(t ) − ϕT (t )
k=1

θ(t ) =
4. Generalized adaptive comb filter
where fk (t ) = ej
where 
θ k (t |t − 1) = 
fk (t )
βk (t − 1) denotes the one-step-ahead
prediction of θ k (t ). Note that after replacing y(t ) with 
yk (t ) in
(6), one obtains εk (t ) = 
yk (t ) − ϕT (t )
θ k (t |t − 1) = y(t ) −

ϕT (t ) Kk=1 
fk (t )
βk (t − 1) = ε(t ), ∀k, which means that all subalgorithms should be driven by the same ‘global’ prediction error.
Such an approach was used, with good results, to design multiplefrequency algorithms in Niedźwiecki and Kaczmarek (2004). When
applied to (6) it yields
ε(t ) = y(t ) − ϕT (t )
In Niedźwiecki and Kaczmarek (2005b) it was shown that the
number of frequency modes, as well as all initial conditions needed
to smoothly start (start without initialization transients) the GANF
algorithm, can be inferred from nonparametric DFT-based analysis
of a short startup fragment of the input–output data. The tool that
can be used for this purpose was termed generalized (system)
periodogram, as in the signal case it reduces to the classical
periodogram.
When applied to identification of the system (1)–(3), the GANF
algorithm (10) has two serious drawbacks.
First, it does not take into consideration the harmonic
structure (3), i.e., the estimated frequencies are regarded as
mutually unrelated quantities, while the true harmonics vary in
a coordinated way. Hence, even though such an unconstrained
multiple-frequency generalized adaptive notch filter can be
used to identify the multi-harmonic system/signal, its tracking
characteristics will be generally inferior to those offered by
solutions that incorporate the harmonic constraints.
Second, the algorithm (10) is not robust to incorrect frequency
matching. While the strong frequency components, i.e., those
characterized by large values of the signal-to-noise ratio SNRk (t ) =
∥βk (t )∥28 /σv2 are usually tracked successfully, the weak ones may
be difficult to follow — even if the initial frequency assignment is
correct, the sub-algorithms tracking such weak components may,
after some time, lock onto the neighboring, stronger components,
corresponding to higher or lower frequencies. Moreover, when the
system/signal is nonstationary, the ‘strength’ of different harmonic
components may vary with time, which further complicates the
picture.
k =1

θ i (t |t − 1)
i=1
i̸=k
K

1603
(10)

θ(t ) =
K

k=1

θ k (t ).
(12)
Author's personal copy
1604
M. Niedźwiecki, M. Meller / Automatica 49 (2013) 1601–1613
One can show, using the ALF technique, that under (A1)–(A2) and
the following assumption
(A3) βk (t ) ≡ βk , i.e., θ k (t ) = ejωk (t ) θ k (t − 1), k = 1, . . . , K , ∀t.
which is a multi-frequency variant of (A3∗ ), the frequency
and frequency rate estimation errors ∆
ω0 (t ) = ω0 (t ) −

ω0 (t ), ∆
α0 (t ) = α0 (t ) − 
α0 (t ), can be approximately expressed
in the form (see Appendix A)
∆
ω0 (t ) ∼
= G1 (q )e0 (t ) + G2 (q )δ(t )
∆
α0 (t ) ∼
= H1 (q−1 )e0 (t ) + H2 (q−1 )δ(t )
−1
−1
(13)
(14)
where the transfer functions G1 (q−1 ), G2 (q−1 ), H1 (q−1 ) and
H2 (q−1 ) are identical with those appearing
 H ∗ in ∗(8)–(9), and

K
m
e
(
t
)
,
e
(
t
)
=
−
Im
βk ϕ (t )fk (t )v(t )/b20 ,
e0 ( t ) =
k
k
k
k=1
b20 =

K
2 2
k=1 mk bk
H
K
2
k t
k=1 mk
and b2k = βH
k 8βk . Note that the normalizing term

β ( − 1)8
βk (t − 1), which appears in the expression
for gk (t ) in (12), can be regarded as an estimate of b20 .
Furthermore, one can show that {ek (t )}, k = 1, . . . , K , are
zero-mean white noise sequences with cross-correlation functions
given by (see Appendix B)
E [ek (t )el (s)] =

ρkl (t )
0
for t = s
for t ̸= s
(15)
where
ρkl (t ) =
σv2 
2b40
Re[βH
k 8βl ] cos[φk (t ) − φl (t )]

− Im[βHk 8βl ] sin[φk (t ) − φl (t )]
t
t
and φk (t ) =
i=1 ωk (i) = mk
i=1 ω0 (i).
Setting l = k in (16), one obtains
σe2k = E[|ek (t )|2 ] =
σv2 b2k
(16)
(17)
2b40
where b2k = βH
k 8βk is the power of the k-th component of the
noiseless system output.
When frequency changes are sufficiently slow, so that the
functions fk (t ), k = 1, . . . , K , can be regarded as locally almost
periodic, the sequences {ek (t )} are mutually orthogonal in the
sense that
⟨ρkl (t )⟩T ∼
= ⟨ρkl (t )|ω0 (t ) ≡ ω0 ⟩∞ = 0, ∀k ̸= l
 T −1
where ⟨x(t )⟩T = (1/T ) i=0 x(t − i) denotes the local average of
x(t ), and T ≫ T0 = 2π /ω0 . Using this result, one obtains

K




σ2
σe20 (t ) T = E[|e0 (t )|2 ] T ∼
m2k σe2k = v2 .
=
2b0
k=1
(18)
Suppose that the assumption (A4) holds true. Then, using standard
results from the linear filtering theory, one obtains
 

 

E [
ω0 (t ) − ω0 (t )]2 T = E [∆
ω0 (t )]2 T


∼
= g [G1 (z −1 )] σe20 (t ) T + g [G2 (z −1 )]σδ2
 

 

E [
α0 (t ) − α0 (t )]2 T = E [∆
α0 (t )]2 T


∼
= g [H1 (z −1 )] σe20 (t ) T + g [H2 (z −1 )]σδ2
(19)
(20)
where
g [X (z −1 )] =
1
2π j

X (z −1 )X (z )
The first term on the right hand side of (19) constitutes the
variance component of the mean-squared frequency estimation
error, and the second term — its bias component. The same remark
applies to (20). According to (18), the variance components in (19)
and (20) are inversely proportional to the quantity which will be
further referred to as effective signal-to-noise ratio (ESNR)
K

b20
2
ESNR =
σv
=
m2k b2k
k=1
σv2
and which differs from the signal-to-noise ratio defined as
E[|ϕ (t )θ(t )| ]

SNR =
T
2
σv2
K


T
∼
=
b2k
k=1
σv2
≤ ESNR.
We note that Eqs. (13)–(14) are identical with those derived earlier for the single frequency case — the only change needed to
move from (8)–(9) to (13)–(14) is replacement of the noiseless output power b2 , which appears in the expression for σe2 , with the
effective output power b20 , which appears in the expression for
⟨σe20 (t )⟩T . Since in the multi-frequency case (K > 1) it holds that
2
b20 >
k=1 bk , the variance components of the mean-squared fundamental frequency and frequency rate tracking errors are smaller
than the analogous errors observed, under the same SNR, in the single frequency case. This increased (compared to the unconstrained
frequency estimation case) accuracy bonus is available due to incorporation in the estimation process prior information about the
harmonic structure of the identified quasi-periodic phenomenon.
The same qualitative effect can be observed in time-invariant frequency estimation schemes, such as the ones described in James,
Anderson, and Williamson (1994) and Nehorai and Porat (1986).
K
5. Generalized adaptive comb smoother
The important consequence of the fact that the approximate
error equations (13)–(14) are identical with those derived in
Niedźwiecki and Meller (2011a) for systems with a single
frequency mode of parameter variation, is that the smoothing
technique proposed there is directly applicable to the multiplefrequency case. Following Niedźwiecki and Meller (2011a),
suppose that a pre-recorded data block Ω (N ) = {y(i), ϕ(i), i =
1, . . . , N } of length N is available, which is typical of off-line
applications, i.e., those based on parameter/signal reconstruction,
rather than tracking. The smoothed estimates of ω0 (t ), α0 (t ) and
θ(t ), based on Ω (N ), will be denoted by 
ω0 (t ), 
α0 (t ) and 
θ(t ),
respectively.3
To obtain smoothed estimates, one can use a cascade of
postprocessing filters derived in Niedźwiecki and Meller (2011a,b).
The proposed fixed-interval generalized adaptive comb smoothing
(GACS) procedure, listed in Table 1, is six-step
Step 1: The preliminary estimates 
ω0 (t ) and 
α0 (t ) are obtained
using the pilot algorithm based on (12).
Step 2: To obtain the smoothed frequency rate estimates 
α0 (t ), the
trajectory {
α0 (t ), t = 1, . . . , N } is filtered, backward in time, using
the anticausal filter S (q): 
α0 (t ) = S (q)
α0 (t ), where
S (q) = 1 − (1 − q)H2 (q) = γα q/D(q).
dz
z
is an integral evaluated along the unit circle in the z-plane and X (z )
denotes any stable proper rational transfer function.
3 In the Kalman filtering/smoothing literature, the filtered and smoothed
estimates of the state vector x(t ) are usually denoted by 
x(t |t ) and 
x(t |N ),
respectively. In this paper a different notation is used to avoid false associations.
Author's personal copy
M. Niedźwiecki, M. Meller / Automatica 49 (2013) 1601–1613
Denote by ∆
ω0 (t ) = ω0 (t ) − 
ω0 (t ) and ∆
α0 (t ) = α0 (t ) − 
α0 (t )
the frequency and frequency rate smoothing errors, respectively.
Under assumptions (A1)–(A3), the approximate error equations
can be obtained in the form
Table 1
Generalized adaptive comb smoother.
Pilot filter:
ω0 (t −1)+
α0 (t −1)]

fk (t ) = ejmk [
fk (t − 1)

fk (t )
βk (t − 1)
ε(t ) = y(t ) − ϕT (t ) K 
∆
ω0 (t ) ∼
= I1 (q−1 )e0 (t ) + I2 (q−1 )δ(t )
∆
α0 (t ) ∼
= J1 (q−1 )e0 (t ) + J2 (q−1 )δ(t )
k=1

β k (t ) = 
βk (t − 1) + µ8−1 ϕ⋆ (t )
fk⋆ (t )ε(t )
g (t ) =
K
k=1 mk Im

ε ⋆ (t )ϕT (t )
fk (t )
βk (t −1)

2 H

k=1 mk βk (t −1)8βk (t −1)
K
I1 (q−1 ) = S (q)T (q−1 )G1 (q−1 )
Frequency rate smoother [optional]:
[
α 0 (N ) = 
α0 (N + 1) = 
α0 (N + 2) = 
α0 (N )]

α0 (t ) = −d1
α0 (t + 1) − d2
α0 (t + 2) − d3
α 0 ( t + 3 ) + γα 
α0 (t + 1)
t = N − 1, . . . , 1
J1 ( q
Frequency smoother:
[ω̄0 (0) = 
ω0 (1)]
ω̄0 (t ) = −a2 ω̄0 (t − 1) + a1 
ω0 (t )
t = 1, . . . , N
[
ω0 (N + 1) = 
ω0 (N + 2) = 
ω0 (N + 3) = 
ω0 (N )]

ω0 (t ) = −d1 
ω0 (t + 1) − d2 
ω0 (t + 2) − d3 
ω0 (t + 3) + γα ω̄0 (t )
t = N, . . . , 1
β̄k (t ) = β̄k (t − 1) + µ8−1 ϕ∗ (t )
fk∗ (t )ε̄(t )
t = 1, . . . , N , k = 1, . . . , K
Amplitude smoother:
[
βk (N + 1) = β̄k (N )]

βk (t ) = (1 − µ)
βk (t + 1) + µβ̄k (t )
t = N , . . . , 1, k = 1, . . . , K
Step 3: To obtain the smoothed frequency estimates 
ω0 (t ), the
trajectory {
ω0 (t ), t = 1, . . . , N } is processed using the noncausal
filter S (q)T (q−1 ), where
1 + a2
,
a1 =
(22)
1 − S 2 (q)S 2 (q−1 )
1 − q−1
) = S (q)H1 (q−1 )
1 − S (q)S (q−1 )
J2 (q−1 ) =
.
1 − q−1


Denote by X + (q−1 ) = X (q−1 )X (q) + the stable factor of a
rational transfer function X (q−1 )X (q). Based on (21)–(22), the
−1
mean-squared estimation errors can be evaluated in a way similar
to (19)–(20)
(23)
(24)
6. Optimization and Cramér–Rao bounds
Output filter:


θ k (t ) = 
fk (t )
βk (t ), 
θ(t ) = Kk=1 
θ k (t )
t = 1, . . . , N
q−1
I2 (q−1 ) =
 

 

E [
ω0 (t ) − ω0 (t )]2 T = E [∆
ω0 (t )]2 T


∼
= g [I1+ (z −1 )] σe20 (t ) T + g [I2+ (z −1 )]σδ2
 

 

E [
α0 (t ) − α0 (t )]2 T = E [∆
α0 (t )]2 T


∼
= g [J1+ (z −1 )] σe20 (t ) T + g [J2+ (z −1 )]σδ2 .
Frequency-guided filter:
ω0 (t )

fk (t ) = ejmk 
fk (t − 1)

ε̄(t ) = y(t ) − ϕT (t ) Kk=1 
fk (t )β̄k (t − 1)
a1 q−1
(21)
where

α 0 (t ) = 
α0 (t − 1) − γα g (t )

ω 0 (t ) = 
ω0 (t − 1) + 
α0 (t − 1) − γω g (t )
t = 1, . . . , N , k = 1, . . . , K
T (q−1 ) =
1605
γα
,
γω
a2 =
γα − γω
.
γω
This can be achieved by means of backward-time processing of
the prefiltered trajectory: 
ω0 (t ) = q−1 S (q)ω̄0 (t ), where ω̄0 (t ) =
qT (q−1 )
ω0 (t ).
Step 4: The amplitude coefficients are re-estimated using the
frequency-guided version of the pilot algorithm, obtained by
replacing in (12) the causal frequency estimates 
ω0 (t ) with their
noncausal (smoothed) counterparts 
ω0 (t ), evaluated at Step 3.
Step 5: To obtain the smoothed amplitude estimates 
βk (t ), k =
1, . . . , K , the re-estimated amplitude trajectories {β̄k (t ), t =
1, . . . , N } are filtered, backward in time, using the anticausal filter
F (q): 
βk (t ) = F (q)β̄k (t ), k = 1, . . . , K , where F (q) = µ/[1 − (1 −
µ)q].
Step 6: To obtain the smoothed partial parameter estimates 
θ k (t ),
the smoothed amplitude estimates 
βk (
t ) are combined with the
smoothed phase estimates 
φk (t ) = mk ti=1 
ω0 (t ). The smoothed
parameter estimate 
θ(t ) is evaluated as a sum of its harmonic
components.
Remark. We note that Step 2 above is optional — if all that is
needed is an estimation of θ(t ), the frequency rate smoothing part
of the algorithm can be skipped.
Consider a system (1)–(3) with pseudo-linear frequency
changes. In order to achieve the best tracking/smoothing results,
the adaptation gains of the GACF/GACS algorithms should be
chosen so as to trade-off the bias and variance components in
(19)–(20) and (21)–(22). Such optimal settings depend exclusively
on the balance between the bias and variance error components,
determined by the scalar coefficient
κ=
b2 σ 2
 ∼
= 0 2δ = ESNR · σδ2
σv
2 E[ (t )] T
E[w 2 (t )]

e20
further referred to as the rate of nonstationarity of the analyzed
system [in signal analysis a similar concept was introduced earlier
in Tichavský and Händel (1995)].
Using residue calculus (Jury, 1964), one can easily derive
analytical expressions quantifying the mean-squared estimation
errors in terms of µ, γω and γα . Unfortunately, these expressions
(not listed here) are too complicated to enable minimization of
the MSE scores in an explicit, analytical form. For this reason the
optimal values of adaptation gains were searched numerically.
The optimal settings that minimize the frequency and frequency
rate, tracking and smoothing, errors were found to be in all four
cases identical — the corresponding values, obtained for several
nonstationarity rates κ , are listed in Table 2.
Our next step was to establish the Cramér–Rao-type lower
tracking bounds (LTBω0 , LTBα0 ) and lower smoothing bounds
(LSBω0 , LSBα0 ), which set the upper limits for the frequency and
frequency rate estimation accuracy using any causal/noncausal
identification algorithm — see Table 2. Note the performance gains
that can be achieved when tracking is replaced with smoothing.
Even though such analysis has mainly a theoretical value, it allows
one to evaluate tracking/smoothing performance of the proposed
algorithms in absolute, rather than relative, terms. The LTBs and
LSBs for the system governed by (1)–(3) were obtained under
assumptions (A1)–(A5) and, to make the analysis easier, under
Author's personal copy
1606
M. Niedźwiecki, M. Meller / Automatica 49 (2013) 1601–1613
Table 2
Optimal GACF/GACS settings and the corresponding normalized lower tracking bounds.
κ
µopt
γωopt
γαopt
LTBω0 /σδ2
LTBα0 /σδ2
LSBω0 /σδ2
LSBα0 /σδ2
10−10
5 · 10−10
10−9
5 · 10−9
10−8
5 · 10−8
10−7
5 · 10−7
10−6
0.0472
0.0613
0.0685
0.0886
0.0990
0.127
0.142
0.181
0.201
0.00113
0.00192
0.00241
0.00407
0.00509
0.00852
0.0106
0.0177
0.0219
0.0000138
0.0000306
0.0000432
0.0000955
0.000134
0.000295
0.000414
0.000905
0.00126
2.05 · 105
9.09 · 104
6.39 · 104
2.82 · 104
1.97 · 104
8.66 · 103
6.06 · 103
2.63 · 103
1.83 · 103
8.21 · 101
6.28 · 101
5.58 · 101
4.26 · 101
3.79 · 101
2.89 · 101
2.57 · 101
1.95 · 101
1.73 · 101
1.18 · 104
5.28 · 103
3.73 · 103
1.67 · 103
1.18 · 103
5.28 · 102
3.73 · 102
1.67 · 102
1.18 · 102
1.38 · 101
1.05 · 101
9.39
7.18
6.40
4.89
4.36
3.33
2.96
some technical constraints imposed on the initial conditions —
see Appendix C. Again, rather than providing the closed-form
analytical formulas, we show how LTBs and LSBs can be established
numerically for a given value of κ .
It was found out that there is a perfect agreement between
the lower tracking and smoothing bounds and the MSE values
obtained by minimizing (19)–(20) and (21)–(22), respectively — in
some cases the computed values agreed up to the sixth decimal
place. This means that, at least theoretically, the optimally tuned
GACF/GACS algorithms should be statistically efficient frequency
and frequency rate trackers/smoothers. In the next section we will
verify this statement using computer simulations.
According to the results given above, in the constant amplitude
case the frequency estimates are unambiguous — in spite
of the fact that the corresponding amplitude estimates are
complex-valued quantities and, as such, could potentially create
nonidentifiability problems. Although the same was also observed
in the time-varying-amplitude case, some caution is needed in
interpreting the quantities 
ω0 (t ), 
ω0 (t ) and 
βk (t ), 
βk (t ) unless
identifiability is formally proved. Note, however, that this potential
nonidentifiability problem does not extend to estimation of θ(t ),
which is our main interest here.
7. Simulation and experimental results
7.1. ALF-based analysis
To check the validity of the analytical expressions (19)–(20),
based on the approximating linear filter equations (13)–(14), the
following two-tap FIR system (inspired by channel equalization
applications) was simulated
y(t ) = θ1 (t )u(t ) + θ2 (t )u(t − 1) + v(t )
(25)
where u(t ) denotes a white 4-QAM [quadrature amplitude
modulation — see e.g. Giannakis and Tepedelenlioǧlu (1998)] input
sequence (u(t ) = ±1 ± j, σu2 = 2) and v(t ) denotes a complexvalued Gaussian measurement noise.
Each of n = 2 impulse response coefficients had K = 4 modes
of variation — system parameters varied according to

 
4
t
θ1 (t )
θ(t ) =
=
βk0 ej[νk +mk τ =1 ω0 (τ )]
θ2 (t )
The approximations (19)–(20) were checked for 3 values of the
signal-to-noise ratio: SNR = 0 dB (σv = 5.5678), SNR = 10 dB
(σv = 1.7607) and SNR = 20 dB (σv = 0.5568), for 2 values of
the nonstationarity rate: κ = 10−10 and κ = 10−9 , and for 10
values of the adaptation gain µ, ranging from 0.01 to 0.1. To reduce
the number of design degrees of freedom, the two other gains
adopted for GACF/GACS algorithms were set to: γω = µ2 /2 and
γα = µ3 /8 — in agreement with the general tendency observed,
under different levels of effective SNR, for the optimal settings
(this rule of thumb was found to work quite well in practice).
The mean-squared frequency and frequency rate estimation errors
were evaluated (for the optimally tuned GACF algorithm) by means
of joint time and ensemble averaging. First, for each realization
of the measurement noise sequence and each realization of the
frequency trajectory, the mean-squared errors were computed
from 1000 iterations of the GACF filter (after the algorithm has
reached its steady-state). The obtained results were next averaged
over 50 realizations of {δ(t ), v(t )} and ν1 , . . . , ν4 .
Figs. 1 and 2 show comparison of theoretical curves and
the time-averaged values of the mean-squared frequency and
frequency rate estimation errors obtained via simulation. Note the
good agreement between theoretical evaluations and the actual
algorithm’s performance, which can be observed for SNR ≥
10 dB. Generally, the degree of fit improves with decreasing κ and
increasing SNR, which is consistent with the operating range of the
approximating linear filter technique. Similar results, not reported
here, were obtained for the GACS algorithm.
7.2. Statistical efficiency
Fig. 3 shows a comparison of the theoretical values of the lower
frequency tracking bound LTBω0 and the lower frequency rate
tracking bound LTBα0 with experimental results obtained for an
optimally tuned GACF algorithm designed for the system described
in Section 7.1. In agreement with the results of theoretical analysis
presented in Section 6, for small rates of system nonstationarity
the proposed GACF algorithm is statistically efficient, i.e., under
the conditions specified earlier, it cannot be outperformed by any
other tracking algorithm. The same conclusion can be drawn after
inspection of the plots shown in Fig. 4, illustrating the behavior of
the optimally tuned GACS algorithm.
k=1
β10 = [2 − j, 1 + j2]T ,
β30 = [1, 1] ,
T
β20 = [1 − j, 1]T
β40 = [0.5, j0.5]T
where mk
= k, the phase shifts ν1 , . . . , ν4 were drawn
independently from the uniform distribution on [0, 2π ), and the
fundamental frequency ω0 (t ) was governed by the integrated
random-walk model [obeying (A4) and (A5)], starting from
ω0 (0) = π /4. Note that in this case ϕ(t ) = [u(t ), u(t − 1)]T , 8 =
I2 σu2 , b21 = 20, b22 = 6, b23 = 4, b24 = 1 and b20 = 96 (since
4
b2k = 31, there was a noticeable discrepancy between SNR
and effective SNR).
k =1
7.3. Performance
The aim of this simulation experiment was to compare
performance of the proposed GACF/GACS algorithms with that
yielded by the unconstrained multiple-frequency versions of the
GANF/GANS algorithms.
The simulated two-tap FIR system (25) was governed by (see
Figs. 5 and 6)
θ(t ) =

 
4
t
θ1 (t )
=
C(t )βk0 ejmk τ =1 ω0 (τ )
θ2 (t )
k=1
Author's personal copy
M. Niedźwiecki, M. Meller / Automatica 49 (2013) 1601–1613
Fig. 1. Average variance of the frequency estimation error (upper figure) and the
frequency rate estimation error (lower figure) for a nonstationary FIR system with 4
frequency modes governed by the integrated random-walk model. The theoretical
results (solid lines) are compared with simulation results obtained for the rate of
nonstationarity κ = 10−10 and 10 different values of µ(γω = µ2 /2, γα = µ3 /8).
The corresponding signal-to-noise ratios were equal to: SNR = 0 dB (◦), SNR = 10
dB (×) and SNR = 20 dB (+).
where C(t ) = diag{sin(π t /1000), cos(π t /1000)} and β10 =
[−1, −j]T , β20 = [−0.112, j0.112]T , β30 = [−0.0402, −j0.0402]T ,
β40 = [−0.0207, j0.0207]T .
The fundamental frequency was changing sinusoidally according to
ω0 (t ) =
π
10

1+
1
3

sin
πt
1000

.
Only the odd harmonics were present: mk = 2k − 1, k = 1, . . . , 4.
Similarly as in the previous experiments, the system was excited
with a 4-QAM sequence.
Fig. 7 shows comparison of the mean-squared parameter
tracking/smoothing errors yielded by the proposed GACF/GACS
algorithms. All MSE values were obtained by means of joint
time averaging (the evaluation interval [1001, 3000] was placed
inside a wider analysis interval [1, 4000]), and ensemble averaging
(50 realizations of measurement noise were used). The standard
deviation of noise was equal to σv = 0.04. For each value of µ, the
values of adaptation gains γω and γα were chosen using the rule
of thumb described in Section 7.1. Note that the GACS algorithm
yields uniformly better results than its GACF counterpart. Both
GACF and GACS algorithms perform better than the corresponding
multi-frequency GANF and GANS algorithms, respectively.
7.4. Estimation of MRI noise
Magnetic resonance imaging (MRI) equipment is used to
visualize internal structures of the human body without exposing
1607
Fig. 2. Average variance of the frequency estimation error (upper figure) and the
frequency rate estimation error (lower figure) for a nonstationary FIR system with 4
frequency modes governed by the integrated random-walk model. The theoretical
results (solid lines) are compared with simulation results obtained for the rate of
nonstationarity κ = 10−9 and 10 different values of µ(γω = µ2 /2, γα = µ3 /8).
The corresponding signal-to-noise ratios were equal to: SNR = 0 dB (◦), SNR = 10
dB (×) and SNR = 20 dB (+).
subjects to harmful radiation. It is utilized in many medical
institutions for diagnostic purposes and, quite recently, as an aid
during some operations — in the latter case it works in the nearly
real-time mode (Kurumi et al., 2007).
MRI devices generate very loud harmonic noise (with intensity
exceeding 100 dB) caused by vibration – owing to the Lorentz
force – of the gradient coil. Exposure to this noise is very annoying
both for the patients and for the medical staff. MRI noise can be
reduced using the active noise control (ANC) techniques, the better
the more accurately one can track the underlying multi-harmonic
signal (K > 30). Since this signal is nonstationary – both the
amplitudes and the fundamental frequency change over time –
its estimation is a challenging task. Fig. 8 shows the time plots
and periodograms of the original MRI noise (recorded in the axial
mode), as well as the time plots and periodograms of prediction
K
errors ε(t ) = y(t )−
s(t |t − 1) = y(t )− k=1 
fk (t )
bk (t − 1) yielded
by the proposed adaptive comb filter (ACF) and by the multiplefrequency version of the adaptive notch filter (ANF). All results
were obtained for µ = 0.01, γω = µ2 /2 and γα = µ3 /8. The
complex-valued version of the MRI signal was obtained using the
discrete Hilbert transform.
While the time plots obtained for ANF and ACF look similar,
the corresponding periodograms differ significantly. It is clear that
the ACF is much more effective in suppressing signal harmonics
than ANF, even though both algorithms used the same starting
values. The failure of the ANF algorithm can be explained by its
poor frequency matching capabilities — after some initial period
Author's personal copy
1608
M. Niedźwiecki, M. Meller / Automatica 49 (2013) 1601–1613
Fig. 3. Comparison of the theoretical values of the lower frequency (upper figure)
and frequency rate (lower figure) tracking bounds (solid lines) with experimental
results obtained for the system with quasi-linear frequency changes for 3 different
SNR values: SNR = 0 dB (◦), SNR = 10 dB (×), SNR = 20 dB (+), and 9 different
values of the rate of system nonstationarity κ .
Fig. 4. Comparison of the theoretical values of the lower frequency (upper figure)
and frequency rate (lower figure) smoothing bounds (solid lines) with experimental
results obtained for the system with quasi-linear frequency changes, for 3 different
SNR values: SNR = 0 dB (◦), SNR = 10 dB (×), SNR = 20 dB (+), and 9 different
values of the rate of system nonstationarity κ .
the algorithm locks on dominant harmonics, leaving the remaining
ones unattenuated.
8. Conclusion
The problem of identification of linear stochastic systems
with quasi-harmonically varying parameters was considered. Both
causal and noncausal identification algorithms were derived, referred to as generalized adaptive comb filters (GACFs) and generalized adaptive comb smoothers (GACSs), respectively. In both cases
the frequency and frequency rate estimation properties of the proposed algorithms were analyzed using the method of approximating linear filter. It was shown, and later confirmed by means of
computer simulations, that when the fundamental frequency of
parameter changes varies slowly with time according to the integrated random-walk model, the optimally tuned GACF/GACS algorithms are (under Gaussian assumptions) statistically efficient
frequency and frequency rate trackers/smoothers, i.e., they reach
the Cramér–Rao-type lower tracking/smoothing bounds — expressions allowing one to evaluate these bounds were also derived in
the paper.
Fig. 5. Real parts (solid lines) and imaginary parts (broken lines) of system
parameters observed in a short time interval.
Appendix A. Derivation of (13) and (14)
Denote by ∆
θ k (t ) = θ k (t ) − 
θ k (t ) the parameter estimation
2

error and let ∆
xk (t ) = Im[θ H
k (t )8∆θ k (t )/b0 ]. According to
Tichavský and Händel (1995), when carrying ALF analysis,
one should neglect all terms of order higher than one in
∆
ωk (t ), ∆
αk (t ), ∆
θ k (t ), δ(t ) and v(t ), including all cross-terms.
Fig. 6. Evolution of the instantaneous fundamental frequency.
Author's personal copy
M. Niedźwiecki, M. Meller / Automatica 49 (2013) 1601–1613
Fig. 7. Comparison of the mean-squared parameter tracking (◦) and parameter
smoothing (*) errors yielded by the generalized adaptive notch algorithms (upper
figure) and generalized adaptive comb algorithms (lower figure) for the two-tap
FIR system with sinusoidally varying fundamental frequency of parameter variation
and sinusoidally varying amplitudes of all harmonics.
To derive recursion for ∆
x(t ) =
K
k =1
where ζ k (t ) = e
∆
xk (t ), note that
ζ k (t ) = ejωk (t ) e−j∆ωk (t −1) e−j∆αk (t −1)
× [θ k (t − 1) − ∆
θ k (t − 1)]
∼ ejωk (t ) [1 − j∆
ωk (t − 1)]
=
k=1
j[
ωk (t −1)+
αk (t −1)]

αk (t ) = mk
α0 (t ),

θ k (t − 1) and

ωk (t ) = mk
ω0 (t ).
Fig. 8. Time plots (upper figures) and periodograms (lower figures) of the: MRI
noise (two top plots), prediction errors yielded by the ACF algorithm (two middle
plots), and prediction errors yielded by the multiple-frequency ANF algorithm (two
bottom plots).
Note that ζ k (t ) can be rewritten in the form ζ k (t ) = ejωk (t )
ωk (t −1) −j∆
e−j∆
e αk (t −1) [θ k (t − 1) − ∆
θ k (t − 1)]. Using the following
ωk (t −1) ∼
approximations e−j∆
ωk (t − 1), e−j∆αk (t −1) ∼
= 1 − j∆
=
1 − j∆
αk (t − 1), that hold for small frequency and frequency rate
errors, respectively, and applying ALF rules, one arrives at

θ k (t ) = ζ k (t ) + µ8−1 ϕ∗ (t )ε(t )
K
K


ε(t ) = ϕT (t )
θ k (t ) + v(t ) − ϕT (t )
ζ k (t )
k=1
1609
(26)
Therefore



θ k (t ) = I − µ8−1 ϕ∗ (t )ϕT (t ) ζ k (t )
+ µ8−1 ϕ∗ (t )ϕT (t )θ k (t ) + µ8−1 ϕ∗ (t )v(t ) + ξ k (t )
× [1 − j∆
αk (t − 1)][θ k (t − 1) − ∆
θ k (t − 1)]
j
ω
(
t
)
∼ e k [1 − j∆
ωk (t − 1) − j∆
αk (t − 1)]
=
× [θ k (t − 1) − ∆
θ k (t − 1)]
∼
θ k (t − 1)
= θ k (t ) − ejωk (t ) ∆
− j[∆
ωk (t − 1) + ∆
αk (t − 1)]θ k (t ).
(28)
Combining (27) with (28), one obtains
where
ξ k (t ) = µ8−1 ϕ∗ (t )ϕT (t )
K



∆
θ k (t ) ∼
θ k (t − 1)
= I − µ8−1 ϕ∗ (t )ϕT (t ) ejωk (t ) ∆


+ j I − µ8−1 ϕ∗ (t )ϕT (t ) [∆
ωk (t − 1)
[θ i (t ) − ζ i (t )].
i=1
i̸=k
jωk (t )
Since in the case considered θ k (t ) = e
relationship leads to
θ k (t − 1), the last


∆
θ k (t ) = I − µ8−1 ϕ∗ (t )ϕT (t ) θ k (t )


− I − µ8−1 ϕ∗ (t )ϕT (t ) ζ k (t )
− µ8−1 ϕ∗ (t )v(t ) − ξ k (t ).
+ ∆
αk (t − 1)]θ k (t ) − µ8−1 ϕ∗ (t )v(t ) − ξ k (t ).


∆
θ k (t ) ∼
θ k (t − 1)
= I − µ8−1 ϕ∗ (t )ϕT (t ) ∆


−1 ∗
T
+ j I − µ8 ϕ (t )ϕ (t ) [∆
ωk (t − 1)
(27)
(29)
Let ∆
θ k (t ) = ∆
θ k (t )fk (t ). After multiplying both sides of (29) with
fk∗ (t ), one arrives at
∗
+ ∆
αk (t − 1)]βk − µ8−1 ϕ∗ (t )fk∗ (t )v(t )
Author's personal copy
1610
M. Niedźwiecki, M. Meller / Automatica 49 (2013) 1601–1613
− ξ k (t )fk∗ (t ).
(30)
For small values of adaptation gains µ, γω and γα , the quantities
∆
θ k (t ), ∆
ωk (t ) and ∆
αk (t ) change slowly compared to ϕ(t ) and
fk (t ), k = 1, . . . , K , i.e., (30) can be regarded as a two-scale
difference equation. When solving such an equation for slowly
varying quantities, one is allowed to replace some functionals of
the fast varying quantities with their time averages. This is usually
referred to as a deterministic averaging technique.
T −1
Denote by ⟨x(t )⟩T = (1/T ) i=0 x(t − i) the local average of
x(t ), and by ⟨x(t )⟩∞ = limT →∞ ⟨x(t )⟩T the corresponding limiting
value (provided it exists).
We will exploit the fact that for any process {ϕ(t )}
 obeying (A2),

and for sufficiently large values of T , it holds that ϕ(t )ϕT (t ) T ∼
=





ϕ(t )ϕT (t ) ∞ = 8 and ϕ(t )ϕT (t )ejωt T ∼
= ϕ(t )ϕT (t )ejωt ∞
= 0, ∀ω ̸= 0. According to the first relationship, the data−1 ∗
T
dependent matrix
 I − Tµ8 ϕ (t )ϕ (t ) in (30) can be replaced
−1
∼
ϕ(t )ϕ (t ) T = (1 − µ)I. Similarly, according
with I − µ8

to the second relationship, since for sufficiently slow frequency
variations fk (t ),
fk (t ), k = 1, .. . , K , are locally almost periodic
functions of time, it holds that ξ k (t )fk∗ (t ) T ∼
= 0, allowing one to
neglect the last term on the right hand side of (30). Hence, using
the averaging technique, one arrives at the following approximate
error equation
∆
θ k (t ) ∼
θ k (t − 1) + j(1 − µ)[∆
ωk (t − 1)
= (1 − µ)∆
+ ∆
αk (t − 1)]βk − µ8 ϕ (t )fk (t )v(t ).
−1 ∗
Multiplying both sides of (31) with β
β

θ (t ) ∼
= λβ
H
k 8∆ k
H
k 8,
∗
(31)
one obtains

θ (t − 1)
β
H
k8 k
− µβHk ϕ∗ (t )fk∗ (t )v(t )
(32)
where λ = 1 − µ. Finally, dividing both sides of Eq. (32)

by b20 , taking imaginary parts, and noting that βH
k 8∆θ k (t ) =
θ k (t ) 8∆
θ k (t ) and β
H
H ∗
k fk
θ Hk (t )ϕ∗ (t )ϕT (t )ζ k (t ) ∼
= θ Hk (t )ϕ∗ (t )ϕT (t )θ k (t )
− θ H (t − 1)ϕ∗ (t )ϕT (t )∆
θ k (t − 1)
k
− jθ Hk (t )ϕ∗ (t )ϕT (t )θ k (t )[∆
ωk (t − 1) + ∆
αk (t − 1)]
H ∗
H ∗
T
T

= β ϕ (t )ϕ (t )βk − β ϕ (t )ϕ (t )∆θ(t − 1)
k
k
− jβHk ϕ∗ (t )ϕT (t )βk [∆
ωk (t − 1) + ∆
αk (t − 1)]
which, after averaging, leads to
zk (t ) ∼
ωk (t − 1) + ∆
αk (t − 1)]
= βHk 8βk − jβHk 8βk [∆
− βHk 8∆
θ k (t − 1) + v ∗ (t )ϕT (t )θ k (t ).
2
Since βH
k 8βk = bk , one arrives at
g (t ) = Im[z (t )/b20 ] ∼
x(t − 1) − ∆
ω0 (t − 1)
= −∆
− ∆
α0 (t − 1) + e0 (t ).
Note that
∆
α0 (t ) = ∆
α0 (t − 1) + δ(t ) + γα g (t )
∆
ω0 (t ) = ∆
ω0 (t − 1) + ∆
α0 (t − 1) + γω g (t ).
Combining the last three equations, one arrives at
H
k 8∆ k
+ jλ[∆
ωk (t − 1) + ∆
αk (t − 1)]β
Furthermore, Im[ε ∗ (t )ϕT (t )ζ k (t )] = Im[zk (t ) − |ϕT (t )ζ k (t )|2 ] =
H
∗
T
∗
Im[zk (t )], where zk (t ) = θ H
k (t )ϕ (t )ϕ (t )ζ k (t ) + ψ k (t )ϕ (t )
T
∗
T
ϕ (t )ζ k (t ) + v (t )ϕ (t )θ k (t ).
Using (28) and applying the ALF rules, one obtains the following
approximation
(t ) = θ (t ), one arrives at
H
k
∆
α0 (t ) ∼
α0 (t − 1) + δ(t ) + γα e0 (t )
= (1 − γα )∆
− γα ∆
ω0 (t − 1) − γα ∆
x(t − 1)
(34)
∆
ω0 (t ) ∼
ω0 (t − 1) + (1 − γω )∆
α0 (t − 1)
= (1 − γω )∆
+ γω e0 (t ) − γω ∆
x(t − 1).
(35)
Finally, solving the set of linear equations (33)–(35) for ∆
ω0 (t ) and
∆
α0 (t ), one obtains (13) and (14), respectively.
λb2
∆
xk (t ) ∼
xk (t − 1) + 2k
= λ∆
Appendix B. Derivation of (16)
× [∆
ωk (t − 1) + ∆
αk (t − 1)] + µek (t )
 H ∗

where ek (t ) = −Im βk ϕ (t )fk∗ (t )v(t )/b20 . Note that ∆
x(t ) =
K
K
K
2
2 2

m
∆
x
(
t
),
e
(
t
)
=
m
e
(
t
)
and
b
=
k
k
0
k k
0
k=1
k=1
k=1 mk bk .
The relationship E [ek (t )el (t )] = 0, ∀k ̸= l stems from the
fact that {v(t )} is a sequence of zero-mean independent random
variables, independent of {ϕ(t )}. To arrive at the expression for
ρkl (t ) we will introduce the following notation
b0
Hence, after incorporating (26), one obtains
∆
x(t ) ∼
x(t − 1) + λ[∆
ω0 (t − 1) + ∆
α0 (t − 1)]
= λ∆
+ µe0 (t ).
v(t ) = vR (t ) + jvI (t ),
(33)
To derive recursions for ∆
ω0 (t ) and ∆
α0 (t ), note that in the
tracking mode it holds that
K

g (t ) =


mk Im ε ⋆ (t )ϕT (t )
fk (t )
βk (t − 1)
where the subscript R/I denotes the real/imaginary part of a
complex variable.
Note that
ek (t ) = −Im ηk∗ (t )e−jφk (t ) v(t )/b20

=
k=1
K

H
m2k 
βk (t − 1)8
βk (t − 1)
∼
=
mk Im[ε ∗ (t )ϕT (t )ζ k (t )]
k =1
b20
and
ε(t ) = ϕT (t )θ k (t ) − ϕT (t )ζ k (t ) + ϕT (t )ψk (t ) + v(t )
K

ψk (t ) =
[θ i (t ) + ζ i (t )].
i=1
i̸=k
1
b20

[ηkR (t )vR (t ) + ηkI (t )vI (t )] sin φk (t )

− [ηkR (t )vI (t ) − ηkI (t )vR (t )] cos φk (t ) .
k=1
K

ηk (t ) = βTk ϕ(t ) = ηkR (t ) + jηkI (t )
Since the sequence {v(t )} is circular, it holds that E[vR2 (t )] =
E[vI2 (t )] = σv2 /2 and E[vR (t )vI (t )] = 0. Using these relationships,
and the fact that the process {v(t )} is independent of {ϕ(t )}, one
arrives at
Ev [ek (t )el (t )]
=
σv2 
2b40
[ηkR (t )ηlR (t ) + ηkI (t )ηlI (t )] cos [φk (t ) − φl (t )]

+ [ηkR (t )ηlI (t ) + ηkI (t )ηlR (t )] sin [φk (t ) − φl (t )] .
(36)
Author's personal copy
M. Niedźwiecki, M. Meller / Automatica 49 (2013) 1601–1613
After elementary but tedious calculations, one can show that
Eϕ [ηkR (t )ηlR (t ) + ηkI (t )ηlI (t )] = Re[β
β]
H
k8 l
Eϕ [ηkR (t )ηlI (t ) + ηkI (t )ηlR (t )] = −Im[βH
k 8βl ].
(37)
Finally, combining (36) with (37), one obtains
=
2b40
Re[β
uniformly distributed over [αmin , αmax ], and that ω0 (0) is a known
deterministic quantity. Hence, the vector of initial conditions can
be specified as c0 = [β10 , . . . , βK 0 , ν1 , . . . , νK , α0 (0), ω0 (0)]T ,
where the quantities β10 , . . . , βK 0 and ω0 (0) are deterministic, and
the quantities ν1 , . . . , νK and α0 (0) are stochastic.
First of all, note that
E [ek (t )el (t )] = Eϕ {Ev [ek (t )el (t )]}
σv2 
1611
log p(yt , xt |ut , c0 ) = log p(yt |xt , ut , c0 ) + log p(xt ).
β ] cos[φk (t ) − φl (t )]
H
k8 l
− Im[βHk 8βl ] sin[φk (t ) − φl (t )] = ρkl (t ).

Appendix C. Computation of lower tracking/smoothing bounds
Since, under the assumptions listed above, the vectors xt and c0
K
jmk φ0 (τ )
,
fully determine θ(τ ), τ = 1, . . . , t: θ(τ ) =
k=1 βk e
where
φ0 (τ ) =
τ

ω0 (n) = ω0 (0) +
n =1
In this appendix, we will derive expressions for theoretical upper bounds that limit tracking/smoothing capabilities of
any causal/noncausal frequency and frequency rate estimation
algorithms applied to quasi-periodically varying systems with
quasi-linear frequency changes. The corresponding lower tracking
bounds (LTBs) and lower smoothing bounds (LSBs) belong to the
class of posterior (or Bayesian) Cramér–Rao bounds, applicable to
signals/systems with random parameters.4
Denote by u and y and c0 the vectors of system inputs (regarded
as a known deterministic sequence, e.g. a particular realization of
a stochastic process), noisy outputs, and fixed initial conditions,
respectively, and let 
x(y, u, c0 ) be an estimator of a real-valued
random parameter vector x based on (y, u, c0 ). Then, under weak
regularity conditions, one can show that (van Trees, 1968)
E[(
x(y, u, c0 ) − x)(
x(y, u, c0 ) − x)T |u, c0 ] ≥ J−1 (u, c0 )
where

∂ 2 log p(y, x|u, c0 )
∂ x∂ xT


∂ log p(y, x|u, c0 ) ∂ log p(y, x|u, c0 )
=E
∂x
∂ xT
J(u, c0 ) = −E

and p(y, x|u, c0 ) = p(y|x, u, c0 )p(x) is the joint probability
density function of the pair (y, x) given u and c0 .
When the input signal is a stochastic process, initial conditions
are random, and averaging is extended to all realizations of u and
c0 , one obtains the following result
(38)
log p(yt |xt , ut , c0 ) = log p[yt |θ(1), ϕ(1), . . . , θ(t ), ϕ(t )]
= c1 −
= c1 −
t
1 
σv2
t
1 
σv2
|v(τ )|2
τ =1
|y(τ ) − ϕT (τ )θ(τ )|2
(39)
τ =1
where c1 is a constant independent of xt .
Differentiating (39) with respect to α0 (m), one obtains
∂ log p(yt |xt , ut , c0 )
∂α0 (m)


t
∂θ(τ )
2 
T
∗ T
Re [y(τ ) − ϕ (τ )θ(τ )] ϕ (τ )
= 2
σv τ =1
∂α0 (m)
and
∂ 2 log p(yt |xt , ut , c0 )
∂α0 (m)∂α0 (n)

t
∂θ H (τ ) ∗
∂θ(τ )
2 
Re −
ϕ (τ )ϕT (τ )
= 2
σv τ =1
∂α0 (n)
∂α0 (m)
∂ 2 θ(τ )
+ [y(τ ) − ϕ (τ )θ(τ )] ϕ (τ )
∂α0 (m)∂α0 (n)

t
2 
∂θ H (τ ) ∗
∂θ(τ )
= 2
Re −
ϕ (τ )ϕT (τ )
σv τ =1
∂α0 (n)
∂α0 (m)

∂ 2 θ(τ )
∗
T
+ v (t )ϕ (τ )
∂α0 (m)∂α0 (n)
T
≥ {E[J(u, c0 )]}−1 = J̄−1
4 When the estimated quantities are stochastic variables, rather than unknown
deterministic constants, the classical Cramér–Rao inequality does not apply.
α0 (m),
n=1 m=1
one arrives at (for normally distributed {v(τ )})
E[(
x(y, u, c0 ) − x)(
x(y, u, c0 ) − x)T ] ≥ E[J−1 (u, c0 )]
where the second transition stems from the Jensen’s inequality for
matrices — see Olkin and Pratt (1958).
In the case considered, let xt = [α0 (1), . . . , α0 (t )]T , yt =
[y(1), . . . , y(t )]T and ut = [ϕT (1), . . . , ϕT (t )]T .
To simplify further analysis, we will assume, in addition to
(A1)–(A5), that the complex-valued ‘amplitudes’ can be written
down in the form βk = βk0 ejνk , k = 1, 2, . . . , K , where βk0 are
fixed (deterministic) complex-valued vectors and νk are mutually
independent random phase shifts, distributed uniformly over the
interval [0, 2π ). Note that under the last assumption it holds that
E[βk βH
l ] = O, ∀k ̸= l. Furthermore, we will assume that α0 (0)is
τ 
n −1


∗ T
where the last transition stems from y(τ ) = ϕT (τ )θ(τ ) + v(τ ).
This leads to

∂ 2 log p(yt |xt , ut , c0 )
∂α0 (m)∂α0 (n)
 

t
2 
∂θ H (τ ) ∗
∂θ(τ
)
=− 2
Re E
ϕ (τ )ϕT (τ )
σv τ =1
∂α0 (n)
∂α0 (m)


t
2 
∂θ H (τ ) ∂θ(τ )
=− 2
E
8
.
σv τ =1
∂α0 (n) ∂α0 (m)

E
Author's personal copy
1612
M. Niedźwiecki, M. Meller / Automatica 49 (2013) 1601–1613
Using (38), one arrives at
where
1
−1

0

Bt = 



0
K
k 
s−1


∂θ(τ )
=j
mk βk ejmk φ0 (τ )
δτ ,l
∂α0 (m)
k=1
s =1 l =1
=j
K

mk βk ejmk φ0 (τ ) max(τ − m, 0)
k=1
where δτ ,l = {0 if τ ̸= l, 1 if τ = l} denotes the Kronecker delta.
This allows one to reach

E
∂θ (τ ) ∂θ(τ )
8
∂α0 (n) ∂α0 (m)
H
=−
K

=−
J̄t =
0
2
−1
..
.
···
···
σv
2
At +
1
σδ
2
2b2
∂ 2 log p(yt |xt , ut , c0 )
= 20 At
T
σv
∂ xt ∂ xt
t
where [At ]mn =
τ =1 max(τ − m, 0) · max(τ − n, 0).
(40)
= c2 + c3 −
t
1 
2σδ2 τ =2
t
1 
2σδ τ =2
2
−1
1
Bt =
1
σδ2
[2κ At + Bt ] .
t →∞

tt
t →∞
1
inf E{ [ω0 (t ) − 
ω0 (t )]2 } = lim cTt J−
2t ct
t →∞
1
LSBα0 = lim inf E{ [α0 (t ) − 
α0 (t )]2 } = lim J−
2t

t →∞

tt
where cTt = [1Tt −1 , 0Tt +1 ] and 0t denotes the vector of zeros
of length t. The values of LTB and LSB, shown in Table 1,
were computed numerically for t ranging from 100 to 600 (the
convergence is slower for smaller values of κ ).
References
w2 (τ )
[α0 (τ ) − α0 (τ − 1)]2
(41)
where c2 = log[1/(αmax −αmin )] and c3 are constants independent
of xt .
Differentiation of (41) results in
=−
LSBω0 = lim
t →∞ α

0 (·)
α0 (t ) − α0 (t − 1)]
2σδ τ =2
2
0

.



−1
where bTt = [1Tt −1 , 0], and 1t denotes the vector of ones of length
t. The analogous expressions for lower smoothing bounds read
t →∞ 
ω0 (·)
log p(xt ) = log p[α0 (1), α0 (2) − α0 (1), . . . ,
2
−1
0
0 

0 

In an analogous way, one can derive the second component of
the generalized Fisher matrix. First, note that
t
1 
0
0 
1
LTBα0 = lim inf E{ [α0 (t ) − 
α0 (t )]2 } = lim J−
t

= c2 + c3 −
···
···
···
1
inf E{ [ω0 (t ) − 
ω0 (t )]2 } = lim bTt J−
t bt
t →∞ α

0 (·)
−E
∂
[α0 (τ ) − α0 (τ − 1)]2
∂α0 (m)

α0 (2) − α0 (1) for m = 1


α (m + 1) − 2α (m) + α (t − 1)
1
0
0
0
= 2
for
1
<
m
<
t
σδ 


α0 (t − 1) − α0 (t ) for m = t

for m = 1
δ(2)
1
= 2 δ(m + 1) − δ(m) for 1 < m < t
σδ 
−δ(t )
for m = t
which leads to

−1
t →∞ 
ω0 (·)
m2k b2k max(τ − m, 0) max(τ − n, 0),

E
2b20
LTBω0 = lim
j(νl −νk ) H
where the cross-terms βH
βk0 8βl0 , k ̸= l average to
k 8βl = e
zero due to independence of the phase shifts νk and νl .
Therefore
α0 (m)
2
−1
0
The asymptotic (steady-state) bounds on accuracy of frequency
and frequency rate estimates can be obtained from
m2k βH
k 8βk max(τ − m, 0) max(τ − n, 0)
k=1
log p(xt )
0
Combining (40) with (42), one obtains

k=1
K

0
−1

∂ log p(xt ) ∂ log p(xt )
1
= 2 Bt
∂ xt
∂ xTt
σδ
(42)
Bakkoury, J., Roviras, D., Ghogho, M., & Castanie, F. (2000). Adaptive MLSE receiver
over rapidly fading channels. Signal Processing, 80, 1347–1360.
Giannakis, G. B., & Tepedelenlioǧlu, C. (1998). Basis expansion models and diversity
techniques for blind identification and equalization of time-varying channels.
Proceedings of the IEEE, 86, 1969–1986.
James, B., Anderson, B. D. O., & Williamson, R. C. (1994). Conditional mean and
maximum likelihood approaches to multiharmonic frequency estimation. IEEE
Transactions on Signal Processing, 42, 1366–1375.
Jury, M. (1964). Theory and application of the Z-transform method. New York: Wiley.
Kurumi, Y., Tani, T., Naka, S., Shiomi, H., Shimizu, T., Abe, K., et al. (2007). MR-guided
microwave ablation for malignancies. International Journal of Clinical Oncology,
12, 85–93.
Nehorai, A., & Porat, B. (1986). Adaptive comb filtering for harmonic signal
enhancement. IEEE Transactions on Acoustics, Speech and Signal Processing, 34,
1124–1138.
Neimark, J. I. (2003). Mathematical models in natural sciences and engineering.
Springer.
Niedźwiecki, M., & Kaczmarek, P. (2004). Generalized adaptive notch filters. In Proc.
2004 IEEE Int. Conf. on Acoustics, Speech and Signal Proc. (pp. 657–660). Montreal,
Canada.
Niedźwiecki, M., & Kaczmarek, P. (2005a). Estimation and tracking of quasiperiodically varying systems. Automatica, 41, 1503–1516.
Niedźwiecki, M., & Kaczmarek, P. (2005b). Identification of quasi-periodically
varying systems using the combined nonparametric/parametric approach. IEEE
Transactions on Signal Processing, 53, 4588–4598.
Niedźwiecki, M., & Meller, M. (2011a). Identification of quasi-periodically varying
systems with quasi-linear frequency changes. In Proc. 18th IFAC World Congress
(pp. 9070–9078). Milano, Italy.
Niedźwiecki, M., & Meller, M. (2011b). New algorithms for adaptive notch
smoothing. IEEE Transactions on Signal Processing, 59, 2024–2037.
Olkin, I., & Pratt, J. (1958). A multivariate Tchebycheff inequality. Annals of
Mathematical Statistics, 29, 226–234.
Regalia, P. A. (1995). Adaptive IIR filtering in signal processing and control. New York:
Marcel Dekker.
Tichavský, P., & Händel, P. (1995). Two algorithms for adaptive retrieval of slowly
time-varying multiple cisoids in noise. IEEE Transactions on Signal Processing, 43,
1116–1127.
Author's personal copy
M. Niedźwiecki, M. Meller / Automatica 49 (2013) 1601–1613
Tsatsanis, M. K., & Giannakis, G. B. (1996). Modeling and equalization of rapidly
fading channels. International Journal of Adaptive Control and Signal Processing,
10, 159–176.
van Trees, H. (1968). Detection, estimation and modulation theory. New York: Wiley.
Maciej Niedźwiecki was born in Poznań, Poland in 1953.
He received the M.Sc. and Ph.D. degrees from the Gdańsk
University of Technology, Gdańsk, Poland, and the Dr. Hab.
(D.Sc.) degree from the Technical University of Warsaw,
Warsaw, Poland, in 1977, 1981 and 1991, respectively.
He spent three years as a Research Fellow with the
Department of Systems Engineering, Australian National
University from 1986 to 1989. From 1990 to 1993 he
served as a Vice-Chairman of the Technical Committee
on Theory of the International Federation of Automatic
Control (IFAC). He is currently a Professor and Head of the
Department of Automatic Control, Faculty of Electronics, Telecommunications and
Computer Science, Gdańsk University of Technology. His main areas of research
1613
interest include system identification, statistical signal processing and adaptive
systems. He is the author of the book Identification of Time-varying Processes (Wiley,
2000).
Dr. Niedźwiecki is currently Associate Editor for IEEE Transactions on Signal
Processing, a member of the IFAC committees on Modeling, Identification and Signal
Processing and on Large Scale Complex Systems, and a member of the Automatic
Control and Robotics Committee of the Polish Academy of Sciences (PAN).
Michał Meller received the M.Sc. and Ph.D. degrees in Automatic Control from the Gdańsk University of Technology,
Gdańsk, Poland, in 2007 and 2010, respectively. Since 2007
he has been working in the Department of Signal and Information Processing, Bumar Elektronika, Gdańsk Division. In
2010 he also joined the Department of Automatic Control
at the Gdańsk University of Technology, Faculty of Electronics, Telecommunications and Computer Science. His
professional interests include signal processing and adaptive systems.