A DSP AZ - Complang
1
A DSP A-Z
http://www.unex.ucla.edu
Digital Signal Processing
An “A” to “Z”
R.W. Stewart
M.W. Hoffman
Signal Processing Division
Dept. of Electronic and Electrical Eng.
University of Strathclyde
Glasgow G1 1XW,
UK
Department of Electrical Eng.
209N Walter Scott Eng. Center
PO Box 880511
Lincoln, NE 68588 0511
USA
Tel: +44 (0) 141 548 2396
Fax: +44 (0) 141 552 2487
E-mail: [email protected]
Tel: +1 402 472 1979
Fax: +1 402 472 4732
Email:[email protected]
© BlueBox Multimedia, R.W. Stewart 1998
DSPedia
2
The
DSPedia
An A-Z of Digital Signal Processing
This text aims to present relevant, accurate and readable definitions of common and not so
common terms, algorithms, techniques and information related to DSP technology and
applications. It is hoped that the information presented will complement the formal teachings of the
many excellent DSP textbooks available and bridge the gaps that often exist between advanced
DSP texts and introductory DSP.
While some of the entries are particularly detailed, most often in cases where the concept,
application or term is particularly important in DSP, you will find that other terms are short, and
perhaps even dismissive when it is considered that the term is not directly relevant to DSP or would
not benefit from an extensive description.
There are 4 key sections to the text:
•
•
•
•
DSP terms A-Z
page 1
Common Numbers associated with DSP
page 427
Acronyms
page 435
References
page 443
Any comment on this text is welcome, and
[email protected], or [email protected]
Bob Stewart, Mike Hoffman
1998
Published by BlueBox Multimedia.
the
authors
can
be
emailed
at
A-series Recommendations:
1
A
A-series Recommendations: Recommendations from the International Telecommunication
Union (ITU) telecommunications committee (ITU-T) outlining the work of the committee. See also
International Telecommunication Union, ITU-T Recommendations.
A-law Compander: A defined standard nonlinear (logarithmic in fact) quantiser characteristic
useful for certain signals. Non-linear quantisers are used in situations where a signal has a large
dynamic range, but where signal amplitudes are more logarithmically distributed than they are
linear. This is the case for normal speech.
Speech signals have a very wide dynamic range: Harsh “oh” and “b” type sounds have a large
amplitude, whereas softer sounds such as “sh” have small amplitudes. If a uniform quantization
scheme were used then although the loud sounds would be represented adequately the quieter
sounds may fall below the threshold of the LSB and therefore be quantized to zero and the
information lost. Therefore non-linear quantizers are used such that the quantization level at low
input levels is much smaller than for higher level signals. To some extent this also exploits the
logarithmic nature of human hearing.
Non-linear ADC
Linear ADC
Binary Output
-2
Binary output
15
15
12
12
8
8
4
4
-1
1
-4
2
-2
Voltage Input
-1
1
-4
-8
-8
-12
-12
-16
-16
2
Voltage Input
A linear, and a non-linear (A-law in fact) input-output characteristic for two 4 bit ADCs. Note
that the linear ADC has uniform quantisation, whereas the non-linear ADC has more
resolution for low level signals by having a smaller step size for low level inputs.
A-law quantizers are often implemented by using a nonlinear circuit followed by a uniform quantizer.
Two schemes are widely in use, the µ -law in the USA:
ln ( 1 + µ x )
y = ----------------------------ln ( 1 + µ )
(1)
+ ln A x-----------------------y = 1
1 + ln A
(2)
and the A-law in Europe and Japan:
DSPedia
2
where “ln” is the natural logarithm (base e), and the input signal x is in the range 0 to 1. The ITU
have defined standards (G.711) for these quantisers where µ = 255 and A = 87.56 . The input/
output characterisitcs of Eqs. 1 and 2 for these two values are virtually identical.
Although a non-linear quantiser can be produced with analogue circuitry, it is more usual that a
linear quantiser will be used, followed by a digital implementation of the compressor. For example,
if a signal has been digitised by a 12 bit linear ADC, then digital µ -law compression can be
performed to compress to 8 bits using a modified version of Eq. 2:
ln ( 1 + µ x ⁄ 2 11 )
ln ( 1 + µ x ⁄ 2048 )
y = 2 7 ------------------------------------------ = 128 ----------------------------------------------ln ( 1 + µ )
ln ( 1 + µ )
(3)
where y is rounded to the nearest integer. After a signal has been compressed and transmitted, at
the receiver it can be expanded back to its linear form by using an expander with the inverse
characteristic to the compressor.
127
Digital output
µ = 255
96
64
32
A-Law Compression
-2048
-1536
-1024
-512
0
-32
512
1024
1536
2047
Digital input
12 bits
input
Digital
A-law
compressor
8 bits
output
-64
-96
-128
The ITU µ -law characteristic for compression from 12 bits to 8 bits. Note that if a value of
µ = 0 was used then the characteristic is linear, and for µ → ∞ the characteristic tends to
a sigmoid/step function.
Listening tests for µ -law encoded speech reveal that compressing a linear resolution 12 bit speech
signal (sampled at 8 kHz) to 8 bits, and then expanding back to a linearly quantised 12 bit signal
does not degrade the speech quality to any significant degree. This can be quantitatively shown by
considering the actual quantisation noise signals for the compressed and uncompressed speech
signals.
In practice the use of DSP routines to perform Eq. 3 is not performed and a piecewise linear
approximation (defined in G.711) to the µ - or A-law characteristic is used. See also Companders,
Compression,G-series Recommendations, m-law.
Absolute Error: Consider the following example, if an analogue voltage of exactly v = 6.285 volts
is represented to only one decimal place by rounding then v′ = 6.3 , and the absolute error, ∆v ,
is defined as the difference between the true value and the estimated value. Therefore,
v = v′ + ∆v
(4)
Absolute Pitch:
3
and
∆v = v – v′
(5)
For this case ∆v = -0.015 volts. Notice that absolute error does not refer to a positive valued error,
but only that no normalization of the error has occurred. See also Error Analysis, Quantization Error,
Relative Error.
Absolute Pitch: See entry for Perfect Pitch.
Absolute Value: The absolute value of a quantity, x, is usually denoted as x . If x ≥ 0 , then
x = x , and if x < 0 then x = – x . For example 12123 = 12123 , and – 234.5 = 234.5 . The
absolute value function y = x is non-linear and is non-differentiable at x = 0 .
y
5
4
3
y = x
2
1
-5 -4 -3 -2 -1
1
0
2
3
4
5
x
Absorption Coefficient: When sound is absorbed by materials such as walls, foam etc., the
amount of sound energy absorbed can be predicted by the material’s absorption coefficient at a
particular frequency. The absorption coefficients for a few materials are shown below. A 1.0
indicates that all sound energy is absorbed, and a 0, that none is absorbed. Sound that is not
absorbed is reflected. The amplitude of reflected sound waves is given by 1 – A times the
amplitude of the impinging sound wave.
Absorption Coefficient
1.0
Polyurethane
Foam
0.8
Reflected
Sound
Glass-Wool
0.6
Thick
Carpet
Absorbed
Sound
0.4
0.2
Incident
Sound
Brick
0
0.1
0.2
0.4
0.5
1
Frequency (kHz)
2
3
4
5
Wall
Accelerometer: A sensor that measures acceleration, often used for vibration sensing and attitude
control applications.
Accumulator: Part of a DSP processor which can add two binary numbers together. The
accumulator is part of the ALU (arithmetic logic unit). See also DSP Processor.
Accuracy: The accuracy of DSP system refers to the error of a quantity compared to its true value.
See also Absolute Error, Relative Error, Quantization Noise.
DSPedia
4
Acoustic Echo Cancellation: For teleconferencing applications or hands free telephony, the
loudspeaker and microphone set up in both locations causes a direct feedback path which can
cause instability and therefore failure of the system. To compensate for this echo acoustic echo
cancellers can be introduced:
A + echoes of B’ + echoes of A’ ....etc.
+
A
A’
−
H1(f)
“feedback”
Adaptive
Filter
Adaptive
Filter
“feedback”
H2(f)
−
B’
+
Room 1
B
Room 2
B + echoes of A’ + echoes of B’ ....etc.
When speaker A in room 1 speaks into microphone 1, the speech will appear at loudspeaker
2 in room 2. However the speech from loudspeaker 2 will be picked up by microphone 2, and
transmitted back into room 1 via loudspeaker 1, which in turn is picked up by loudspeaker 1,
and so on. Hence unless the loudspeaker and microphones in each room are acoustically
isolated (which would require headphones), there is a direct feedback path which may cause
stability problems and hence failure of the full duplex speakerphone. Setting up an adaptive
filter at each end will attempt to cancel the echo at each outgoing line. Amplifiers, ADCs,
DACs, communication channels etc. have been omitted to allow the problem to be clearly
defined.
Teleconferencing is very dependent on adaptive signal processing strategies for acoustic echo
control. Typically teleconferencing will sample at 8 or 16 kHz and the length of the adaptive filters
could be thousands of weights (or coefficients), depending on the acoustic environments where
they are being used. See also Adaptive Signal Processing, Echo Cancellation, Least Mean Squares
Algorithm, Noise Cancellation, Recursive Least Squares.
Acoustics: The science of sound. See also Absorption, Audio, Echo, Reverberation.
Actuator: Devices which take electrical energy and convert it into some other form, e.g.
loudspeakers, AC motors, Light emitting diodes (LEDs).
Active Filter: An analog filter that includes amplification components such as op-amps is termed
an active filter; a filter that only has resistive, capacitive and inductive elements is termed a passive
filter. In DSP systems analog filters are widely used for anti-alias and reconstruction filters, where
good roll-off characteristics above fs /2 are required. A simple RC circuit forms a first order (single
pole) passive filter with roll of 20dB/decade (or 6dB/ocatve). By cascading RC circuits with an
(active) buffer amplifier circuit, higher order filters (with more than one pole) can be easily designed.
See also Anti-alias Filter, Filters (Butterworth, Chebyshev, Bessel etc.) , Knee, Reconstruction Filter
, RC Circuit, Roll-off.
Active Noise Control (ANC):
5
Active Noise Control (ANC): By introducing anti-phase acoustic waveforms, zones of quiet can
be introduced at specified areas in space caused by the destructive interference of the offending
noise and an artificially induced anti-phase noise:
Anti-phase
noise
ANC
Loudspeaker
Quiet Zone:
(destructive
interference)
Periodic
noise
NOISE
The simple principle of active noise control.
ANC works best for low frequencies up to around 600Hz. This can be intuitively argued by the fact
that the wavelength of low frequencies is very long and it is easier to match peaks and troughs to
create relatively large zones of quiet. Current applications for ANC can be found inside aircraft, in
automobiles, in noisy industrial environments, in ventilation ducts, and in medical MRI equipment.
Future applications include mobile telephones and maybe even noisy neighbors!
The general active noise control problem is:
NOISE
T(f)
Reference
microphone
n(t)
Q(f)
x(t)
Adaptive
Noise
Controller
y(t)
Hr(f)
ye(t)
Desired
zone of
quiet
d(t)
He(f)
Secondary
Loudspeaker
Error
microphone
e(t) = d(t) + ye(t)
The general set up of an active noise controller as a feedback loop where
the aim is to minimize the error signal power.
DSPedia
6
To implement an ANC system in real time the filtered-X LMS or filtered-U LMS algorithms can be
used [68], [69]:
NOISE
T(f)
Reference microphone
Q(f)
d(k)
Filter Zeroes
a
x(k)
ˆ
He ( z )
y(k)
He(f)
+
Filter Poles
b
a ( k + 1 ) = a ( k ) + 2µe ( k )f ( k )
f(k)
+
Σ
ˆ
He ( z )
Loud
speaker
Error
microphone
b ( k + 1 ) = b ( k ) + 2µe ( k )g ( k )
g(k)
The filtered-U LMS algorithm for active noise control. Note that if there are no poles, this
architecture simplifies to the filtered-X LMS.
The figure below shows the time and frequency domains for the ANC of an air conditioning duct.
Note that the signals shown are represent the sound pressure level at the error microphone. In
Active Vibration Control (AVT):
7
general the zone of quiet does not extend much greater than λ ⁄ 4 around the error microphone
(where λ is the noise wavelength):
TIme Analysis
2500
Before ANC
Amplitude (units)
1500
500
0
-500
-1500
-2500/2500
After ANC
1500
500
0
-500
-1500
-2500
0
5
10
15
20
25
30
35
40
45
50
Time (ms)
Power Spectra Analysis
0
Before ANC
Magnitude (dB)
-8
-16
-24
-32
-40/0
After ANC
-8
-16
-24
-32
-40
0
100
200
300
400
500
600
700
800
900
1000
Frequency (Hz)
ANC inside air conditioning duct. The sound pressure levels shown represent the noise at an
error microphone before and after switching on the noise canceller. The noise canceller clearly
reduces the low frequency (periodic) noise components.
Sampling rates for ANC can be as low as 1kHz if the offending noise is very low in frequency (say
50-400Hz) but can be as high as 50 kHz for certain types of ANC headphones where very rapid
adaption is required, even although the maximum frequency being cancelled is not more than a few
kHz which would make the Nyquist rate considerably lower. See also Active Vibration Control,
Adaptive Line Enhancer, Adaptive Signal Processing, Least Mean Squares Algorithm, Least Mean
Squares Filtered-X Algorithm Convergence, Noise Cancellation.
Active Vibration Control (AVT): DSP techniques for AVT are similar to active noise cancellation
(ANC) algorithms and architectures. Actuators are employed to introduce anti-phase vibrations in
an attempt to reduce the vibrations of a mechanical system. See also Active Noise Cancellation.
DSPedia
8
AC-2: An Audio Compression algorithm developed by Dolby Labs and intended for applications
such as high quality digital audio broadcasting. AC-2 claims compression ratios of 6:1 with sound
quality almost indistinguishable from CD quality sound under almost all listening conditions. AC-2
is based on psychoacoustic modelling of human hearing. See also Compression, Precision
Adaptive Subband Coding (PASC).
Adaptation: Adaptation is the auditory effect whereby a constant and noisy signal is perceived to
become less loud or noticeable after prolonged exposure. An example would be the adaptation to
the engine noise in a (loud!) propeller aircraft. See also Audiology, Habituation, Psychoacoustics.
Adaptive Differential Pulse Code Modulation (ADPCM): ADPCM is a family of speech
compression and decompression algorithms which use adaptive quantizers and adaptive
predictors to compress data (usually speech) for transmission. The CCITT standard of ADPCM
allows an analog voice conversation sampled at 8kHz to be carried within a 32kbits/second digital
channel . Three or four bits are used to describe each sample which represent the difference
between two adjacent samples. See also Differential Pulse Code Modulation (ADPCM), Delta
Modulation, Continuously Variable Slope Delta Modulation (CVSD), G.721.
Adaptive Beamformer: A spatial filter (beamformer) that has time-varying, data dependent (i.e.,
adaptive) weights. See also Beamforming.
Adaptive Equalisation: If the effects of a signal being passed through a particular system are to
be “removed” then this is equalisation. See Equalisation.
Adaptive Filter: The generic adaptive filter can be represented as:
d( k)
x( k )
Adaptive
Filter, w(k)
y(k)
+
−
e( k)
Adaptive Algorithm
y ( k ) = Filter { x ( k ), w ( k ) }
w ( k + 1 ) = w ( k ) + e ( k )f { d ( ( k ), x ( k ) ) }
In the generic adaptive filter architecture the aim can intuitively be described as being to
adapt the impulse response of the digital filter such that the input signal x ( k ) is filtered to
produce y ( k ) which when subtracted from desired signal d ( k ) , will minimize the power of
the error signal e ( k ) .
The adaptive filter output y ( k ) is produced by the filter weight vector, w ( k ) , convolved (in the
linear case) with x ( k ) . The adaptive filter weight vector is updated based on a function of the error
signal e ( k ) at each time step k to produce a new weight vector, w ( k + 1 ) to be used at the next
time step. This adaptive algorithm is used in order that the input signal of the filter, x ( k ) , is filtered
to produce an output, y ( k ) , which is similar to the desired signal, d ( k ) , such that the power of the
error signal, e ( k ) = d ( k ) – y ( k ) , is minimized. This minimization is essentially achieved by
exploiting the correlation that should exist between d ( k ) and y ( k ) .
Adaptive Filter:
9
The adaptive digital filter can be an FIR, IIR, Lattice or even a non-linear (Volterra) filter, depending
on the application. The most common by far is the FIR. The adaptive algorithm can be based on
gradient techniques such as the LMS, or on recursive least squares techniques such as the RLS.
In general different algorithms have different attributes in terms of minimum error achievable,
convergence time, and stability.
There are at least four general architectures that can be set up for adaptive filters: (1) System
identification; (2) Inverse system identification; (3) Noise cancellation; (4) Prediction. Note that all
of these architectures have the same generic adaptive filter as shown below (the “Adaptive
Algorithm” block explicitly drawn above has been left out for illustrative convenience and clarity):
Unknown
System
x(k)
x(k)
Adaptive
Filter
Delay
y(k)
d(k)
+ e(k)
-
x(k)
Unknown
System
s(k)
System Identification
Adaptive
Filter
y(k)
d(k)
+ e(k)
-
Inverse System Identification
s(k) + n(k)
n’(k)
x(k)
Adaptive
Filter
y(k)
-
Noise Cancellation
d(k)
+ e(k)
s(k)
Delay
x(k)
Adaptive
Filter
y(k)
d(k)
+ e(k)
-
Prediction
Four adaptive signal processing architectures
Consider first the system identification; at an intuitive level, if the adaptive algorithm is indeed
successful at minimizing the error to zero, then by simple inspection the transfer function of the
“Unknown System” must be identical to the transfer function of the adaptive filter. Given that the
error of the adaptive filter is now zero, then the adaptive filters weights are no longer updated and
will remain in a steady state. As long as the unknown system does not change its characteristics
we have now successfully identified (or modelled) the system. If the adaption was not perfect and
the error is “very small” rather than zero (which is more likely in real applications) then it is fair to
say the we have a good model rather than a perfect model.
Similarly for the inverse system identification if the error adapts to zero over a period of time, then
by observation the transfer function of the adaptive filter must be the exact inverse of the “Unknown
System”. (Note that the “Delay” is necessary to ensure that the problem is causal and therefore
solvable with real systems, i.e. given that the “Unknown System” may introduce a time delay in
producing x ( k ) , then if the “Delay” was not present in the path to the desired signal the system
would be required produced an anti-delay or look ahead in time - clearly this is impossible.)
For the noise cancellation architecture, if the input signal is s ( k ) which is corrupted by additive
noise, n ( k ) , then the aim is to use a correlated noise reference signal, n′ ( k ) as an input to the
DSPedia
10
adaptive filter, such that when performing the adaption there is only information available to
implicitly model the noise signal, n ( k ) and therefore when this filter adapts to a steady state we
would expect that e ( k ) ≈ s ( k ) .
Finally, for the prediction filter, if the error is set to be adapted to zero, then the adaptive filter must
predict future elements of the input s ( k ) based only on past observations. This can be performed
if the signal s ( k ) is periodic and the filter is long enough to “remember” past values. One
application therefore of the prediction architecture could be to extract periodic signals from
stochastic noise signals. The prediction filter can be extended to a “smoothing filter” if data are
processed off-line -- this means that samples before and after the present sample are filtered to
obtain an estimate of the present sample. Smoothing cannot be done in real-time, however there
are important applications where real-time processing is not required (e.g., geophysical seismic
signal processing).
A particular application may have elements of more than one single architecture, for example in the
following, if the adaptive filter is successful in modelling “Unknown System 1”, and inverse
modelling “Unknown System 2”, then if s ( k ) is uncorrelated with r ( k ) then the error signal is likely
to be e ( k ) ≈ s ( k ) :
s(k)
Unknown
System 1
r(k)
Unknown
System 2
Delay
x(k)
Adaptive
Filter
+
y(k)
+
d(k)
+ e(k)
-
An adaptive filtering architecture incorporating elements of system identification, inverse
system identification and noise cancellation
In the four general architectures shown above the unknown systems being investigated will
normally be analog in nature, and therefore suitable ADCs and DACs would be used at the various
Adaptive Infinite Impulse Response (IIR) Filters:
11
analog input and output points as appropriate. For example if an adaptive filter was being used to
find a model of a small acoustic enclosure the overall hardware set up would be:
DAC
x(t)
d(t)
ADC
d(k)
Adaptive
Filter
x(k)
+ e(k)
y(k)
-
Digital Signal Processor
The analog-digital interfacing for a system identification, or modelling,
of an acoustic transfer path using a loudspeaker and microphone.
See also Adaptive Signal Processing, Acoustic Echo Cancellation, Active Noise Control, Adaptive
Line Enhancer, Echo Cancellation, Least Mean Squares (LMS) Algorithm, Least Squares, Noise
Cancellation, Recursive Least Squares, Wiener-Hopf Equations.
Adaptive Infinite Impulse Response (IIR) Filters: See Least Mean Squares IIR Algorithms.
Adaptive Line Enhancer (ALE): An adaptive signal processing structure that is designed to
enhance or extract periodic (or predictable) components:
d( k)
∆
x(k )
p(k ) + n( k)
p(k – ∆) + n(k – ∆)
Adaptive
Filter
−
y(k)
+
e(k )
An adaptive line enhancer. The input signal consists of a periodic component, p ( k ) and a
stochastic component, n ( k ) . The delay, ∆, is long enough such that the stochastic
component at the input to the adaptive filter, n ( k – ∆ ) is decorrelated with the input n ( k ) .
For periodic signal the delay does not decorrelate p ( k ) and p ( k – ∆ ) . When the adaptive
filter adapts it will therefore only cancel the periodic signal.
The delay, ∆, should be long enough to decorrelate the broadband “noise-like” signal, resulting in
an adaptive filter which extracts the narrowband periodic signal at filter output y ( k ) (or removes
the periodic noise from a wideband signal at e ( k ) ). An ALE exploits the knowledge that the signal
of interest is periodic, whereas the additive noise is stochastic. If the decorrelation delay, ∆, is long
enough then the stochastic noise presented to the d ( k ) input is uncorrelated with the noise
presented to the x ( k ) input, however the periodic noise remains correlated:
DSPedia
12
r(n)
q( n)
Lag, n
Correlation r ( n ) = E { p ( k )p ( k + n ) }
periodic (sine wave) signal
of a
-∆
∆
Lag, n
Correlation q ( n ) = E { n ( k )n ( k + n ) }
of a stochastic signal
Typically an ALE may be used in communication channels or in radar and sonar applications where
a low level sinusoid is masked by white or colored noise. In a telecommunications system, an ALE
could be used to extract periodic DTMF signals from very high levels of stochastic noise.
Alternatively note that the ALE can be used to extract the periodic noise from the stochastic signal
by observing the signal e ( k ) . See also Adaptive Signal Processing, Least Mean Squares
Algorithm, Noise Cancellation.
Adaptive Noise Cancellation: See Adaptive Signal Processing, Noise Cancellation.
Adaptive Signal Processing: The discrete mathematics of adaptive filtering, originally based on
the least squares minimization theory of the celebrated 19th Century German mathematician
Gauss. Least squares is of course widely used in statistical analysis and virtually every branch of
science and engineering. For many DSP applications, however, least squares minimization is
applied to real time data and therefore presents the challenge of producing a real time
implementation to operate on data arriving at high data rates (from 1kHz to 100kHz), and with
loosely known statistics and properties. In addition, other cost functions besides least squares are
also used.
One of the first suggestions of adaptive DSP algorithms was in Widrow and Hoff’s classic paper on
the adaptive switching circuits and the least mean squares (LMS) algorithm at the IRE WESCON
Conference in 1960. This paper stimulated great interest by providing a practical and potentially real
time solution for least squares implementation. Widrow followed up this work with two definitive and
classic papers on adaptive signal processing in the 1970s [152], [153].
Adaptive signal processing has found many applications. A generic breakdown of these
applications can be made into the following categories of signal processing problems: signal
detection (is it there?), signal estimation (what is it?), parameter or state estimation, signal
compression, signal synthesis, signal classification, etc. The common attributes of adaptive signal
processing applications include time varying (adaptive) computations (processing) using sensed
input values (signals).See also Acoustic Echo Cancellation, Active Noise Control, Adaptive Filter,
Adaptive Line Enhancer, Echo Cancellation, Least Mean Squares (LMS) Algorithm, Least Squares,
Noise Cancellation, Recursive Least Squares, Wiener-Hopf Equations.
Adaptive Spectral Perceptual Entropy Coding (ASPEC): ASPEC is a means of providing
psychoacoustic compression of hifidelity audio and was developed by AT&T Bell Labs, Thomson
and the Fraunhofer society amongst others. In 1990 features of the ASPEC coding system were
incorporated into the International Organization for Standards MPEG-1 standard ISO in
combination with MUSICAM. See also Masking Pattern Adapted Universal Subband Integrated
Adaptive Step Size:
13
Coding and Multiplexing (MUSICAM), Precision Adaptive Subband Coding (PASC), Spectral
Masking, Psychoacoustics, Temporal Masking.
Adaptive Step Size: See Step Size Parameter.
Adaptive Transform Acoustic Coding (ATRAC): ATRAC coding is used for compression of
hifidelity audio (usually starting with 16 bit data at 44.1kHz) to reduce storage requirement on
recording mediums such as the mini-disc (MD) [155]. ATRAC achieves a compression ratio of
almost 5:1 with very little perceived difference to uncompressed PCM quality. ATRAC exploits
psychoacoustic (spectral) masking properties of the human ear and effectively compresses data by
varying the bit resolution used to code different parts of the audio spectrum. More information on
the mini-disc (and also ATRAC) can be found in [155].
ATRAC has three key coding stages. First is the subband filtering which splits the signal into three
subbands, (low:0 - 5.5 kHz; mid:5.5 - 11kHz; high:11- 22kHz) using a two stage quadrature mirror
filter (QMF) bank.
The second stage them performs a modified discrete cosine transform (MDCT) to produce a
frequency representation of the signal. The actual length (no. of samples) of the transform is
controlled adaptively via an internal decision process and either uses time frame lengths of 11.6ms
(when in long mode) for all frequency bands, and 1.45ms (when in short mode) for the high
frequency band, and 2.9ms (also called short mode) for the low and mid frequency bands. The
choice of mode is usually long, however if a signal has rapidly varying instantaneous power (when
say a cymbal is struck) short mode may be required in the low and mid frequency bands to
adequately code the rapid attack portion of the waveform.
Finally the third stage is to consider the spectral characteristics of the three subbands and allocate
bit resolution such that spectral components below the threshold of hearing, are not encoded, and
that the spectral masking attributes of the signal spectrum are exploited such that the number of
bits required to code certain frequency bands is greatly reduced. (See entry for Precision Adaptive
Subband Coding (PASC) for a description of quantization noise masking.) ATRAC splits the
frequencies from the MDCT into a total of 52 frequency bins which are of varying bandwidth based
on the width of the critical bands in the human auditory mechanism. ATRAC then compands and
requantizes using a block floating point representation. The wordlength is determined by the bit
DSPedia
14
allocation process based on psychoacoustic models. Each input 11.6 ms time frame of 512 × 16 bit
samples or 1024 bytes is compressed to 212 bytes (4.83:1 compression ratio).
Delay
QMF-1
Digital Audio
input
44.1kHz, 16
bits;
1.4112 Mbits/s
11.025 - 22.05kHz
MDCT
High
5.5125 - 11.025kHz
MDCT
Mid
Bit
allocation
and spectral
quantization
QMF-2
0 - 5.5125 kHz
MDCT
Low
Compressed
output
292 Imbeds/sec
The three stages of adaptive transform acoustic coding (ATRAC): (1) Quadrature mirror
filter (QMF) subband coding; (2) Modified Discrete Cosine Transform (MDCT); (3) Bit
allocation and spectral masking/quantization decision. Data is input for coding in time
frames of 512 samples (1024 bytes) and compressed into 212 bytes.
ATRAC decoding from compressed format back to 44.1kHz PCM format is achieved by first
performing an inverse MDCT on the three subbands (using long mode or short mode data lengths
as specified in the coded data). The three time domain signals produced are then reconstructed
back into a time domain signal using QMF synthesis filters for output to a DAC. See also Compact
Disc, Data Compression, Frequency Range of Hearing, MiniDisc (MD), Psychoacoustics, Precision
Adaptive Subband Coding (PASC), Spectral Masking, Subband Filtering, Temporal Masking,
Threshold of Hearing.
Additive White Gaussian Noise: The most commonly assumed noise channel in the analysis and
design of communications systems. Why is this so? Well, for one, this assumption allows analysis
of the resulting system to be tractable (i.e., we can do the analysis). In addition, this is a very good
model of electronic circuit noise. In communication systems the modulated signal is often so weak
that this circuit noise becomes a dominant effect. The model of a flat (i.e., white) spectra is good in
electronic circuits up to about 1012Hz. See also White Noise.
Address Bus: A collection of wires that are used for sending memory address information either
inter-chip (between chips) or intra-chip (within a chip). Typically DSP address buses are 16 or 32
bits wide. See also DSP Processor.
Address Registers: Memory locations inside a DSP processor that are used as temporary storage
space for addresses of data stored somewhere in memory. The address register width is always
greater than or equal to (normally the same) the width of the DSP processor address bus. Most DSP
processors have a number of address registers. See also DSP Processor.
AES/EBU: See Audio Engineering Society, European Broadcast Union.
Aliasing: An irrecoverable effect of sampling a signal too slowly. High frequency components of a
signal (over one-half the sampling frequency) cannot be accurately reconstructed in a digital
system. Intuitively, the problem of sampling too slowly (aliasing) can be understood by considering
that rapidly varying signal fluctuations that take place in between samples cannot be represented
at the output. The distortion created by sampling these high frequency signals too slowly is not
Algorithm:
15
reversible and can only be avoided by proper aliasing protection as provided by an anti-alias filter
or a an oversampled Analog to Digital converter.
Voltage
period = 1/f
0.01
0.02
time
0.03
Sampling a 100 Hz sine wave at only 80 Hz causes aliasing, and the output
signal is interpreted as a 20 Hz sine wave, i.e.
See also Anti-alias Filter, Oversampling.
Algorithm: A mathematical based computational method which forms a set of well defined rules or
equations for performing a particular task. For example, the FFT algorithm can be coded into a DSP
processor assembly language and then used to calculate FFTs from stored (or real-time) digital
data.
All-pass Filter: An all-pass filter passes all input frequencies with the same gain, although the
phase of the signal will be modified. (A true all-pass filter has a gain of one.) All-pass filters are used
for applications such as group delay equalisation, notch filtering design, Hilbert transform
implementation, musical instruments synthesis [43] .
The simplest all pass filter is a simple delay! This “filter” passes all frequencies with the same gain,
has linear phase response and introduces a group delay of one sample at all frequencies:
time domain
x(k)
z-domain
y(k)
y(k ) = x(k – 1)
Y ( z ) = z–1 H ( z )
Y(z)
H ( z ) = ------------ = z – 1
X(z)
A simple all pass filter. All frequencies are passed with the same gain.
A more general representation of some types of all pass filters can be represented by the general
z-domain transfer function for an infinite impulse response (IIR) N pole, N zero filter:
–1
* –N + a * z – N + 1 + … + a *
*
– N A * ( z –1 )
( z )- = a
N – 1 z + aN
0z
1
------------------------------------------------------------------------------------------------------------- = z---------------------------H( z) = Y
X(z)
A( z )
a 0 + a 1 z –1 + … + a N – 1 z – N + 1 + a N z –N
(6)
where a * is the complex conjugate of a . Usually the filter weights are real, therefore a = a * , and
we set a 0 = 1 :
z – N + a 1 z –N + 1 + … + a N – 1 z –1 + a N
z –N A ( z –1 )( z -) = ---------------------------------------------------------------------------------------------------------- = ------------------------H(z) = Y
X(z)
A( z )
1 + a 1 z – 1 + … + a N – 1 z – N + 1 + a N z –N
(7)
DSPedia
16
We can easily show that H ( z ) = a N (see below) for all frequencies. Note that the numerator
polynomial z –N A ( z ) is simply the ordered reversed z-polynomial of the denominator A ( z ) . For an
input signal x ( k ) the discrete time output of an all-pass filter is:
y ( k ) = aN x ( k ) + aN – 1 x ( k – 1 ) + … + a1 x ( k – N + 1 ) + x ( k – N ) +
+ a1 y ( k – 1 ) + … + aN – 1 y ( k + N – 1 ) + aN x ( k – N )
(8)
In order to be stable, the poles of the all-pass filter must lie within the unit circle. Therefore for the
denominator polynomial, if the N roots of the polynomial A ( z ) are:
A ( z ) = ( 1 – p 1 z –1 ) ( 1 – p 2 z –1 )… ( 1 – p N z –1 )
(9)
then p n < 1 for n = 1 to N in order to ensure all poles are within the unit circle. The poles and
zeroes of the all pass filter are therefore:
a N ( 1 – p 1– 1 z –1 ) ( 1 – p 2–1 z – 1 )… ( 1 – p N–1 z –1 )
H ( z ) = -----------------------------------------------------------------------------------------------------------( 1 – p 1 z –1 ) ( 1 – p 2 z –1 )… ( 1 – p N z –1 )
(10)
where the roots of the zeroes polynomial A ( z –1 ) are easily calculated to be the inverse of the poles
(see following example).
To illustrate the relationship between roots of z-domain polynomial and of its order reversed
polynomial, consider a polynomial of order 3 with roots at z = p 1 and z = p 2 :
1 + a 1 z – 1 + a 2 z – 2 + a 3 z –3 = ( 1 – p 1 z – 1 ) ( 1 – p 2 z – 1 ) ( 1 – p 3 z – 1 )
= 1 – ( p 1 + p 2 + p 3 )z – 1 + ( p 1 p 2 + p 2 p 3 + p 1 p 3 )z – 2 + p 1 p 2 p 3 z – 3
Then replacing z with z – 1 gives:
1 + a 1 z1 + a 2 z2 + a3 z 3 = ( 1 – p1 z ) ( 1 – p2 z ) ( 1 – p3 z )
and therefore multiplying both sides by z – 3 gives:
z–3 ( 1 + a1 z 1 + a2 z 2 + a3 z 3 ) = z –3 ( 1 – p1 z ) ( 1 – p2 z ) ( 1 – p3 z )
z – 3 + a 1 z – 2 + a 2 z –1 + a 3 = ( z – 1 – p 1 ) ( z – 1 – p 2 ) ( z – 1 – p 3 )
= – p 1 p 2 p 3 ( 1 – p 1– 1 z – 1 ) ( 1 – p 2– 1 z – 1 ) ( 1 – p 3– 1 z – 1 )
= – a 3 ( 1 – p 1– 1 z –1 ) ( 1 – p 2– 1 z –1 ) ( 1 – p 3– 1 z –1 )
hence revealing the roots of the order reversed polynomial to be at z = 1 ⁄ p 1 , z = 1 ⁄ p 2
and z = 1 ⁄ p 3 .
Of course, if all of the poles of Eq. 10 lie within the z-domain unit circle then all of the zeroes of the
denominator of Eq. 10 will necessarily lie outside of the unit circle of the z-domain, i.e. when p n < 1
for n = 1 to N then p n– 1 > 1 for n = 1 to N . Therefore an all pass filter is maximum phase.
The magnitude frequency response of the pole at z = p i and the zero at z = p i–1 is:
1 – p i–1 z –1
H i ( e jω ) = ---------------------------1 – p i z –1
z = e jω
1= -----pi
(11)
All-pass Filter:
17
If we let p i = x + jy then the frequency response is found by evaluating the transfer
function at z = e jω :
–j ω
1 – p i– 1 e – j ω
1  p i – e -
1
- = ----  -----------------------H i ( e jω ) = --------------------------- = ---- G ( e jω )
jω
–
jω
–
p
p
1 – pi e

i  1 – pi e
i
where G ( e jω ) = 1 . This can be shown by first considering that:
( x – cos ω ) + j ( y – sin ω )
x + jy – ( cos ω – j sin ω )
G ( e jω ) = = -------------------------------------------------------------------- = -----------------------------------------------------------------------------------------------------1 – ( x + jy ) ( cos ω – j sin ω )
1 – x cos ω – y sin ω + j ( x sin ω – y cos ω )
and therefore the (squared) magnitude frequency response of G ( e jω ) is:
G ( e jω )
2
( x – cos ω ) 2 + ( y – sin ω ) 2
= ------------------------------------------------------------------------------------------------------------------( 1 – ( x cos ω + y sin ω ) ) 2 + ( x sin ω – y cos ω ) 2
( x 2 – 2x cos ω + cos2 ω ) + ( y 2 – 2y sin ω + sin2 ω )
= -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------1 – 2x cos ω – 2y sin ω + ( x cos ω + y sin ω ) 2 + x 2 sin2 ω + y 2 cos2 ω – 2xy sin ω cos ω
( sin2 ω + cos2 ω ) + x 2 + y 2 – 2x cos ω – 2b sin ω
= ------------------------------------------------------------------------------------------------------------------------------------------------------------------------1 + x 2 ( sin2 ω + cos2 ω ) + y 2 ( sin2 ω + cos2 ω ) – 2x cos ω + 2y sin ω
1 + x 2 + y 2 – 2x cos ω – 2y sin ω= -------------------------------------------------------------------------------=1
1 + x 2 + y 2 – 2x cos ω + 2y sin ω
1
1
Hence: H i ( e jω ) = ------- = ---------------------pi
x2 + y2
Therefore the magnitude frequency response of the all pass filter in Eq. 10 is indeed “flat” and given
by:
aN
= 1
H ( e jω ) = a N H1 ( e jω ) H 2 ( e jω ) … H N ( e jω ) = ---------------------------------p1 p2 … pN
(12)
From Eq. 7 and 10 it is easy to show that a N = p 1 p 2 … p N .
Consider the poles and zeroes of a simple 2nd order all-pass filter
transfer function (found by simply using the quadratic formula):
Imag
1 + 2z – 1 + 3z – 2
H ( z ) = ---------------------------------------3 + 2z – 1 + z – 2
z-domain
1
( 1 – ( 1 + j 2 )z – 1 ) ( 1 – ( 1 – j 2 )z – 1 )
= -----------------------------------------------------------------------------------------------------------------------------3 ( 1 – ( 1 ⁄ 3 + j 2 ⁄ 3 )z – 1 ) ( 1 – ( 1 ⁄ 3 – j 2 ⁄ 3 )z – 1 )
-1
0
-1
1
Real
( 1 – p 1– 1 z – 1 ) ( 1 – p 2– 1 z – 1 )
1
= ------------------ ⋅ --------------------------------------------------------------p1 p2
( 1 – p 1 z – 1 ) ( 1 – p 2 z –1 )
and obviously p 1 = 1 ⁄ 3 – j 2 ⁄ 3 and p 2 = 1 ⁄ 3 + j 2 ⁄ 3 and
p 1– 1 = 1 – j 2 and p 2– 1 = 1 + j 2 . This example demonsrates that
given that the poles must be inside the unit circle for a stable filter, the
zeroes will always be outside of the unit circle, i.e. maximum phase.
Any non-minimum phase system (i.e. zeroes outside the unit circle) can always be described as a
cascade of a minimum phase filter and a maximum phase all-pass filter. Consider the non-minimum
phase filter:
DSPedia
18
( 1 – α 1 z –1 ) ( 1 – α 2 z –1 ) ( 1 – α 3 z –1 ) ( 1 – α 4 z –1 )
H ( z ) = --------------------------------------------------------------------------------------------------------------------( 1 – β 1 z – 1 ) ( 1 – β2 z –1 ) ( 1 – β 3 z –1 )
(13)
where the poles, β 1, β 2, and β 3 are inside the unit circle (to ensure a stable filter) and the zeroes
α1 and α 2 are inside the unit circle, but the zeroes α 3 and α4 are outside of the unit circle. This
filter can be written in the form of a minimum phase system cascaded with an all-pass filter by
rewriting as:
 ( 1 – α 1 z – 1 ) ( 1 – α 2 z – 1 ) ( 1 – α 3 z – 1 ) ( 1 – α 4 z – 1 )  ( 1 – α 3– 1 z – 1 ) ( 1 – α 4– 1 z – 1 )
-  -----------------------------------------------------------------
H ( z ) =  ----------------------------------------------------------------------------------------------------------------------( 1 – β1 z–1 ) ( 1 – β2 z –1 ) ( 1 – β3 z –1 )

  ( 1 – α 3– 1 z – 1 ) ( 1 – α 4– 1 z – 1 )
 ( 1 – α 1 z – 1 ) ( 1 – α 2 z – 1 ) ( 1 – α 3–1 z – 1 ) ( 1 – α 4– 1 z – 1 )  ( ( 1 – α 3 z – 1 ) ( 1 – α 4 z – 1 ) ) 
-  -----------------------------------------------------------------
=  ----------------------------------------------------------------------------------------------------------------------------( 1 – β1 z–1 ) ( 1 – β2 z –1 ) ( 1 – β3 z –1 )

  ( 1 – α 3– 1 z – 1 ) ( 1 – α 4– 1 z – 1 )
Minimum phase filter
(14)
All-pass maximum phase filter
Therefore the minimum phase filter has zeroes inside the unit circle at z = α 3–1 , z = α 4– 1 and has
exactly the same magnitude frequency response as the original filter and the gain of the all-pass
filter being 1. See also All-pass Filter-Phase Compensation, Digital Filter, Infinite Impulse Response
Filter, Notch Filter.
G ( e jω )
0
-10
Magnitude and phase
response of G ( z )
-2π
-4π
-20
0
G ( e jω )
0
frequency (Hz)
frequency (Hz)
All-pass filter
Input
Gain (dB)
G(z)
0
HA(z)
G ( e jω )H A ( e jω )
-10
G ( e jω )H A ( e jω )
-2π
-20
0
0
Phase
0
Phase
Gain (dB)
All-pass Filter, Phase Compensation: All pass filters are often used for phase compensation or
group delay equalisation where the aim is to cascade an all-pass filter with a particular filter in order
to achieve a linear phase response in the passband and leave the magnitude frequency response
unchanged. (Given that signal information in the stopband is unwanted then there is usually no
need to phase compensate there!). Therefore if a particular filter has a non-linear phase response
and therefore non-constant group delay, then it may be possible to design a phase compensating
all-pass filter:
-4π
frequency (Hz)
0
Output
Magnitude and phase
response of
G ( z )H A ( z )
frequency (Hz)
Cascading an all pass filter H A ( z ) with a non-linear phase filter G ( z ) in order to linearise
the phase response and therefore produce a constant group delay. The magnitude
frequency response of the cascaded system is the same as the original system.
See also Digital Filter, Infinite Impulse Response Filter, Notch Filter.
All-pass Filter, Fractional Sample Delay Implementation:
19
All-pass Filter, Fractional Sample Delay Implementation: If it is required to delay a digital signal
by a number of discrete sample delays this is easily accomplished using delay elements:
y(k) = x(k – 3)
x(k)
x(k)
y(k)
1
T = ---- secs
fs
0
0
k
time (secs/fs)
k
time (secs/fs)
Delaying a signal by 3 samples, using simple delay elements.
Using DSP techniques to delay a signal by a time that is an integer number of sample delays
t s = 1 ⁄ f s is therefore relatively straightforward. However delaying by a time that is not an integer
number of sampling delays (i.e a fractional delay) is less straightforward.
Another method uses a simple first order all pass filter, to “approximately” implement a fractional
sampling delay. Consider the all-pass filter:
z – 1 + aH ( z ) = -------------------1 + az –1
(15)
To find the phase response, we first calculate:
e –j ω + a
cos ω – j sin ω + a = --------------------------------------------------( a + cos ω ) – j sin ω
H ( e jω ) = ----------------------- = --------------------------------------------------–
ω
j
1 + a cos ω – ja sin ω
1 + a cos ω – ja sin ω
1 + ae
(16)
and therefore:
– sin ω
a sin ω
∠H ( e jω ) = tan–1  ------------------------ + tan–1  ---------------------------
+
+
a cos ω
1 a cos ω
(17)
For small values of x the approximation tan– 1 x ≈ x , cos x ≈ 1 and sin x ≈ x hold. Therefore in Eq.
17, for small values of ω we get:
1 – a– ω - + -----------aω - = – -----------∠H ( e jω ) ≈ -----------ω = δω
+a
1
a+1 1+a
(18)
where δ = ( 1 – a ) ⁄ ( 1 + a ) . Therefore at “small” frequencies the phase response is linear, thus
giving a constant group delay of δ . Hence if a signal with a low frequency value f i , where:
2πf i
---------- << 1
fs
is required to be delayed by δ of a sample period ( t s = 1 ⁄ f s ), then:
(19)
DSPedia
20
1 – a⇒ δ = -----------1+a
1 – δ⇒ a = ----------1+δ
(20)
x ( k ) = sin ( 2πf i k ⁄ f s )
Therefore for the sine wave input signal of
approximately y ( k ) ≈ sin ( ( 2πf i ( k – δ ) ) ⁄ f s ) .
the output signal is
Parameters associated with creating delays of 0.1, 0.4, and 0.9 are shown below :
Input
z –1 + a
--------------------1 + az – 1
1–δ
a = -----------1+δ
1.0
dH ( e jω ) ⁄ dω
Group Delay
Phase (radians)
Delay (samples)
All-Pass Filter
1.2
δ = 0.9
0.8
H ( e jω ) Phase Response
0
δ = 0.1
-π/2
0.6
δ = 0.4
0.4
0.2
0
Note that for:
δ = 0.1, a = 0.9 ⁄ 1.1 ;
δ = 0.4, a = 0.6 ⁄ 1.4 ;
δ = 0.9, a = 0.1 ⁄ 1.9
Output
0.1
0.2
0.3
δ = 0.1
0.4
0.5
frequency (Hz)
δ = 0.9
-π
0
0.1
0.2
δ = 0.4
0.3
0.4
0.5
frequency (Hz)
Phase response and group delay for a first order all pass filter implementing a fractional
delay at low frequencies. For frequencies below 0.1f s the phase response is “almost”
linear, and therefore the group delay is effectively a constant. Note of course that for a
stable filter, a < 1 . The gain at all frequencies is 1 (a feature of all pass filters of course).
One area where fractional delays are useful is in musical instrument synthesis where accurate
control of the feedback loop delay is desirable to allow accurate generation of musical notes with
rich harmonics using “simple” filters [43]. If a digital audio system is sampling at f s = 48000 Hz
then for frequencies up to around 4000 Hz very accurate control is available over the loop delay
thus allowing accurate generation of musical note frequencies. More detail on fractional delay
method and applications can be found in [97]. See All-pass Filter-Phase Compensation,
Equalisation, Finite Impulse Reponse Filter - Linear Phase. .
All-Pole Filter: An all-pole filter is another name for a digital infinite impulse response (IIR) filter
which features only a recursive (feedback) section, i.e. it has no feedforward (non-recursive) finite
All-Pole Filter:
21
impulse response (FIR) section. The signal flow graph and discrete time equations for an all-pole
filter are:.
x(k)
y(k-M+1)
y(k-M)
bM
bM-1
y(k-2)
b2
y(k-1)
y(k)
b1
M
y(k ) =
∑
bn y( k – n)
n=1
= b1 y ( k – 1 ) + b 2 y ( k – 2 ) + … + b M – 1 ( 2 ) y ( k – M + 1 ) + b M y ( k – M )
An all pole filter has a feedback (recursive) section but no feedforward (non-recursive)
section. As for all IIR filters care must be taken to ensure that the filter is stable and all poles
are within the unit circle of the z-domain. (In our example we have used b’s to specify the
recursive weights, and (where appropriate) a’s to specify the non-recursive weights. Some
others use precisely the reverse notation!)
An M th order all-pole filter has M weights (b1 to bM). and the z-domain transfer function can be
represented by an M th order z-polynomial:
( z )- = -------------------------------------------------------------------------------------------------1
----------B( z) = Y
–
1
X(z)
1 + b1 z + … + bM – 1 z – M + 1 + bM z –M
1
= ---------------------------------M
1+
∑
(21)
bn z –n
n=1
The all-pole filter weights are also referred to as the autoregressive parameters if the all-pole filter
is used to generate an AR process. See also All-Zero Filter, Autoregressive Model, AutoregressiveMoving Average Filter, Digital Filter, Finite Impulse Response Filter, Infinite Impulse Response
Filter.
DSPedia
22
All-Zero Filter: An all zero filter is another name for a finite impulse response (FIR) digital filter:
x(k-1)
x(k-N+2)
x(k-N+1)
x(k)
w0
w1
wN-2
wN-1
y(k)
y ( k ) = w 0 x ( k ) + w 1 x ( k – 1 ) + w 2 x ( k – 2 ) + w 3 x ( k – 3 ) + ..... + w N – 1 x ( k – N + 1 )
N–1
=
x(k)
Tx =
=
–
(
)
w
x
k
n
w
w0 w1 w 2 x ( k – 1 )
∑ n
k
x(k – 2)
n=0
The signal flow graph and the discrete time output equation for an all zero digital filter. An
all zero filter is non-recursive and therefore contains no feedback components.
An (N-1)-th order all-zero filter has N weights (w0 to wN-1) and can be represented as an (N-1)-th
order polynomial in the z-domain:
( z )- = w + w z –1 + + w
–N+2 + w
–N+1 =
----------W(z) = Y
…
N – 2z
N – 1z
0
1
X( z)
N–1
∑
n=0
w n z –n
(22)
= X ( z )z – N + 1 [ w 0 z N – 1 + w1 z N – 2 + …w N – 1 ]
An all-zero filter is often also referred to as a moving average filter, although the name “moving
average filter” is (usually) more specifically used to mean an all-zero filter where all of the filter
weights are 1/N (or 1). See also All-Pole Filter, Comb Filter, Digital Filter, Finite Impulse Response
Filter, Infinite Impulse Response Filter , Moving Average Filter.
Ambience Processing: The addition of echoes or reverberation to warm a particular sound or
mimic the effect of a certain type of hall, or other acoustic environment. Another more popular term
used by Hifi companies is Digital Soundfield Processing (DSfP).
Amplifier: A device used to amplify, or linearly increase, the value of an analog voltage signal.
Amplifiers are usually denoted by a triangle symbol. The amplification factor is stated as a ratio
V out ⁄ V in , or in dBs as 20 log 10 ( V out ⁄ V in ) . For any real time input/output DSP system some form
of amplifier interface is required at the input and the output. A good amplifier should have a very
high input impedance, and a very low output impedance. Some systems require an amplification
Amplitude:
23
factor of 1 to protect or isolate a source; this type of amplifier is often called a buffer. See also
Operational Amplifier, Digital Amplifier, Buffer Amplifier, Pre-amplifier, and Attenuation.
Vin
Voltage
Voltage
Vout
time
time
Amplifier
Amplitude: The value size (or magnitude) of a signal at a specific time. Prior to analog to digital
conversion (ADC) the instantaneous amplitude will be given as a voltage value, and after the ADC,
the amplitude of a particular sample will be given as a binary number. Note that a few authors use
amplitude as the plus/minus magnitude of a signal.
Volts
Digital
Value
4
3
2
1
1
2
3
4
t2
t1
time
32000
24000
16000
8000
0
8000
16000
24000
32000
n2
n1
time
Signal amplitude at:
After A/D conversion:
t1: V = 3.7 volts
t2: V = -3.1 volts
n1: Value = 30976
n2: Value = -20567
Amplitude Modulation: One of the three ways of modulating a sine wave signal to carry
information. The sine wave or carrier has its amplitude changed in accordance with the information
signal to be transmitted. See also Frequency Modulation, Phase Modulation.
Amplitude Response: See Fourer Series - Amplitude/Phase Representation, Fourier Series Complex Exponential Representation.
Amplitude Shift Keying (ASK): A digital modulation technique in which the information bits are
encoded in the amplitude of a symbol. On-Off Keying (OOK) is a special case of ASK in which the
two possible symbols are zero (Off) and V volts (On). See also Frequency Shift Keying, Phase Shift
Keying, Pulse Amplitude Modulation, Quadrature Amplitude Modulation.
Analog: An analog means the “same as”. Therefore, as an example, an analog voltage for a sound
signal means that the voltage has the same characteristics of amplitude and phase variation as the
sound. Using the appropriate sensor, analog voltages can be created for light intensity (a
photovoltaic cell), vibrations (accelerometer), sound (microphone), fluid level (potentiometer and
floating ball) and so on.
Analog Computer: Before the availability of low cost, high performance DSP processors, analog
computers were used for analysis of signals and systems. The basic linear elements for analog
computers were the summing amplifier, the integrator, and the differentiator [44]. By the judicious
use of resistor and capacitor values, and the input of appropriate signals, analog computers could
DSPedia
24
be used for solving differential equations, exponential and sine wave generation and the
development of control system transfer functions.
C
R
+
Vin
–1 t
V out = --------- ∫ V in dt
RC 0
Integrator
R
C
+
Vin
dV in
V out = – R C -----------dt
Differentiator
V1
V2
V3
R1
R2
R3
Rf
+
Rf
Rf
Rf
V out = ------- V 1 + ------- V 1 + ------- V 1
R1
R2
R3
Summer
Analog Differentiator: See Analog Computer.
Analog Integrator: See Analog Computer.
Analog to Digital Converter (A/D or ADC): A analog to digital converter takes an analog input
voltage (a real number) and converts it (or “quantizes” it) to a binary number (i.e., to one of a finite
set of values). The number of conversions per second is governed by the sampling rate. The input
to an ADC is usually from a sample and hold circuit which holds an input voltage constant for one
sampling period while the ADC performs the actual analog to digital conversion. Most ADCs used
in DSP use 2’s complement arithmetic. For audio applications 16 bit ADCs are used, whereas for
telecommunications and speech coding, 8 bit ADCs are usually used. Modern ADCs can achieve
Anechoic:
25
almost 20 bits of accuracy at sampling rates of up to 100kHz. See also Anti-alias Filter, Digital to
Analog Converter, Quantizer, Sample and Hold, Sigma Delta .
fs
Voltage
2
1
ADC
0
time
-1
-2
Binary
Value
15
12
8
4
0
-4
-8
-12
-16
time
Binary
Output
-2
15
01111
12
01100
8
01000
4
00100
Example of a 5 bit ADC converting
the output from a sample and hold
circuit to binary values
-1
1
11001
-4
11000
-8
10100
-12
10000
-16
2
Voltage
Input
Anechoic: An acoustic condition in which (virtually) no reflected echoes exist. This would occur if
two people were having a conversation suspended very high in the air. In practice anechoic
chambers can be built where the walls are made of specially constructed cones which do not reflect
any sound, but absorb it all. Having a conversation in an anechoic chamber can be awkward as the
human brain is expecting some echo to occur.
ANSI: American National Standards Institute. A group affiliated with the International Standards
Organization (ISO) that prepares and establishes standards for a wide variety of science and
engineering applications including transmission codes such as ASCII and companding standards
like µ-law, among other things. See also Standards.
ANSI/IEEE Standard 754: See IEEE Standard 754.
Anti-alias Filter: A filter used at the input to an A/D converter to block any frequencies above f s ⁄ 2 ,
where f s is the sampling frequency of the A/D (analog to digital) converter. The anti-alias filter is
analog and usually composed of resistive and capacitive components to provide good attenuation
above f s ⁄ 2 . With the introduction of general oversampling techniques and more specifically sigmadelta techniques, the specification for analog anti-alias filters is traded off against using
DSPedia
26
oversampling and digital low pass filters. See also Aliasing, Analog to Digital Converter,
Oversampling, Sampling, Sample and Hold.
Magnitude
Frequency domain
representation of antialias filter
fs/2
frequency
AntiAlias
Filter
To DSP
Processor
ADC
Magnitude
Magnitude
Analog input voltage
fs/2
fs/2
frequency
frequency
Frequency spectra of an analog signal before
and after being filtered by an anti-alias filter.
Aperture: The physical distance spanned by an array of sensors or an antenna dish. Aperture is a
fundamental quantity in DSP applications ranging from RADAR processing to SONAR array
processing to geophysical remote sensing.
sensors
array aperture
See also Beamforming, Shading Weights.
Aperture Taper: See Shading Weights.
Application Specific Integrated Circuit (ASIC): A custom designed integrated circuit targeted at
a specific application. For example, an ASIC could be designed that implements a 32 tap digital filter
with weights set up to provide high pass filtering for a digital audio system.
Architecture: The hardware set up of a particular DSP system. For example a system which uses
four DSP processors, may be referred to as a parallel processing DSP architecture. At the chip
level, inside most DSP processors a control bus, address bus and data bus are used that is often
referred to generically as the Harvard architecture. See also DSP Board, DSP Processor.
Arpanet: The name for a US Defense Department’s Advanced Research Projects Agency network
(circa 1969) which was the first distributed communications network and has now “probably”
evolved into the Internet.
Array (1): The name given to a set of quantities stored in a tabular or list type form. For example a
3 × 5 matrix could be stored as a 3 × 5 array in memory.
Array Multiplier:
27
Array (2): The general name given to a group of sensors/receivers (antennas, microphones, or
hyrophones for example) arranged in a specific pattern in order to improve the reception of a signal
impinging on the array sensors. The simplest form of array is the linear, or 1-D (one dimensional)
array which consists of a set of (often equally spaced) sensors. This array can be used to
discriminate angles of arrival in any plane containing the array, but is limited because of a cone of
confusion. This cone is the cone of angles of arrival that all give rise to identical time differences at
the array.
cone of
confusion
linear equi-spaced array
The 2-D array has a set of elements distributed in a plane and can be used to discriminate signals
in two dimensions of arrival angle. A similar, but less severe confusion results since signals from
opposite sides of the plane containing the array (top-bottom) give rise to the same time delays at
each of the elements. This may or may not be a problem depending on the geometry of the array
and the particular application of the array. 3-D arrays can also be used to eliminate this ambiguity.
See also Beamforming.
Array Multiplier: See Parallel Multiplier.
ASCII: American Standard Code for Information Interchange. A 7 bit binary code that defines 128
standard characters for use in computers and data transmission. See also EBCDIC.
Assembler: A program which takes mnemonic codes for operations on a DSP chip, and assembles
them into machine code which can actually be run on the processor. See also Cross Compiler,
Machine Code.
Assembly Language: This is a mnemonic code used to program a DSP processor at a relatively
low level. The Assembly language is then assembled into actual machine code (1’s and 0’s) that
can be downloaded to the DSP system for execution. The assembly language for DSP processors
from the various DSP chip manufacturers is different. See also Cross Compiler, Machine Code..
movep
clr
rep
mac
macr
movep
y:input, x:(r0)
a
#19
x0, y0, a
x0, x0, a
a, y:output
; input sample
x:(r0)+,x0
y:(r4)+, y0
x:(r0)+,x0
y:(r4)+, y0
r0); output filtered sample
A segment of Motorola DSP56000 assembly language to realize a 20 tap FIR filter
Asymptotic: When a variable, x, converges to a solution m, with the error e = x – m reducing with
increasing time, but never (in theory) reaching exactly m, then the convergence is asymptotic. For
example the function:
xn = 2–n
(23)
DSPedia
28
will asymptotically approach zero as n increases, but will never reach exactly zero. (Of course, if
finite precision arithmetic is used then the quantization error may allow this particular result to
converge exactly.)The function xn can be plotted as:
en
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
2
3
4
5
6
Iteration, n
See also Adaptive Signal Processing, Convergence, Critically Damped, Overdamped,
Underdamped.
Asynchronous: Meaning not synchronized. An asynchronous system does not work to the regular
beat of a clock, and is likely to use handshaking techniques to communicate with other systems.
See also Handshaking.
DSP
System 1
RTS
CTS
Tx
DSP
System 2
A simple protocol for handshaking. DSP system 1 send an RTS signal (request to send data) to DSP
System 2, which replies with a CTS signal (clear to send data) if it is ready to receive data. After the
handshake using RTS and CTS, the data can be transmitted on the Tx line.
Asynchronous Transfer Mode (ATM): A protocol for digital data transmission (e.g., voice or
video data) that breaks data from higher levels in a network into 53 byte cells comprising a 5 byte
header and 48 data bytes. The protocol allows for virtual circuit connections (i.e., like a telephone
circuit) and can be used to support a datagram network (i.e., like some electronic mail systems). In
spite of the word Asynchronous, ATM can be used over the ubiquitous synchronous optical network
(SONET).
Attack-Decay-Sustain-Release (ADSR): In general the four phases of the sound pressure level
envelope of a musical note comprise: (1) the attack, when the note is played; (2) the decay when
the note starts to reduce in volume from its peak; (3) the sustain where the note holds its volume
and the decay is slow and; (4) the release after the note is released and the volume rapidly decays
Attenuation:
29
Musical Note Volume
away. The ADSR profile of most musical instruments is different and varies widely for different
classes of instrument such as woodwind, brass, and strings.
Attack Decay
Sustain
Release
time
The amplitude envelope of a musical instrument can usually be characterized by four different
phases. The relative duration of each phase depends of course on the instrument being
played.
Specification of the ADSR values is a key element for synthesizing of musical instruments. See also
Music, Music Synthesis.
Attenuation: A signal is attenuated when its magnitude is reduced. Attenuation is often measured
as a (modulus) ratio ( V out ⁄ V in ) , or in dBs as 20 log 10 V out ⁄ V in . Note that an attenuation of 10
is equivalent to a gain of 10, expressed in dB, an attenuation of 20dB is equivalent to a gain of 20dB, i.e.,
1
- or Attenuation (dB) = −Gain (dB)
Attenuation Factor = -----------------------------Gain Factor
(24)
Therefore an attenuation factor of 0.1, is actually a gain factor of 10! The simplest form of attenuator
for analog circuits is a resistor bridge. Of course, to avoid loading the source it is more advisable to
use an op-amp based attenuator.) See also Amplifier.
Vin
Attenuator
time
Voltage
Voltage
Vout
time
Audio: Audio is the Latin word for “I hear” and usually used in the context of electronic systems
and devices that produce and affect what we hear.
Audio Evoked Potential: See Evoked Potentials.
Audio Engineering Society/ European Broadcast Union (AES/EBU): The AES/EBU is the
acronym used to describe a popular digital audio standard for bit serial communications protocol for
transmitting two channels of digital audio data on a single transmission line. The standard requires
the use of 32kHz, 44.1kHz or 48kHz sample rates. See also Standards.
Audio Engineering Society (AES): The Audio Engineering Society is a professional organization
whose area of interest is all aspects of audio. The international headquarters are at 60 East 42nd
DSPedia
30
Street, Room 2520, New York, NY 10165-2520, USA. The British is at AES British Section, Audio
Engineering Society Ltd, PO Box 645, Slough SL1 8BJ, UK.
Audiogram: An audiogram is a graph showing the deviation of a person’s hearing from the defined
“average threshold of hearing” or “Hearing Level”. The audiogram plots hearing level, dB (HL),
against logarithmic frequency for both ears. dB (HL) are used in preference to dB (SPL) - sound
pressure level - in order to allow a person’s hearing profile to be compared with a straight line
average unimpaired hearing threshold.
Threshold of
hearing
0dB (HL) line
Audiogram
-10
0
Hearing Level, dB (HL)
10
A “reasonably”
healthy ear.
20
30
40
An impaired ear
with high frequency
hearing loss
50
60
70
80
o - Right ear
x - Left ear
90
100
110
120
130
140
125
250
500
1000
2000
4000
8000
frequency (Hz)
An audiogram is produced by an audiologist using a calibrated audiometer to find the lowest level
of aural stimuli just detectable by a patient’s left and right ear respectively. See also Audiometry,
Auditory Filters, Ear, Equal Loudness Contours, Frequency Range of Hearing, Hearing Impairment,
Hearing Level, Permanent Threshold Shift, Sound Pressure Level, Temporary Threshold Shift,
Threshold of Hearing.
Audiology: The scientific study of hearing. See also Audiometry, Auditory Filters, Beat
Frequencies, Binaural Beats, Binaural Unmasking, Dichotic, Diotic, Ear, Equal Loudness Contours,
Equivalent Sound Continuous Level, Frequency Range of Hearing, Habituation, Hearing Aids,
Hearing Impairment, Hearing Level, Loudness Recruitment, Psychoacoustics, Sensation Level,
Sound Pressure Level, Spectral Masking, Temporal Masking, Temporary Threshold Shift,
Threshold of Hearing.
Audiometer: An instrument used to measure the sensitivity of human hearing using various forms
of aural stimuli at calibrated sound pressure levels (SPL). An audiometer is usually a desktop
instrument with a selection of potentiometric sliders, dials and switch controls to specify the
frequency range, signal characteristics and intensity of various aural stimuli. Audiometers connect
to calibrated headphones (for air conduction tests) or a bone-phone (to stimulate the mastoid bone
behind the ear with vibrations if tests are being done to detect the presence of nerve deafness).
Occasionally free-field loudspeaker tests may be done using narrowband frequency modulated
tones or warble tones. (If pure tones were used nodes and anti-nodes would be set up in the test
room at various points).
Audiometry:
31
The most basic form of audiometer is likely to only produce pure tones over a frequency range of
125Hz, 250Hz, 500Hz, 1000Hz, 2000Hz, 4000Hz, and 8000Hz. More complex audiometers will be
able to produce intermediate frequencies and also frequency modulated (FM) or warble tones,
bandlimited noise, and spectral masking noise. Because of the dynamic range of human hearing
and the severity of some impairments, an audiometer may require to be able to generate tones over
a 130dB (SPL) range.
Computer based, DSP audiometers are likely to completely displace the traditional analogue
electronic instruments over the next few years. DSP audiometers may even be integrated into PC
notebook style “DSP Audiometric Workstations”, capable of all forms of audiometric testing, hearing
aid testing, and programming of the impending future generation of DSP hearing aids. See also
Audiogram, Audiometry, Auditory Filters, Frequency Range of Hearing, Hearing Impairment,
Hearing Level, Sound Pressure Level, Spectral Masking, Threshold of Hearing.
Audiometry: Audiometry is the measurement of the sensitivity of the human ear [30], [157]. For
audiometric testing, audiologists use electronic instruments called audiometers to generate various
forms of aural stimuli.
A first test of any patient’s hearing is usually done with pure tone audiometry, using tones with less
than 0.05% total harmonic distortion (THD) at test frequencies of 125Hz, 250Hz, 500Hz, 1000Hz,
2000Hz, 4000Hz and 8000Hz and dynamic ranges of almost 130dB (SPL) for the most sensitive
human hearing frequencies between 2-4kHz. Each ear is presented with a tone lasting (randomly)
between 1 and 3 seconds; the randomness avoids giving rhythmic clues to the patient. The
loudness of the tones are varied in steps of 5 and 10dB until a threshold can be determined. The
patient indicates whether a tone was heard by clicking a switch. As an example of a test procedure,
the British Society of Audiology Test B [157] determines the threshold at a particular frequency as
follows:
1. Reduce the tone level in 10dB steps until the patient no longer responds;
2. Three further tones are presented at this level. If none or only one of these is heard, that level is taken as
unheard;
3. If all tones in stage 2 were heard, the level is reduced by 5dB until the level is unheard, by repeating stage 2
procedure;
4. If stage 2 was not heard the level is raised by 5dB and as many tones are presented as are necessary to deduce
whether at least 2 out of 4 presentations were heard. If this level is heard it is taken as the threshold for that
frequency;
5. If stage 4 was not heard the level is raised by 5dB and stage 4 is repeated until a threshold is found;
The results of an audiometric test are usually plotted as an audiogram, a graph of dB Hearing Level
(HL) versus logarithmic frequency.
A audiometric procedure using (spectral) masking is particularly important where one ear is
suspected to be much more sensitive than the other. Most audiometers will provide a facility to
produce spectral masking noise. Masking noise is generally white and is played into the ear that is
not being tested in cases where the tone presented to the test ear is very loud. If masking was not
used the conduction of the tone through the skull is heard by the other ear giving a false impression
about the sensitivity of the ear under test.
More complex audiometers provide a wider range of frequencies, and also facilities for producing
narrowband frequency modulated tones, narrowband noise, white noise, and speech noise, thus
providing for a more comprehensive facility for investigation of hearing loss. Audiometry is specified
DSPedia
32
in IEC 645, ISO 6189: 1983, ISO 8253: 1989. See also Audiogram, Audiology, Ear, Frequency
Range of Hearing, Hearing Impairment, Sensation Level, Sound Pressure Level, Spectral Masking,
Temporal Masking, Threshold of Hearing.
Auditory Filters: It is conjectured that a suitable model of the front end of the auditory system is
composed of a series of overlapping bandpass filters [30]. When trying to detect a signal of interest
in broadband background noise the listener is thought to make use of a filter with a centre frequency
close to that of the signal of interest. The perception to the listener is that the background noise is
somewhat filtered out and only the components within the background noise that lie in the auditory
filter passband remain. The threshold of hearing of the signal of interest is thus determined by the
amount of noise passing through the filter.
This auditory filter can be demonstrated by presenting a tone in the presence of noise centered
around the tone and gradually increasing the noise bandwidth while maintaining a constant noise
power spectral density. The threshold of the tone increases at first, however starts to flatten off as
the noise increases out with the bandwidth of the auditory filter. The bandwidth at which the tone
threshold stopped increasing is known as the critical bandwidth (CB) or equivalent rectangular
bandwidth (ERB). These filters are often assumed to have constant percent critical bandwidths (i.e.,
constant fractional bandwidths). For normal hearing individuals this bandwidth may be about 18
percent -- so an auditory filter centered at 1000 Hz would have a critical bandwidth of about 180 Hz.
The entire hearing range can be covered by about 24 (non-overlapping) critical bandwidths. See
also Audiology, Audiometry, Ear, Fractional Bandwidth, Frequency Range of Hearing,
Psychoacoustics, Spectral Masking, Temporal Masking, Threshold of Hearing.
Aural: Relating to the process of hearing. The terms monaural and binaural are related to hearing
with one and two ears respectively. See also Audiology, Binaural, Ear, Monaural, Threshold of
Hearing.
Auralization: The acoustic simulation of virtual spaces. For example simulating the sound of a
stadium (an open sound with large echo and long reverberation times) in a small room using DSP.
Autocorrelation: When dealing with stochastic (random) signals, autocorrelation, r ( n ) , provides
a measure of the randomness of a signal, x ( k ) and is calculated as:
r ( n ) = E { x ( k )x ( k + n ) } =
∑ x ( k )x ( k + n )p { x ( k ), x ( k + n ) }
(25)
k
where p { x ( k ), x ( k + n ) } is the joint probability density function of the signal or random process,
x ( k ) at times k and k+n. For ergodic signals using 2M available samples the autocorrelation can
be estimated as a time average:
1
r ( k ) = ----M
M–1
∑
x ( n )x ( n + k ) for large M
(26)
k=0
If the mean and autocorrelation of a signal are constant then the signal is said to be wide sense
stationary. In many least mean squares DSP algorithms the assumption of wide sense stationarity
is necessary for algorithm derivations and proofs of convergence.
Autoregressive (AR) Model:
33
If a signal is highly correlated from sample to sample, then for a particular sample at time, i, the next
sample at time i+1 will have a value that can be predicted with a small amount of error. If a signal
has almost no sample to sample correlation (almost white noise) then the sample value at time i+1
cannot be reliably predicted from values of the sequence occurring at or before time i. Calculating
the autocorrelation function, r ( n ) , therefore gives a measure of how well correlated (“or similar”) a
signal is with itself by comparing the difference between samples at time lags of n = 0,1,2,... and so
on.
Taking the discrete Fourier transform of the autocorrelation function yields the Power Spectral
Density (PSD) function which gives a measure of the frequency content of a stochastic signal. See
also Ergodic, Power Spectral Density.
Signal A
Magnitude
Magnitude
Signal B
time, k
time, k
r(n)
r(n)
1
1
Autocorrelation
n
Magnitude
Power
Spectral
Density
frequency
Magnitude
n
frequency
Signal A is more highly correlated than Signal B, and therefore from sample to sample, Signal
A varies less than Signal B. The autocorrelation function of Signal A is wider than for Signal B
because as n increases, samples are correlated with previous values and the signal does not
change its magnitude by a large amount. Signal B makes larger and less predictable changes
and as the lag value n increases the correlation between the i-th sample, and the (i+n)-th
sample reduces rapidly. By inspection Signal B has the wider frequency content, which is
confirmed on calculation of the Power Spectral Density function.
Autoregressive (AR) Model: An autoregressive model is a means of generating an
autoregressive stochastic process. Autoregressive refers to the fact that the signal is the output of
a all-pole infinite impulse response (IIR) filter that has been driven by white noise input [17], [90].
DSPedia
34
An autoregressive process can be generated by the signal flow graph and discrete time equations
below:
v(k)
u(k-M+1)
u(k-M)
bM
bM-1
u(k-2)
b2
u(k-1)
u(k)
b1
M
u(k ) =
∑
bn v ( k – n )
n=1
= b1 v ( k – 1 ) + b2 v ( k – 2 ) + … + bM – 1 v ( k – M + 1 ) + b M v ( k – M )
An autoregressive model has a feedback (recursive) section but no feedforward (nonrecursive) section. The input signal, v(k), is assumed to be white Gaussian noise. |The
output signal, u(k), is referred to as an autoregressive process. When setting the filter
weights values, { b n } care must be taken to ensure that the filter is stable and all filter
poles are within the unit circle of the z-domain. In addition, since the autoregressive model
is generated with a feedback system, it is necessary to let the AR system reach steady
state before using the output samples.
An M th order autoregressive model is generated from an all-pole digital filter that has M weights (b1
to bM). These weights are also referred to as the autoregressive parameters. The z-domain transfer
function can be represented by an M th order z-polynomial:
( z -) = -------------------------------------------------------------------------------------------------1
----------H( z) = U
–
1
V(z)
1 + b 1 z + … + b M – 1 z – M + 1 + b M z –M
1
= ---------------------------------M
1+
(27)
∑ b n z –n
n=1
If a stochastic signal is produced by using white noise as an input to an all-pole filter, then this is
referred to as autoregressive modelling. The name “autoregressive” comes from the Greek prefix
“auto-” meaning, self or one’s own, and “regression” meaning previous or past, hence the combined
meaning of a process whose output is generated from its own past outputs. Autoregressive models
are sometimes loosely referred to as all-pole models. In addition, sometimes the input to the all-pole
model is something other than white noise. For example, in modelling voiced speech a pulse train
with the desired pitch period drives the all-pole model.
Autoregressive models are widely used in speech processing and other DSP applications whereby
a stochastic signal is to be modelled by taking the output of an all-pole filter driven by a stochastic
signal. See also All-Zero Filter, Autoregressive Modelling, Autoregressive-Moving Average Filter,
Digital Filter, Infinite Impulse Response Filter.
Autoregressive Modelling (inverse):
35
Autoregressive Modelling (inverse): Given an M-th order autoregressive process the inverse
problem is to generate the AR model parameters which can be used to produce this process from
a white noise input:
Modelled Signal, or
Autoregressive Process
Autoregressive
Model
{b1, b2,..., bM}
White Noise
v(k)
u(k)
The output signal u ( k ) is referred to as an autoregressive process, and was generated by
a white noise input at v ( k ) . The autoregressive coefficients can be found using statistical
signal processing least squares techniques such as Yule-Walker or the LMS algorithm.
To do this, one common approach uses the AR process as the input to an M-th order (or greater)
all-zero filter with weights {1, b1, b2, ... bM}. If the M adjustable weights are selected to minimize the
output power, the output will be white noise process. In addition, the feed-forward coefficients from
the all-zero model will correspond the parameters of the autoregressive input process. This use of
an adaptive FIR predictor is referred to as autoregressive modelling [6], [10], [17]:
Modelled Signal, or
Autoregressive Process
White Noise
All-zero Filter
{1,b1, b2,..., bM-1}
u(k)
v(k)
The white noise signal v ( k ) can be reproduced by using the modelled stochastic signal as
an input to an all zero (FIR) filter with M weights, the first weight being 1.
u(k)
u(k-1)
u(k-M+1)
u(k-M)
Modelled Signal
b1
bM-1
bM
v(k)
White Noise
Generation of white noise from an autoregressive process using an all-zero filter.
To see that the AR parameters are recovered we can rewrite Eq. 27 (see Autoregressive Model) as:
( z )- = 1 + b z –1 + … + b
– M + 1 + b z –M
----------V(z) = U
M – 1z
M
1
H( z)
(28)
If a given stochastic signal, u ( k ) was in fact generated by an autoregressive process then we can
use mean square minimization techniques to find the autoregressive parameters (i.e., the all-pole
filter weights) that would produce that signal from a white noise input. First note that the output of
the all zero filter is given by:
DSPedia
36
M
v(k) = u( k) +
∑
bmu ( k – m ) = u( k ) + bT u ( k – 1 )
(29)
m=1
where the vector
b = [ b1 … bM – 1 bM ] T
and the vector
u( k) = [ u( k – 1 ) … u(k – M + 1 ) u(k – M )]T
If we attempt to minimize the signal v ( k ) at the output of the filter, then this is implicitly done by
generating the predictable components present in the stationary stochastic signal u ( k ) (assuming
the filter is of sufficient order) which means that the output v ( k ) will consist of the completely
unpredictable part of the signal which is, in fact, white noise (See Wold Decomposition and [17]).
To use MMSE techniques, first note that the squared output signal is:
v2( k ) = [ u( k ) + bT u( k – 1 ) ]2
= u 2 ( k ) + [ b T u ( k – 1 ) ] 2 + 2u ( k )b T u ( k – 1 )
(30)
= u 2 ( k ) + b T u ( k – 1 )u T ( k – 1 )b + 2b T [ u ( k )u ( k – 1 ) ]
Taking expected (or mean) values using the expectation operator E { . } we can write the mean
squared value, E { v 2 ( k ) } as:
E { v 2 ( k ) } = E { u 2 ( k ) } + b T E { u ( k – 1 )u T ( k – 1 ) }b + 2b T E { u ( k )u ( k – 1 ) }
(31)
Writing in terms of the M × M correlation matrix,
R = E { u ( k – 1 )u T ( k – 1 ) } =
r0
… rM – 2 rM – 1
:
…
…
:
r0
:
r1
rM – 1 …
r1
r0
rM – 2
(32)
and the M × 1 correlation vector,
r1
r = E { u ( k )u ( k – 1 ) } =
where
r n = E { u ( k )u ( k – n ) } = E { u ( k – n )u ( k ) }
r2
:
rM
(33)
.
gives,
E { v 2 ( k ) } = E { u 2 ( k ) } + b T Rb + 2b T r
(34)
Autoregressive Modelling (inverse):
37
Given that this equation is quadratic in b then there is only one minimum value (See entry for
Wiener-Hopf Equations for more details on quadratic surfaces). The minimum mean squared error
(MMSE) solution occurs when the predictable component in the signal u ( k ) is completely
predicted, leaving only the unpredictable white noise as the output. This yields the autoregressive
components, b AR , can be found by setting the (partial derivative) gradient vector, ∇ , to zero:
∇ = ∂ E { x 2 ( k ) } = 2Rb AR + 2r = 0
∂b
(35)
⇒ b AR = – R –1 r
(36)
Therefore, given a signal that was generated by an autoregressive process, Eq. 36 (known as the
Yule Walker equations) can be used to find the parameters of the autoregressive process, that
would generate the signal u ( k ) given a white noise input signal, v ( k ) .
To practically calculate Yule Walker equations requires that the R matrix and r vector are realized
from the stochastic signal u ( k ) , and the R matrix is then inverted prior to premultiplying vector r.
Assuming that the signal u ( k ) is ergodic, then in the real world we can calculate elements of R and
r from:
1
r n ≅ ---N
N–1
∑ u ( k )u ( k – n )
(37)
n=0
where N is a large number of samples that adequately represent the signal. Clearly, solving the
Yule-Walker equations requires a very large number of computations, and is usually not done
directly in real time systems (See entry Wiener-Hopf for more details). Instead the Levinson-Durbin
algorithm is used which is an efficient technique for solving equations of the form of Eq. 36. In many
systems the LMS (least mean squares) algorithm [53] is used in a predictor architecture:
Autoregressive
Process
u( k)
Adaptive
Filter, w
+
+
v( k)
LMS Algorithm
y ( k ) = Filter { u ( k ), w ( k ) }
w ( k + 1 ) = w ( k ) + 2µv ( k )x ( k – 1 )
The signal that was generated by an autoregressive process is input to the delay and
thereafter adaptive filter. The adaptive filter attempts to minimize the signal v ( k ) and will
therefore set the coefficients to values such that the periodic component of the signal is
predicted by the autoregressive filter weights.
Autoregressive modelling is widely used in speech processing and whereby speech is assumed to
be generated by an autoregressive process and by extracting the autoregressive filter weights
DSPedia
38
(parameters) these can be used for later generation of unvoiced speech components (speech
synthesis) or for speech vocoding [11]. For model based speech coding the linear prediction
problem of Eq. 36 is solved using the Levinson-Durbin algorithm. For speech coding techniques
based on waveform coding, the predictor is more likely to the of the simple LMS form.
Other stochastic linear filter models include the moving average (MA) model and the autoregressive
moving average (ARMA) models. However the autoregressive filter is by far the most popular for
modelling for the main reasons that to find weights requires the solution of a set of linear equations
and that it is a generally good model for many applications. The MA or ARMA models, on the other
hand, require the solution of a (more difficult to solve) set of non-linear equations.
See also Adaptive Filtering, Autoregressive Model, Autoregressive Moving Average Filter,
Autoregressive Parametric Spectrum Estimation, Least Mean Squares Algorithm, Moving Average
Model.
Autoregressive Moving Average (ARMA) Model: An autoregressive moving average model
uses a combination of an autoregressive model and moving average model. If white noise is input
to an ARMA model, the output is the desired process signal u ( k ) . Unfortunately solving the
equations for an ARMA model requires the solution of a set of non-linear equations. See also
Autoregressive Model, Moving Average FIR Filter.
Autoregressive Parametric Spectral Analysis: Using an autoregressive model we can perform
parametric power spectral analysis. From the coefficients of the all-pole filter, we can generate the
power spectrum of the autoregressive process output, u ( k ) (see above figure in Autoregressive
Model) by exploiting the fact that the white noise input has a flat spectrum and a total power of σ 2
[17], [90].Noting that the filter frequency response is:
1
H ( f ) = -----------------------------------------------------------------------------------------------------------------1 + b 1 e –jω + … + b M – 1 e –j ( M – 1 )ω + b M e –jMω
1
= --------------------------------------M
1+
(38)
∑ bn e –jωn
n=1
then the power spectrum of the autoregressive filter output is:
Y ( f ) 2 = σ2 H( f ) 2
(39)
(assuming frequency is normalized so fs=1). See also Autoregressive Model, Autoregressive
Modelling.
Autoregressive (AR) Power Spectrum: See Autoregressive Model.
Autoregressive (AR) Process: See Autoregressive Model.
Averaging: See Waveform Averaging, Exponential Averaging, Moving Average, Weighted
Moving Average.
AZTEC Algorithm: Amplitude Zone Time Epoch Coding (AZTEC) is an algorithm used for data
compression of ECGs. The algorithm very simply decomposes a signal into plateaus and slopes
AZTEC Algorithm:
39
which are then coded in an a data array. Compression ratios of a factor of 10 can be achieved,
however the algorithm can cause PRD (Percent Root-mean-square Difference) error levels of
almost 30% [48].
40
DSPedia
41
B
Back Substitution: See Matrix Algorithms - Back Substitution.
Band Matrix: See Matrix Structured - Band.
Bandpass Filter: A filter (analog or digital) that preserves portions of an input signal between two
frequencies. See also Bandstop Filter, Digital Filter, Low Pass Filter, High Pass Filter.
Bandwidth
Input
Output
G(f)
Magnitude
|G(f) |
Bandpass
Filter
Lower
cut-off
frequency
Upper
cut-off
frequency
frequency
Bandstop filter: A filter (analog or digital) that removes portions of an input signal between two
frequencies. See also Bandpass Filter, Low Pass Filter, High Pass Filter.
Stopband
Input
G(f)
Output
Magnitude
|G(f) |
Bandstop
Filter
Lower
cut-off
frequency
Upper
cut-off
frequency
frequency
Bandwagon: The general English definition is a party, cause or group that people may jump on,
or become involved with when it looks likely to succeed. The term was used by the famous
information theorist Claude Shannon in 1956 [130] to describe the explosion of interest in his then
recently published (1948) information theory paper. In referring to that particular bandwagon
Shannon commented that:
“Research rather than exposition is the keynote, and our critical thresholds should be raised. Authors should
submit only their best efforts, and these only after careful criticism by themselves and their colleagues. A few
first rate papers are preferable to a large number that are poorly conceived or half finished. The latter are no
credit to their writers and a waste of time to their readers.”
Bartlett Window: See Window.
Baseband: Typically, a signal prior to any form of digital or analog modulation. A baseband signal
extends from 0Hz contiguously over an increasing frequency range. For example if a radio station
produces a baseband audio signal (typically music, 0 - 12kHz) in either a digital or analog form, the
baseband signal is then modulated onto a carrier (such as 102.5MHz for an FM radio station) for
transmission and subsequent reception by radio receivers. At the radio receiver the signal will be
DSPedia
42
demodulated back to its original frequency band. Baseband can also refer to a naturally bandpass
signal that has been mixed down to DC.
Basis: See Vector Properties and Definitions - Basis.
Basis Function: A periodic signal, x ( t ) ,with period T can be expressed as a series of periodic
basis functions, { φ k } such that:
∞
∑
x(t) =
c n φn ( t )
(40)
n = –∞
A basis is said to be orthogonal if:
〈 φ i ( t ), φ i ( t )〉 =
b
∫a φ i ( τ )φj* ( τ )dτ
for i ≠ j
(41)
where “*” denotes complex conjugate. It is useful to find an orthogonal basis, as if other functions
are to be used to approximate a given signal, then it is useful to have as little similarity as possible
between the various functions to avoid providing redundant information. The complex exponential
used in the Fourier series are an orthogonal set of functions and if φ k ( t ) = e jkω 0 t where
ω 0 = 2π ⁄ T then this is the complex or exponential Fourier series See also Fourier Transform,
Matrix Operations.
Baud: A measure of data transmission rate, mean symbols per second. Baud is often mis-used to
mean bits per second. A baud is actually equal to the number of discrete events or transitions per
second. There is potential confusion over the proper use of the word baud since at high data
transmission speeds where data compression techniques are used (V42bis) the number of
character bits per second transmitted does not necessarily equal the transmitted data rate in
symbols per second.
Baugh-Wooley Multiplier: A type of parallel multiplier which operates on 2’s complement data
and is widely used in DSP [106]. See also Parallel Multiplier.
Bayes Theorem: See Probability.
Beamforming: A technique to enhance the sensitivity of a device towards a given direction (the
look direction) by exploiting the spatial separation of an array of sensors (microphones or
hydrophones for example). The array could be a linear 1-D array, 2-D array or even 3-D. The
primary motivation behind beamforming is often a desire to copy a signal of interest while
suppressing spatially disparate interfering signals. Delay-and-sum beamformers simply combine
the outputs of a number of sensors (after signals are delayed to allow constructive interference in
the look direction).
More advanced adaptive beamforming techniques go further by attempting to null out any signals
arriving from at the array that are not in the desired look direction. The key mechanisms responsible
for the spatial sensitivity of a beamformer are constructive and destructive interference. Bearing
estimation is related to beamforming, but not necessarily the same. A bearing estimator enhances
Direction of Arrival (DOA) information for signals of interest, while a beamformer produces an
enhanced copy of a signal of interest. See also Adaptive Beamformer, Bearing Estimation,
43
Broadside, Endfire, Constructive Interference,
Interference, Localization, Spatial Filtering.
Delay-and-Sum
Sidelobe
OUTPUT
Beamformer,
Destructive
Interfering
Signal in
Null region
Desired Signal
impinging on
mainlobe
DSP
Beamforming
Implementation
BEAMPATTERN
Beamformer shown with resultant beampattern (polar plot of spatial sensitivity).
Beampattern: A plot of spatial sensitivity of a beamformer (or antenna) as a function of direction.
The main lobe and sidelobes are often easily distinguished. Any nulls (direction with virtually no
sensitivity) are also clearly distinguished. Beampatterns can be plotted for a single frequency
(useful for a narrowband application) or as a broadband measure where the sensitivity in each
direction is integrated over the frequency span of interest. Broadband patterns seldom contain the
deep nulls that are present in narrowband patterns. See also Beamformer, Localization.
mainlobe
array gain
as a function
of angle
sidelobe
-15
-10
-5
0 dB contour
Typical Beampattern
Bearing Estimation: A classic signal processing problem where it is required to find the angular
direction of a number of incoming source signals. In bearing estimation, source signal copy is not
a concern. See also Beamforming, Localization.
Beat Frequencies: When two audible tones of similar frequencies are played together they will
effectively go in and out of phase with each other and alternately constructively and destructively
interfere. Depending on the frequencies and the magnitude of the difference between the tones
they may be aurally perceived as beat frequencies rather than two distinct tones. If the frequency
difference is no greater than about 10Hz then the ear will follow the amplitude fluctuations and
therefore perceive a low beat frequency. Beat frequencies are heard most clearly for tones between
around 300Hz and 600Hz. As the frequency of the tones increases above 1000-1500Hz the tones
will be heard distinctly rather than as beats. This phenomenon is consistent with the fact the neural
firings of the auditory system lose synchrony with the incoming sine wave at these frequencies.
Simple trigonometry shows that:
DSPedia
44
(A – B)
(A + B)
cos A + cos B = 2 cos ------------------ cos ------------------2
2
(42)
Therefore if a 100Hz tone and a 110Hz tone are played simultaneously the composite tone can be
written as:
( 2π10t )
( 2π210t )
cos ( 2π100t ) + cos ( 2π110t ) = 2 cos -------------------- cos -----------------------2
2
(43)
= 2 cos ( 2π5t ) cos ( 2π105t )
Amplitude
which can be represented as:
0
-1
Amplitude
100Hz tone
1
0.05
0.1
0.15
0.2
time (secs)
110Hz tone
1
0
-1
time (secs)
0.05
0.1
0.15
0.2
100Hz + 110Hz tone
Amplitude
2
1
0
time (secs)
-1
-2
0.05
0.1
0.15
0.2
The composite tone clearly shows the amplitude fluctuation at 10 times per second caused by the
5Hz modulation effect.
A phenomenon called binaural beats (as distinct from the above description of monaural beats)
occurs when a tone of one frequency is presented to one ear, and a slightly different tone frequency
is presented to the other ear [30]. The sound will appear to fluctuate at a rate corresponding to the
difference between the frequencies. See also Audiology, Binaural Beats, Binaural Unmasking,
Psychoacoustics.
45
Bell 103/ 113: The Bell 103/113 is a modem standard for communication at 300 bits/sec. The Bell
103/113 is a full duplex modem using FSK (frequency shift keying) modulation. The frequencies
used are:
Transmit:
Receive:
Originate
End (Hz)
Answer
End (Hz)
Space
1070
2025
Mark
1270
2225
Space
2025
1070
Mark
2225
1270
The transmit level is 0 to -12 dBm and the receive level is 0 to -50 dBm.
Although in the mid 1990s modem speeds of 14400 bits/sec are standard and (compressed) bit
data rates of 115200 bits/sec are achievable for remote computer communication, the 300 baud
modem is still one of the top selling modems! This is due to low rate modems being used for short
time connection applications where only a few bytes of data are exchanged, such as telephone
credit card verification, traffic light control, remote metering and security systems. See also Bell 202,
Standards, V-Series Recommendations.
Bell 202: The Bell 202 is a modem standard for communication at 1200 bits/sec. The Bell 202 is
a half duplex modem using FSK (frequency shift keying) modulation. The frequencies used are:
Transmit
(Hz)
Space
2200
Mark
1200
See also Bell 103/113, Bell 212, Standards, V-Series Recommendations.
Bell 212: The Bell 212 is a modem standard for communication at 1200 bits/sec. The Bell 202 is
a full duplex modem using QPSK (quadrature phase shift keying) modulation. The carrier
frequencies used are:
Originate
End (Hz)
Answer
End (Hz)
Transmit:
1200
2400
Receive:
2400
1200
DSPedia
46
Each keying carries two bits:
Message
(2 bits)
Phase
Angle
00
90o
01
0o
10
180o
11
270o
See also Bell 103/113, Bell 202, Standards, V-Series Recommendations.
Bento: Bento is a multimedia data storage and interchange format the development of which was
sponsored primarily by Apple Inc probably with the intention that it would become a de facto
standard. The standard is available from ftp://ftp.apple.com/apple/standards/. See
also Standards.
BER vs. S/N Test: (Bit Error Rate vs. Signal to Noise Ratio). A test used to measure the ability of
a modem (or a digital communication system) to operate over noise lines with a minimum of data
transfer errors. Since even on the best of telephone lines there is always some level of noise, the
modem should work with the lowest S/N ratio possible.
Bit Error Rate
10-2
10-3
10-4
10-5
10-6
4
6
8
10
12
14
16
Signal to Noise (dB)
Plot of BER vs. S/N for a typical modem operating at 1200 bits/second
Other modem performance characteristics include BER vs. Phase Jitter which demonstrates the
tolerance to phase jitter; BER vs. Receive Level which measures the sensitivity to the received
signal dynamic range (typically 36dB is the minimum desirable); BER vs. Carrier Offset which
indicates how the modem performance is affected by the shifts in the carrier frequency encountered
in normal public telephone networks (ITU-T specifications allow up to as a 7Hz offset).
Bessel Filter: See Filters.
Bidiagonal Matrix: See Matrix Structured - Bidiagonal.
47
Binary: Base 2, where only the digits 0 and 1 are used to represent numbers, e.g.
LSB
MSB
27
26
25
24
23
22
21
20
128
64
32
16
8
4
2
1
0
1
0
1
1
0
1
0
= 90
1
0
0
0
0
0
0
0
= 128
1
1
0
0
0
0
0
0
= 192
The decimal equivalents of the unsigned 8 bit numbers 01011010, 10000000, and 11000000.
See also Binary Point, Two’s Complement.
Binary Phase Shift Keying (BPSK): A special case of PSK in which two signals with differing
phase exist in the signal set. See also Phase Shift Keying.
Binary Point: The binary point is the base 2 equivalent of the decimal point. Bits after the binary
point have a fractional value. See also Fractional Binary, Integer Arithmetic, Two’s Complement. .
LSB
MSB
−20
2-1
2-2
2-3
2-4
2-5
2-6
2-7
−1
0.5
0.25
0.125
0.0625
0.03125
0
1
0
1
1
0
1
0
= 0.84375
1
0
0
0
0
0
0
0
= −1
1
1
0
0
0
0
0
0
= −0.5
0.015625 0.0078125
The decimal equivalents of 0.1011010, 1.0000000, and 1.1000000. Note that the 2’s
complement notation can still be used, with the most significant bit having a weighting of −1.
Binaural: Binaural processing refers to an audio system that processes signals for presentation to
two ears. See also Monaural, Monophonic, Stereophonic.
Binaural Beats: A phenomenon called binaural beats occurs when a tone of one frequency is
presented to one ear, and a slightly different tone frequency is presented to the other ear using
headphones. The sound will appear to fluctuate at a rate corresponding to the difference between
the frequencies. Binaural beats are a result of the interaction of the nervous system of the output of
the ear to the brain. Binaural beats would appear to indicate that the auditory nerve preserves
DSPedia
48
phase information about the acoustic stimulus [30]. See also Audiology, Beat Frequencies, Binaural
Unmasking, Psychoacoustics.
300Hz
310Hz
Listener will experience 10 binaural beats per second.
Binaural Unmasking: If a tone masked by white noise is played into one ear or both ears (diotic
stimulus) then the auditory mechanism will not perceive the tone without either increasing the tone
sound pressure level (SPL) or decreasing the white noise SPL. However if the tone + white noise
is played into one ear, and the white noise only into the other ear (dichotic stimulus) then the
auditory effect of binaural unmasking will actually make the tone more readily detectable.
Binaural unmasking will also occur when noise + tone is input to both ears, but the phase of one
the tones is shifted by 180o relative to the other one.)
Noise +Tone
Noise + Tone
Tone NOT perceived
The tone in the both ears is completely
masked by the white noise and
therefore not perceived.
Noise only
Noise + Tone
Tone perceived
If noise only is played into the right ear the
tone becomes readily detectable. Hence
the auditory mechanism is providing a form
of noise cancellation.
As a crude DSP analogy, compare this effect to the adaptive noise canceller whereby if a
(correlated) noise reference is available the noise in a speech + noise signal can be attenuated,
thus providing the improved SNR at the canceller output. See also Adaptive Noise Cancellation,
Audiometry, Dichotic, Diotic.
Biomedical Signals: Over the last few years biomedical signals such as ECGs, EEGs, Evoked
Potentials, EMGs have been recorded using DSP acquisition hardware, sampling at a few hundred
Hertz. There is now considerable work to develop DSP algorithms for analysis and classification,
and compression of sampled biomedical signals [48]. IEEE Transactions on Biomedical
Engineering is a good source for further information. See also ECG, EMG, Evoked Potentials.
Bipolar (1): A type of integrated circuit that uses NPN or PNP bipolar transistors in its construction
[45].
49
Bipolar (2): Bipolar refers to the type of signalling method used for digital data transmission, in
which either the marks or the spaces are indicated by successively alternating positive and negative
polarities. See also Non-return to Zero, Polar.
Bit: A single binary digit; a 0 (a space) or 1 (a mark).
Bit Error Rate (BER): The fraction of bits in error occurring in a received bit stream. BER is
calculated as the average number of bits in error, divided by the total number of bits in a given binary
digit data stream. See also BER vs. S/N Test.
Bit Reverse Addressing: Due to the nature of the FFT algorithm it is often required to access data
from memory in a non-arithmetic sequence (i.e. not 0,1,2, etc.) but in a sequence which is
generated by reversing the address bits. As this type of addressing is very common to a DSP
processor computing FFTs, this special addressing mode is available in some DSP processors to
make programming easier, and algorithm execution faster. See also Decimation-in-Time,
Decimation-in-Frequency, FFT.
Bit Serial Multiplier: See Parallel Multiplier.
Bitstream: Bitstream (Philips technology) DACs use sigma-delta technology to produce low cost
and precise digital to analog converters. See Sigma Delta.
Blackmann Window: See Window.
Blackmann-harris Window: See Window.
Blue Book: Shorthand name for the ITU-T regulations published in 1988 in 20 volumes and 61
Fascicles with a blue cover! (The ITU were known as the CCITT in 1988.) See also International
Telecommunication Union, Red Book, Standards.
Board: See DSP Board.
Bounded: When the upper and lower values of specific parameters of a signal (or function) are
known, or can be calculated or inferred from prior knowledge, then that parameter is said to be
bounded.
Boxcar Filter: See Moving Average.
Brick Wall Filter: This is a filter having a frequency response that falls off to zero with infinite slope
at some specified frequency. Although such filters are desirable in various DSP applications a true
brick wall filter does not exist, and approximations with tolerable errors must be made.
Magnitude
In the ideal brick wall filter, all frequencies below
f0 are passed by the filter, and all frequencies
above f0 are completely removed.
f0
Broadband: See Wideband.
frequency
DSPedia
50
Broadband Hiss: If a speech or music signal has a relatively low level of superimposed white
noise then this is referred to as broadband hiss. The term hiss is onomatopoeic -- the prolonged
sound of the “ss’s” gives a good simulation of the phenomenon. See also Dithering, White Noise.
Broadband Integrated Digital Services Network (BISDN): Generally, BISDN refers to the
information infrastructure provided by communications companies and institutions. The term
BISDN evolved from the Integrated Services Digital Network (ISDN) to be a superset of the
hardware and protocols provided by a previously adequate network infrastructure.
Broadside: A beamformer configuration in which the desired signal is located at right angles to the
line or plane containing an array of sensors. See also Beamforming, Endfire.
Σ
90o
BROADSIDE
Broadside Direction indicated for a linear array of sensors.
Buffer: Usually an area of memory used to store data temporarily. For example a large stream of
sampled data is buffered in memory as 1000 sample chunks prior to digital signal processing.
Buffers are also used in data communications to compensate for changes in the rate of data flow
(e.g., rate fluctuations due to data compression algorithms).
Voltage
Voltage Follower
Amplifier
time
Very low
power voltage
+
-
Voltage
Buffer Amplifier: An amplifier with a high input impedance and low output impedance that has a
voltage gain of one. If, for example, a sensor outputs an analog voltage that is of the appropriate
magnitude to input to an ADC, but it cannot deliver or sink enough current, then a buffer amplifier
can be used prior to the ADC converter. The simplest form of buffer amplifier to build is a voltage
follower with gain 1, implemented using an op-amp.
time
High power
voltage
Burst Errors: When a large number of bits are incorrect in a relatively short segment of data bits
then a burst error has occurred. In burst errors the average bit error rate is greatly exceeded by
multiple bit errors. When the number of bits in error is very high then non-interleaved error
correction schemes are unlikely to be successful and retransmission of the data may be required.
See also Channel Coding, Interleaving, Cross-Interleaved Reed-Solomon Coding.
51
Bus: The generic name given to a set of wires used to transmit digital information from one point
to another. A bus can be on-chip or off-chip. See also DSP Processor.
Busy Tone: Tones at 480 Hz and 620 Hz make up the busy tone for telephone systems.
Butterfly: The name given to the signal flow graph (SFG) element which can be used as a basic
computational element to construct an N point fast Fourier transform (FFT) computation. See also
FFT.
k
WN
-1
Butterworth Filter: See Filters.
Byte: 8 bits. 2 nibbles.
52
DSPedia
Cable (1):
53
C
Cable (1): One or more conductors (such as copper wire) or other transmission media (such as
optical fiber) within a protective sheath (usually plastic) to allow the efficient propagation of signals.
Cable (2): A generic name for cable TV systems using coaxial cable and/or optical fibers to
transmit signals. Cable was first introduced into areas of the USA where geographical features
prevented normal terrestrial TV reception. Within a few years of its introduction it proved so popular,
flexible and reliable that cable became widely available all over the USA. Currently cable companies
are involved in developing digital broadcast systems, and interactive TV viewing features.
Cache: A useful means of keeping often used data or information handy, a cache is simply a buffer
of memory whose contents are updated according to an algorithm that is designed to minimize the
number of data accesses that require looking beyond the cache memory. Both hardware and
software implementations of the cache algorithms are common in DSP systems.
Call Progress Detection (CPD): A technique for monitoring the connection status during initiation
of a telephone call by detecting the presence of call progress signalling tones such as the dialing
tone, or the engaged (busy) signals as commonly found in the telephone network.
Carrier Board: A printed circuit board that can host a number of daughter modules providing
facilities such as a DSP processor, memory, and I/O channels. A carrier board without daughter
modules has no real functionality. See also DSP Board, DSP Processor.
Carry Look-Ahead Adder: See entry for Parallel Adder.
Cassette Tape: See Compact Cassette Tape.
Cauchy-Schwartz Inequality: See Vector Properties - Cauchy-Schwartz.
Causal: A signal produced by a real device or system is said to be causal. If a signal generating
device is turned on at time, t = t 0 , then the resultant signal produced exists only after time, t = t 0 :

 x ( t ) if t ≥ t 0
y(t) = 
 0 if t < t 0

(44)
SIgnals that are not causal are said to be non-causal. Although in the real world all signals are
necessarily causal, from a mathematical viewpoint non-causal signals can be useful for the analysis
of signals and systems.
Central Processing Unit (CPU): The part of the processor that performs that actual processing
operations of addition, multiplications, comparison etc. The size of the arithmetic in the CPU usually
defines the processor wordlength. For example the DSP56002 has a 24 bit CPU, meaning that it is
a 24 bit processor. Usually the CPU wordlength matches the data bus width. If a DSP processor is
floating point, then the CPU will also be capable of floating point arithmetic. See also DSP
Processor.
DSPedia
54
Channel: The generic name given to the transmission path of any signal, which usually changes
the signal characteristics, e.g. a telephone channel.
Also used to mean the input or output port of a DSP system. For example a DSP board with two
ADCs and one DACs would be described as a twin channel input, single channel output system.
Channel Coding: This refers to the coding of information data that introduces structured
redundancy so that inevitable errors introduced by transmitting symbols over noisy channels will be
correctable (or at least detectable) at the receiver. The simplest channel codes are single bit parity
checks (a simple block code). Other, more involved block codes and convolutional codes exist. In
block coding a block of k data bits are encoded into n code bits to yield a rate k/n code. Block codes
tend to have large k and large n. In convolutional coding the coder maintains a memory of previous
data bits and outputs n code bits for each k input bits (using not only the input data bits but also
those data bits stored in the coder memory) to yield a rate k/n code. Convolutional codes tend to
have small values of k and n with coding strength determined by the amount of memory in the
coder. Block and/or convolutional coding techniques can be combined to produce very strong (often
cross-interleaved) codes. See also Source Coding, Interleaving, Cross-Interleaved Reed-Solomon
Code.
Characteristic Polynomial: In order to conveniently specify the code used for cyclic redundancy
coding (CRC) or a pseudo random binary sequence, a characteristic polynomial is often referred to.
For example, the divisor using in ITU-T V.41 error control is 10001000000100001 is easier to
represent as:
X 16 + X 12 + X 5 + 1
(45)
The index of each term in this polynomial indicates a 1 in the divisor (i.e. the divisor has 1’s at
positions 0, 1, 5, 12 and 16). See also Pseudo-Random Binary Sequence.
Chebyshev Filter: See Filters.
Character: Letter, number, punctuation or other symbol. Characters are the basic unit of textual
information. In DSP enabled data communication most characters are represented by ASCII codes.
See also ASCII, EBCDIC.
Chip: Integrated Circuit.
Chip Interval: The clocking period of a pseudo random binary sequence generator. See also
Pseudo Random Binary Sequence Generator.
Cholesky Decomposition: See Matrix Decompositions - Cholesky.
Chorus: A music effect where a delayed, and perhaps low pass filtered version of a signal is added
to the original signal to create a chorus or echoic sound. See also Music, Music Synthesis.
Chromatic Scale: The complete set of 12 notes in one octave of the Western music scale is often
referred to as the chromatic scale. Each adjacent note in the chromatic scale differs by one
semitone, which corresponds to multiplying the lower frequency by the twelfth root of 2, i.e.
2 1 / 12 = 1.0594631… . The chromatic scale is also known as the equitempered scale. See also
Western Music Scale.
Circulant Matrix: See Matrix Structured - Circulant.
Circular Buffers:
55
Circular Buffers: This is a effectively a programming concept that allows fast and efficient
implementation of shift registers in memory to allow convolutions, FIR filters, and correlations to be
executed with a minimum of data movement as each new data sample arrives. Modulo registers,
and indirect pointers facilitate circular buffers.
Circular Reasoning: See Reasoning, Circular.
CISC: Complex Instruction Set Computer (see RISC definition)
Clipping: The nonlinear process whereby the value of an input voltage is limited to some
maximum and minimum value. An analog signal with a magnitude larger than the upper and lower
bounds ± V max of an ADC chip, will be clipped. Any voltage above V max will be clipped and the
information lost. Clipping effects frequently occur in amplifiers when the amplification of the input
signal results in a value greater than the power rail voltages.
Vmax
time
-Vmax
Clipping
Circuit
time
Vout
-Vmax
Vmax Vin
Vout = Vin,
for Vin < Vmax
Vout = Vmax,
for Vin > Vmax
Clock: A device which produces a periodic square wave that can be used to synchronize a DSP
system. Current technology can produce extremely accurate clocks into the MHz range of
frequencies.
Clock Jitter: If the clock edges of a clock vary in time about their nominal position in a stochastic
manner, then this is clock jitter. In ADCs and DACs clock jitter will manifest itself as a raising of the
noise floor [78]. See also Quantization Noise.
CMOS (Complimentary Metal Oxide Silicon): The (power efficient) integration technology used
to fabricate most DSP processors.
Cochlea: The mechanics of the cochlea convert the vibrations from the bones of the middle ear
(i.e., the ossicles, often called the hammer, anvil and stirrup) into excitation of the acoustic nerve
endings. This excitation is perceived as sound by the brain. See also Ear.
Codebook Coding: A technique for data compression based on signal prediction. The
compressed estimate is derived by finding the model that most closely matches the signal based
on previous signals. Only the error between the selected model and the actual signal needs to be
transmitted. For many types of signal this provides excellent data compression since, provided the
codebook is sufficiently large, errors will be small. See also Compression.
DSPedia
56
Codec: A COder and DECoder. Often used to describe a matched pair of A/D and D/A converters
on a single CODEC chip usually with logarithmic quantizers (A-law for Europe and µ -law for the
USA.)
Coded Excited Linear Prediction Vocoders (CELP): The CELP vocoder is a speech encoding
scheme that can offer good quality speech as relatively low bit rates (4.8kbits/sec) [133]. The
drawback is that this vocoder scheme has a very high computational requirement. CELP is
essentially a vector quantization scheme using a codebook at both analyzer and synthesizer. Using
CELP a 200Mbyte hard disk drive could store close to 100 hours of digitized speech. See also
Compression.
Coherent: Refers to a detection or demodulation technique that exploits and requires knowledge
of the phase of the carrier signal. Incoherent or Noncoherent refers to techniques that ignore or do
not require this phase information.
Color Subsampling: A technique widely used in video compression algorithms such as MPEG1.
Color subsampling exploits the fact that the eye is less sensitive to the color (or chrominance) part
of an image compared to the luminance part. Since the eye is not as sensitive to changes in color
in a small neighborhood of a given pixel, this information is subsampled by a factor of two in each
dimension. This subsampling results in one-fourth of the number of chrominance pixels (for each of
the two chrominance fields) as are used for the luminance field (or brightness). See also Moving
Picture Experts Group.
Column Vector: See Vector.
Comb Filter: A comb digital filter is so called because the magnitude frequency response is
periodic and resembles that of a comb. (It is worth noting that the term “comb filter” is not always
used consistently in the DSP community.) Comb filters are very simple to implement either as an
FIR filter type structure where all weights are either 1, or 0, or as single pole IIR filters. Consider a
simple FIR comb filter:
x(k-N)
x(k)
N-delay elements
“+” or “-”
y ( k ) = x ( k )±x ( k – N )
The simple comb filter can be viewed as an FIR filter where the first and last filter weights
are 1, and all other weights are zero. The comb filter can be implemented with only a shift
register, and an adder; multipliers are not required. If the two samples are added then the
comb filter has a linear gain factor of 2 (i.e 6 dB) at 0 Hz (DC) thus in some sense giving a
low pass characteristics at low frequencies. And if they are subtracted the filter has a gain
of 0 giving in some sense a band stop filter characteristic at low frequencies.
The transfer function for the FIR comb filters can be found as:
Y ( z ) = X ( z ) ± z –N X ( z ) = ( 1 ± z –N )X ( z )
Y ( z )- =
⇒ H ( z ) = ----------( 1 ± z –N )
X(z)
(46)
Comb Filter:
57
The zeroes of the comb filter, are the N roots of the z-domain polynomial 1 ± z – N : Therefore for the
case where the samples are subtracted:
1 – z –N = 0
⇒ zn =
N
1 where n = 0…N – 1
⇒ zn =
N
e j2πn noting e j2πn = 1
⇒ zn =
(47)
j2πn----------e N
And for the case where the samples are added:
1 + z –N = 0
⇒ zn =
⇒ zn =
N
e
N
– 1 where n = 0…N – 1
1
j2π  n + ---

2
⇒ zn =
noting e
1
j2π  n + ---

2
1
j2π  n + ---
2
--------------------------N
e
= –1
(48)
DSPedia
58
As an example, consider a comb filter H ( z ) = 1 + z –8 and a sampling rate of f s = 10000 Hz . The
impulse response, h ( n ) , frequency response, H ( f ) , and zeroes of the filter can be illustrated as:
Imag
1
h(n)
1
Impulse Response
0.5
-1
0
z-domain
-0.5
0
0.5
1
Real
-0.5
1 2 3 4 5 6 7 8 time, n
Log Magnitude Freq. Response
H(f)
10
0
-20
1
0.5
-30
-40
Linear Magnitude Freq. Response
2
1.5
-10
Gain
20 log H ( f ) (dB)
-1
0
1000
2000
3000
4000
5000
frequency, (Hz)
0
1000
2000
3000
4000
5000
frequency, (Hz)
The impulse response, z-domain plot of the zeroes, and magnitude frequency response of
the comb filter, H ( z ) = 1 + z – 8 . Note that the comb filter is like a set of frequency selective
bandpass filters, with the first half-band filter having a low pass characteristic. The number
of bands from 0 Hz to fs/2 is N/2. The zeroes are spaced equally around the unit circle and
symmetrically about the x-axis with no zero at z = 1 . (There is a zero at z = – 1 if N is
odd.)
Comb Filter:
59
For the comb filter H ( z ) = 1 – z – 8 and a sampling rate of f s = 10000 Hz . The impulse response,
h ( n ) , frequency response, H ( f ) , and zeroes of the filter are:
Imag
h(n)
1
Impulse Response
1
z-domain
0.5
0
1 2 3 4 5 6 7
-1
8 time, n
-0.5
0
0.5
1
Real
-0.5
-1
Log Magnitude Freq. Response
H(f)
10
0
Gain
20 log H ( f ) (dB)
-1
-10
-20
1.5
1
0.5
-30
-40
Linear Magnitude Freq. Response
2
0
1000
2000
3000
4000
5000
frequency, (Hz)
0
1000
2000
3000
4000
5000
frequency, (Hz)
The impulse response, z-domain plot of the zeroes, and magnitude frequency response of
the comb filter, H ( z ) = 1 – z – 8 . The zeroes are spaced equally around the unit circle and
symmetrically about the x-axis. There is a zero at z = 1 .There is not a zero a z = – 1 if N
is odd.
FIR comb filters have linear phase and are unconditionally stable (as are all FIR filters). For more
information on unconditional stability and linear phase see entry for Finite Impulse response Filters.
Another type of comb filter magnitude frequency response can be produced from a single pole IIR
filter:
x(k)
“+” or “-”
b
y ( k ) = x ( k ) ±y ( k – N )
N-delay elements
y(k-N)
A single pole IIR comb filter. The closer the weight value b is to 1, then the sharper the teeth
of the comb filter in the frequency domain (see below). b is of course less than 1, or
instability results.
This type of comb filter is often used in music synthesis and for soundfield processing [43]. Unlike
the FIR comb filter note that this comb filter does require at least one multiplication operation.
Consider the difference equation of the above single pole IIR comb filter:
DSPedia
60
y ( k ) = x ( k )±b y ( k – N )
Y ( z )- = --------------------1 ⇒ G ( z ) = ----------X( z)
1 ± bz – N
(49)
For a sampling rate of f s = 10000 Hz , N = 8 and b = 0.6 the impulse response g ( n ) , the
frequency response, G ( f ) , and poles of the filter are:
Imag
z-domain
-0.5
0
0.5
1
20
15
20 log G ( f )
0.5
-1
Log Magnitude Freq. Response
(dB)
1
Real
-0.5
10
5
0
-5
-10
-1
1
G ( z ) = -------------------------1 – 0.6z – 8
0
1000
2000
3000
4000
5000
frequency, (Hz)
The z-domain plot of the filter poles and magnitude frequency response of one pole comb
filter. The poles are inside the unit circle and lie on a circle of radius 0.6 1 / 8 = 0.938….
As the feedback weight value, b, is decreased (closer to 0), then the poles move away from
the unit circle towards the origin, and the peaks of the magnitude frequency response
become less sharp and provide less gain.
Increasing the feedback weight, b , to be very close to 1, the “teeth” of the filter become sharper
and the gain increases:
1
z-domain
0.5
-1
-0.5
0
-0.5
0.5
1
Real
20 log G ( f ) (dB)
Imag
Log Magnitude Freq. Response
20
15
10
5
0
-5
-10
-1
0
1000
2000
3000
1
G ( z ) = ------------------------1 – 0.9z – 8
4000
5000
frequency, (Hz)
The z-domain plot of the filter poles and magnitude frequency response of a one pole comb
filter. The poles are just inside the unit circle and lie on a circle of radius 0.9 1 / 8 = 0.987….
Of course if b is increased such that b ≥ 1 then the filter is unstable.
The IIR comb filter is mainly used in computer music [43] for simulation of musical instruments and
in soundfield processing [33] to simulate reverberation.
Finally it is worth noting again that the term “comb filter” is used by some to refer to the single pole
IIR comb filter described above, and the term “inverse comb filter” to the FIR comb filter both
Comité Consultatif International Télégraphique et Téléphonique
61
described above. Other authors refer to both as comb filters. The uniting feature however of all
comb filters is the periodic (comb like) magnitude frequency response. See also Digital Filter, Finite
Impulse Response Filter, Finite Impulse Response FiIter-Linear Phase, Infinite Impulse Response
Filter, Moving Average Filter. .
English
Comité Consultatif International Télégraphique et Téléphonique (CCITT): The
translation of this French name is the International Consultative Committee on Telegraphy and
Telecommunication and is now known as the ITU-T committee. The ITU-T (formerly CCITT) is an
advisory committee to the International Telecommunications Union (ITU) whose recommendations
covering telephony and telegraphy have international influence among telecommunications
engineers and manufacturers. See also International Telecommunication Union, ITU-T.
Comité Consultatif International Radiocommunication (CCIR): The English translation of this
French names is the International Consultative Committee on Radiocommunication and is now
known as the ITU-R committee. The ITU-R (formerly CCIR) is an advisory committee to the
International Telecommunications
Union
(ITU)
whose
recommendations
covering
radiocommunications have international influence among radio engineers and manufacturers. See
also International Telecommunication Union, ITU-R.
is
the
Comité Européen de Normalisation Electrotechnique (CENELEC): CENELEC
European Committee for Electrotechnical Standardization. They provide European standards over
a wide range of electrotechnology. CENELEC has drawn up an agreement with European
Telecommunications Standards Institute (ETSI) to study telecommunications, information
technology and broadcasting. See also European Telecommunications Standards Institute,
International Telecommunication Union, International Organisation for Standards, Standards.
Common Intermediate Format (CIF): The CIF image format has 288 lines by 360 pixels/line of
luminance information and 144 x 180 of chrominance information and is used in the |TU-T H261
digital video recommendation. A reduced version of CIF called quarter CIF (QCIF) is also defined
in H261. The choice between CIF and QCIF depends on channel bandwidth and desired quality.
See also H-series Recommendations, International Telecommunication Union, Quarter Common
Intermediate Format.
Compact Cassette Tape: Compact cassette tapes were first introduced in the 1960s for
convenient home recording and audio replay. By the end of the 1970s compact cassette was one
of the key formats for the reproduction of music. Currently available compact cassettes afford a
“good” response of about 65dB dynamic range from 100Hz to 12000Hz or better. Compact cassette
outlived vinyl records, and is still a very popular format for music particularly in automobile audio
systems. In the early 1990s DCC (Digital Compact Cassette) was introduced which had backwards
compatibility with compact cassette. See also Digital Compact Cassette.
Compact Disc (CD): The digital audio system that stores two channels (stereo) of 16-bit music
sampled at 44.1kHz. Current CDs allow almost 70 minutes of music to be stored on one disc
(without compression). This is equivalent to a total of
2 × 44100 × 70 × 60 × 16 = 5927040000 bits of information.
(50)
CDs use cross-interleaved Reed-Solomon coding for error protection. See also Digital Audio Tape
(DAT), Red Book, Cross-Interleaved Reed-Solomon Coding.
DSPedia
62
Compact Disc-Analogue Records Debate: Given that the bandwidth of hi-fidelity digital audio
systems is up to 22.05kHz for compact disc (CD) and 24kHz for DAT it would appear that the full
range of hearing is more than covered. However this is one of the key issues of the CD-analogue
records debate. The argument of some analog purists is that although humans cannot perceive
individual tones above 20kHz, when listening to musical instruments which produce harmonic
frequencies above the human range of hearing these high frequencies are perceived in some
“collective” fashion. This adds to the perception of live as opposed to recorded music; the debate
will probably continue into the next century. See also Compact Disc, Frequency Range of Hearing,
Threshold of Hearing.
Compact Disc ROM (CD-ROM): As well as music, CDs can be used to store general purpose
computer data, or even video. Thus the disk acts like a Read Only Memory (ROM).
Companders: Compressor and expander (compander) systems are used to improve the SNR of
channels. Such systems initially attenuate high level signal components and amplify low level
signals (compression). When the signal is received the lower level signals appear at the receiving
end at a level above the channel noise, and when expansion (the inverse of the compression
function) is applied an improved signal to noise ratio is maintained. In addition, the original signal is
preserved by the inverse relationship between the compression and expansion functions. In the
absence of quantization, companders provide two inverse 1-1 mappings that allow the original
signal to be recovered exactly. Quantization introduces an irreversible distortion, of course, that
does not allow exact recovery of the original signal. See also A-law and µ -law.
Comparator: A device which compares two inputs, and gives an output indicating which input was
the largest.
Complex Base: In everyday life base 10 (decimal) is used for numerical manipulation, and inside
computers base 2 (binary) is used. When complex numbers are manipulated inside a DSP
processor, the real parts and complex parts are treated separately. Therefore to perform a complex
multiplication of:
( a + jb ) ( c + jd ) = ( ac – bd ) + j ( ad + bc )
(51)
where 16 bit numbers are used to represent a, b, c, and d will require four separate real number
multiplications and two additions. Therefore an interesting alternative (although not used in an
practice to the authors’ knowledge) is to use the complex base ( 1 + j ) , where only the digits 0, 1,
and j are used. Setting up a table of the powers of this base gives:
(1+j)4
(1+j)3
(1+j)2
(1+j)1
(1+j)0
-4
-2+2j
2j
1+j
1
0
0
1
1
0
1 + 3j
0
0
0
j
0
-1 - j
0
j
1
1
1
j
1
0
1
1
0
-3 + 3j
Complex
Decimal
Numbers in the complex base ( 1 + j ) can then be arithmetically manipulated (addition,
subtraction, multiplication) although this is not as straightforward as for binary!
Complex Conjugate:
63
Complex Conjugate: A complex number is conjugated by negating the complex part of the
number. The complex conjugate is often denoted by a "*". For example, if a = 5 + 7j , then
a∗ = 5 – 7j . (A complex number and its conjugate are often called a conjugate pair.) Note that
the product of aa* is always a real number:
aa∗ = ( 5 + 7j ) ( 5 – 7j ) = 25 + 35j – 35j + 49 = 25 + 49 = 74
(52)
and can clearly be calculated by summing the squares of the real and complex parts. (Taking the
square root of the product aa∗ is often referred to as the magnitude of a complex number.) The
conjugate of a complex number expressed as complex exponential is obtained by negating the
exponential power:
( e jω )∗ = e – jω
(53)
e jω = cos ω + j sin ω , and
(54)
e –jω = cos ( – ω ) + j sin ( – ω ) = cos ω – j sin ω
(55)
This can be easily seen by noting that:
given that cosine is an even function, and sine is an odd function. Therefore:
e jω e – jω = e 0 = cos2 ω + sin2 ω
(56)
A simple rule for taking a complex conjugate is: “replace any j by -j “. See also Complex Numbers.
Complex Conjugate Reciprocal: The complex conjugate reciprocal of a complex number is
found by taking the reciprocal of the complex conjugate of the number. For example, if z = a + bj ,
then the complex conjugate reciprocal is:
a + bj1- = ------------1 - = --------------------a – bj
a2 + b2
z∗
(57)
See also Complex Numbers, Pole-Zero Flipping.
Complex Exponential Functions: An exponent of a complex number times t, the time variable,
provides a fundamental and ubiquitous signal type for linear systems analysis: the damped
exponential. These signals describe many electrical and mechanical systems encountered in
everyday life, like the suspension system for an automobile. See also Damped Exponential.
Complex LMS: See LMS algorithm.
Complex Numbers: A complex number contains both a real part and a complex part. The
complex part is multiplied by the imaginary number j, where j is the square root of -1. (In other
branches of applied mathematics i is usually used to represent the imaginary number, however in
electrical engineering j is used because the letter i is used to denote electrical current.) For the
complex number:
a + jb
(58)
DSPedia
64
a is the real part, where a ∈ ℜ (ℜ is the set of real numbers) and jb is the imaginary part, where
b ∈ ℜ . Complex arithmetic can be performed and the result expressed as a real part and imaginary
part. For addition:
( a + jb ) + ( c + jd ) = ( a + c ) + j ( b + d )
(59)
( a + jb ) ( c + jd ) = ( ac – bd ) + j ( ad + bc )
(60)
and for multiplication:
Complex number notation is used to simplify Fourier analysis by allowing the expression of complex
sinusoids using the complex exponential e jω = cos ω + j sin ω . Also in DSP complex numbers
represent a convenient way of representing a two dimensional space, for example in an adaptive
beamformer (two dimensional space), or an adaptive decision feedback analyser where the inphase component is a real number, and the quadrature phase component is a complex number.
See also Complex Conjugate, Complex Sinusoid.
Complex Plane: The complex plane allows the representation of complex numbers by plotting the
real part of a complex number on the x-axis, and the imaginary part of the number on the y-axis.
Imaginary, ℑ
2 + 3j
4
3
3
2
1
-4 -3 -2 -1
-1
0
1
2
3
4
Real, ℜ
-2
-3.51- 3.49j
-3
-4
If a complex number is written as a complex exponential, then the complex plane plot can be
interpreted as a phasor diagram, such that for the complex number a + jb :
a + jb = Me jθ ,
(61)
where
M =
a2 + b2
b
θ = tan– 1  ---
 a
.
(62)
Conjugate Reciprocal:
65
If θ is a time dependent function such that θ = ωt , then the phasor will rotate in a counter-clockwise
direction with angular frequency of ω radians per second (or ω ⁄ ( 2π ) rotations per second, i.e.,
cycles per second or Hertz). See also z-plane, Complex Exponential.
Imaginary, ℑ
b
ω
M
θ
a Real, ℜ
Conjugate Reciprocal: See Complex Conjugate Reciprocal.
Complex Roots: When the roots of a polynomials are calculated, if there is no real solution, then
roots are said to be complex. As an example consider the following quadratic polynomial:
y = x2 + x + 1
(63)
The roots of this polynomial are when y = 0 . Geometrically this is where the are where the graph
of y cuts the x-axis. However plotting this polynomial it is clear that the graph does not cut the x-axis:
y
7
6
5
4
3
2
1
-4 -3 -2 -1
0
1
2
3
4
x
In this case the roots of the polynomial are not real. Using the quadratic formula we can calculated
the roots as:
– 1 ± 1 2 – 4x = ------------------------------2
1 ± 3j= –---------------2
(64)
3
1
--- + ------- j  x + --- – ------3- j
x2 + x + 1 =  x + 1

2 2 
2 2 
(65)
and therefore:
DSPedia
66
This example indicates the fundamental utility of complex number systems. Note that the
coefficients of the polynomial are real numbers. It is obvious from the plot of the polynomial that no
real solution to y(x) = 0 exists. However, the solution does exist if we choose x from the larger set
of complex numbers. In applications involving linear systems, these complex solutions provide a
tremendous amount of information about the nature of the problem. Thus real world phenomena
can be understood and predicted simply and accurately in a way not possible without the intuition
provided by complex mathematics. See also Poles, Zeroes.
Complex Sinusoid: See Damped Exponential.
Compression: Over the last few years compression has emerged as one of the largest areas of
real time DSP application for digital audio and video. The simple motivation is that the bandwidth
required to transmit digital audio and video signals is considerably higher than the analogue
transmission of the baseband analogue signal, and also that storage requirements for digital audio
and video are very high. Therefore data rates are reduced by essentially reducing the data required
to transmit of store a signal, while attempting to maintain the signal quality.
For example, the data rate of a stereo CD sampling at 44.1kHz, using 16 bit samples on stereo
channels is:
Data Rate = 44100 × 16 × 2 = 1411200 bits/sec
(66)
The often quoted CD transmission bandwidth (assuming binary signalling) is 1.5MHz. Compare this
bandwidth with the equivalent analog bandwidth of around 30kHz for two 15kHz analog audio
channels.
The storage requirements for 60 minutes of music in CD format are:
CD Storage Requirement = 44100 × 2 × 2 × 60 × 60 = 635 Mbytes/60 minutes
(67)
In general therefore CD quality PCM audio is difficult to transmit, and storage requirements are very
high. As discussed above, if the sampling rate is reduced or the data wordlength reduced, then of
course the data rate will be reduced, however the audio quality will also be affected. Therefore there
is a requirement for audio compression algorithms which will reduced the quantity of data, but will
not reduce the perceived quality of the audio.
For telecommunications where speech is coded at 8kHz using, for example, 8 bit words the data
rate is 64000 bits per second. The typical bandwidth of a telephone line is around 4000Hz, and
therefore powerful compression algorithms are clearly necessary. Similarly teleconferencing
systems require to compress speech coded at the higher rate of 16 kHz, and a video signal.
Ideally no information will be lost by a compression algorithm (i.e. lossless). However, the
compression achievable with lossless techniques is typically quite limited. Therefore most audio
compression techniques are lossy such that the aim of compression algorithm is to reduce the
components of the signal that do not matter such as periods of silence, or sounds that will not be
heard due to the psychoacoustic behaviour of the ear whereby loud sounds mask quieter ones.
For hi-fidelity audio the psychoacoustic or perceptual coding technique is now widely used to
compress by factors between 2:1 and almost 12:1. Two recent music formats, the mini-disk and
DCC (digital compact cassette) both use perceptual coding techniques and produce compress of
5:1 and 4:1 with virtually no (perceptual) degradation in the quality of the music. Digital audio
Condition Code Register (CCR):
67
compression will continue to be a particularly large area of research and development over the next
few years. Applications that will be enabled by real time DSP compression techniques include:
Telecommunications: Using toll-quality telephone lines to transmit compressed data and speech;
Digital Audio Broadcasting (DAB): DAB data rates must be as low as possible to minimise the required
bandwidth;
Teleconferencing/Video-phones: Teleconferencing or videophones via telephone circuits and cellular
telephone networks;
Local Video: Using image/video compression schemes medium quality video broadcast for organisations
such as the police, hospitals etc are feasible over telephones, ISDN lines, or AM radio channels;
Audio Storage: If a signal is compressed by a factor of M, then the amount of data that can be stored on
a particular medium increases by a factor of M.
The table below summarises a few of the well known audio compression techniques for both hifidelity audio and telecommunications. Currently there exist many different “standard” compression
algorithms, and different algorithms have different performance attributes, some remaining
proprietary to certain companies.
Algorithm
Compressio
n Ratio
Bit/rate,
kbits/sec
Audio
Bandwidth (Hz)
Example
Application
PASC
4:1
384
20kHz
DCC
Dolby AC-2
6:1
256
20kHz
Cinema Sound
MUSICAM
4:1 to 12:1
192 to 256
20kHz
Professional Audio
NICAM
2:1
676
16kHz
Stereo TV audio
ATRAC
5:1
307
20kHz
Mini-disc
ADPCM (G721)
8:5 to 4:1
16, 24, 32,
40
4kHz
Telecommunications
IS-54 VSELP
8:1
8
4kHz
Telecommunications
LD-CELP
(G728)
4:1
8
4kHz
Telecommunications
Video compression schemes are also widely researched, developed and implemented. The best
known schemes are Moving Picture Experts Group (MPEG) which is in fact both audio and video,
and the ITU H-Series Recommendations (H261 etc). The Joint Photographic Experts Group
(JPEG) standards and Joint Bi-level Image Group (JBIG) consider the compression of still images.
See also Adaptive Differential Pulse Code Modulation, Adaptive Transform Acoustic Coding
(ATRAC), Entropy Coding, Huffman Coding, Arithmetic Coding, Differential Pulse Code
Modulation, Digital Compact Cassette, G-Series Recommendations, H-Series Recommendations,
Joint Photographic Experts Group, MiniDisc, Moving Picture Experts Group, TransformCoding,
Precision Adaptive Subband Coding, Run Length Encoding.
Condition Code Register (CCR): The register inside a DSP processor which contains
information on the result of the last instruction executed by the processor. Typically bits (or flags)
in the CCR will indicate if the previous instruction had a zero result, positive result, overflow
DSPedia
68
occurred, the carry bit value. The CCR bits are then used to make conditional decisions (branching).
The CCR is sometimes called the Status Register (SR). See also DSP Processor.
Condition Number: See Matrix Properties - Condition Number.
Conditioning: See Signal Conditioning.
Conductive Hearing Loss: If there is a defect in the middle ear this can often reduce the
transmission of sound to the inner ear [30]. A simple conductive hearing loss can be caused by as
simple a problem as excessive wax in the ear. The audiogram of a person with a conductive hearing
loss will often indicate that the hearing loss is relatively uniform over the hearing frequency range.
In general a conductive hearing loss can be alleviated with an amplifying hearing aid. See also
Audiology, Audiometry, Ear, Hearing Aids, Hearing Impairment, Loudness Recruitment,
Sensorineural Hearing Loss, Threshold of Hearing.
Conjugate: See Complex Conjugate.
Conjugate Pair: See Complex Conjugate.
Conjugate Transpose: See Matrix Properties - Hermitian Transpose
Constructive Interference: The addition of two waveforms with nearly identical phase.
Constructive interference is exploited to produce resonance in physical and electrical systems.
Constructive interference is also responsible for energy peaks in diffraction patterns. See also
Destructive Interference, Beamforming, Diffraction.
Boundary
Incident Waves
Wave Peaks
Wave Valleys
Wave Peak Constructive Interference
Wave Valley Constructive Interference
Reflected Waves
Destructive Interference, i.e., Cancellation
Continuous Phase Modulation (CPM): A type of modulation in which abrupt phase changes are
avoided to reduce the bandwidth of the modulated signal. CPM requires increased decoder
complexity. See also Minimum Shift Keying, Viterbi Algorithm.
Continuous Variable Slope Delta Modulator (CVSD): A speech compression technique that
was used before ADPCM became popular and standardized by the ITU [133]. Although CVSD
Control Bus:
69
generally produces lower quality speech it is less sensitive to transmission errors than ADPCM. See
also Compression, Delta Modulation.
Control Bus: A collection of wires on a DSP processor used to transmit control information on chip
and off chip. An example of control information is stating whether memory is to be read from, or
written to. This would be indicated by the single R ⁄ W line. See also DSP Processor.
Convergence: Algorithms such as adaptive algorithms, are attempting to find a particular solution
to a problem by converging or iterating to the correct solution. Convergence implies that the correct
solution is found by continuously reducing the error between the current iterated value and the true
solution. When the error is zero (or, more practically, relatively small), the algorithm is said to have
converged. For example consider an algorithm which will update the value of a variable xn to
converge to the square root of a number, a. The iterative update is given by:
1
a-
x n + 1 = ---  x n + ---2
x 
(68)
n
where the initial guess, x0, is a/2. The error of e n = x n – a will reduce at each iteration, and
converge to zero. Because most algorithms converge asymptotically, convergence is often stated
to have occurred when a specified error quantity is less than a particular value.
Finding the square root of a = 15, using an iterative algorithm to converge to the solution of
a = 5.477 . Note that after only 6 iterations the algorithm has converged to within 0.03 of the
correct answer
Variable, xn
Error, en
16
14
12
10
8
6
4
2
0
1
2
3
4
5
6
Iteration, n
1
2
3
4
5
6
Iteration, n
10
8
6
4
2
0
Another example is a system identification application using an adaptive LMS FIR filter to model an
unknown system. Convergence is said to have occurred when the mean squared error between the
output of the actual system and the modelled one (given the same input) is less than a certain value
determined by the application designer. Algorithms that do not converge and perhaps diverge, are
usually labelled as unstable. See also Adaptive Signal Processing, Iterative Techniques.
Convolution: When a signal is input to a particular linear system the impulse response of the
system is convolved with the input signal to yield the output signal. For example, when a sampled
speech signal is operated on by a digital low pass filter, then the output is formed from the
convolution of the input signal and the impulse response of the low pass filter:
DSPedia
70
y(n ) = h(n ) ⊗ x(n ) =
∑ h ( k )x ( n – k )
(69)
k
x(n)
h(n)
For n < 0 both the signal
x(n) and the filter h(n) are
zero.
n
n
For n < 0 the convolution output
is 0. The summation occurs over
the summation variable, k.
n = -1
h(k)
x(-1-k)
k
n=0
h(k)
y(n)
Σ
x(0-k)
k
n=1
y(n)
n=1
k
n
y(n)
n=2
n=2
Σ
h(k)
x(2-k)
n
k
y(n)
n=7
h(k)
n
Σ
h(k)
x(1-k)
n=0
n=7
Σ
x(7-k)
k
n
Cooley-Tukey: J.W. Cooley and J.W. Tukey published a noteworthy paper in 1965 highlighting
that the discrete Fourier transform (DFT) could be computed in fewer computations by using the
fast Fourier transform (FFT) [66]. Reference to the Cooley-Tukey algorithm usually means the FFT.
See also Fast Fourier Transform, Discrete Fourier Transform.
Co-processor: Inside a PC, a processor that is additional to the general purpose processor (such
as the Intel 80486) is described as a co-processor and will usually only perform demanding
CORDIC:
71
computational tasks. For multi-media applications, DSP processors inside the PC to facilitate
speech processing, video and communications are co-processors.
CORDIC: An arithmetic technique that can be used to calculate sin, cos, tan and trigonometrical
values using only shift and adds of binary operands [25].
Core: All DSP applications require very fast MAC operations to be performed, however the
algorithms to be implemented, and the necessary peripherals to input data, memory requirements,
timers and on-chip CODEC requirements are all slightly different. Therefore companies like
Motorola are releasing DSP chips which have a common core but have on-chip special purpose
modules and interfaces. For example Motorola’s DSP56156 has a 5616 core but with other
modules, such as on-chip CODEC and PLL to tailor the chip for telecommunications applications.
See also DSP Processor.
Correlation: If two signals are correlated then this means that they are in some sense similar.
Depending on how similar they are, signals may be described as being weakly correlated or
strongly correlated. If two signals, x(k) and y(k), are ergodic then the correlation function, rxy(n) can
be estimated as:
1
r̂ xy ( n ) = -----------------2M + 1
M
∑
x ( k )y ( n + k )
for largeM
(70)
k = –M
Taking the discrete Fourier transform (DFT) of the autocorrelation function gives the cross spectral
density. See also Autocorrelation.
Correlation Matrix: Assuming that a signal x ( k ) is a wide sense stationary ergodic processes, a
3 × 3 correlation matrix can be formed by taking the expectation, E { . } , of the elements of the matrix
formed by multiplying the signal vector, x ( k ) = [ x ( k ) x ( k – 1 ) x ( k – 2 ) ] by its transpose to
produce the correlation matrix:
x(k)
T
=
=
k
E
E
k
[
(
)
]
(
)x
R
x
x(k – 1) x(k) x(k – 1) x(k – 2)
x(k – 2)
x2( k )
= E x ( k )x ( k – 1 )
x ( k )x ( k – 1 )
x2( k
– 1)
x ( k )x ( k – 2 ) x ( k – 2 )x ( k – 1 )
x ( k )x ( k – 2 )
x ( k – 1 )x ( k – 2 )
(71)
x2( k – 2 )
r0 r1 r2
=
r1 r0 r1
r2 r1 r0
where r n = E [ x ( k )x ( k – n ) ] . The correlation matrix, R is Toeplitz symmetric and for a more general
N point data vector the matrix will be N x N in dimension:
DSPedia
72
r0
r1
r2
… rN – 1
r1
r0
r1
… rN – 2
R = r
2
r1
r0
… rN – 3
(72)
:
:
:
…:
rN – 1 rN – 2 rN – 3 … ro
The Toeplitz structure (i.e., constant diagonal entries) results from the fact that the diagonal entries
all correspond to the same time lag estimate of the correlation, that is, n + k – n = n is constant.
To calculate r n statistical averages should be used, or if the signal is ergodic then time averages
can be used. See also Adaptive Signal Processing, Cross Correlation Vector, Ergodic, Expected
Value, Matrix, Matrix Structured - Toeplitz, Wide Sense Stationarity, Wiener-Hopf Equations.
Correlation Vector: See Cross Correlation Vector.
CORTES Algorithm: Coordinate Reduction Time Encoding Scheme (CORTES) is an algorithm
for the data compression of ECG signals. CORTES is based on the ATZEC and TP algorithms,
using the AZTEC to discard clinically insignificant data in the isoelectric region, and applying the TP
algorithm to clinically significant high frequency regions of the ECG data [48]. See also AZTEC,
Electrocardiogram, TP.
Critical Bands: It is conjectured that a suitable model of the human auditory system is composed
of a series of (constant fractional bandwidth) bandpass filters [30] which comprise critical bands.
When trying to detect a signal of interest in broadband background noise the listener is thought to
make use of a bandpass filter with a centre frequency close to that of the signal of interest. The
perception to the listener is that the background noise is somewhat filtered out and only the
components within the background noise that lie in the critical band remain. The threshold of
hearing of the signal of interest is thus determined by the amount of noise passing through the filter.
See also Auditory Filters, Audiology, Audiometry, Fractional Bandwidth, Threshold of Hearing.
Critical Distance: In a reverberant environment, the critical distance is defined as the separation
between source and receiver that results in the acoustic energy of the reflected waveforms being
equal to the acoustic energy in the direct path. A single number is often used to classify a given
environment, although the specific acoustics of a given room may produce different critical
distances for alternate source (or receiver) positions. Roughly, the critical distance characterizes
how much reverberation exists in a given room. See also Reverberation.
Cross Compiler: This is a piece of software which allows a user to program in a high level
language (such as ‘C’) and generate cross compiled code for the target DSP’s assembly language.
This code can in turn be assembled and the actual machine code program downloaded to the DSP
processor. Although cross-compilers can make program writing much easier, they do not always
produce efficient code (i.e. using minimal instructions) and hence it is often necessary to write in
assembly language (or hand code) either the entire program or critical sections of the program (via
in-line assembly commands in the higher level language program). Motorola produce a C cross
compiler for the DSP56000 series, and Texas Instruments produce one for the TMS320 series of
DSP processors.
Cross Correlation Vector: A 3 element cross correlation vector, p, for a signal d ( k ) and a signal
x ( k ) can be calculated from:
Cross Interleaved Reed Solomon Coding (CIRC):
p0
d ( k )x ( k )
p = E { d ( k )x ( k ) } = E d ( k )x ( k – 1 ) = p 1
d ( k )x ( k – 2 )
p2
73
(73)
Hence for an N element vector:
p0
p =
p1
:
pN – 1
(74)
where p n = E { d ( k )x ( k – n ) } , and E { . } is the expected value function. To calculate p n statistical
averages should be used, or if the signal is ergodic then time averages can be used. See also
Adaptive Signal Processing, Correlation Matrix, Ergodic, Matrix, Expected Value, Wide Sense
Stationarity, Wiener-Hopf Equations.
Cross Interleaved Reed Solomon Coding (CIRC): CIRC is an error correcting scheme which
was adopted for use in compact discs (CD) systems [33]. CIRC is an interleaved combination of
block (Reed-Solomon) and convolutional error correcting schemes. It is used to correct both burst
errors and random bit errors. On a CD player errors can be caused by manufacturing defects, dust,
scratches and so on. CIRC coding can be decoded to correct several thousand consecutive bit
errors. It is safe to say that without the signal processing that goes into CD error correction and error
concealment, the compact discs we see today would be substantially more expensive to produce
and, subsequently, the CD players would not be nearly the ubiquitous appliance we see today. See
also Compact Disc.
Cross-Talk: The interference of one channel upon another causing the signal from one channel to
be detectable (usually at a reduced level) on another channel.
Cut-off Frequency: The cut-off frequency of a filter is the point at which the attenuation of the filter
drops by 3dB. Although the term cut-off conjures up the image of a sharp attenuation, 3dB is
equivalent to 20log10 2 , i.e. the filtered signal output has half of the power of the input signal,
10log10 2 . For example the cut-off frequency of a low pass filter, is the frequency at which the filter
DSPedia
74
attenuation drops by 3dB when plotted on a log magnitude scale, and reduces by 2 on a linear
scale. A bandpass filter will have two cut-off frequencies. See also Attenuation, Decibels
Bandwidth
0
-5
-10
-15
-20
Gain Factor
Gain, dB
Bandwidth
Cut-off
frequency
frequency
1
0.75
0.5
0.25
0
Cut-off
frequency
frequency
The cut-off frequency, or 3dB point of a filter. The left hand side illustrates the cut-off followed
by the slow roll-off characteristic. The right hand side shows the same filter plotted as
attenuation factor (linear scale, not decibel) against frequency. The cut off occurs when the
attenuation is at 1 ⁄ [ 2 ]
Cyberspace: The name given to the virtual dimension that the world wide network (internet) of
connected computers gives rise to in the minds of people who spend a large amount of time “there”.
Without the DSP modems there would be no cyberspace! See also Internet.
Cyclic Redundancy Check (CRC): A cyclic redundancy check can be performed on digital data
transmission systems whereby it is required at the receiver end to check the integrity of the data
transmitted. This is most often used as an error detection scheme -- detected errors require
retransmission. If both ends know the algebraic method of encoding the original data the raw data
can be CRC coded at the transmission end, and then at the received end the cyclic (i.e., efficient)
redundancy can be checked. This redundancy check highlights the fact that bit transmission errors
have occurred. CRC techniques can be easily implemented using shift registers [40]. See also
Characteristic Polynomial, V-series Recommendations.
Cyclostationary: If the autocorrelation function (or second order statistics) of a signal fluctuates
periodically with time, then this signal is cyclostationary. See [75] for a tutorial article.
75
D
Damped Sinusoid: A common solution to linear system problems takes the form
e
( a + jb )t
at jbt
= e e
at
= e [ cos ( bt ) + j sin ( bt ) ]
.
(75)
where the complex exponent gives rise to two separate components, an exponential decay term,
at
e and a sinusoidal variation term [ cos ( bt ) + j sin ( bt ) ] . Common examples of systems that give
rise to damped sinusoidal solutions are the suspension system in an automobile or the voltage in a
passive electrical circuit that has energy storage elements (capacitors and inductors). Because
many physical phenomena can be accurately described by coupled differential equations (for which
damped sinusoids are common solutions), real world experiences of damped sinusoids are quite
common.
Data Acquisition: The general name given to the reading of data using an analog-to-digital
converter (ADC) and storing the sampled data on some form of computer memory (e.g., a hard disk
drive).
Data Bus: The data bus is a collection of wires on a DSP processor that is used to transmit actual
data values between chips, or within the chip itself. See also DSP Processor.
Data Compression: See Compression.
Data Registers: Memory locations inside a DSP processor that can be used for temporary storage
of data. The data registers are at least as long as the wordlength of the processor. Most DSP
processors have a number of data registers. See also DSP Processor.
Data Vector: The most recent N data values of a particular signal, x(k), can be conveniently
represented as a vector, xk , where k denotes the most recent element in the vector. For example,
if N = 5:
xk
40
20
0
1
2
3
4
5
6
7
8
9
10
11
12
-20
13
time, k
-40
xk
x7
xk – 1
x6
If x k = x k – 1
then, for example x 7 = x 5
xk – 3
x4
xk – 4
x3
– 23
– 20
= –9
11
29
More generally any type of data stored or manipulated as a vector can reasonable be referred to as
a data vector. See also Vector, Vector Properties, Weight Vector.
Data Windowing: See Window.
DSPedia
76
Daughter Module: Most DSP boards are designed to be hosted by an IBM PC. To provide input/
output facilities or additional DSP processors some DSP boards (then called motherboards) have
spaces for optional daughter modules to be inserted.
Decade: An decade refers the interval between two frequencies where one frequency is ten times
other. Therefore as an example from 10Hz to 100Hz is a decade, and from 100Hz to 1000Hz is a
decade and so on. See also Logarithmic Frequency, Octave, Roll-off.
Decibels (dB): The logarithmic unit of decibels is used to quantify power of any signal relative to
a reference signal. A power signal dB measure is calculated as 10log10(P1/P0). In DSP since input
signals are voltage, and Power = (Voltage)2 divided by Resistance we conventionally convert a
voltage signal into its logarithmic value by calculating 20log10(V1/V0). Decibels are widely used to
represent the attenuation or amplification of signals:
P
V
Attentuation = 10 log  ------ = 20 log  ------
 P 0
 V 0
(76)
where P o is the reference power, and V 0 is the reference voltage. dB’s are used because they
often provide a more convenient measure for working with signals (e.g., plotting power spectra)
than do linear measures.
Often the symbol dB is followed by a letter that indicates how the decibels were computed. For
example, dBm indicates a power measurement relative to a milliwatt, whereas dBW indicates
power relative to a watt. In acoustics applications, dB can be measured relative to various
perceptually relevant scales, such as A-weighting. In this case, noise levels are reported as dB(A)
to indicate the relative weighting (A) selected for the measurement. See Sound Pressure Level
Weighting Curves, Decibels SPL.
Decibels (dB) SPL: The decibel is universally used to measure acoustic power and sound
pressure levels (SPL). The decibel rating for a particular sound is calculated relative to a reference
power W o :
W1
10 log  --------
W0
(77)
dB SPL is sound pressure measured relative to 20 µ-Pascals ( 2 × 10 – 5 Newtons/m2). Acoustic
power is proportional to pressure squared, so pressure based dB are computed via 20log10
pressure ratios. Intensity (or power) based dB computations use 10log10 intensity ratios. The sound
level 0dB SPL is a low sound level that was selected to be around the absolute threshold of average
human hearing for a pure 1000Hz sinusoid [30]. Normal speech has an SPL value of about 70dB
SPL. The acoustic energy 200 feet from a jet aircraft at take-off about 125dB SPL, this is above the
threshold of feeling (meaning you can feel the noise as well as hear it). See also Sound Pressure
Level.
Decibels (dB) HL (3): Hearing Level (HL). See Hearing Level, Audiogram.
Decimation: Decimation is the process of reducing the sampling rate of a signal that has been
oversampled. When a signal is bandlimited to a bandwidth that is a factor of 0.5 or less than half of
the sampling frequency ( f s ⁄ 2 ) then the sampling rate can be reduced without loss of information.
Oversampling simply means that a signal has been sampled at a rate higher than dictated by the
77
Nyquist criteria. In DSP systems oversampling is usually done at integral multiples of the Nyquist
rate, f n , and usually by a power of two factor such as 4 x’s, 8 x’s or 64 x’s.
For a discrete signal oversampled by a factor R, then the sampling frequency, f s , is:
f s ≡ f ovs = Rf n
(78)
time
0
Magnitude
t
t
ovs = n ⁄ 4
Baseband
signal
freq
fovs/2
fovs
Magnitude
0
4
f
= 4f = ----n
ovs
tn
freq
fn/2 fn
2fn
4fn
d
freq
fn/2 fn
freq
fn/2 fn
fovs/2
Oversampling
ADC
Attenuation
Analog
Input
Attenuation
fovs
Analog anti-alias
filter
freq
2fn
4fn
2fn
Downsampler
Digital Low Pass Filter
fn/2 fn
1
f n = ----t
n
time
t
Magnitude
time
tn
Amplitude
Amplitude
Amplitude
For an R x’s oversampled signal the only portion of interest is the baseband signal extending from
0 to f n ⁄ 2 Hz. Therefore decimation is required. The oversampled signal is first digitally low pass
filtered to f n ⁄ 2 using a digital filter with a sharp cut-off. The resulting signal is therefore now
bandlimited to f n ⁄ 2 and can be downsampled by retaining only every R-th sample. Decimation for
a system oversampling by a factor of R = 4 can be illustrated as:
4fn
4
To DSP
Processor
Decimation of a 4 x’s oversampled signal, f ovs = 4f n by low pass digital filtering then
downsampling by 4, which retains every 4th sample. The decimation process is essentially
a technique whereby anti-alias filtering is being done partly in the analog domain and partly
in the digital domain. Note that the decimated Nyquist rate or baseband signal will be
delayed by the group delay, t d of the digital low pass filter (which we assume to be linear
phase).
For the oversampling example above where R = 4 , any frequencies that exist between f n ⁄ 2 Hz
and f ovs ⁄ 2 = 4f n after the analog anti-alias filter can be removed with a digital low pass filter prior
to downsampling by a factor of 4. Hence the complexity of the analogue low pass anti-alias filter
has been reduced by effectively adding a digital low pass stage of anti-alias filtering.
So why not just oversample, but not decimate? To illustrate the requirement for decimation where
possible, linear digital FIR filtering using an oversampled signal will require RN filter weights
(corresponding to T secs ) whereas the number of weights in the equivalent function Nyquist rate
filter will only be N (also corresponding to T secs ) Hence the oversampled DSP processing would
require to perform R2Nfn multiply/adds per second, compared to the Nyquist rate DSP processing
which requires Nfn multiply/adds per second, a factor of R 2 more. This is clearly not very desirable
and a considerable disadvantage of an oversampled system compared to a Nyquist rate system.
Therefore this is why an oversampled signal is usually decimated to the Nyquist rate, first by digital
low pass filtering, then by downsampling (retaining only every R-th sample).
DSPedia
78
The word decimation originally comes from a procedure within the Roman armies, where for acts
of cowardice the legionaires were lined up, and every 10th man was executed. Hence the prefix
“dec” meaning ten.
See also Anti-alias Filter, Downsampling, Oversampling, Upsampling, Interpolation, Sigma Delta.
Decimation-in-Frequency (DIF): The DFT can be reformulated to give the FFT either as a DIT or
a DIF algorithm. Since the input data and output data values of the FFT appear in bit-reversed
order, decimation-in-frequency computation of the FFT provides the output frequency samples in
bit-reversed order. See also Bit Reverse Addressing, Discrete Fourier Transform, Fast Fourier
Transform, Cooley-Tukey.
Decimation-in-Time (DIT): The DFT can be reformulated to give the FFT either as a DIF or a DIT
algorithm. Since the input data and output data values of the FFT appear in bit-reversed order,
decimation-in-time computation of the FFT provides the output frequency samples in proper order
when the input time samples are arranged in bit-reversed order. See also Bit Reverse Addressing,
Discrete Fourier Transform, Fast Fourier Transform, Cooley-Tukey.
Delay and Sum Beamformer: A relatively simple beamformer in which the output from an array
of sensors are subject to independent time delays and then summed together. The delays are
typically selected to provide a look direction from which the desired signal will constructively
interfere at the summer while signals from other directions are attenuated because they tend to
destructively interfere. The delays are dictated by the geometry of the array of sensors and the
speed of propagation of the wavefront. See also Adaptive Beamformer, Beamformer, Broadside,
Endfire.
Delays
Summer
Output
Σ
τ1
τ2
d1
d
τ n = -----nc
d2
τ3
90o
c is propagation
velocity
θ
Look Direction θ
τM
Sensors
In a delay-and-sum beamformer, the output from each of the sensors in an array is delayed an
appropriate amount (to time-align the desired signal) and then combined via a summation to generate
the beamformed output. No amplitude weighting of the sensors is performed.
Delay LMS: See Least Mean Squares Algorithm Variants.
Delta Modulation: Delta modulation is a technique used to take a sampled signal, x(n), and
encode the magnitude change from the previous sample and transmit only the single bit difference
( ∆ ) between adjacent samples [2]. If the signal has increased from the previous sample, then
encode a 1, if it had decreased then encode as a -1. The received signal is then demodulated by
taking successive delta samples and summing to reconstruct the original signal using an integrator.
Delta modulation can reduce the number of bits per second to be transmitted down a channel,
79
compared to PCM. However when using a delta modulator, the sampling rate and step size must
be carefully chosen or slope overload and/or granularity problems may occur. See also Adaptive
Differential Pulse Code Modulation,Continuously Variable Slope Delta Modulation, Differential
Pulse Code Modulation, Integrator, Slope Overload, Granularity Effects..
x(n)
1-bit
Quantizer
Σ
xd(n)
∆(n)
Channel
Low Pass
Filter
x(n)
fs
∫
De-modulator
Modulator
x(n)
∫
4
3
2
1
0
-1
-2
-3
-4
xd(n)
time
∆(n)
1
-1
time
Delta-Sigma: Synonymous term with Sigma Delta. See Sigma-Delta.
Descrambler: See Scrambler/Descrambler.
Destructive Interference: The addition of two waveforms with nearly opposite phase. Destructive
interference is exploited to cancel unwanted noise, vibrations, and interference in physical and
electrical systems. Destructive interference is also responsible for energy nulls in diffraction
patterns. See also Diffraction, Constructive Interference, Beamforming.
Determinant: See Matrix Properties - Determinant.
Diagonal Matrix: See Matrix Structured - Diagonal.
DSPedia
80
Dial Tone: Tones at 350 Hz and 440 Hz make up the dialing tone for telephone systems. See also
Dual Tone Multifrequency, Busy Tone, Ringing Tone.
~440 Hz
~350 Hz
50 Hz
mains
hum
Dichotic: A situation where the aural stimulation reaching both ears is not the same. For example,
setting up a demonstration of binaural beats is a dichotic stimulus. The human ear essentially
provides dichotic hearing whereby it is possible for the auditory mechanism to process the differing
information arriving at both ears and subsequently localize the source. See also Audiometry,
Binaural Unmasking, Binaural Beats,Diotic, Lateralization, .
Difference Limen (DL): The smallest noticeable difference between two audio stimuli, or the Just
Noticeable Difference (JND) between these stimuli. Determination of DL’s usually requires that
subjects be given a discrimination task. Typically, DL’s (or JND’s) are computed for two signals that
are identical in all respects save the parameter being tested for a DL. For example, if the DL is
desired for sound intensity discrimination, two stimuli differing only in intensity would be presented
to the subject under test. These stimuli could be tones at a given frequency that are presented for
a fixed period. It is interesting to note that the DL for sound intensity (measured in dB) is generally
found to be constant over a very wide range (this is known as Weber’s law).
To have meaning a DL must be specified along with the set up and conditions used to establish the
value. For example stating that the frequency DL for the human ear is 1Hz between the frequencies
of 1- 4 kHz requires that sound pressure levels, stimuli duration, and stimuli decomposition are
clearly stated as varying these parameters will cause variation in the measured frequency DL. See
also Audiology, Audiometry, Frequency Range of Hearing, Threshold of Hearing.
Differentiation: See Differentiator.
Differential Phase Shift Keying (DPSK): A type of modulation in which the information bits are
encoded in the change of the relative phase from one symbol to the next. DPSK is useful for
communicating over time varying channels. DPSK also removes the need for absolute phase
synchronization, since the phase information is encoded in a relative way. See also Phase Shift
Keying.
Differentiator: A (linear) device that will produce an output that is the derivative of the input. In
digital signal processing terms a differentiator is quite straightforward. The output of a differentiator,
y(t), will be the rate of change of the signal curve, x(t), at time t. For sampled digital signals the input
will be constant for one sampling period, and therefore to differentiate the signal the previous
sample value is subtracted from the current value and divided by the sampling period. If the
sampling period is normalized to one, then a signal is differentiated in the discrete domain by
81
subtracting consecutive input samples. A differentiator is implemented using a digital delay
element, and a summing element to calculate:
y ( n ) = x ( n ) –x( n – 1 )
(79)
In the z-domain the transfer function of a differentiator is:
Y ( z ) = X ( z ) –z –1 X ( z )
⇒
(80)
Y
( z )----------= 1 – z –1
X(z )
When viewed in the frequency domain a differentiator has the characteristics of a high pass filter.
Thus differentiating a signal with additive noise tends to emphasize or enhance the high frequency
components of the additive noise. See also Analog Computer, Integrator, High Pass Filter.
x(t)
x(t)
time
y(t)
d
x ( t )-----------dt
y(t)
3
2
1
time
Analog
Differentiation
x(n)
y(n)
x(n)
Discrete
time, n
∆t
+
x(n)
Σ
x(n-1)
∆x
( n )-------------∆t
y(n)
Discrete
time, n
Discrete
Differentiation
y(n)
X(z)
1–z
−
–1
Y(z)
∆
Time Domain Discrete Differentiator SFG
z-domain differentiator representation
Differential Pulse Code Modulation (DPCM): DPCM is an extension of delta modulation that
makes use of redundancy in analog signals to quantize the difference between a discrete input
signal and a predicted value to one of P values [2]. (Note a delta modulator has only one level ± 1 ).
The integrator shown below performs a summation of all input values as the predictor. More
x(n)
P-level
Quantizer
Σ
x̂ ( n )
∆(n)
Channel
∫
fs
∫
Modulator
De-modulator
Low Pass
Filter
x̃ ( n )
DSPedia
82
complex DPCM systems require a predictor filter in place of the simple integrator. Note that the
x(n)
P-level
Quantizer
Σ
x̂ ( n )
∆(n)
Channel
Linear Predictor
x̃ ( n )
fs
Linear Predictor
Modulator
De-modulator
predictor at the modulator end uses the same quantized error values as inputs that are available to
the predictor at the demodulator end. If the unquantized error values were used at the modulator
end then there would be an accumulated error between demodulator output and the modulator
input with a strictly increasing variance. This does not happen in the above configuration. See also
Adaptive Differential Pulse Code Modulation (ADPCM), Delta Modulation, Continuously Variable
Slope Delta Modulation (CVSD), Slope Overload, Granularity.
Diffraction: Diffraction is the bending of waves around an object via wave propagation of incident
and reflected waves impinging on the object. See also Constructive Interference, Destructive
Interference, Head Shadow.
Boundary
Incident Waves
Diffracted Waves
Example of diffraction of incident waves through an opening in a boundary.
Digital: Represented as a discrete countable quantity. When an analog voltage is passed through
an ADC the output is a digitized and sampled version of the input. Note that digitization implies
quantization.
Digital Audio: Any aspect of audio reproduction or recording that uses a digital representation of
analogue acoustic signals is often referred to generically as digital audio [33], [34], [37]. Over the
last 10-20 years digital audio has evolved into three distinguishable groups of application
dependent quality:
1. Telephone Speech 300 - 3400Hz: Typically speech down a telephone line is carried over a channel with a
bandwidth extending from around 300Hz to 3400Hz. This bandwidth is adequate for good coherent and
intelligible conversation. Music is coherent but unattractive. Clearly intelligible speech can be obtained by
83
sampling at 8kHz with 8 bit PCM samples, corresponding to an uncompressed bit rate of 64kbits/s.
2. Wideband Speech: 50 - 7000Hz: For applications such as teleconferencing prolonged conversation requires a
speech quality that has more naturalness and presence. This is accomplished by retaining low and high
frequency components of speech compared to a telephone channel. Music with the same bandwidth will have
almost AM radio quality. Good quality speech can be obtained by sampling at 16kHz with 12 bit PCM samples,
corresponding to a bit rate of 192kbits/s.
3. High Fidelity Audio: 20 - 20000Hz: For high fidelity music reproduction audio the reproduced sound should be
of comparable quality to the original sound. Wideband audio is sampled at one of the standard frequencies of 32
kHz, 44.1 kHz, or 48 kHz using 16 bit PCM. A stereo compact disc (44.1kHz, 16 bits) has a data rate of 1.4112
Mbits/s.
Generally, when one refers to digital audio applications involving speech materials only (e.g.,
speech coding) the term speech is directly included in the descriptive term. Consequently, digital
audio has come to connote high fidelity audio, with speech applications more precisely defined.
The table below summarizes the key parameters for a few well known digital audio applications.
Note that to conserve bandwidth and storage requirements DSP enabled compression techniques
are applied in a few of these applications.
Technology
Example
Application
Sampling
Rate (kHz)
Compression
Single Channel
Bit Rate (kbits/s)
Digital Audio Tape (DAT)
Professional recording
48
No
768
Compact Disc (CD)
Consumer audio
44.1
No
705.6
Digital Compact Cassette (DCC)
Consumer audio
32, 44.1, 48
Yes
192
MiniDisc (MD)
Consumer audio
44.1
Yes
146
Dolby AC-2
Cinema sound
48
Yes
128
MUSICAM (ISO Layer II)
Consumer broadcasting
32, 44.1, 48
Yes
16 - 192
NICAM
TV audio
32
Yes
338
PCM A/µ-law (G711)
Telephone
8
Yes
64
ADPCM (G721)
Telephone
8
Yes
16,24,32,40
LD-CELP (G728)
Telephone
8
Yes
16
RPE-LTP (GSM)
Telephone
8
Yes
13.3
Subband ADPCM (G722)
Teleconferencing
16
Yes
64
Digital Audio Systems
Although the digital audio market is undoubtedly very mature, the power of DSP systems is
stimulating research and development in a number of areas:
1. Improved compression strategies based on perceptual and predictive coding; compression ratios of up to 20:1
for hifidelity audio may eventually be achievable.
2. The provision of surround sound using multichannel systems to allow cinema and “living room” audiences to
experience 3-D sound.
3. DSP effects processing: remastering, de-scratching recordings, sound effects, soundfield simulation etc.
4. Noise reduction systems such as adaptive noise controllers, echo cancellers, acoustic echo cancellers,
equalization systems.
DSPedia
84
5. Super-fidelity systems sampling at 96kHz to provide ultrasound [154] (above 20kHz and which is perhaps more
tactile than audible), and systems to faithfully reproduce infrasound [138] (below 20Hz and which is most
definitely tactile and in some cases rather dangerous!)
Real-time digital audio systems are one of three types: (1) input/output system (e.g. telephone/
teleconferencing system); (2) output only (e.g. CD player); or (3) input only (e.g. DAT professional
recording). The figure below shows the key elements of a single channel input/output digital audio
system. The input signal from a microphone is signal conditioned/amplified as appropriate to the
input/output characteristic of the analogue to digital converter (ADC) at a sampling rate of f s Hz.
Prior to being input to the ADC stage the analogue signal is low pass filtered to remove all
frequencies above f s ⁄ 2 by the analogue anti-alias filter. The output from ADC is then a stream of
binary numbers, which are then compressed, coded and modulated for transmission, broadcasting
or recording via/to a suitable medium (e.g. FM radio broadcast, telephone call or CD mastering).
When a digital audio signal is received or read it is a stream of binary numbers which are
demodulated and decoded/decompressed with DSP processing into a sampled data PCM format
for input to a digital to analogue converter (DAC) which outputs to an analogue low pass
reconstruction filter stage (also cutting off at f s ⁄ 2 prior to being amplified and output to a
loudspeaker (e.g. reception of digital audio FM radio or a telephone call, or playback of a CD).
Acoustic
Analogue
Digital
Amp
Input
Signal recording and
conditioning
Acoustic
fs
fs
ADC &
AntiAlias
Filter
Analogue
DSP
Processing:
Coding/
Compression/
Modulation
DSP
Processing:
Decoding/
Decompression
/ Demodulation
Data
transmission/
broadcasting/
recording &
playback
DAC &
Reconstruction
Filter
Amp
Output
Signal conditioning
and reproduction
The generic single input, single output channel digital audio signal processing system.
See also Compact Disc, Data Compression, Digital Audio Tape, Digital Compact Cassette,
MiniDisc, Speech Coding.
Digital Audio Broadcasting (DAB): The transmission of electromagnetic carriers modulated by
digital signals. DAB will permit the transmission of high fidelity audio and is more immune to noise
and distortion than conventional techniques. Repeater transmitters can receive a DAB signal, clean
the signal and retransmit a noise free version. Currently there is a large body of interest in
developed DAB consumer systems using a combination of satellite, terrestrial and cable
transmission. For terrestrial DAB however there is currently no large bandwidth specifically
allocated for DAB, and therefore FM radio station owners may be required to volunteer their bands
for digital audio broadcasting. See also Compression,Standards.
Digital Audio Tape (DAT): An audio format introduced in the late 1980s to compete with compact
disc. DAT samples at 48kHz and used 16 bit data with stereo channels. Although DAT was a
commercial failure for the consumer market it has been adopted as a professional studio recording
85
medium. A very similar format of 8mm digital tape is also quite commonly used for data storage.
See also Digital Compact Cassette, MiniDisc.
Digital Communications: The process of transmitting and receiving messages (information) by
sending and decoding one of a finite number of symbols during a sequence of symbol periods. One
primary requirement of a digital communication system is that the information must be represented
in a digital (or discrete) format. See also Message,Symbol, Symbol Period.
Digital Compact Cassette (DCC): DCC was introduced by Philips in the early 1990s as a
combination of the physical format of the popular compact cassette, and featuring new digital audio
signal processing and magnetic head technology [83], [52], [150]. Because of physical constraints
DCC uses psychoacoustic data compression techniques to increase the amount of data that can
be stored on a tape. The DCC mechanism allows it to play both (analog) compact cassette tapes
and DCC tapes. The tape speed is 4.75cm/s for both types of tapes and a carefully designed thin
film head is used to achieve both digital playback and analog playback. The actual tape quality is
similar to that used for video tapes. DCC is a competing format to Sony’s MiniDisc which also uses
psychoacoustic data compression techniques.
If normal stereo 16 bit, 48kHz (1.536 Mbits/sec) PCM digital recording were done on a DCC tape,
only about 20 minutes of music could be stored due to the physical restrictions of the tape.
Therefore to allow more than an hour of music on a single tape data compression is required. DCC
uses precision adaptive subband coding (PASC) to compress the audio by a factor of 4:1 to a data
rate of 384 Mbits/s (192 Mbits/s per channel) thus allowing more than an hour of music to be stored.
PASC is based on psychoacoustic compression principles and is similar to ISO/MPEG layer 1
standard. The input to a PASC encoder can be PCM data of up to 20 bits resolution at sampling
rates of 48kHz, 44.1kHz or 32kHz. The quality of music from a PASC encoded DCC is arguably as
good as a CD, and in fact for some parameters such as dynamic range a prerecorded DCC tape
can have improved performance over a CD (see Precision Adaptive Subband Coding).
Eight to ten modulation and cross interleaved Reed-Solomon coding (CIRC) is used for the DCC
tape channel coding and error correction. In addition to the audio tracks DCC features an auxiliary
channel capable of storing 6.75kbits/sec and which can be used for storing timing, textual
information and copyright protection codes.
L
R
in
out
L
R
ADC
Digital
I/O
32
Channel
Subband
Filter
Psychoacoustic
Coding:
PASC
Error
Coding/
Error
Correction
Data
Modulation
Read/
Write
Head
DAC
The Digital Compact Cassette (DCC) compresses PCM encoded 48kHz, 44.1kHz or 32kHz
digital audio to a bit rate of 384 bits/s. The PCM input data can have up to 20 bits precision.
In terms of DSP algorithms the DCC also uses an IIR digital filter for equalization of the thin film
magnetic head frequency response, and a 12 weight FIR filter to compensate for the high frequency
roll-off of the magnetic channel. See also Compact Disc, Digital Audio, Digital Audio Tape (DAT),
MiniDisc, Precision Adaptive Subband Coding (PASC), Psychoacoustics.
DSPedia
86
Digital European Cordless Telephone (DECT): The DECT is a telephone whereby a wireless
radio connection at 1.9GHz communicates with a base station and is normally connected to the
public switched telephone network. One or more handsets can communicate with each other or the
outside world.
P out
Attenuation = 10 log ---------P in
0
-3
(f)
= 20 log Y
---------X( f)
Gain (dB)
-20
-40
-3dB point
Gain Factor
Digital Filter: A DSP system that will filter a digital input (i.e., selectively discriminate signals in
different frequency bands) according to some pre-designed criteria is called a digital filter. In some
situations digital filters are used to modify phase only [10], [7], [21], [31], [29]. A digital filter’s
characteristics are usually viewed via their frequency response and for some applications their
phase response (discussed in Finite Impulse Response Filter, and Infinite Impulse Response
Filter). For the frequency response, the filter attenuation or gain characteristic can either be
specified on a linear gain scale, or more commonly a logarithmic gain scale:
-60
1
(f)
Attenuation = H ( f ) = Y
---------X( f)
0.8
0.6
0.4
0.2
-80
1000
0
1000
frequency (Hz)
Linear Response
Logarithmic Response
X( f)
frequency (Hz)
Digital
Filter, H(f)
Y(f)
The above digital filter is a low pass filter cutting off at 1000Hz. Both the linear and
logarithmic magnitude responses of the transfer function, H ( f ) = Y ( f ) ⁄ X ( f ) are shown.
The cut-off frequency of a filter is usually denoted as the “3dB frequency”, i.e. at f3dB = 1000
Hz, the filter attenuates the power of a sinusoidal component signal at this frequency by
0.5, i.e.
P out
10 log ----------P in
f3dB
Y ( f 3dB )
= 20 log ------------------ = 10 log 0.5 = 20 log 0.707… = – 3 dB
X ( f 3dB )
The power of the output signal relative to the input signal at f3dB is therefore 0.5, and the
signal amplitude is attenuated by 1 ⁄ 2 = 0.707… . For a low pass filter signals with a
frequency higher than f3dB are attenuated by more than 3dB.
Digital filters are usually designed as either low pass, high pass, band-pass or band-stop:
0
0
0
Gain
0
frequency
Low Pass
frequency
High Pass
frequency
Band-Pass
frequency
Band-Stop
87
A number of filter design packages will give the user the facility to design a filter for an arbitrary
frequency response by “sketching” graphically:
Gain
0
frequency
User Defined Frequency Response
There are two types of linear digital filters, FIR (finite impulse response filter) and IIR (infinite
impulse response filter). An FIR filter is a digital filter that performs a moving, weighted average on
a discrete input signal, x ( n ) , to produce an output signal. (For a more intuitive discussion of FIR
filtering operation see entry for Finite Impulse Response Filter).
The arithmetic computation required by the digital filter is of course performed on a DSP processor
or equivalent:
x(t)
x(k)
0
time, t
AntiAlias
Filter
ADC
fs
0
time, k
DSP
Processor
fs
Recons
truction
Filter
DAC
y(k)
y(t)
0
time, k
0
time, t
Analogue
Digital
Analogue
The digital filter equations are implemented on the DSP Processor which processes the
time sampled data signal to produce a time sampled output data signal.
The actual frequency and phase response of the filter is found by taking the discrete frequency
transform (DFT) of the weight values of w 0 to w N – 1 .
An FIR digital filter is usually represented in a signal flow graph or with a summation (convolution)
equation:
DSPedia
88
x(k)
x(k-1)
x(k-2)
w1
w0
x(k-3)
w2
x(k-N+2)
w3
wN-2
x(k-N+1)
wN-1
y(k)
y ( k ) = w 0 x ( k ) + w 1 x ( k – 1 ) + w 2 x ( k – 2 ) + w 3 x ( k – 3 ) + ..... + w N – 1 x ( k – N + 1 )
N–1
=
∑
wn x ( k – n ) = w T xk
n=0
where w = w 0 w 1 w 2 … w N – 1 and x k = x ( k ) x ( k – 1 ) x ( k – 2 ) : x ( k – N + 1 )
The signal flow graph and the output equation for an FIR digital filter. The filter output y(k)
can be expressed as a summation equation, a difference equation or using vector notation.
The signal flow graph can be drawn in a more modular fashion by splitting the N element summer
into a series of two element summers:
x(k)
x(k-1)
w0
w1
x(k-2)
w2
x(k-3)
w3
x(k-N+2)
wN-2
x(k-N+1)
wN-1
y(k)
The signal flow graph for an FIR filter is often modularized in order that the large N element
summer is broken down into a series of N-1 two element summing nodes. The operation,
of course, of this filter is identical to the above.
An IIR digital utilizes feedback (or recursion) in order to achieve a longer impulse response and
therefore the possible advantage of a filter with a sharper cut off frequency (i.e., smaller transition
bandwidth - see below) but with fewer weights than an FIR digital filter with an analogous frequency
response. (For a more intuitive discussion on the operation of an IIR filter see entry for Infinite
Impulse Response Filter.) The attraction of few weights is that the filter is cheaper to implement (in
89
terms of power consumption, DSP cycles and/or cost of DSP hardware). The signal flow graph and
output equation for an IIR filter is:
x(k)
a0
a1
∑
b3
a2
2
y( k) =
y(k-3)
x(k-2)
x(k-1)
y(k-2)
b2
y(k-1)
y(k)
b1
3
an x ( k – n ) +
n=0
∑
bn y( k – n)
n=1
= a 0 x ( k ) + a1 x ( k – 1 ) + a 2 x ( k – 2 ) + b 1 y ( k – 1 ) + b 2 y ( k – 2 ) + b3 y ( k – 3 )
x(k )
y(k – 1)
= a T x k + b T yk – 1 = a 0 a 1 a 2 x ( k – 1 ) + b1 b 2 b 3 y ( k – 2 )
x(k – 2)
y(k – 3 ))
A signal flow graph and equation for a 2 zero, 3 pole IIR digital filter. The filter output y(k)
can be expressed as a summation equation, a difference equation or using vector notation.
Design algorithms to find suitable weights for digital FIR filters are incorporated into many DSP
software packages and typically allow the user to specify the parameters of:
•
•
•
•
•
•
•
Sampling frequency;
Passband;
Transition band;
Stopband;
Passband ripple;
Stopband attenuation;
No. of weights in the filter.
DSPedia
90
These parameters allow variations from the ideal (brick wall) filter, with the trade-offs being made
by the design engineer. In general, the less stringent the bounds on the various parameters, then
the fewer weights the digital filter will require:
Transition
Band Stopband
Passband
Passband
Ripple
Low Pass
-3
Gain (dB)
-3
Gain (dB)
Transition
Stopband Band
Stopband
Attenuation
Passband
Passband
Ripple
High Pass
Stopband
Attenuation
“Ideal” Filter
“Ideal” Filter
Stop- Transition
Band
band
Passband
Gain (dB)
-3
Band-Pass
frequency
fs/2
Transition StopBand
band
Passband
Ripple
Stopband
Attenuation
“Ideal” Filter
frequency
Passband
fs/2
Transition Transition
Band Stop- Band Passband
band
-3
Gain (dB)
frequency
Band-Stop
Stopband
Attenuation
“Ideal” Filter
fs/2
frequency
Parameters for specifying low pass, high pass, band-pass and band stop filters
fs/2
91
After the filter weights are produced by DSP filter design software the impulse response of the
digital filter can be plotted, i.e. the filter weights shown against time:
w0 = w30 = 0.00378...
w1 = w29 = 0.00977...
w2 = w28 = 0.01809...
w3 = w27 = 0.02544...
w4 = w26 = 0.027154...
w5 = w25 = 0.019008...
w6 = w24 = 0.00003...
w7 = w23 = -0.02538...
w8 = w22 = -0.04748...
w9 = w21 = -0.05394...
w10 = w20 = -0.03487...
w11 = w19 = 0.01214...
w12 = w18 = 0.07926...
w13 = w17 = 0.14972...
w14 = w16 = 0.20316...
w15 = 0.22319...
(Truncated to 5 decimal places)
h(n)
0.25
0.20
0.15
0.10
1
T = ---------------- secs
10000
0.05
0
time, n
-0.05
30
20
10
DESIGN 1: Low Pass FIR Filter Impulse Response
The impulse response h ( n ) = w n of the low pass filter specified in the above SystemView
dialog boxes: cut-off frequency 1000 Hz; passband gain 0dB; stopband attenuation 60dB;
transition band 500 Hz; passband ripple 5dB and sampling at fs = 10000 Hz. The filter is
linear phase and has 31 weights and therefore an impulse response of duration 31/10000
seconds. For this particular filter the weights are represented with floating point real
numbers. Note that the filter was designed with 0dB in the passband. As a quick check the
sum of all of the coefficients is approximately 1, meaning that if a 0 Hz (DC) signal was
input, the output is not amplified or attenuated, i.e. gain = 1 or 0 dB.
From the impulse response the DFT (or FFT) can be used to produce the filter magnitude frequency
response and the actual filter characteristics can be compared with the original desired
specification:
20 log H ( f )
H(f)
1.2
Gain (dB)
Gain
1.0
0.8
0.6
0.4
0.2
0
1000
2000
3000
4000
5000
frequency (Hz)
Linear Magnitude Response
0
-10
-20
-30
-40
-50
-60
-70
-80
0
1000
2000
3000
4000
5000
frequency (Hz)
Logarithmic Magnitude Response
The 1024 point FFT (zero padded) of the above DESIGN 1 low pass filter impulse
response. The passband ripple is easier to see in the linear plot, whereas the stopband
ripple is easier to see in the logarithmic plot.
DSPedia
92
Amplitude
To illustrate the operation of the above digital filter, a chirp signal starting at a frequency of 900 Hz,
and linearly increasing to 1500 Hz over 0.05 seconds (500 samples) can be input to the filter and
the output observed (individual samples are not shown):
0
1.00e-2
2.00e-2
3.00e-2
4.00e-2
0
time
0
1.00e-2
Amplitude
3.00e-2
4.00e-2
1 -----------secs
1500
1--------secs
900
0
2.00e-2
1000 Hz Cut off Low
Pass Digital Filter
1.00e-2
2.00e-2
3.00e-2
4.00e-2
0
time
0
1.00e-2
2.00e-2
3.00e-2
4.00e-2
As the chirp frequency reaches about 1000 Hz, the digital filter attenuates the amplitude output
signal by a factor of around 0.7 (3dB) until at 1500 Hz the signal amplitude is attenuated by more
than 60 dB or a factor of 0.001.
If a low pass filter with less passband ripple and a sharper cut off is required then another filter can
be designed, although more weights will be required and the implementation cost of the filter has
therefore increased. To illustrate this point, if the above low pass filter is redesigned, but this time
with a stopband attenuation of 80dB, a passband ripple of 0.1dB and a transition band of, again,
93
500 Hz, the impulse response of the filter produced by the DSP design software now requires 67
weights:
h(n)
0.25
0.20
0.15
0.10
1
T = ---------------- secs
10000
0.05
0
time, n
-0.05
10
20
40
30
50
60
70
DESIGN 2: Low Pass FIR Filter Impulse Response
The impulse response h ( n ) = w n of a low pass filter with: cut-off frequency 1000 Hz;
passband gain 0dB; stopband attenuation 80dB; transition band 500 Hz; passband ripple
0.1dB and sampling at fs = 10000 Hz. The filter is linear phase and has 67 weights
(compare to the above Design 1 which had 31 weights) and therefore an impulse response
of duration 67/10000 seconds.
The frequency response of this Design 2 filter can be found by taking the FFT of the digital filter
impulse response:
H(f)
20 log H ( f )
1.2
0.8
Gain (dB)
Gain
1.0
0.6
0.4
0.2
0
1000
2000
3000
4000
5000
frequency (Hz)
Linear Magnitude Response
0
-10
-20
-30
-40
-50
-60
-70
-80
0
1000
2000
3000
4000
5000
frequency (Hz)
Logarithmic Magnitude Response
The 1024 point FFT (zero padded) of the DESIGN 2 impulse response low pass filter
impulse response. Note that, as specified, the filter roll-off is now steeper, the stopband is
almost 80 dB and the inband ripple is only fractions of a dB.
Therefore low pass, high pass, bandpass, and bandstop digital filters can all be released by using
the formal digital filter design methods that are available in a number of DSP software packages.
(Or if you have a great deal of time on your hands you can design them yourself with a paper and
pencil and reference to one of the classic DSP textbooks!) There are of course many filter design
DSPedia
94
trade-offs. For example, as already illustrated above, to design a filter with a fast transition between
stopband and passband requires more filter weights than a low pass filter with a slow roll-off in the
transition band. However the more filter weights, the higher the computational load on the DSP
processor, and the larger the group delay through the filter is likely to be. Care must therefore be
taken to ensure that the computational load of the digital filter does not exceed the maximum
processing rate of the DSP processor (which can be loosely measured in multiply-accumulates,
MACs) being used to implement it. The minimum computation load of DSP processor implementing
a digital filter in the time domain is at least:
Computational Load of Digital Filter = ( Sampling Rate × No. of Filter Weights ) MACs
(81)
and likely to be a factor greater than 1 higher due to the additional overhead of other assembly
language instructions to read data in/out, to implement loops etc. Therefore a 100 weight digital filter
sampling at 8000 Hz requires a computational load of 800,000 MACs/second (readily achievable in
the mid-1990’s), whereas for a two channel digital audio tape (DAT) system sampling at 48kHz and
using stereo digital filters with 1000 weights requires a DSP processor capable of performing almost
100 million MACs per second (verging on the “just about” achievable with late-1990s DSP
processor technology). See also Adaptive Filter, Comb Filter, Finite Impulse Response (FIR) Filter,
Infinite Impulse Response (IIR) Filter, Group Delay, Linear Phase.
Digital Filter Order: The order of a digital filter is specified from the degree of the z-domain
polynomial. For example, an N weight FIR filter:
y ( k ) = w 0 x ( k ) + w 1 x ( k – 1 ) + …w N – 1 x ( k – N + 1 )
(82)
can be written as an N-1th order z-polynomial:
Y ( z ) = X ( z ) [ w 0 + w 1 z –1 + w N – 1 z – N + 1 ]
= X ( z )z – N + 1 [ w 0 z N – 1 + w 1 z N – 2 + …w N – 1 ]
(83)
For an IIR filter, the order of the feedforward and feedback sections of the filter can both be
specified. For example an IIR filter with a 0-th order feedforward section (i.e. N = 1 above meaning
w 0 = 1 and all other weights are 0), and an M-1th order feedback section is given by the difference
equation:
y ( k ) = x ( k ) + b 1 y ( k – 1 ) + b 2 y ( k – 2 ) + …b M – 1 y ( k – M + 1 )
(84)
and the M-1th order denominator polynomial is shown below as:
Y
( z )- = ----------------------------------------------------------------------------------------------------------------1
----------–
1
X(z )
1 + b1 z + … + bM – 2 z – M + 2 + bM – 1 z – M + 1
zM – 1
= ------------------------------------------------------------------------------------------------z M – 1 + b1 zM – 2 + … + bM – 2 z + bM – 1
(85)
It is worth noting that for an IIR filter the coefficients are indexed starting at 1, i.e. b 1 If a b 0
coefficient were added in the above signal flow graph, then this would introduce a scaling of the
output, y(k). See also Digital Filter, Finite Impulse Response Filter, Infinite Impulse Response Filter.
95
Digital Soundfield Processing (DSfP): The name given to the artificial addition of echo and
reverberation to a digital audio signal. For example music played in a car can add echo and
reverberation to the digital signal prior to being played through the speakers thus giving the
impression of the acoustics of a large theatre or a stadium.
Digital Television: The enabling technologies of digital television are presented in detail in [95],
[96].
Digital to Analog Converter (D/A or DAC): A digital to analog converter is a device which will
take a stream of digital numbers and convert to a continuous voltage signal. Every digital to analog
converter has an input-output characteristic that specifies the output voltage for a given binary
number input. The output of a DAC is very steppy, and will in fact produce frequency components
above the sampling frequency. Therefore a reconstruction filter should be used at the output of a
DAC to smooth out the steps. Most D/As used in DSP operate using 2’s complement arithmetic.
See also Reconstruction Filter, Analog to Digital Converter.
Digital
Value
Voltage
2
15
12
8
4
0
-4
-8
-12
-16
1
DAC
time, k
-1
time, k
-2
Output
(Volts)
-4
2
1
Example of a 5 bit DAC converting a
train of binary values to an analog
waveform.
4
8
01100
01111
01000
-1
00100
11001
-8
11000
10100
-16 -12
10000
0
12
Binary
Input
15
-2
Digital Video Interactive (DVI): Intel Inc. have produced a proprietary digital video compression
technology which is generally known as DVI. Files that are encoded as DVI usually have the suffix,
“.dvi” (as do LaTeXTM device independent files -- these are different). See also Standards.
Diotic: A situation where the aural stimulation reaching both ears is the same. For example, diotic
audiometric testing would play the exactly the same sounds into both ears. See also Audiometry,
Dichotic, Monauralic.
DSPedia
96
Dirac Impulse or Dirac Delta Function: The continuous time analog to the unit impulse function.
See Unit Impulse Function.
Direct Broadcast Satellite (DBS): Satellite transmission of television and radio signals may be
received directly by a consumer using a (relatively small) parabolic antenna (dish) and a digital
tuner. This form of broadcasting is gaining popularity in Europe, Japan, the USA and Australia.
Direct Memory Access: Allowing access to read or write RAM without interrupting normal
operation of the processor. The TMS320C40 DSP Processor has 6 independent DMA channels
that are 8 bits wide and allow access to memory without interrupting the DSP computation
operation. See also DSP Processor.
Directivity: A measure of the spatial selectivity of an array of sensors, or a single microphone or
antenna. Loosely, directivity is the ratio of the gain in the look direction to the average gain in all
directions. The higher the directivity, the more concentrated the spatial selectivity of a device is in
the look direction compared to all other directions. Mathematically, directivity is defined for a
(power) gain function G(θ,φ,f) as:
G ( 0, 0, f )
D ( f ) = ------------------------------------------------1
------ ∫ G ( θ, φ, f ) dΩ
4π
(86)
FOV
where the look direction (and the maximum of the gain function) is assumed to be θ=0 and φ=0 and
the field of view (FOV) is assumed to be Ω = 4π steradians (units of solid angle). Note that the
directivity defined above is a function of frequency, f, only. If directivity as a function of frequency,
D(f), is averaged (i.e., integrated) over frequency then a single directivity number can be obtained
for a wideband system. See also Superdirectivity, Sidelobe, Main Lobe, Endfire.
Discrete Cosine Transform (DCT): The DCT is given by the equation:
N–1
2πkn
∑ x ( n ) cos ------------N
X(k) =
for k = 0 to N – 1
(87)
n=0
The DCT is essentially discrete Fourier transform (DFT) evaluated only for the real part of the
complex exponential:
N–1
X( k) =
∑
– j 2πkn
-----------------x ( n )e N
for k = 0 to N – 1
(88)
n=0
The DCT is used in a number of speech and image coding algorithms. See also Discrete Fourier
Transform.
Discrete Fourier Transform: The Fourier transform [57], [58], [93] for continuous signals can be
defined as:
97
∞
x(t) =
∫ X ( f )ej2πft df
Synthesis
–∞
∞
X( f) =
(89)
–∞
x( n)
Analysis
∫ x ( t )e –j2πft dt
Fourier Transform Pair
NT s seconds
10
8
6
4
Ts
2
0 0
-1
1
3
N-3 N-2 N-1
4
sample
-2
Sampling an analogue signal, x ( t ) , to produce a discrete time signal, x ( nTs ) written as
x ( n ) . The sampling period is T s and the sampling frequency is therefore f s = 1 ⁄ T s . The
total time duration of the N samples is NT s seconds. Just as there exists a continuous time
Fourier transform, we can also derive a discrete Fourier transform (DFT) in order to assess
what sinusoidal frequency components comprise this signal.
In the case where a signal is sampled at intervals of Ts seconds and is therefore discrete, the
Fourier transform analysis equation will become:
∞
X(f) =
∫ x ( nTs )e–j2πfnT d( nTs )
(90)
s
–∞
and hence we can write:
∞
∞
∑
X(f) =
x ( nT 0 )e
– j 2πfnT 0
∑
=
n = –∞
x ( nT0 )e
– j 2πfn
---------------fs
(91)
n = –∞
To further simplify we can write the discrete time signal simply in terms of its sample number:
∞
X(f) =
∑
n = –∞
∞
x ( nT 0 )e
– j 2πfnT 0
=
∑
x ( n )e
– j 2πfn
---------------fs
(92)
n = –∞
Of course if our signal is causal then the first sample is at n = 0 , and the last sample is at
n = N – 1 , giving a total of N samples:
DSPedia
98
N
∑ x ( n )e
X( f) =
– j 2πfn---------------fs
(93)
n=0
By using a finite number of data points this also forces the implicit assumption that our signal is now
periodic, with a period of N samples, or NT s seconds (see above figure). Therefore noting that Eq.
93 is actually calculated for a continuous frequency variable, f , then in actual fact we need only
evaluate this equation at specific frequencies which are the zero frequency (DC) and hamonics of
the “fundamental” frequency, f 0 = 1 ⁄ NT s = f s ⁄ N , i.e. N – 1 discrete frequencies of 0, f 0 , 2f 0 ,
upto f s .
kf s
X  ------- =
 N
N–1
∑ x ( n )e
– j 2πkfs n
---------------------Nfs
for k = 0 to N – 1
(94)
n=0
Simplifying to use only the time indice, n , and the frequency indice, k , gives the discrete Fourier
transform:
N–1
X( k) =
∑
– j 2πkn
-----------------x ( n )e N
for k = 0 to N – 1
(95)
n=0
If we recall that the discrete signal x ( k ) was sampled at f s then the signal has image (or alias)
components above f s ⁄ 2 , then when evaluating Eq. 95 it is only necessary to evaluate up to f s ⁄ 2 ,
and therefore the DFT is further simplified to:
N–1
X(k) =
∑
– j 2πkn
-----------------x ( n )e N
n=0
for k = 0 to N ⁄ 2
(96)
Discrete Fourier Transform
Clearly because we have evaluated the DFT at only N frequencies, then the frequency resolution
is limited to the DFT “bins” of frequency width f s ⁄ N Hz.
Note that the discrete Fourier transform only requires multiplications and since each complex
exponential is computed in its complex number form.
– j2πkn
-----------------e N
2πkn
2πkn
= cos -------------- – j sin -------------N
N
(97)
If the signal x ( k ) is real valued, then the DFT computation requires approximately N 2 real
multiplications and adds (noting that a real value multiplied by a complex value requires two real
multiplies). If the signal x ( k ) is complex then a total of 2N 2 MACs are required (noting that the
multiplication of two complex values requires four real multiplications).
From the DFT we can calculate a magnitude and a phase response:
X ( k ) = X ( k ) ∠X ( k )
From a given DFT sequence, we can of course calculate the inverse DFT from:
(98)
99
1
x ( n ) = ---N
N–1
∑
j2πnk-------------X ( k )e N
(99)
k=0
As an example consider taking the DFT of 128 samples of an 8Hz sine wave sampled at 128 Hz:
Time Signal
x ( nT s ) 1
A
m
p
Ts
500.e-3
l
i
0
t
u
d
e
-500.e-3
-1
0
X ( kf 0 )
250.e-3
500.e-3
time/s
750.e-3
Magnitude Response
500.e-3
M
a
400.e-3
g
n
300.e-3
i
t
u
200.e-3
d
e
100.e-3
0
0
8
16
24
32
40
48
56
64
frequency/Hz
The time signal shows 128 samples of an 8 Hz sine wave sampled at 128Hz:
x ( n ) = sin ( 16πn ) ⁄ 128 . Note that there are exactly an integral number of periods (eight)
present over the 128 samples. Taking the DFT exactly identifies the signal as an 8 Hz
sinusoid. The DFT magnitude spectrum has an equivalent negative frequency portion
which is identical to that of the positive frequencies if the time signal was real valued.
DSPedia
100
If we take the DFT of the slightly more complex signal consisting of an 8Hz and a 24Hz sine wave
of half the amplitude of the 8Hz then:
0
250.e 3
500.e 3
750.e 3
x ( nTs ) 1
A
m
p
500.e-3
Ts
l
i
0
t
Time
Signal
u
d
-500.e-3
e
-1
0
X ( kf 0 )
250.e-3
500.e-3
time/s
750.e-3
Magnitude Response
500.e-3
M
a
400.e-3
g
n
300.e-3
i
t
u
200.e-3
d
e
100.e-3
0
0
8
16
24
32
40
48
56
64
frequency/Hz
The time signal shows 128 samples of an 8 Hz and 24 Hz sine waves sampled at 128Hz:
x ( n ) = sin ( 16πn ) ⁄ 128 + 0.5 sin ( 48πn ) ⁄ 128 . Note that there are exactly an integral
number of periods present for both sinusoids over the 128 samples.
101
Now consider taking the DFT of 128 samples of an 8.5 Hz sine wave sampled at 128 Hz:
Time Signal
x ( nT s )
1
A
m
p
500.e-3
Ts
l
i
0
t
u
d
e
-500.e-3
-1
0
X ( kf 0 )
M
a
250.e-3
500.e-3
time/s
750.e-3
Magnitude Response
350.e-3
300.e-3
250.e-3
g
n
200.e-3
i
t
150.e-3
u
d
e
100.e-3
50.e-3
0
0
8
16
24
32
40
48
56
64
frequency/Hz
The time signal shows 128 samples of an 8.5 Hz sine wave sampled at 128Hz:
x ( n ) = sin ( 17πn ) ⁄ 128 . Note that because the 8.5Hz sine wave does not lie exactly on a
frequency bin, then its energy appears spread over a number of frequency bins around 8Hz.
So why is the signal energy now spread over a number of frequency bins? We can interpret this by
recalling that the DFT implicitly assumes that the signal is periodic, and the N data points being
analysed are one full period of the signal. Hence the DFT assumes the signal has the form:
x( t)
time
N samples
Repeated samples
Repeated samples
and so on.....
If there are an integral number of sine wave periods in the N samples input to the DFT
computation, then the spectral peaks will fall exactly on one of the frequency bins as shown
earlier. Essentially the result produced for the DFT computation has assumed that the
signal was periodic, and the N samples form one period of the signal and thereafter the
period repeats. Hence the DFT assumes the complete signal is as illustrated above (the
discrete samples are not shows for clarity.
DSPedia
102
If there are not an integral number of periods in the signal (as for the 8.5Hz example), then:
Discontinuity
x(t)
time
Repeated samples
Repeated samples
N samples
and so on.....
If there are not an integral number of sine wave periods in the N samples input to the DFT
computation, then the spectral peaks will not fall exactly on one of the frequency bins. As
the DFT computation has assumed that the signal was periodic, the DFT interprets that the
signal undergoes a “discontinuity” jump at the end of the N samples. Hence the result of
the DFT interprets the time signal as if this discontinuity was part of it. Hence more than
one single sine wave is required to produce this waveform and thus a number of frequency
bins indicate sine wave components being present.
In order to address the problem of spectral leakage, the DFT is often used in conjunction with a
windowing function. See also Basis Function, Discrete Cosine Transform, Discrete Fourier
Transform - Redundant Computation, Fast Fourier Transform, Fourier, Fourier Analysis, Fourier
Series, Fourier Transform, Frequency Response.
Discrete Fourier Transform, Redundant Computation: If we rewrite the form of the DFT in Eq.
96 as:
N–1
X(k) =
∑ x ( n )WNkn
for k = 0 to N ⁄ 2
(100)
n=0
where W =
j2π
-------eN
Therefore to calculated the DFT of a (trivial) signal with 8 samples requires:
X( 0) = x(0 ) + x( 1) + x( 2) + x( 3) + x(4 ) + x(5 ) + x( 6) + x( 7)
X ( 1 ) = x ( 0 ) + x ( 1 )W 8– 1 + x ( 2 )W 8– 2 + x ( 3 )W 8– 3 + x ( 4 )W 8–4 + x ( 5 )W 8– 5 + x ( 6 )W 8– 6 + x ( 7 )W 8– 7
X ( 2 ) = x ( 0 ) + x ( 1 )W 8– 2 + x ( 2 )W 8– 4 + x ( 3 )W 8– 6 + x ( 4 )W 8–8 + x ( 5 )W 8– 10 + x ( 6 )W 812 + x ( 7 )W 8– 14
(101)
X ( 3 ) = x ( 0 ) + x ( 1 )W 8– 3 + x ( 2 )W 8– 6 + x ( 3 )W 8– 9 + x ( 4 )W 8–12 + x ( 5 )W 8– 15 + x ( 6 )W 8– 18 + x ( 7 )W 8– 21
However note that there is redundant computation in Eq. 101. Consider the third term in the second
line of Eq. 101:
x ( 2 )W 8–2
= x ( 2 )e
–2
j2π  ------
8
= x ( 2 )e
– jπ
-------2
(102)
Now consider the computation of the third term in the fourth line of Eq. 101
x ( 2 )W 8– 6
= x ( 2 )e
–6
j2π  ------
 8
=
– j3π
----------x ( 2 )e 2
=
– jπ
-------jπ
x ( 2 )e e 2
=
– jπ
-------– x ( 2 )e 2
(103)
103
There we can save one multiply operation by noting that the term x ( 2 )W 8–6 = – x ( 2 )W 8–2 . In fact
because of the periodicity of W Nkn every term in the fourth line of Eq. 101 is available from the terms
in the second line of the equation. Hence a considerable saving in multiplicative computations can
be achieved. This is the basis of the fast (discrete) Fourier transform discussed under item Fast
Fourier Transform.
Discrete Fourier Transform, Spectral Aliasing: Note that the discrete Fourier transform of a
signal x ( n ) is periodic in the frequency domain. If we assume that the signal was real and was
sampled above the Nyquist rate f s , then there are no frequency components of interest above f s ⁄ 2 .
From the Fourier transform, if we calculate the frequency components up to frequency f s ⁄ 2 then
this is equivalent to evaluating the DFT for the first N ⁄ 2 – 1 discrete frequency samples:
N–1
∑
X(k) =
– j 2πkn
-----------------x ( n )e N
for k = 0 to N ⁄ 2 – 1
(104)
n=0
Of course if we evaluate for the next N ⁄ 2 – 1 discrete frequencies (i.e. from f s ⁄ 2 to f s ) then:
N–1
X( k ) =
∑
– j 2πkn
-----------------x ( n )e N
for k = N ⁄ 2 to N – 1
(105)
n=0
In Eq. 11 if we substitute for the variable i = N – k ⇒ k = N – i and calculate over range
i = 1 to N ⁄ 2 (equivalent to the range k = N ⁄ 2 to N – 1 ) then:
N–1
X( i) =
∑
– j 2πin
----------------x ( n )e N
for i = 1 to N ⁄ 2
(106)
n=0
and we can write:
N–1
X(N – k) =
∑
n=0
N–1
=
∑
– j 2π ( N – k )n--------------------------------N
x ( n )e
j2πkn- –------------------j 2πNn-------------N
e N
x ( n )e
∑
∑
=
j2πkn
--------------x ( n )e N e –j 2πn
(107)
n=0
n=0
N–1
=
N–1
x ( n )e
j2πkn-------------N
for k = N ⁄ 2 to N – 1
n=0
since e j2πn = 1 for all integer values of n . Therefore from Eq. 107 it is clear that:
X( k ) = X(N – k )
(108)
DSPedia
104
Hence when we plot the DFT it is symmetrical about the N ⁄ 2 frequency sample, i.e. the frequency
value f s ⁄ 2 Hz depending on whether we plot the x-axis as a frequency indice or a true frequency
value.
We can further easily show that if we take a value of frequency index k above N – 1 (i.e. evaluate
the DFT above frequency f s , then:
N–1
∑
X ( k + mN ) =
n=0
N–1
∑
=
– j 2π ( k + mN )n
--------------------------------------N
x ( n )e
N–1
=
∑
– j 2πkn
-----------------x ( n )e N e –j 2πmn
n=0
(109)
– j 2πkn
-----------------x ( n )e N
n=0
= X(k)
where m is a positive integer and we note that e j2πmn = 1 .
Therefore we can conclude that when evaluating the magnitude response of the DFT the
components of specific interest cover the (baseband) frequencies from 0 to f s ⁄ 2 , and the
magnitude spectra will be symmetrical about the f s ⁄ 2 line and periodic with period f s :
x(n)
NT s seconds
10
8
6
4
Ts
2
0 0
-1
2
1
3
N-3 N-2 N-1
4
sample index
-2
Discrete Fourier transform
1 ⁄ NT s Hz
1 ⁄ T s Hz
X( k)
fs/2
fs
N discrete frequency points
3/2fs
2fs
5/2fs
3fs
frequency/Hz
Spectral aliasing. The main portion of interest of the magnitude response is the “baseband”
from 0 to f s ⁄ 2 Hz. The “baseband” spectra is symmetrical about the point f s ⁄ 2 and
thereafter periodic with period f s Hz.
See also Discrete Fourier Transform, Fast Fourier Transform, Fast Fourier Transform - Zero
Padding, Fourier Analysis, Fourier Series, Fourier Transform.
105
Discrete Time: After an analog signal has been sampled at regular intervals, each sample
corresponds to the signal magnitude at a particular discrete time. If the sampling period was τ secs,
then sampling a continuous time analog signal:
x(t)
(110)
x n = x ( n ) = x ( nτ ) , for n = 0, 1, 2, 3, …
(111)
every τ seconds would produce samples
For notational convenience the τ is usually dropped, and only the discrete time index, n, is used.
Of course, any letter can be used to denote the discrete time index, although the most common are:
“n”, “k” and “i”.
Digital Signal After Sampling
Analog Signal Before Sampling
x(t)
x(n)
0.001
0.002
0.003
0.004
0.005
0.006
0.007
0.008
0.009
0.010
0.011
0.012
5
time,t (secs)
1 2 3 4
8
6 7
12
9 10 11
Discrete time,n
Sampling a signal x(t) at 1000Hz. The sampling interval is therefore:
1
τ = ------------- seconds
1000
The sampled signal is denoted as x ( n ) , where the explicit reference to τ has been dropped or
notational convenience.
Distortion: If the output of a system differs from the input in a non-linear fashion then distortion
has occurred. For example, if a signal is clipped by a DSP system then the output is said to be
distorted. By the very nature of non-linear functions, a distorted signal will contain frequency
components that were not present in the input signal. Distortion is also sometimes used to describe
linear frequency shaping. See also Total Harmonic Distortion.
Distribution Function: See Random Variable.
Dithering (audio): Dithering is a technique whereby a very low level of noise is added to a signal
in order to improve the quality of the psychoacoustically perceived sound. Although the addition of
dithering noise to a signal clearly reduces the signal to noise ratio (SNR) because it actually adds
more noise to the original signal, the overall sound is likely to be improved by breaking up the
correlation between the various signal components and quantization error (which, without dithering,
results in the quantization noise being manifested as harmonic or tonal distortion).
DSPedia
106
One form of dithering adds a white noise dither signal, d ( t ) with a power of q 2 ⁄ 12 , where q is the
quantization level of the analog to digital converter (ADC), to the audio signal, x ( t ) prior to
conversion:
Dither signal
d(t)
time
time
Analog to
Digital
Converter
(ADC)
x(t)
y(k)
k
Input signal
Dithered sampled output signal
Note that without dithering, the quantization noise power introduced by the ADC is q 2 ⁄ 12 , and
therefore after dithering, the noise power in the digital signal is q 2 ⁄ 6 , i.e. the noise has doubled or
increased by 3dB ( 20 log 2 ). However the dithered output signal will have decorrelated the
quantization error of the ADC and the input signal, thus reducing the harmonic distortion
components. This reduction improves the perceived sound quality.
The following example illustrates dithering. A 600Hz sine wave of amplitude 6.104 × 10 –5
( = 2 ⁄ 32767 ) volts was sampled at 48000Hz with a 16 bit ADC which had the following input/output
characteristic:
Binary Output
32767
16384
-1
-0.5
0.5
-16384
1
Voltage Input
(volts)
-32768
16 bit Analogue to Digital Converter Input/Output Characteristic.
After analog to digital conversion (with d ( t ) = 0 , i.e. no dithering) the digital output has an
amplitude of 2. On a full scale logarithmic plot, 2 corresponds to -84 dB ( = 20 log ( 2 ⁄ 32767 ) ) where
107
Amplitude, x(n)
Magnitude, |X(f)| (dB)
the full scale amplitude of 32767 ( = 2 15 – 1 ) is 0dB. Time and frequency representations of the
output of the ADC are shown below, along with a 16384 point FFT of the ADC output:
frequency (kHz)
time(ms)
The frequency representation of the 600Hz sine wave clearly shows that the quantization
noise manifests itself as harmonic distortion. Therefore when this signal is reconverted to
analog and replayed, the harmonic distortion may be audible.
The magnitude frequency spectrum of the (undithered) signal clearly highlights the tonal distortion
components which result from the conversion of this low level signal. The main distortion
components are at 1800Hz, 3000Hz, 4200Hz, and so on, (i.e. at 3, 5, 7,..., times the signal’s
fundamental frequency of 600 Hz).
Amplitude, y(n)
Magnitude, |Y(f)| (dB)
However if the signal was first dithered by adding an analog white noise dithering signal, d ( t ) of
power q 2 ⁄ 12 prior to ADC conversion then the time and frequency representations of the ADC
output are:
time(ms)
frequency (kHz)
The frequency representation of the dithered 600Hz sine wave clearly shows that the
correlation between signal and the quantization error has been removed. Therefore if the
signal is reconverted to analog and replayed then the quantization noise is now effectively
whitened and harmonic distortion of the signal is no longer perceived.
Note that the magnitude frequency spectrum of the dithered signal has a higher average noise floor,
but the tonal nature of the quantization noise has been removed. This dithered signal is more
DSPedia
108
perceptually tolerable to listen to as the background white noise is less perceptually annoying that
than the harmonic noise generated without dithering.
Note that a common misconception is that dithering can be used to improve the quality of prerecorded 16 bit hifidelity audio signals. There are, however, no techniques by which a 16 bit CD
output can be dithered to remove or reduce harmonic distortion other than add levels of noise to
mask it! It may appear in the previous figure as if simply perturbing the quantized values would be
a relatively simple and effective dithering technique. There are a number of important differences
between dithering before and after the quantizer. First, after the quantizer the noise is simply
additive and the spectra of the dither and the harmonically distorted signal add (this is the masking
of the harmonic distortion referred to above -- requiring a relatively high power dither). The additive
dithering before quantization does not result in additive spectra because the quantization is
nonlinear. Another difference can be thought of this way: the dither signal is much more likely to
cause a change in the quantized level when the input analog signal is close to a quantization
boundary (i.e., it does not have to move the signal value very far). After quantization, we have no
way of knowing (in the general case) how close an input signal was to a quantization boundary -so mimicking the dither effect is not, in general, possible. However if a master 20 bit (or higher)
resolution recording exists and it is to be remastered to 16 bits, then digital dithering is appropriate,
whereby the 20 bit signal can be dithered prior to requantizing to 16 bits. The benefits will be similar
to those described above for ADCs.
Some simple mathematical analysis of the benefits of dithering for breaking up correlation between
the signal and the quantization noise can be done. The following figure shows the correlation
between a sine wave input signal and the quantization error for 1 to 8 bits of signal resolution:
0.4
0.35
0.3
0.25
No dither
0.2
0.15
0.1
Single bit dither
0.05
0
1
2
3
4
5
6
7
8
Number of bits of signal resolution
SNR (dB)
Correlation Coefficient
0.45
50
45
40
35
30
25
20
15
10
5
0
No dither
Single bit dither
1
2
3
4
5
6
7
8
Number of bits of signal resolution
For low resolution signals the correlation between the signal and quantization error is high. This
will be see as tonal or harmonic distortion. however if simple dithering scheme is performed prior
to analog to digital conversion the correlation can be greatly reduced.
For less than 8 bits resolution the correlation between the signal and quantization noise increases
to 0.4 and the signal will sound very (harmonically) distorted. The solid line shows the correlation
and signal to noise ratio (SNR) of the signal before and after dither has been added. Clearly the
dither is successful at breaking up the correlation between signal and quantization noise and the
benefits are greatest for low resolutions. However the total quantization noise in the digital signal
after dithering is increased by 3dB for all bit resolutions.
109
A uniformly distributed probability density function (PDF) and maximum amplitude of a half bit
( ± q ⁄ 2 ) is often used for dithering. Adding a single half bit dither signal successfully decorrelates
the expected error, however the second moment of the error remains correlated. To decorrelate the
second order moment a second uniformly distributed signal can be added. Higher order moments
can be decorrelated by adding additional single bits (with uniform probability density functions),
however it is found in practice that two uniform random variables (combining to give a triangular
probability density function) are sufficient. The effect of adding two random variables with uniform
PDFs of p ( x ) is equivalent to adding a random binary sequence with a triangular PDF (TPDF):
p( x)
-q/2
p( y)
q/2 d
1
-q
p(x )
q y
d1 ( t )
-q/2
q/2 d
2
When two uniformly distributed random variables d 1 and d 2 , are added together, the probability
density function (PDF) of the result, y is a random variable with a triangular PDF (TPDF)
obtained by a convolution of the PDFs of d 1 and d 2 .
The noise power added to the output signal by one uniform PDF is q 2 ⁄ 12 , and therefore with two
of these dithering signals q 2 ⁄ 6 noise power is added to the output signal. Noting that the
quantization noise power of the ADC is q 2 ⁄ 12 and therefore the total noise power of an audio
signal dithered with a TPDF is q 2 ⁄ 4 , i.e. total noise power in the output signal has increased by a
factor of 3 or by 4.8 dB ( 10 log 3 ) over the noise power from the ADC being used without dither.
Despite this increase in total noise, the noise power is now more uniformly distributed over
frequency (i.e., more white and sounding like a broadband hissing) and the harmonic distortion
components caused by correlation between quantization error and the input signal has been
effectively attenuated.
In order to mathematically illustrate why dither works, an extreme case of low bit resolution will be
addressed. For a single bit ADC (stochastic conversion) the quantizer is effectively reduced to a
comparator where:

x ( k ) = sign ( x ( k ) ) =  1, v ( n ) ≥ 0
 – 1, v ( n ) < 0
(112)
For an input constant (dc) input signal of v ( t ) = V 0 then x ( k ) = 1 , if V 0 > 0 regardless of the exact
magnitude. However by adding a dither signal d(n) with uniform probability density function over the
values Q ⁄ 2 and – Q ⁄ 2 before performing the conversion, such that:

x ( k ) =  1, v ( n ) + d ( n ) ≥ 0
 – 1, v ( n ) + d ( n ) < 0
(113)
DSPedia
110
and taking the mean (expected) value of x ( n ) gives:
E [ x ( n ) ] = E [ sign ( v ( n ) + d ( n ) ) ] = E [ sign ( n′ ( k ) ) ]
(114)
where the n′ ( k ) is a uniformly distributed random variable with a uniform distribution over values
of V 0 – Q ⁄ 2 and V 0 + Q ⁄ 2 . We can therefore show that the expected or mean value of the dither
signal is:
E [ x ( n ) ] = ( –1 )
0
∫
V0 – Q ⁄ 2
1--dn′ +
Q
V0 + Q ⁄ 2
∫
0
1--dn′
Q
(115)
1
2
1
= ----  V 0 – Q
---- + ----  V 0 + Q
---- = ---- V




Q
Q 0
Q
2
2
Therefore in the mean, the quantizer average dithered output is proportional to V 0 . The same
intuitive argument can be seen for time varying x(n), as long as the sampling rate is sufficiently fast
compared to the changes in the signal.
Dither can be further addressed with oversampling techniques to perform noise shaped dithering.
See also Analog to Digital Conversion, Digital to Analog Conversion, Digital Audio, Noise Shaping,
Tonal Distortion.
Divergence: When an algorithm does not converge to a stable solution and instead progresses
ever further away from a solution it may be said to be diverging. See also the Convergence entry.
Divide and Conquer: The name given to the general problem solving strategy of first dividing the
overall problem into a series of smaller sub-problems, solving these subproblems, and finally using
the solutions to the subproblems to give the overall solution. Some people also use this as an
approach to competing against external groups or managing people within their own organization.
Division: Division is rarely required by real time DSP algorithms such as filtering, FFTs,
correlation, adaptive algorithms and so on. Therefore DSP processors do not provide a provision
for performing fast division, in the same way that single cycle parallel multipliers are provided.
Therefore division is usually performed using a serial algorithm producing a bit at a time result, or
using an iterative technique such as Newton-Raphson. Processors such as the DSP56002 can
perform a fixed point division in around 12 clock cycles. It is worth pointing out however that some
DSP algorithms such the QR for adaptive signal processing have excellent convergence and
stability properties and do require division. Therefore is it possible that in the future some DSP
devices may incorporate fast divide and square roots to allow these techniques to be implemented
in real time. See also DSP Processor, Parallel Adder, Parallel Multiplier.
Dosemeter: See Noise Dosemeter.
Dot Product: See Vector Properties - Inner Product.
Downsampling: The sampling rate of a digital signal sampled at fs can be downsampled by a
factor of M to a sampling frequency fd = fs/M by retaining only every M-th sample. Downsampling
can lead to aliasing problems and should be performed in conjunction with a low pass filter that cuts-
111
off at fs/2M; this combination is usually referred to as a decimator. See also Aliasing, Upsampling,
Decimation, Interpolation, Fractional Sampling Rate Conversion.
ts
x(k)
1f s = --ts
y(k)
td
----fd = M
td
time
time
Input
Output
4
Downsampler
|Y(f)|
0
|X(f)|
fs /2
fs
0
fd /2
fd
frequency
3fd /2 2fd
5fd /2 3fd
7fd /2 4fd
frequency
Dr. Bub: The electronic bulletin board operated by Motorola and providing public domain source
code, and Motorola DSP related information and announcements.
Driver: The power output from a DAC is usually insufficient to drive an actuator such as a
loudspeaker. Although the voltage may be at the correct level, the DAC cannot source enough
current to deliver the required power. Therefore a driver in the form of an amplifier is required. See
also Signal Conditioning.
DSP
Processor
DAC
Driver Amplifier
DSP Board: A DSP board is a generic name for a printed circuit board (PCB) which has a DSP
processor, memory, A/D and D/A capabilities, and digital input ports (parallel and serial). For
development work most DSP boards are plug-in modules for computers such as the IBM-PC, and
Macintosh. The computer is used as a host to allow assembly language programs to be
conveniently developed and tested using assemblers and cross compilers. When an application
DSPedia
112
has been fully developed, a stand-alone DSP board can be realized. See also Daughter Module,
DSP Processor, Motherboard.
Interface to Host
Computer
Address
bus
DSP
Processor
ROM
Digital to
Analog
Converter
RAM
Analog to
Digital
Converter
Parallel
and Serial
I/O
Data
bus
Voltage Output
Voltage Input
DSP Processor: A microprocessor that has been designed for implementing DSP algorithms. The
main features of these chips are fast interrupt response times, a single cycle parallel multiplier, and
a subset of the assembly language instructions found on a general purpose microprocessor (e.g.
Motorola 68030) to save on silicon area and optimize DSP type instructions. The main DSP
processors are the families of the DSP56/96 (Motorola), TMS320 (Texas Instruments), ADSP 2100
(Analog Devices), and DSP16/32 (AT&T). DSP Processors are either floating point or fixed point
devices. See also DSP Board.
Address Bus
Data Bus
Control Bus
Data and
Address
Registers
Parallel
Multiplier
Instruction
Decoder
Interrupt
Handler
RAM
Arithmetic
Logic Unit
ROM
Timers
EPROM
A Generic
DSP Processor
DSPLINKTM: A bidirectional and parallel 16 bit data interface path used on Loughborough Sound
Images Ltd. (UK) and Spectron (USA) DSP boards to allow high speed communication between
separate DSP boards and peripheral boards. The use of DSPLINK means that data between
separate boards in a PC do not need to communicate data via the PC bus.
Dual: A prefix to mean “two of”. For example the Burr Brown DAC2814 chip is described as a Dual
12 Bit Digital to Analog Converter (DAC) meaning that the chip has two separate (or independent)
DACs. In the case of DACs and ADCs, if the device is used for hi-fidelity audio dual devices are
often referred to as stereo. See also Quad.
Dual Slope: A type of A/D converter.
113
Dual Tone Multifrequency (DTMF): DTMF is the basis of operation of push button tone dialing
telephones. Each button on a touch tone telephone is a combination of two frequencies, each from
a group of four. 2 4 = 16 possible combinations of tones pairs can be encoded using the two groups
of four tones. The two groups of four frequencies are: (low) 697Hz, 770Hz, 852Hz, 941Hz, and
(high) 1209Hz, 1336Hz, 1477Hz, and 1633Hz:
1209 Hz
1336 Hz 1477Hz
1633Hz
697 Hz
1
2
3
A
770 Hz
4
5
6
B
852 Hz
7
8
9
C
941 Hz
.
0
#
D
Each button on the keypad is a combination
of two DTMF frequencies. (Note most
telephones do not have keys A,B, C, D)
The standards for DTMF signal generation and detection are given in the ITU (International
Telecommunication Union) standards Q.23 and Q.24. In current telephone systems, virtually every
telephone now uses DTMF signalling to allow transmission of a 16 character alphabet for
applications such as number dialing, data entry, voice mail access, password entry and so on. The
DTMF specifications commonly adopted are:
Signal Frequencies:
• Low Group 697, 770, 852, 941 Hz
• High Group: 1209, 1336, 1477, 1633 Hz
Frequency tolerance:
• Operation: ≤ 1.5%
Power levels per frequency:
• Operation: 0 to -25dBm
• Non-operation: -55dBm max
Power level difference between frequencies
• +4dB to -8dB
Signal Reception timing:
• Signal duration: operation: 40ms (min)
• Signal duration: non-operation: 23ms (max)
• Pause duration: 40ms (min);
• Signal interruption: 10ms (max);
• Signalling velocity: 93 ms/digit (min).
DSPedia
114
See also Dual Tone Multifrequency - Tone Detection, Dual Tone Multifrequency - Tone Generation,
Goertzel’s Algorithm.
Dual Tone Multifrequency (DTMF), Tone Generation: One method to generate a tone is to use
a sine wave look up table. For example some members of the Mototola DSP56000 series of
processors include a ROM encoded 256 element sine wave table which can be used for this
purpose. Noting that each DTMF signal is a sum of two tones, then it should be possible to use a
look up table at different sampling rates to produce a DTMF tone.
An easier method is to design a “marginally stable” IIR (infinite impulse response) filter whereby the
poles of the filter are on the unit circle and the filter impulse response is a sinusoid at the desired
frequency. This method of tone generation requires only a few lines of DSP code, and avoids the
requirement for “expensive” look-up tables. The structure of an IIR filter suitable for tone generation
is simply:
time
x(k)
y(k-2)
y(k-1)
y(k)
Impulse input
time
Sinusoidal Output
-1
b1
A two pole “marginally stable” IIR filter. For an input of an impulse the filter begins to
oscillate.
This operation of this 2 pole filter can be analysed by considering the z-domain representation. The
discrete time equation for this filter is:
2
y(k) = x(k) +
∑
b n y ( k – n ) = x ( k ) + by ( k – 1 ) – y ( k – 2 )
(116)
n=1
where we now write b 1 = b and b 2 = – 1 . Writing this in the z-domain gives:
Y ( z ) = X ( z ) + b z –1 Y ( z ) – z – 2 Y ( z )
(117)
The transfer function, H ( z ) , is therefore:
( z )----------H(z) = Y
X( z )
1
1
1
= ----------------------------------= --------------------------------------------------------- = ------------------------------------------------------------------1 – bz – 1 + z –2 ( 1 – p 1 z – 1 ) ( 1 – p 2 z – 1 ) 1 – ( p 1 + p 2 )z –1 + p 1 p 2 z – 2
(118)
115
where, p 1 and p 2 are the poles of the filter, and b = p 1 + p 2 and p 1 p 2 = 1 . The poles of the filter,
p 1, 2 (where the notation p 1, 2 means p 1 and p 2 ) can be calculated from the quadratic formula as:
b ± j 4 – b 2± b 2 – 4- = --------------------------------------------------------p 1, 2 = b
2
2
(119)
Given that b is a real value, then p 1 and p 2 are complex conjugates. Rewriting Eq. 119 in polar
form gives:
p 1, 2 =
4 – b2
± j tan–1 ------------------b
e
(120)
Considering the denominator polynomial of Eq. 118, the magnitude of the complex conjugate
values p 1 and p 2 are necessarily both 1, and the poles will lie on the unit circle. In terms of the
frequency placement of the poles, noting that this is given by:
p 1, 2 = 1 = e
± j2 πf
-------------fs
(121)
(where e jω = 1 for any ω ) for a sampling frequency f s , from Eqs. 121 and 120 it follows that:
4 – b2
2πf
--------- = tan–1 ------------------b
fs
(122)
For most telecommunication systems the sampling frequency is f s = 8000Hz . The values of b for
the various desired DTMF frequency of oscillations can therefore be calculated from Eq. 122 to be:
b
frequency, f / Hz
1.707737809
697
1.645281036
770
1.568686984
852
1.478204568
941
1.164104023
1209
0.996370211
1336
0.798618389
1477
0.568532707
1633
DSPedia
116
For example, in order to generate the DTMF signal for the digit #1, it is required to produce two
tones, one at 697 Hz and one at 1209 Hz. This can be accomplished by using the IIR filter :
time
x(k)
y(k)
Impulse input
time
Dual tone Output
-1
1.707737...
-1
1.164104...
An IIR filter to produce the DTMF signal for the digit #1. The filter consists of two “marginally
stable” two pole IIR files producing the 697 Hz tone (top) and the 1209 Hz tone (bottom)
added together. Note that the filters will have different magnitude responses and therefore
the two tones are unlikely to have the same amplitude. The ITU standard allows for this
amplitude difference.
See also Dual Tone Multifrequency (DTMF) - Tone Detection, Dual Tone Multifrequency (DTMF) Tone Detection, Goertzel’s Algorithm.
Dual Tone Multifrequency (DTMF), Tone Detection: DTMF tones can be detected by
performing a discrete Fourier transform (DFT), and considering the level of power that is present in
a particular frequency bin. Because DTMF tones are often used in situations where speech may
also be present, it is important that any detection scheme used can distinguish between a tone and
a speech signal that happens to have strong tonal components at a DTMF frequency. Therefore for
a DTMF tone at f Hz, a detection scheme should check for the signal component at f Hz and also
check that there is no discernable component at 2f Hz; quasi-periodic speech components (such
as vowel sounds) are rich in (even) harmonics, whereas DTMF tones are not.
The number of samples used in calculating the DFT should be shorter than the number of samples
in half of a DTMF signalling interval, typically of 50ms duration equivalent to 400 samples at a
sampling frequency of f s = 8000 Hz , but be large enough to give a good frequency resolution. The
DTMF standards of the International Telecommunication Union (ITU) therefore suggest a value of
205 samples in standards Q.23 and Q.24. Using this 205 point DFT the DTMF fundamental and the
second harmonics of the 8 possible tones can be successfully discerned. Simple decision logic is
applied to the DFT output to specify which tone is present. The second harmonic is also detected
in order that the tones can be discriminated from speech utterances that happen to include a
frequency component at one of the 8 frequencies. Speech can have very strong harmonic content,
whereas the DTMF tone will not. To add robustness against noise, the same DTMF tones require
to be detected in a row to give a valid DTMF signal .
117
If a 205 point DFT is used, then the frequency resolution will be:
------------- = 39.02 Hz
Frequency Resolution = 8000
205
(123)
The DTMF tones therefore do not all lie exactly on the frequency bins. For example the tone at 770
Hz will be detected at the frequency bin of 780 Hz ( 20 × 39.02 Hz ). In general the frequency bin, k
to look for a single tone can be calculated from:
f tone N
k = int  ----------------
 fs 
(124)
where f tone is a DTMF frequency, N = 205 and f s = 8000 Hz . The bins for all of the DTMF tones
for these parameters are therefore:
frequency, f / Hz
bin
697
18
770
20
852
22
941
24
1209
31
1336
34
1477
38
1633
42
When the 2nd harmonic of a DTMF frequency is to be considered, then the bin at twice the
fundamental frequency bin value is detected (there should be no appreciable signal power there for
a DTMF frequency). When calculating the DFT for DTMF detection because we are only interested
in certain frequencies, then it is only necessary to calculate the frequency components at the
frequency bins of interest. Therefore an efficient algorithm based on the DFT called Goertzel’s
algorithm is usually used for DTMF tone detection. See also Dual Tone Multifrequency , Dual Tone
Multifrequency - Tone Generation, Goertzel’s Algorithm.
Dynamic Link Library: A library of compiled software routines in a separate file on disk that can
be called by a Microsoft Windows program.
Dynamic RAM (DRAM): Random access memory that needs to be periodically refreshed
(electrically recharged) so that information that is stored electrically is not lost. See also Non-volatile
RAM, Static RAM.
Dynamic Range: Dynamic range specifies the numerical range, giving an indication of the largest
and smallest values that can be correctly represented by a DSP system. For example if 16 bits are
used in a system then the linear (amplitude) dynamic range is -215 → 215-1 (-32768 to +32767).
Usually dynamic range is given in decibels (dB) calculated from 20 log10 (Linear Range), e.g. for
16 bits 20log10216 =96dB.
118
DSPedia
119
E
e: The natural logarithm base, e = 2.7182818… . e can be derived by taking the following limit:
n
---
e ≡ lim  1 + 1
n
n → ∞
(125)
See also Exponential Function.
Ear: The ear is a basically the system of flesh, bone, nerves and brain allowing mammals to
perceive and react to sound. It is probably fair to say that a very large percentage of DSP is dealing
with the processing, coding and reproduction of audio signals for presentation to the human ear.
Semicircular canals
Pinna
Inner ear bones
Cochlear nerves
to the brain
Auditory canal
Cochlea
Eardrum
A Simplified Diagram of the Human Ear
The human ear can be generally described as consisting of three parts, the outer, middle and inner
ear. The outer ear consists of the pinna and the ear canal. The shape of the external ear has
evolved such that is has good sensitivity to frequencies in the range 2 - 4kHz. Its complex shape
provides a number of diffracted and reflected acoustic paths into the middle ear which will modify
the spectrum of the arriving sound. As a result a single ear can actually discriminate direction of
arrival of broadband sounds.
The ear canal leads to the ear drum (tympanic membrane) which can flex in response to sound.
Sound is then mechanically conducted to the inner ear interconnection of bones (the ossicles), the
malleus (hammer), the incus (anvil) and the stapes (stirrup) which act as an impedance matching
network (with the ear drum and the oval window of the cochlea) to improve the transmission of
acoustic energy to the inner ear. Muscular suppression of the ossicle movement provides for
additional compression of very loud sounds.
The inner ear consists mainly of the cochlea and the vestibular system which includes the
semicircular canals (these are primarily used for balance). The cochlea is a fluid filled snail-shell
shaped organ that is divided along its length by two membranes. Hair cells attached to the basilar
membrane detect the displacement of the membrane along the distance from the oval window to
the end of the cochlea. Different frequencies are mapped to different spots along the basilar
membrane. The further the distance from the oval window, the lower the frequency. The basilar
membrane and its associated components can be viewed as acting like a series of bandpass filters
DSPedia
120
sending information to the brain to interpret [30]. In addition, the output of these filters is
logarithmically compressed. The combination of the middle and inner ear mechanics allows signals
to be processed over the amazing dynamic range of 120dB. See also See also Audiology,
Audiometer, Audiometry, Auditory Filters, Hearing Impairment, Threshold of Hearing.
EBCDIC: See also ASCII.
Echo: When a sound is reflected of a nearby wall or object, this reflection is called an echo.
Subsequent echoes (of echoes), as would be clearly heard in a large, empty room are referred to
collectively as reverberations. Echoes also occur on telecommunication systems where impedance
mismatches reflect a signal back to the transmitter. Echoes can sometimes be heard on long
distance telephone calls. See also Echo Cancellation, Reverberation.
Echo Cancellation: An echo canceller can be realised [53] with an adaptive signal processing
system identification architecture. For example if a telephone line is causing an echo then by
incorporating an adaptive echo canceller it should be possible to attenuate this echo:
A
ADC
Input Signal
Adaptive
Filter
Output Signal
−
B
DAC
Simulated
echo of A
Echo
“Generator”
e.g. Hybrid
Telephone
Connection
To Speaker B
+
B + echo of A
A simple adaptive echo canceller. The success of the cancellation will depend
on the statistics and relative powers of the signals A and B.
When speaker A (or data source A) sends information down the telephone line, mismatches in the
telephone hybrids can cause echoes to occur. Therefore speaker A will hear an echo of their own
voice which can be particularly annoying if the echo path from the near and far end hybrids is
particularly long. (Some echo to the earpiece is often desirable for telephone conversation, and the
local hybrid is deliberately mismatched. However for data transmission echo is very undesirable
and must be removed.) If the echo generating path can be suitably modelled with an adaptive filter,
then a negative simulated echo can be added to cancel out the signal A echo. At the other end of
the line, telephone user B can also have an echo canceller.
In general local echo cancellation (where the adaptive echo canceller is inside the consumer’s
telephone/data communication equipment) is only used for data transmission and not speech.
Minimum specifications for the ITU V-series of recommendations can be found in the CCITT Blue
Book. For V32 modems (9600 bits/sec with Trellis code modulation) an echo reduction ratio of 52dB
is required. This is a power reduction of around 160,000 in the echo. Hence the requirement for a
powerful DSP processor.
For long distance telephone calls where the round trip echo delay is more than 0.1 seconds and
suppressed by less than 40dB (this is typical via satellite or undersea cables) line echo on speech
121
can be a particularly annoying problem. Before adaptive echo cancellers were cost effective to
implement, the echo problem was be solved by setting up speech detectors and allowing speech
to be half duplex. This was inconvenient for speakers who were required to take turns speaking.
Adaptive echo cancellers at telephone exchanges have helped to solve this problem. The set up of
the telephone exchange echo cancellers is a little different from the above example and the echo
is cancelled on the outgoing signal line, rather than the incoming signal line. See also Acoustic Echo
Cancellation, Adaptive Filtering, Least Mean Squares Algorithm.
Eigenanalysis: See Matrix Decompositions - Eigenanalysis.
Eigenvalue: See Matrix Decompositions - Eigenanalysis.
Eigenvector: See Matrix Decompositions - Eigenanalysis.
Eight to Fourteen Modulation (EFM): EFM is used in compact disc (CD) players to convert 8 bit
symbols to a 14 bit word using a look-up table [33]. When the 14 bit words are used fewer 1-0 and
0-1 transitions are needed than would be the case with the 8 bit words. In addition, the presence of
the transitions are guaranteed. This allows required synchronization information to be placed on the
disc for every possible data set. In addition, the forced presence of zeros allows the transitions
(ones) to occur less frequently than would otherwise be the case. This increases the playing time
since more bits can be put on a disk with a fixed minimum feature size (i.e., pit size). See also
Compact Disc.
Amplitude (mV)
Electrocardiogram (ECG): The general name given to the electrical potentials of the heart
sensed by electrodes placed externally on the body (i.e., surface leads) [48]. These potentials can
also be sensed by placing electrodes directly on the heart as is done with implantable devices
(sometimes referred to as pacemakers). The bandwidth used for a typical clinical ECG signal is
about 0.05-100Hz. The peak amplitude of a sensed ECG signal is about 1 mV and for use in a DSP
system the ECG will typically require to be amplified by a low noise amplifier with gain of about 1000
or more.
0.4
0.2
0
Example ECG
time (secs)
Electroencephalogram (EEG): The EEG measures small microvolt potentials induced by the
brain that are picked up by electrodes placed on the head [48]. The frequency range of interest is
about 0.5-60Hz. A number of companies are now making multichannel DSP acquisition boards for
recording EEGs at sampling rates of a few hundred Hertz.
Electromagnetic Interference (EMI): Unwanted electromagnetic radiation resulting from energy
sources that interfere with or modulate desired electrical signals within a system.
Electromagnetic Compatibility (EMC): With the proliferation of electronic circuit boards in
virtually every walk of life particular care must be taken at the design stage to avoid the electronics
DSPedia
122
acting as a transmitter of high frequency electromagnetic waves. In general a strip of wire with a
high frequency current passing through can act as an antenna and transmit radio waves. The
harmonic content from a simple clock in a simple microprocessor system can easily give of radio
signals that may interfere with nearby radio communications devices, or other electronic circuitry.
A number of EMC regulations have recently been introduced to guard against unwanted radio wave
emissions from electronic systems.
Electromagnetic Spectrum: Electromagnetic waves travel through space at approximately
3 × 10 8 m/s, i.e. the speed of light. In fact, light is a form of electromagnetic radiation for which we
have evolved sensors (eyes). The various broadcasting bands are classified as very low (VLF), low
(LF), medium (MF), high (HF), very high (VHF), ultra high (UHF), super high (SHF), and extremely
high frequencies (EHF). One of the most familiar bands in everyday life is VHF (very high) used by
FM radio stations.
VLF
LF
MF
HF
AM
Radio
3kHz
30kHz
300kHz
3MHz
VHF
UHF
FM
Radio
30MHz 300MHz
SHF
Satellite
3GHz
EHF
Infrared Visible
Light
30GHz 300GHz
Electromyogram (EMG): Signals sensed by electrodes placed inside muscles of the body. The
frequency range of interest is 10-200Hz.
Electroreception: Electroreception is a means by which fish, animals and birds use electric fields
for navigation or communication. There are two type of electric fish: “strongly electric” such as the
electric eel which can uses its electrical energy as a defense mechanism, and; “weakly electric”
which applies to many common sea and freshwater fish who use electrical energy for navigation
and perhaps even communication [151]. Weakly electric fish can have one of two differing patterns
of electric discharge: (1) Continuous wave where a tone like signal is output at frequencies of
between 50 and 1000 Hz, and (2) Pulse wave where trains of pulses lasting about a millisecond
and spaced about 25 milliseconds apart. The signals are generated by a special tubular organ that
extends almost from the fish head to tail. By sensing the variation in electrical conductivity caused
by objects distorting the electric field, an electrical image of can be conveyed to the fish via
receptors on its body. The relatively weak electric field, however, means that fish are in general
electrically short sighted and cannot sense distances any more than one or two fish lengths away.
However this is enough to avoid rocks and other poor electrical conductors which will disperse
electrical shadows that the fish can pick up on. See also Mammals.
Elementary Signals: A set of elementary signals can be defined which have certain properties
and can be combined in a linear or non linear fashion with time shifts and periodic extensions to
create more complicated signals. Elementary signals are useful for the mathematical analysis and
description of signals and systems [47]. Although there is no universally agreed list of elementary
signals, a list of the most basic functions is likely to include:
1. Unit Step;
2. Unit Impulse;
3. Rectangular Pulse;
4. Triangular Pulse
5. Ramp Function;
123
6. Harmonic Oscillation (sine and cosine waves);
7. Exponential Functions;
8. Complex Exponentials;
9. Mother Wavelets and Scaling Functions;
Both analog and discrete versions of the above elementary signals can be defined. Elementary
signals are also referred to as signal primitives. See also Convolution, Elementary Signals, Fourier
Transform Properties, Impulse Response, Sampling Property, Unit Impulse Function, Unit Step
Function.
Elliptic Filter: See Filters.
Embedded Control: DSP processors and associated A/D and D/A channels can be used for
control of a mechanical system. For example a feedback control algorithm with could be used to
control the revolution speed of the blade in a sheet metal cutter. Typically the term embedded will
imply a real-time system.
Emulator: A hardware board or device which has (hopefully!) the same functionality as an actual
DSP chip, and can be used conveniently and effectively for developing and debugging applications
before actual implementation on the DSP chip.
Endfire: A beamformer configuration in which the desired signal is located along a line that
contains a linear array of sensors. See also Broadside, Superdirectivity.
Sensors
τ1
Output
Summer
or DSP
processor
τ2
τ3
τM
Delays
di,i+1
Endfire Look
Direction
d 1, n
τ n = ---------c
c is propagation
velocity
Engaged Tone: See also Busy Tone.
Ensemble Averages: A term used interchangeably with statistical average. See Expected Value.
Entropy: See Information Theory
Entropy Coding: Any type of data compression technique which exploits the fact that some
symbols are likely to occur less often than others and assigns fewer bits for coding to the more
frequent. For example the letter “e” occurs more often in the English language that the letter “z”.
Therefore the transmission code for “e” may only use 2 bits, whereas the transmission code for “z”
might require 8 bits. The technique can be further enhanced by assigning codes to comment groups
of letters such as “ch”, or “sh”. See also Huffman Coding.
DSPedia
124
Equal Loudness Contours: Equal loudness gives a measure of the actual SPL of a sound
compared to the perceived or judged loudness. i.e. a purely subjective measure. The equal
loudness contours are therefore presented for equal phons (the subjective measure of loudness).
Equal Loudness Contours
140
SPL (dB)
Phons:
120
120
100
100
80
80
60
60
40
40
20
20
10
0
10
Threshold of Hearing
50
100
500
1000
5000 10000 20000
frequency (Hz)
The curves are obtained by averaging over a large cross section of the population who do not have
hearing impairments [30]. These measurements were first performed by Fletcher and Munson in
1933 [73], and later by Robinson and Dadson in 1956 [126]. See also Audiometry, Auditory Filters,
Frequency Range of Hearing, Hearing, Loudness Recruitment, Sound Pressure Level, Sound
Pressure Level Weighting Curves, Spectral Masking, Temporal Masking, Temporary Threshold
Shift, Threshold of Hearing, Ultrasound.
Equal Tempered Scale: See Equitempered Scale.
Equalisation: If a signal is passed through a channel (e.g., it is filtered) and the effects of the
channel on the signal are removed by making an inverse channel filter using DSP, then this is
referred to as equalization. Equalization attempts to restore the frequency and phase characteristic
of the signal to the values prior to transmission and is widely used in telecommunications to
maximize the reliable transmission data rate, and reduce errors caused by the channel frequency
and phase response. Equalization implementations are now commonly found in FAX machines and
125
telephone MODEMS. Most equalization algorithms are adaptive signal processing least squares or
least mean squares based. See also Inverse System Identification.
USA
SCOTLAND
Telephone Channel
T(f)
Channel
Frequency
Response
T(f)
2
USA
Equalization
Digital Filter
E(f)
A/D
2
T(f)E(f)
2
Equalizer
Frequency
Response
E(f)
4 frequency (kHz)
D/A
4 frequency (kHz)
Combined Frequency
Response of Channel
and Equalizer
4 frequency (kHz)
SCOTLAND
Equitempered Scale: Another name for the well known Western music scale of 12 musical notes
in an octave where the ratio of the fundamental frequencies of adjacent notes is a constant of value
2 1 / 12 = 1.0594631… . The frequency different between adjacent notes on the equitempered
scale is therefore about 6%. The difference between the logarithm of the fundamental frequency of
adjacent notes is therefore a constant of:
log ( 2 1 / 12 ) = 0.0250858…
(126)
Hence if a piece of digital music is replayed at a sampling rate that mismatches the original by more
or less than 6%, the key of the music will be changed (as well as everything sounding that little bit
slower!). See also Music, Music Synthesis, Western Music Scale.
Equivalent Sound Continuous Level (Leq): Sound pressure level in units of dB (SPL), gives a
measure of the instantaneous level of sound. To produce a measure of averaged or integrated
sound pressure level a time interval T, the equivalent sound continuous level can be calculated [46]:
T
L eq,T
 --1- ∫ P 2 ( t )
T 0

-
= 10 log  ---------------------- P 2 ref 


(127)
where P ref is the standard SPL reference pressure of 2 × 10-5 N/m2 = 20 µ Pa, and P ( t ) is the time
varying sound pressure. If a particular sound pressure level weighting curve was used, such as the
A-weighting scale, then this may be indicated as LAeq,T
Leq measurements can usually be calculated by good quality SPL meters which will average the
sound over a specified time typically from a few seconds to a few minutes. SPL meters which
provide this facility will correspond to IEC 804: 1985 (and BS 6698 in the UK). See also Hearing
DSPedia
126
Impairment, Sound Exposure Meters, Sound Pressure Level, Sound Pressure Level Weighting
Curves, Threshold of Hearing.
Ergodic: If a stationary random process (i.e., a signal) is ergodic, then its statistical average (or
ensemble average) equal the time average of a single realization of the process. For example given
a signal x ( n ) , with a probability density function p { x ( n ) } the mean or expected value is calculated
from:
Mean of x ( n ) = E { x ( n ) } =
∑ x ( n )p { x ( n ) }
(128)
n
and the mean squared value is calculated as:
Mean Squared Value of x ( n ) = E { [ x ( n ) ] 2 } =
∑ [ x( n ) ]2 p{ x( n ) }
(129)
n
For a stationary signal the probability density function or a number of realizations of the signal may
be difficult or inconvenient to obtain. Therefore if the signal is ergodic the time averages can be
used:
1
E { x ( n ) } ≈ --------------------M2 – M1
M2 – 1
∑
x ( n ) for large ( M 2 – M 1 )
(130)
[ x ( n ) ] 2 for large ( M 2 – M 1 )
(131)
n = M1
and
1
E { [ x ( n ) ] 2 } ≈ --------------------M2 – M1
M2 – 1
∑
n = M1
See also Expected Value, Mean Value, Mean Squared Value, Variance, Wide Sense Stationarity.
Error Analysis: When the cumulative effect of arithmetic round-off errors in an algorithm is
calculated, this is referred to as an error analysis. Most error analysis is performed from
consideration of relative and absolute errors of quantities. For example, consider two real numbers
x and y, that are estimated as x’ and y’ with absolute errors ∆x and ∆y . Therefore:
x = x′ + ∆x
y = y′ + ∆y
(132)
w = x+y
(133)
If x and y are added:
then the error, ∆w , caused by adding the estimated quantities such that w′ = x′ + y′ is calculated
by noting that:
w = w′ + ∆w = x′ + ∆x + y′ + ∆y
(134)
127
and therefore:
∆w = ∆x + ∆y
(135)
Therefore the (worst case) error caused by the adding (or subtracting) two values is calculated as
the sum of the absolute errors.
When the product z = xy is formed then:
z = xy = ( x′ + ∆x ) ( y′ + ∆y )
= x′y′ + ∆xy′ + ∆yx′ + ∆x∆y
(136)
Using the estimated quantities to calculate z′ = x′y′ , the product error, ∆z , is given by:
∆z = z – z′ = ∆xy′ + ∆yx′ + ∆x∆y
(137)
If we assume that the quantities ∆x and ∆y . are small with respect to x′ and y′ then the term ∆x∆y
can be neglected and the error in the product given by:
∆z ≅ ∆xy′ + ∆yx′
(138)
Dividing both sides of the equation by z, we can express the relative error in z as the sum of the
relative errors of x and y:
∆z
------- ≅ ∆x
------- + ∆y
------z
x
y
(139)
The above two results can be used to simplify the error analysis of the arithmetic of many signal
processing algorithms. See also Absolute Error, Quantization Noise, Relative Error.
Error Budget: See Total Error Budget.
Error Burst: See Burst Errors.
Error Performance Surface: See Wiener-Hopf Equations.
Euclidean Distance: Loosely, Euclidean distance is simply linear distance, i.e., distance “as the
crow flies”. More specifically, Euclidean distance is the square root of the sum of the squared
differences between two vectors. One example would be the distance between the endpoints of the
hypotenuse of a right triangle. This distance satisfies the Pythagorean Theorem, i.e., the square
root of the sum of the squares. See also Hamming Distance, Viterbi Algorithm.
Euler’s Formula: An important mathematical relationship in dealing with complex numbers and
harmonic relationships is given by Euler’s Formula:
e
jθ
jθ
= cos θ + j sin θ
(140)
If we think of e as being a 2-dimensional unit length vector (or phasor) that rotates around the
origin as θ is varied, then the real part ( cos θ ) is given by the projection of that vector onto the xaxis, and the imaginary part ( sin θ ) is given by the projection of that vector onto the y-axis.
DSPedia
128
European Broadcast Union (EBU): The EBU define standards and recommendations for
broadcast of audio, video and data. The EBU has a special relationship with the European
Telecommunications Standards Institute (ETSI) through which joint standards are produced such
as NICAM 728 (ETS 300 163).
“a network, in general evolving from a telephony integrated digital network (IDN), that provides end to end
connectivity to support a wide range of services including voice and non-voice services, to which users have a
limited set of standard multi-purpose user network interfaces.”
The ITU-T I-series of recommendations fully defines the operation and existence of ISDN. See also
European Telecommunications Standards Institute, International Telecommunication Union,
International Organisation for Standards, Standards, I-series Recommendations, ITU-T
Recommendations.
European Telecommunications Standards Institute (ETSI): ETSI provides a forum at which all
European countries sit to decide upon telecommunications standards. The institute was set up in
1988 for three main reasons: (1) the global (ISO/IEC) standards often left too many questions open;
(2) they often do not prescribe enough detail to achieve interoperability; (3) Europe cannot always
wait for other countries to agree or follow the standards of the USA and Asia.
ETSI has 12 committees covering telecommunications, wired fixed networks, satellite
communications, radio communications for the fixed and mobile services, testing methodology, and
equipment engineering. ETSI were responsible for the recommendations of GSM (Group Specialé
Mobile, or Global System for Mobile Communications). See also Comité Européen de
Normalisation Electrotechnique, International Telecommunication Union, International
Organisation for Standards, Standards.
Evaluation Board: A printed circuit board produced in volume by a company, and intended for
evaluation and benchmarking purposes. An evaluation board is often a cut down version of a
production board available from the company. A DSP evaluation board is likely to have limited
memory available, use a slow clock DSP processor, and be restricted in its convenient
expandability. See also DSP Board.
Even Function: The graph of an even function is symmetric about the y-axis such that
y = f ( x ) = f ( – x ) . This simple 1-dimensional intuition is quickly extended to more complex
functions by noting that the basic requirement is still f ( x ) = f ( – x ) whether x or f(x) are vectors or
vector-valued functions or some combination. Example even functions include y = cos x and
y = x 2 . In contrast an odd function has point symmetry about the origin such that
y = f ( x ) = – f ( x ) . See also Odd Function.
y
y
x
y = x2
x
y = cos x
Evoked Potentials: When the brain is excited by audio or visual stimuli, small voltage potentials
can be measured on the head, emanating from brain [48]. These Visually Evoked Potentials (VEP),
and Audio Evoked Potentials (AEP) can be sampled, and processed using a DSP system. Evoked
potentials can also be measured directly on the brain or the brainstem.
129
Excess Mean Square Error: See Least Mean Squares (LMS) Algorithm.
Exp: Common notation used for the exponential function. See Exponential Function.
Expected Value: The expected value, E { . } , of a random variable (or a function of a random
variable) is simply the average value of the random variable (or of the function of a random
variable). The statistical average or mean value of signal x ( n ) is computed from:
Mean of x ( n ) = E { x ( n ) } =
∑ x ( n )p { x ( n ) }
(141)
n
where E { x ( n ) } is “the expected value of x ( n ) ”, and p { x ( n ) } is the probability density function of
the random variable x ( n ) . An another example of expected values, the mean squared value of x ( k )
is calculated as:
Mean Squared Value of x ( n ) = E { x 2 ( n ) } =
∑ x 2 ( n )p { x ( n ) }
(142)
n
Expected value is a linear operation, i.e.,:
E { ax ( n ) + by ( n ) } = aE { x ( n ) } + bE { y ( n ) }
(143)
where a and b are constants and x ( n ) and y ( n ) are random signals generated by known
probability density functions, p y { y ( n ) } and p x { x ( n ) } .
For most signals encountered in real time DSP the probability density function is unlikely to be
known and therefore the expected value cannot be calculated as suggested above. However if the
signal is ergodic, then time averages can be used to approximate the statistical averages. See also
Ergodic, Mean Value, Mean Squared Value, Variance, Wide Sense Stationarity.
Exponential Averaging: An exponential averager with parameter α computes an average x ( n ) of
a sequence {x(n)} as:
x ( n ) = ( 1 – α )x ( n – 1 ) + αx ( n )
(144)
where α is contained in the interval [0,1]. An exponential average (a one pole lowpass filter) is
simpler to compute than a moving rectangular window since older data points are simply forgotten
by the exponentially decreasing powers of (1 - α). A convenient rule of thumb approximation for the
“equivalent rectangular window” of an exponential averager is 1/α data samples. See also
Waveform Averaging, Moving Average, Weighted Moving Average.
Exponential Function: The simple exponential function is:
DSPedia
130
y = e x = exp ( x )
y
(145)
20
15
10
y = ex
5
-1
0
1
2
3
x
where “e” is the base of the natural logarithm, e = 2.7182818 . A key property of the exponential
function is that the derivative of e x is e x , i.e.
d x = x
e
e
dx
(146)
Real causal exponential functions can be used to represent the natural decay of energy in a passive
system, such as the voltage decay in an RC circuits. For example consider the discrete time
exponential:
x(k)
A
0 1 2 3 4.....
x ( k ) = Ae –λkts u ( k )
k
(147)
where u(k) is the unit step function, t s is the sampling period, and A and λ are constants. See also
Complex Exponential Functions, Damped Sinusoid, RC Circuit.
131
F
F-Series Recommendations: The F-series telecommunication recommendations from the
International Telecommunication (ITU), advisory committee on telecommunications (denoted ITUT and formerly known as CCITT) provide standards for services other than telephone (ops, quality,
service definitions and human factors). Some of the current recommendations (http://www.itu.ch)
include:
F.1
F.2
F.4
F.10
F.11
F.14
F.15
F.16
F.17
F.18
F.20
F.21
F.23
F.24
F.30
F.31
F.35
F.40
F.41
F.59
F.60
F.61
F.63
F.64
F.65
F.68
F.69
F.70
F.71
F.72
F.73
F.74
F.80
F.82
F.86
F.87
F.89
F.91
Operational provisions for the international public telegram service.
Operational provisions for the collection of telegram charges.
Plain and secret language.
Character error rate objective for telegraph communication using 5-unit start-stop equipment.
Continued availability of traditional services.
General provisions for one-stop-shopping arrangements.
Evaluating the success of new services.
Global virtual network service.
Operational aspects of service telecommunications.
Guidelines on harmonization of international public bureau services.
The international gentex service.
Composition of answer-back codes for the international gentex service.
Grade of service for long-distance international gentex circuits.
Average grade of service from country to country in the gentex service.
Use of various sequences of combinations for special purposes.
Telegram retransmission system.
Provisions applying to the operation of an international public automatic message switching service for
equipments utilizing the International Telegraph Alphabet No. 2.
International public telemessage service.
Interworking between the telemessage service and the international public telegram service.
General characteristics of the international telex service.
Operational provisions for the international telex service.
Operational provisions relating to the chargeable duration of a telex call.
Additional facilities in the international telex service.
Determination of the number of international telex circuits required to carry a given volume of traffic.
Time-to-answer by operators at international telex positions.
Establishment of the automatic intercontinental telex network.
The international telex service Service and operational provisions of telex destination codes and telex
network identification codes.
Evaluating the quality of the international telex service.
Interconnection of private teleprinter networks with the telex network.
The international telex service - General principles and operational aspects of a store and forward
facility.
Operational principles for communication between terminals of the international telex service and data
terminal equipment on packet switched public data networks.
Intermediate storage devices accessed from the international telex service using single stage selection
answerback format.
Basic requirements for interworking relations between the international telex service and other
services.
Operational provisions to permit interworking between the international telex service and the intex
service.
Interworking between the international telex service and the videotex service.
Operational principles for the transfer of messages from terminals on the telex network to Group 3
facsimile terminals connected to the public switched telephone network.
Status enquiry function in the international telex service.
General statistics for the telegraph services.
DSPedia
132
F.93
F.95
F.96
F.100
F.104
F.105
F.106
F.107
F.108
F.111
F.112
F.113
F.115
F.120
F.122
F.125
F.127
F.130
F.131
F.140
F.141
F.150
F.160
F.162
F.163
F.170
F.171
F.180
F.182
F.184
F.190
F.200
F.201
F.202
F.203
F.220
F.230
F.300
F.350
F.351
F.353
F.400
F.401
F.410
F.415
Routing tables for offices connected to the gentex service.
Table of international telex relations and traffic.
List of destination indicators.
Scheduled radiocommunication services.
International leased circuit services - Customer circuit designations.
Operational provisions for phototelegrams.
Operational provisions for private phototelegraph calls.
Rules for phototelegraph calls established over circuits normally used for telephone traffic.
Operating rules for international phototelegraph calls to multiple destinations.
Principles of service for mobile systems.
Quality objectives for 50-baud start-stop telegraph transmission in the maritime mobile-satellite
service.
Service provisions for aeronautical passenger communications supported by mobile-satellite systems.
Service objectives and principles for future public land mobile telecommunication systems.
Ship station identification for VHF/UHF and maritime mobile-satellite services.
Operational procedures for the maritime satellite data transmission service.
Numbering plan for access to the mobile-satellite services of INMARSAT from the international telex
service.
Operational procedures for interworking between the international telex service and the service offered
by INMARSAT-C system.
Maritime answer-back codes.
Radiotelex service codes.
Point-to-multipoint telecommunication service via satellite.
International two-way multipoint telecommunication service via satellite.
Service and operational provisions for the intex service.
General operational provisions for the international public facsimile services.
Service and operational requirements of store-and-forward facsimile service.
Operational requirements of the interconnection of facsimile store-and-forward units.
Operational provisions for the international public facsimile service between public bureaux
(bureaufax).
Operational provisions relating to the use of store-and-forward switching nodes within the bureaufax
service.
General operational provisions for the international public facsimile service between subscriber
stations (telefax).
Operational provisions for the international public facsimile service between subscribers' stations with
Group 3 facsimile machines (Telefax 3).
Operational provisions for the international public facsimile service between subscriber stations with
Group 4 facsimile machines (Telefax 4).
Operational provisions for the international facsimile service between public bureaux and subscriber
stations and vice versa (bureaufax-telefax and vice versa).
Teletex service.
Interworking between teletex service and telex service - General principles.
Interworking between the telex service and the teletex service - General procedures and operational
requirements for the international interconnenction of telex/teletex conversion facilities.
Network based storage for the teletex service.
Service requirements unique to the processable mode number eleven (PM11) used within teletex
service.
Service requirements unique to the mixed mode (MM) used within the teletex service
Videotex service.
Application of T Series recommendations.
General principles on the presentation of terminal identification to users of the telematic services.
Provision of telematic and data transmission services on integrated services digital network (ISDN).
Message handling services: Message Handling System and service overview.
X.400
Message handling services: naming and addressing for public message handling services.
Message Handling Services: the public message transfer service.
Message handling services: Intercommunication with public physical delivery services.
133
F.420
F.421
F.422
F.423
F.435
F.440
F.500
F.551
F.581
F.600
F.701
F.710
F.711
F.720
F.721
F.730
F.732
F.740
F.761
F.811
F.812
F.813
F.850
F.851
F.901
F.902
F.910
Message handling services: the public interpersonal messaging service.
Message handling services: Intercommunication between the IPM service and the telex service.
Message handling services: Intercommunication between the IPM service and the teletex service.
Message Handling Services: intercommunication between the interpersonal messaging service and
the telefax service.
Message handling: electronic data interchange messaging service.
Message handling services: the voice messaging service.
International public directory services.
Service for the telematic file transfer within Telefax 3, Telefax 4, Teletex services and message
handling services.
Guidelines for programming communication interfaces (PCIs) definition: Service
Service and operational principles for public data transmission services.
Teleconference service.
General principles for audiographic conference service.
Audiographic conference teleservice for ISDN.
Videotelephony services - general.
Videotelephony teleservice for ISDN.
Videoconference service- general.
Broadband Videoconference Services.
Audiovisual interactive services.
Service-oriented requirements for telewriting applications.
Broadband connection-oriented bearer service.
Broadband connectionless data bearer service.
Virtual path service for reserved and permanent communications.
Principles of Universal Personal Telecommunication (UPT).
Universal personal telecommunication (UPT) - Service description (service set 1)
Usability evaluation of telecommunication services.
Interactive services design guidelines.
Procedures for designing, evaluating and selecting symbols, pictograms and icons.
For additional detail consult the appropriate standard document or contact the ITU. See also
International Telecommunication Union, ITU-T Recommendations, Standards.
Far End Echo: Signal echo that is produced by components in far end telephone equipment. Far
end echo arrives after near end echo. See also Echo Cancellation, Near End Echo.
Fast Fourier Transform (FFT): The FFT [66], [93] is a method of computing the discrete Fourier
transform (DFT) that exploits the redundancy in the general DFT equation:
N–1
X( k) =
∑
– j 2πkn
-----------------x ( n )e N
for k = 0 to N – 1
(148)
n=0
Noting that the DFT computation of Eq. 148 requires approximately N 2 complex multiply
accumulates (MACs), where N is a power of 2, the radix-2 FFT requires only Nlog2N MACs. The
computational savings achieved by the FFT is therefore a factor of N/log2N. When N is large this
DSPedia
134
saving can be considerable. The following table compares the number of MACs required for
different values of N for the DFT and the FFT:
N
DFT MACs
FFT MACs
32
1024
160
1024
1048576
10240
32768
~ 1 x 109
~ 0.5 x106
There are a number of different FFT algorithms sometimes grouped via the names Cooley-Tukey,
prime factor, decimation-in-time, decimation-in-frequency, radix-2 and so on. The bottom line for all
FFT algorithms is, however, that they remove redundancy from the direct DFT computational
algorithm of Eq. 148.
We can highlight the existence of the redundant computation in the DFT by inspecting Eq. 148.
First, for notational simplicity we can rewrite Eq. 148 as:
N–1
X( k) =
∑ x ( n )WN–kn
for k = 0 to N – 1
(149)
n=0
where W = e j2π ⁄ N = cos 2π ⁄ N + j sin 2π ⁄ N Using the DFT algorithm to calculate the first four
components of the DFT of a (trivial) signal with only 8 samples requires the following computations:
X( 0) = x(0 ) + x( 1) + x( 2) + x( 3) + x(4 ) + x( 5) + x( 6) + x( 7)
X ( 1 ) = x ( 0 ) + x ( 1 )W 8– 1 + x ( 2 )W 8– 2 + x ( 3 )W 8– 3 + x ( 4 )W 8– 4 + x ( 5 )W 8– 5 + x ( 6 )W 8–6 + x ( 7 )W 8– 7
X ( 2 ) = x ( 0 ) + x ( 1 )W 8– 2 + x ( 2 )W 8– 4 + x ( 3 )W 8– 6 + x ( 4 )W 8– 8 + x ( 5 )W 8– 10 + x ( 6 )W 8–12 + x ( 7 )W 8– 14
(150)
X ( 3 ) = x ( 0 ) + x ( 1 )W 8– 3 + x ( 2 )W 8– 6 + x ( 3 )W 8– 9 + x ( 4 )W 8– 12 + x ( 5 )W 8– 15 + x ( 6 )W 8– 18 + x ( 7 )W 8– 21
However note that there is redundant (or repeated) arithmetic computation in Eq. 150. For example,
consider the third term in the second line of Eq. 150:
x ( 2 )W 8–2
= x ( 2 )e
–2
j2π  ------
8
=
– jπ
-------x ( 2 )e 2
(151)
Now consider the computation of the third term in the fourth line of Eq. 150:
x ( 2 )W 8–6
= x ( 2 )e
–6
j2π  ------
8
=
– j3π
----------x ( 2 )e 2
=
– jπ
-------–
π
j
x ( 2 )e e 2
=
– jπ
-------– x ( 2 )e 2
(152)
Therefore we can save one multiply operation by noting that the term x ( 2 )W 8– 6 = – x ( 2 )W 8–2 . In fact
because of the periodicity of W Nkn every term in the fourth line of Eq. 150 is available from the
computed terms in the second line of the equation. Hence a considerable saving in multiplicative
computations can be achieved if the computational order of the DFT algorithm is carefully
considered.
More generally we can show that the terms in the second line of Eq. 150 are:
135
x ( n )W 8–n = x ( n )e
– j 2πn-------------8
= x ( n )e
– j πn
----------4
(153)
and for terms in the fourth line of Eq. 150:
x ( n )W 8–3n
=
– j 6πn
-------------x ( n )e 8
=
x ( n ) ( –j ) n e
=
– j 3πn
-------------x ( n )e 4
= x ( n )e
π π
– j  --- + --- n
 2 4
=
πn
πn
– j ------ – j -----2
x ( n )e
e 4
(154)
πn
– j -----4
= ( – j ) n x ( n )W 8– n
This exploitation of the computational redundancy is the basis of the FFT which allows the same
result as the DFT to be computed, but with less MACs.
To more formally derive one version of the FFT (decimation-in-time radix-2), consider splitting the
DFT equation into two “half signals” consisting of the odd numbered and even numbered samples,
where the total number of samples is a power of 2 ( N = 2 n ):
N⁄2–1
X(k) =
∑
– j 2πk ( 2n )
-------------------------N
x ( 2n )e
n=0
N⁄2–1
=
∑
n=0
N⁄2–1
=
∑
N⁄2–1
+
∑
x ( 2n + 1 )e
– j 2πk ( 2n + 1 )
------------------------------------N
n=0
N⁄2–1
x ( 2n )W N–2 nk +
∑
x ( 2n + 1 )W N–( 2n + 1 ) k
(155)
n=0
N⁄2–1
x ( 2n )W N–2 nk + W N–k
n=0
∑
x ( 2n + 1 )W N–2 nk
n=0
Notice in Eq. 155 that the N point DFT which requires N 2 MACs in Eq. 148 is now accomplished
by performing two N ⁄ 2 point DFTs requiring a total of 2 × N 2 ⁄ 4 MACs which is a computational
saving of 50%. Therefore a next logical step is to take the N ⁄ 2 point DFTs and perform as N ⁄ 4
point DFTs, saving 50% computation again, and so on. As the number of points we started with was
a power of 2, then we can perform this decimation of the signal a total of N times, and each time
reduce the total computation of each stage to that of a “butterfly” operation. If N = 2 n then the
computational saving is a factor of:
N -------------log 2 N
(156)
DSPedia
136
In general equations for an FFT are awkward to write mathematically, and therefore the algorithm
is very often represented as a “butterfly” based signal flow graph (SFG), the butterfly being a simple
signal flow graph of the form:
Splitting node
c
a
Summing node
b
d
k
WN
-1
Multiplier
k is a complex number, and the input
The butterfly signal flow graph. The multipler W N
data, a and b may also be compex. One butterfly computation requires one complex
multiply and two complex additions (assuming the data is complex).
A more complete SFG for an 8 point decimation in time radix 2 FFT computation is:
X(0)
x(0)
x(4)
0
W8
-1
x(2)
x(6)
2
W8
0
W8
X(1)
0
W8
-1
-1
x(1)
x(5)
X(3)
0
W8
-1
1
W8
0
W8
-1
x(3)
x(7)
X(2)
-1
2
W8
0
W8
-1
2
0
W8
W8
-1
-1
3
W8
-1
-1
-1
X(4)
X(5)
X(6)
X(7)
kn = e – 2π ⁄ N . Note
A radix-2 Decimation-in-time (DIT) Cooley-Tukey FFT, for N = 8; W N
that the butterfly computation is repeated through the SFG.
See also Bit Reverse Addressing, Cooley-Tukey, Discrete Cosine Transform, Discrete Fourier
Transform, Fast Fourier Transform - Decimation-in-Time (DIT), Fast Fourier Transform Decimation-in-Frequency (DIF), Fast Fourier Transform - Zero Padding, Fourier, Fourier Analysis,
Fourier Series, Fourier Transform, Frequency Response, Phase Response.
Fast Fourier Transform, Decimation-in-Frequency (DIF): The DFT can be reformulated to give
the FFT either as a DIT or a DIF algorithm. Since the input data and output data values of the FFT
appear in bit-reversed order, decimation-in-frequency computation of the FFT provides the output
frequency samples in bit-reversed order. See also Discrete Fourier Transform, Fast Fourier
Transform, Fast Fourier Transform - Decimation-in-Frequency.
137
Fast Fourier Transform, Decimation-in-Time (DIT): The DFT can be reformulated to give the
FFT either as a DIF or a DIT algorithm. Since the input data and output data values of the FFT
appear in bit-reversed order, decimation-in-time computation of the FFT provides the output
frequency samples in proper order when the input time samples are arranged in bit-reversed order.
See also Discrete Fourier Transform, Fast Fourier Transform - Decimation-in-Time, Fast Fourier
Transform - Decimation-in-Frequency.
See also Discrete Fourier Transform.
Fast Fourier Transform, Zero Padding: When performing an FFT, the number of data points
used in the algorithm is a power of 2 (for radix-2 FFT algorithms). What if a particular process only
produces 100 samples and the FFT is required? There are two choices: (1) Truncate the sequence
to 64 samples; (2) Pad out the signal by setting the last 28 values of the FFT to be the same as the
first 28 samples; (3) Zero pad the data by setting the last 28 values of the FFT to zero.
Solution (1) will lose signal information and solution (2) will add information which is not necessarily
part of the signal (i.e. discontinuities). However, solution (3) will only increase the frequency
resolution of the FFT by adding more harmonics and does not affect the integrity of the data.
Fast Given’s Rotations: See Matrix Decompositions - Square Root Free Given’s Rotations.
Filtered-U LMS: See Active Noise Cancellation.
Filtered-X LMS: See Least Mean Squares Filtered-X Algorithm.
Filters: A circuit designed to pass signals of certain frequencies, and attenuate others Filters can
be analog or digital [45]. In general a filter with N poles (where N is usually the number of reactive
circuit elements used, such as capacitors or inductors) will have a roll-off of 6N dB/octave or 20N
dB/decade.
Although the above second order (two pole) active filter increases the final rate of roll-off, the
sharpness of the knee (at the 3dB frequency) of the filter is not improved and the further increase
in order will not produce a filter that approaches the ideal filter. Other designs, such as the
Butterworth, Chebychev and Bessel filter, produce filters that have a flatter passband characteristic
or a much sharper knee. In general, for a fixed order filter, the sharper the knee of the filter the more
variation in the gain of the passband.
A simple active filter is illustrated below.
+
Vin
Vin
Vout
A simple 3rd order active filter.
The cut-off frequency can be changed by modifying the resistor values. This filter has a roll-off of
18dB/octave, therefore meaning that if used as an anti-alias filter cutting of at fs/2 where f s is the
sampling frequency, the filter would only provide attenuation of 18 dB at fs and hence aliasing
problems may occur. A popular (though not necessarily appropriate) rule of thumb anti-alias filters
DSPedia
138
Second Order (Active) Filter
First Order (Passive) Filter
Vin
R
R
R
C
Vout
f 3dB
1 = --------------2πRC
Vin
C
Buffer
Amplifier
C
Vou
V out
1
----------- = ------------------------------------------------------------------------V in
1 + 2 ( f ⁄ f 3dB ) 2 + ( f ⁄ f 3dB ) 4
V out
1
----------- = -------------------------------------V in
1 + ( f ⁄ f 3dB ) 2
20log10 Vout/Vin (dB)
Ideal
filter
0
-5
-10
-15
-20
-25
-30
-35
-40
-45
-50
-55
-60
Log10 frequency (decade)
1st order (passive) RC circuit:
Roll-off = 20dB/decade
2nd order active RC circuit:
Roll-off = 40dB/decade
0.1
0.5
1
5
10
50
100
500
1000
log10(f/f3dB)
should provide at least the same attenuation at the sampling frequency as the dynamic range of the
wordlength. For example, if using 16 bit arithmetic the dynamic range is 20 log 2 16 = 96dB and the
roll-off of the filter above the 3dB frequency is at least 96dB/octave. In designing anti-alias fitters,
the key requirement is limiting the significance of any aliased frequency components. Because it is
the nature of lowpass filters to provide more attenuation at higher frequencies that at lower ones,
the aliased components at fs/2 are usually the limiting factor. See also Active Filter, Anti-alias Filter,
Bandpass Filter, Digital Filter, High Pass Filter, Low Pass Filter, Knee, Reconstruction Filter, RC
Filter, Roll-off.
Bessel Filter: A filter that has a maximally flat phase response in its passband.
Butterworth Filter: This is a filter based on certain mathematical constraints and defining equations.
These filters have been used for a very long time in designing stable analog filters. In general the
Butterworth filter has a passband that is very flat, at the expense of a slow roll off. The gain of the order n
(analog) Butterworth can be given as
V out
1
----------- = ---------------------------------------V in
1 + ( f ⁄ f 3db ) 2n
Chebyshev Filter: A type of filter that has a certain amount of ripple in the passband, but has a very steep
roll-off. The gain of the order n (analog) Chebyshev filter can be given as below where C n is a special
(157)
139
polynomial and ε is a constant that determines the magnitude of the passband ripple. The spelling of
Chebyshev has many variants (such as Tschebyscheff).
V out
1
----------- = -----------------------------------------------V in
2
1 + ε C n2 ( f ⁄ f 3db )
(158)
Elliptic Filter: A type of filter that achieves the maximum possible roll-off for a particular filter order. The
phase response of an elliptic filter is extremely non-linear.
Finite Impulse Response (FIR) Filter: (See first Digital Filter). An FIR filter digital filter performs
a moving weighted average on an input stream of digital data to filter a signal according to some
predefined frequency criteria such as a low pass, high pass, band pass, or band-stop filter:
0
0
0
Gain
0
frequency
Low Pass
frequency
frequency
High Pass
Band-Pass
frequency
Band-Stop
FIR Filters are usually designed with software to be low pass, high pass, band pass or
band-stop.
As discussed under Digital Filter, an FIR filter is integrated to the real world via analogue to digital
converters (ADC) and digital to analogue converters (DAC) and suitable anti-alias and
reconstruction filters. An FIR digital filter can be conveniently represented in a signal flow graph:
x(k)
x(k-1)
w0
w1
x(k-2)
w2
x(k-3)
w3
x(k-N+2)
wN-2
x(k-N+1)
wN-1
y(k)
The signal flow graph and the output equation for an FIR digital filter. The last N input
samples are weighted by the filter coefficients to produce the output y ( k )
The general output equation (convolution) for an FIR filter is:
y ( k ) = w 0 x ( k ) + w 1 x ( k – 1 ) + w 2 x ( k – 2 ) + w 3 x ( k – 3 ) + ..... + wN – 1 x ( k – N + 1 )
N–1
=
∑
n=0
wn x ( k – n )
(159)
DSPedia
140
The term finite impulse response refers to the fact that the impulse response results in energy at
only a finite number of samples after which the output is zero. Therefore if the input sequence is a
unit impulse the FIR filter output will have a finite duration:
δ(k)
h(k)
Unit Impulse
1
Finite Impulse Response
1
T = ---- secs
fs
1
T = ---- secs
fs
0
1
2
3
4
5
6
7
0
K
1
3
4
6
time (secs/fs)
7
K
time (secs/fs)
x(k)
Digital
FIR Filter
y(k)
The discrete output of a finite impulse response (FIR) filter sampled at fs Hz has a finite
duration in time, i.e. the output will decay to zero within a finite time.
This can be illustrated by considering that the FIR filter is essentially a shift register which is clocked
once per sampling period. For example consider a simple 4 weight filter:
1
0
0
w1
0
w2
0
w3
0
1
w1
0
w2
0
w1
w0
Time, k=0
0
1
0
w2
w3
w1
Time, k=1
0
0
w3
0
w1
1
w2
w3
w2
Time, k=2
0
w3
Time, k=3
0
0
w1
0
w2
0
0
w3
0
w1
0
w2
w3
0
Time, k=4
0
Time, k=5
etc.....etc.....
When applying a unit impulse response to a filter, the 1 value passes through the filter “shift
register” causing the filter impulse response to be output.
As an example, a simple low pass FIR filter can be designed using the DSP design software
SystemView by Elanix , with a sampling rate of 10000 Hz, a cut off frequency of around 1000Hz, a
141
stopband attenuation of about 40dB, passband ripple of less than 1 dB and limited to 15 weights.
The resulting filter is:
Low Pass FIR Filter Impulse Response
h(n)
0.25
w0 = w14 = -0.01813...
w1 = w13 = -0.08489...
w2 = w12 =-0.03210...
w3 = w11 = -0.00156...
w4 = w10 = 0.07258...
w5 = w9 = 0.15493...
w6 = w8 = 0.22140...
w7 = 0.25669...
(Truncated to 5 decimal
places)
0.20
0.15
1
T = ---------------- secs
10000
0.10
0.05
0
time, n
-0.05
15
10
5
The impulse response h ( n ) = w n of a low pass filter, FIR1 with 15 weights, a sampling
rate of 10000 Hz, and cut off frequency designed at around 1000Hz.
Noting that a unit impulse contains “all frequencies”, then the magnitude frequency response and
phase response of the filter are found from the DFT (or FFT) of the filter weights:
H(f)
Gain (dB)
1.0
Gain
20 log H ( f )
10
1.2
0.8
0.6
0.4
0.2
0
1000
2000
3000
4000
5000
frequency (Hz)
Linear Magnitude Response
0
-10
-20
-30
-40
0
1000
2000
3000
4000
5000
frequency (Hz)
Logarithmic Magnitude Response
The 1024 point FFT (zero padded) of the above low pass filter impulse response, FIR1. As
the sampling rate is 10000 Hz the frequency response is only plotted up to 5000 Hz. (Note
that the y-axis is labelled Gain rather than Attenuation, this is because -10dB gain is the
same as 10dB attenuation. Hence if attenuation was plotted the above figures would be
inverted.)
DSPedia
142
H(f)
Phase Response (unwrapped)
0
Phase (radians)
Phase (radians)
H(f)
-π
-2π
-3π
-4π
-5π
-6π
0
1000
2000
3000
4000
5000
frequency (Hz)
Phase Response (wrapped)
π
π/2
0
-π/2
-π
0
1000
2000
3000
4000
5000
frequency (Hz)
The 1024 point FFT generated phase response (phase shift versus frequency) above low
pass filter impulse response, FIR1. Note that the the filter is linear phase and the wrapped
and unwrapped phase responses are different ways of representing the same information.
The “wrapped” phase response will often produced by DSP software packages and gives
phase values between -π and π only. As the phase is calculated as modulo 2π. i.e. a phase
shift of θ is the same as a phase shift of θ + 2π and so on. Phase responses are also often
plotted using degrees rather than radians.
From the magnitude and phase response plots we can therefore calculate the attenuation and
phase shift of different input signal frequencies. For example, if a single frequency at 1500Hz, with
an amplitude of 150 is input to the above filter, then the amplitude of the output signal will be around
30, and phase shifted by a little over -2π radians. However, if a single frequency of 500Hz was input,
then the output signal amplitude is amplified by a factor of about 1.085 and phase shifted by about
-0.7π radians.
As a more intuitive and illustrative example of filtering, consider inputing the signal, x ( k ) below
to a suitably designed “low pass filter” to produce the output signal, y ( k ) :
x(k)
y(k)
time, k
x(k)
time, k
Low Pass
Digital Filter
y(k)
Example of an FIR Filter performing low pass filtering, i.e. removing high frequencies by
performing a weighted moving average with suitable low pass characteristic weights. The
remaining low frequencies are phase shifted (i.e. time delayed) as a result of passing
through the filter.
So, how long is a typical FIR filter? This of course depends on the requirement of the problem being
addressed. For the generic filter characteristic shown below more weights are required if:
• A sharper transition bandwidth is required;
• More stopband attenuation is required;
143
• Very small passband ripple is required.
Transition
Band
Gain (dB)
0
Passband
Ripple
-3
Stopband
Attenuation
Low Pass
Ideal Filter
fs/2
frequency
Generic low pass filter magnitude response. The more stringent the filter requirements of
stopband attenuation, transition bandwidth and to a lesser extent passband ripple, the
more weights that are required.
Consider again the design of the above FIR filter (FIR1) which was a low pass filter cutting of at
about 1000Hz. Using SystemView, the above criteria can be varied such that the number of filter
weights can be increased and a more stringent filter designed. Consider the design of three low
pass filters cutting off at 1000 Hz, with stopband attenuation of 40dB and transition bandwidths 500
Hz, 200 Hz and 50 Hz:
Gain (dB)
0
FIR 1
0
FIR 2
0
-20
-20
-20
-40
-40
-40
-60
-60
-60
-80
-80
-80
0
1000 2000 3000 4000 5000
frequency (Hz)
Transition Band: 1000 - 1500Hz
No. of weights: 29
0
1000 2000 3000 4000 5000
frequency (Hz)
Transition Band: 1000 - 1200Hz
No. of weights: 69
FIR 3
0
1000 2000 3000 4000 5000
frequency (Hz)
Transition Band: 1000 - 1100Hz
No. of weights: 269
Low pass filters designed parameters: Stopband Attenuation = 40dB; Passband Ripple =
1dB and transition bandwidths, of 500, 200, and 50 Hz. The sharper the transition band the
more filter weights that are required.
DSPedia
144
The respective impulse responses of FIR1 , FIR2 and FIR3 are respectively 15, 69 and 269
weights long, with group delays of 7, 34 and 134 samples respectively.
1
------------ secs
10000
0.2
0.2
FIR2,
1
------------ secs
10000
time
69 weights
time
0.2
1
------------ secs
10000
FIR3, 269 weights
time
The impulse responses of low pass filters FIR1, FIR2, and FIR3, all with 40 dB stopband
attenuation, 1dB passband ripple, but transition bandwidths of 500, 200 and 50 Hz
respectively. Clearly the more stringent the filter parameters, the longer the required
impulse response.
Similarly if the stopband attenuation specification is increased, the number of filter weights required
will again require to increase. For a low pass filter with a cut off frequency again at 1000 Hz, a
transition bandwidth of 500 Hz and stopband attenuations of 40 dB , 60 dB and 80 dB :
Gain (dB)
0
0
FIR 1
0
FIR 4
-20
-20
-20
-40
-40
-40
-60
-60
-60
-80
-80
-80
0
1000 2000 3000 4000 5000
frequency (Hz)
Transition Band: 1000 - 1500Hz
No. of weights: 29
0
1000 2000 3000 4000 5000
frequency (Hz)
Transition Band: 1000 - 1200Hz
No. of weights: 41
FIR 5
0
1000 2000 3000 4000 5000
frequency (Hz)
Transition Band: 1000 - 1100Hz
No. of weights: 55
Low pass filters designed parameters: Transition Bandwidth = 500Hz; Passband Ripple =
1dB and stopband attenuations of 40 dB, 60 dB, and 80 dB.
0.2
FIR 1
FIR 4
0.2
1
------------
1
------------ secs
10000
time
10000
FIR 5
0.2
1
------------
secs
10000
secs
time
The impulse responses of low pass filters FIR1, FIR4, and FIR5, all 1dB passband ripple,
and transition bandwidths of 500 Hz and stopband attenuation of 40, 60 and 80dB
respectively. Clearly the more stringent the filter parameters, the longer the required
impulse response.
time
145
Similarly if the passband ripple parameter is reduced, then a longer impulse response will be
required. See also Adaptive Filter, Digital Filter, Low Pass Filter, High Pass Filter, Bandpass Filter,
Bandstop Filter, IIR Filter.
Finite Impulse Response (FIR) Filter, Bit Errors: If we consider the possibility of a random
single bit error in the weights of an FIR filter, the effect on the filter magnitude and phase response
can be quite dramatic. Consider a simple 15 weight filter :
“Correct” Filter
1
T = ------------- secs
8000
Gain (dB)
h(n)
0.25
0.20
0.15
0.10
0.05
0
-0.05
0
-10
-20
-30
-40
time
20 log H ( f )
10
800
0
1600
2400
3200
4000
frequency (Hz)
Fifteen weight low pass FIR filter cutting off at 800 Hz.
The 3rd coefficient is of value -0.0725..., and in 16 bit fractional binary notation this is
0.0001001010010102. If a single bit occurs in the 3rd bit of this binary coefficient then the value
becomes:
0.0011001010010102 = -0.1957...
The impulse response clearly changes “a little” whereas the effect on the frequency response
changes is a little more substantial and causes a loss of about 5 dB attenuation.
Bit Error Filter
1
T = ------------- secs
8000
10
time, n
15
Gain (dB)
h(n)
0.25
0.20
0.15
0.10
0.05
0
-0.05
20 log H ( f )
10
0
-10
-20
-30
-40
0
800
1600
2400
3200
4000
frequency (Hz)
5
15 weights low pass FIR filter cutting off at 800 Hz with the 3rd coefficient being in error by
a single bit. Note the change to the frequency response compared to the correct filter
above.
DSPedia
146
Also because the impulse response is no longer symmetric the phase response is no longer
linear:
H(f)
Correct Filter
0
Phase (radians)
Phase (radians)
H(f)
-π
-2π
-3π
-4π
-5π
-6π
800
0
1600
2400
3200
4000
frequency (Hz)
Bit Error Filter
0
-π
-2π
-3π
-4π
-5π
-6π
800
0
1600
2400
3200
4000
frequency (Hz)
Phase response of the original (“correct”) filter and the bit error filter. The result of the error
in a single coefficient has caused the phase to be no longer exactly linear.
Of course the bit error may have occured at the least significant bits and the frequency domain
effect would be much less pronounced. However because of the excellent reliability of DSP
processors the occurence of bit errors in filter coefficients is unlikely. See also Digital Filter, Finite
Impulse Response Filter.
Finite Impulse Response (FIR), Group Delay: See Finite Impulse Response Filter - Linear
Phase.
Finite Impulse Response Filter (FIR), Linear Phase: If the weights of an N weight real valued
FIR filter are symmetric or anti-symmetric, i.e.
: w ( n ) = ±w ( N – 1 – n )
(160)
then the filter has linear phase. This means that all frequencies passing through the filter are
delayed by the same amount. The impulse response of a linear phase FIR filter can have either an
even or odd number of weights.
line of symmetry
line of symmetry
wk
wk
0
k
Symmetric impulse response of an 11 (odd
number) weight linear phase FIR filter.
0
k
Symmetric impulse response of an 8 (even
number) weight linear phase FIR filter.
147
Location of anti-symmetry
Location of anti-symmetry
wk
wk
0
0
k
Anti-symmetric impulse response of an 11 (odd
number) weight linear phase FIR filter.
k
Anti-symmetric impulse response of an 8
(even number) weight linear phase FIR filter.
The z-domain plane pole zero plot of a linear phase filter will always have conjugate pair zeroes,
i.e. the zeroes are symmetric about the real axis:
The desirable property of linear phase is particularly important in applications where the phase of
a signal carries important information. To illustrate the linear phase response, consider inputting a
cosine wave of frequency f , sampled at f s samples per second (i.e. cos 2πfk ⁄ f s ) to a symmetric
impulse response FIR filter with an even number of weights N (i.e. w n = w N – n for
n = 0, 1, …, N ⁄ 2 – 1 ). For notational convenience let ω = 2πf ⁄ f s :
N⁄2–1
N–1
y( k ) =
∑ wn cos ω ( k – n )
n=0
N⁄2–1
=
∑
=
∑
w n ( cos ω ( k – n ) + cos ω ( k – N + n ) )
n=0
∙
2w n cos ω ( k – N ⁄ 2 ) cos ω ( n – N ⁄ 2 )
n=0
(161)
N⁄2–1
= 2 cos ω ( k – N ⁄ 2 )
∑
w n cos ω ( n – N ⁄ 2 )
n=0
N⁄2–1
= M ⋅ cos ω ( k – N ⁄ 2 ),
where M =
∑
2w n cos ω ( n – N ⁄ 2 )
n=0
where the trigonometric identity,
cos A + cos B = 2 cos ( ( A + B ) ⁄ 2 ) cos ( ( A – B ) ⁄ 2 )
has been used. From this equation it can be seen that regardless of the input frequency, the input
cosine wave is delayed only by N ⁄ 2 samples, often referred to as the group delay, and its
magnitude is scaled by the factor M. Hence the phase response of such an FIR is simply a linear
plot of the straight line defined by ωN ⁄ 2 . Group delay is often defined as the differentiation of the
phase response with respect to angular frequency. Hence, a filter that provides linear phase has a
group delay that is constant for all frequencies. An all-pass filter with constant group delay (i.e.,
linear phase) produces a pure delay for any input time waveform.
DSPedia
148
Linear phase FIR filters can be implemented with N ⁄ 2 multiplies and N accumulates compared
to the N MACs required by an FIR filter with a non-symmetric impulse response. This can be
illustrated by rewriting the output of a symmetric FIR filter with an even number of coefficients:
N⁄2–1
N–1
y(k ) =
∑ wn x ( k – n )
n=0
∑
=
wn [ x ( k – n ) + x ( k – N + n ) ]
(162)
n=0
Although the number of multiplies is halved, most DSP processors can perform a multiplyaccumulate in the same time as an addition so there is not necessarily a computational advantage
for the implementation of a symmetric FIR filter on a DSP device. One drawback of a linear phase
filter is of course that they always introduce a delay.
Linear phase FIR filters are non-minimum phase, i.e. they will always have zeroes that are on are
outside of the unit circle. For the z-domain plane plot of the z-transform of a linear phase filter, for
all zeroes that are not on the unit circle, there will be a complex conjugate reciprocal of that zero.
For example :
h(n)
Imag
0.4
2
0.3
0.2
z-domain
0.1
0
1
1
2
3
4
time, n
The impulse response of a simple 5 weight linear phase FIR
filter and the corresponding z-domain plane plot. Note that for
the zeroes inside the unit circle at z = – 0.286 ± 0.3526 j , there
are conjugate reciprocal zeroes at:
-1
1
z = ------------------------------------------ = – 1.384 ± 1.727j
– 0.286 ±0.3526 j
0
1
Real
-1
-2
See also Digital Filter, Finite Impulse Response Filter.
Finite Impulse Response (FIR), Minimum Phase: If the zeroes of an FIR filter all lie within the
unit circle on the z-domain plane, then the filter is said to be minimum phase. One simple property
is that the inverse filter of a minimum phase FIR filter is a stable IIR filter, i.e. all of the poles lie within
the unit circle. See also Finite Impulse Response Filter.
Finite Impulse Response (FIR) Filter, Order Reversed: Consider the general finite impulse
response filter with transfer function denoted as H ( z ) :
H ( z ) = a 1 + a 2 z – 1 + … + a N – 1 z – N + 1 + a N z –N
(163)
The order reversed FIR filter transfer function, H r ( z ) is given by:
H r ( z ) = a N + a N – 1 z –1 + … + a 1 z – N + 1 + a 0 z –N
(164)
149
The respective FIR filter signal flow graphs (SFG) are simply:
FIR Filter
Order Reversed FIR Filter
x(k)
x(k)
a1
a2
aN−1
aN
aN
aN−1
a2
y(k)
a1
y(k)
The signal flow graph for an N+1 weight FIR filter and the order reversed FIR filter. The
order reversed FIR filter is same order as the original FIR filter but with the filter weights in
opposite order.
From the z-domain functions above it is easy to show that H r ( z ) = z –N H ( z –1 ) . The order
reversed FIR filter has exactly the same magnitude frequency response as the original FIR filter:
H r ( z ) z = e jω = z –N H ( z –1 ) z = e jω = e –jωN H ( e –j ω ) = H ( e –jω ) = H ( e jω )
= H ( z ) z = e jω
(165)
The phase response of the two filters are however different. The difference to the phase response
can be noted by considering that the zeroes of the order reversed FIR filter are the inverse of the
zeroes of the original FIR filter, i.e. if the zeroes of Eq. 164 are α 1, α 2, …α N – 1, α N :
H ( z ) = ( 1 – α1 z –1 ) ( 1 – α 2 z – 1 )… ( 1 – α N – 1 z –1 ) ( 1 – α N z –1 )
(166)
then the zeroes of the order reversed polynomial are α 1–1, α2– 2, …α N–1– 1 , αN– 1 which can be seen
from:
H r ( z ) = z –N H ( z –1 )
= z –N ( 1 – α 1 z ) ( 1 – α 2 z )… ( 1 – α N – 1 z ) ( 1 – α N z )
= ( z – 1 – α 1 ) ( z –1 – α 2 )… ( z – 1 – α N – 1 ) ( z – 1 – α N )
(167)
( –1 )N
= ------------------------------------------ ( 1 – α 1–1 z –1 ) ( 1 – α 2–1 z – 1 )… ( 1 – α N–1– 1 z –1 ) ( 1 – α N–1 z – 1 )
α 1 α2 …α N – 1 α N
As examples consider the 8 weight FIR filter
H ( z ) = 10 + 5z –1 – 3z –2 – z –3 + 3z –4 + 2z – 5 – z – 6 + 0.5z – 7
(168)
and the corresponding order reversed FIR filter:
H ( z ) = 0.5 – z – 1 + 2z –2 + 3z –3 – z –4 – 3z –5 + 5z –6 + 10z –7
(169)
DSPedia
150
Assuming a sampling frequency of f s = 1 , the impulse response of both filters are easily plotted as
:
h(k )
hr ( k )
10
8
6
4
2
0
10
8
6
4
2
0
k
Impulse response, h ( k ) of simple FIR filter
k
Order reversed impulse response, h r ( k ) .
The corresponding magnitude and phase frequency responses of both filters are:
20 log H ( e jω ) Magnitude Response
25
20
15
10
5
0
0.1
0.2
0.3
0.4
H ( e jω ) Phase Response
π
Phase (radians)
Gain (dB)
30
π/2
0
-π/2
-π
0.5
0
frequency (Hz)
0.1
0.2
0.3
0.4
0.5
frequency (Hz)
Magnitude and phase frequency response of FIR filter
H ( z ) = 10 + 5z – 1 – 3z –2 – z –3 + 3z – 4 + 2z – 5 – z – 6 + 0.5z – 7
20 log H r ( e jω ) Magnitude Response
25
20
15
10
5
0
0.1
0.2
0.3
0.4
0.5
frequency (Hz)
H r ( e jω ) Phase Response (wrapped)
π
Phase (radians)
Gain (dB)
30
π/2
0
-π/2
-π
0
0.1
0.2
0.3
Magnitude and phase frequency response of order reversed FIR filter
H r ( z ) = 0.5 – z – 1 + 2z –2 + 3z – 3 – z – 4 – 3z – 5 + 5z – 6 + 10z – 7
0.4
0.5
frequency (Hz)
151
and the z-domain plots of both filter zeroes are:
Imag
- Zeroes of FIR filter H(z)
- Zeroes of order reversed FIR filter Hr(z)
z-domain
For a zero α = x + jy we note that α = x 2 + y 2 and
therefore for related the order reversed filter zero at
1 ⁄ α we note:
x – jy - = --------------------x 2 + y 2- = --------------------1 1 - = ------------------1- = ------------2
2
2
α
x + jy
x2 + y2
x +y
x + y2
1
0
-1
1
Real
-1
For this particular example H ( z ) is clearly minimum
phase (all zeroes inside the unit circle), and therefore
H r ( z ) is maximum phase (all zeroes outside of the unit
circle.
See also All-pass Filter, Digital Filter, Finite Impulse Response Filter.
Finite Impulse Response (FIR) Filter, Real Time Implementation: For each input sample, an
FIR filter requires to perform N multiply accumulate (MAC) operations:
N–1
y(k) =
∑ wn x ( k – n )
(170)
n=0
Therefore if a particular FIR filter is sampling data at fs Hz, then the number of arithmetic operations
per second is:
MACs/sec = Nf s
(171)
Finite Impulse Response (FIR) Filter, Wordlength: For a real time implementation of a digital
filter, the wordlength used to represent the filter weights will of course have some bearing on the
achievable accuracy of the frequency response. Consider for example the design of a high pass
digital filter using 16 bit filter weights:
Gain (dB)
0
FIR 1
0
FIR 4
0
-20
-20
-20
-40
-40
-40
-60
-60
0
1000 2000 3000 4000 5000
frequency (Hz)
16 bit coefficients
FIR 5
-60
0
1000 2000 3000 4000 5000
frequency (Hz)
8 bit coefficients
0
1000 2000 3000 4000 5000
frequency (Hz)
4 bit coefficients
Low pass filters designed parameters: Transition Bandwidth = 500Hz; Passband Ripple =
1dB and stopband attenuations of 40 dB, 60 dB, and 80 dB.
DSPedia
152
Finite Impulse Response (FIR) Filter, Zeroes: An important way of representing an FIR digital
filter is with a z-domain plot of the filter zeroes. By writing the transfer function of an FIR filter in the
z-domain, the resulting polynomial in z can be factorised to find the roots, which are in fact the
“zeroes” of the digital filter. Consider a simple 5 weight FIR filter :
y ( k ) = – 0.3 x ( k ) + 0.5x ( k – 1 ) + x ( k – 2 ) + 0.5x ( k – 3 ) – 0.3x ( k – 4 )
(172)
The signal flow graph of this filter can be represented as:
x(k)
x(k-1)
-0.3
0.5
x(k-2)
1
x(k-4)
x(k-3)
0.5
-0.3
y(k)
The signal flow graph for a 5 weight FIR filter.
The z-domain transfer function of this polynomial is therefore:
( z )- = – 0.3 + 0.5z –1 + z – 2 +
----------H( z) = Y
0.5z – 3 – 0.3z –4
X(z)
(173)
If the z-polynomial of Eq. 173 is factorised (using DSP design software rather than with paper and
pencil!) then this gives for this example:
H ( z ) = – 0.3 ( 1 – 2.95z –1 ) ( 1 – ( – 0.811 + 0.584j )z – 1 ) ( 1 – ( – 0.811 + 0.584j )z – 1 ) ( 1 – 0.339z –1 )
(174)
and the zeroes of the FIR filter (corresponding to the roots of the polynomial are,
z = 2.95, 0.339, – 0.811 + 0.584j, and –0.811 – 0.584j . (Note all quantities have been rounded to
3 decimal places). The corresponding SFG of the FIR filter written in the zero form of Eq. 174 is
therefore:
x(k)
2.95
0.339
-0.811+
0.584j
-0.8110.584j
-0.3
y(k)
The signal flow graph of four first order cascaded filters corresponding to the same impulse
response as the 5 weight filter shown above. The first order filter coefficients correspond to
the zeroes of the 5 weight filter.
153
The zeroes of the FIR filter can also be plotted on the z-domain plane:
Imag
1
z-domain
0.5
-1
-0.5
0
0.5
1
Real
2
3
-0.5
-1
The zeroes of the FIR filter in Eq. 173. Note that some of roots are complex. In the case of
an FIR filter with real coefficients the zeroes are always symmetric about the x-axis
(conjugate pairs) such that when the factorised polynomial is multiplied out there are no
imagniary values.
If all of the zeroes of the FIR filter are within the unit circle then the filter is said to be minimum
phase.
FIR Filter: See Finite Impulse Response Filter.
First Order Hold: Interpolation between discrete samples using a straight line. First order hold is
a crude form of interpolation. See also Interpolation, Step Reconstruction, Zero Order Hold.
Fixed point: Numbers are represented as integers. 16 bit fixed point can represent a range of
65536 (216) numbers (including zero). 24 bit fixed point as used by some Motorola fixed point DSP
processors can represent a range of 16777216 (224) numbers. See also Binary, Binary Point,
Floating Point, Two’s Complement.
Fixed Point DSP: A DSP processor that can manipulate only fixed point numbers, such as the
Motorola DSP56002, the Texas Instruments TMS320C50, the AT&T DSP16, or the Analog Devices
ADSP2100. See also Floating Point DSP.
Flash Converter: A type (expensive) analog to digital converter.
Fletcher-Munson Curves: Fletcher and Munson’s 1933 paper [73] studied the definition of sound
intensity, the subjective loudness of human hearing, and associated measurements. Most notably
they produced a set of equal loudness contours which showed the variation in SPL of tones at
different frequencies that are perceived as having the same loudness. The work of Fletcher and
Munson was re-evaluated a few years later by Robinson and Dadson [126]. See also Equal
Loudness Contours, Frequency Range of Hearing, Loudness Recruitment, Sound Pressure Level,
Threshold of Hearing.
Floating Point: Numbers are represented in a floating point notation with a mantissa and an
exponent. 32 bit floating point numbers have a 24 bit mantissa and an 8 bit exponent. Motorola DSP
processors use the IEEE 754 floating point number format whereas Texas Instruments use their
own floating point number format. Both formats give a dynamic range of approximately 2-128 to 2128
with a resolution of 24 bits.
fs: Abbreviation for the sampling frequency (in Hz) of a DSP system.
DSPedia
154
Floating Point Arithmetic Standards: See IEEE Standard 754.
Fourier: Jean Baptiste Fourier (died 1830) made a major contribution to modern mathematics with
his work in using trigonometric functions to represent heat and diffusion equations. Fourier’s work
is now collectively refered to as Fourier Analysis. See also Discrete Fourier Transform, Fourier
Analysis, Fourier Series, Fourier Transform.
Fourier Analysis: The mathematical tools of the Fourier series, Fourier transform, discrete
Fourier transform, magnitude response, phase response and so on can be collectively refered to
as Fourier analysis tools. Fourier analysis is widely used in science, engineering and business
mathematics. In DSP representing a signal in the frequency domain using Fourier techniques, can
bring a number of advantages:
Physical Meaning: Many real world signals are produced as a sum of harmonic oscillations, e.g. vibrating
music strings; vibration induced from the reciprocating motion of an engine; vibration of the vocal tract and
other forms of simple harmonic motion. Hence reliable mathematical models can be produced.
Filtering: It is often useful to filter in a frequency selective manner, e.g. filter out low frequencies.
Signal Compression: If a signal is periodic over a long time, then rather than transmit the time signal, we
can transmit the frequency domain parameters (amplitude, frequencies and phase) and the signal can be
reconstructed at the other end of a communications line.
See also Discrete Fourier Transform, Fast Fourier Transform, Fourier Transform.
Fourier Series: There exists mathematical theory called the Fourier series that allows any periodic
waveform in time to be decomposed into a sum of harmonically related sine and cosine waves. The
first requirement in realising the Fourier series is to calculate the fundamental period, T , which is
the shortest time over which the signal repeats, i.e. for a signal x ( t ) , then:
x ( t ) = x ( t + T ) = x ( t + 2T ) = … = x ( t + kT )
(175)
1
T = ---f0
x( t)
t0
t0 + T
t 0 + 2T
time
The (fundamental) period of a signal x ( t ) identified as T . The fundamental frequency, f 0 ,
is calculated as f 0 = 1 ⁄ T . Clearly x ( t 0 ) = x ( t 0 + T ) = x ( t 0 + 2T ) .
For a periodic signal with fundamental period T seconds, the Fourier series represents this signal
as a sum of sine and cosine components that are harmonics of the fundamental frequency,
f 0 = 1 ⁄ T Hz. The Fourier series can be written in a number of different ways:
155
∞
∑
x( t) =
2πnt
A n cos  ------------ +
 T 
n=0
2πnt
-
∑ Bn sin  ----------T 
n=1
∞
= A0 +
∞
∑
(176)
2πnt
2πnt
A n cos  ------------ + B n sin  ------------
 T 
 T 
n=1
∞
= A0 +
∑
[ A n cos ( 2πnf 0 t ) + B n sin ( 2πnf 0 t ) ]
n=1
∞
= A0 +
∑
[ A n cos ( nω 0 t ) + B n sin ( nω 0 t ) ]
n=1
∞
=
∑ [ An cos ( nω0 t ) + Bn sin ( nω0 t ) ]
n=0
= A 0 + A 1 cos ( ω 0 t ) + A 2 cos ( 2ω 0 t ) + A 2 cos ( 3ω 0 t ) + …
+ B 1 sin ( ω 0 t ) + B 2 sin ( 2ω 0 t ) + B 2 sin ( 3ω 0 t ) + …
where A n and B n are the amplitudes of the various cosine and sine waveforms, and angular
frequency is denoted by ω 0 = 2πf 0 radians/second.
Depending on the actual problem being solved we can choose to specify the fundamental
periodicity of the waveform in terms of the period ( T ), frequency ( f 0 ), or angular frequency (ω 0 ) as
shown in Eq. 176. Note that there is actually no requirement to specifically include a B 0 term since
sin 0 = 0 , although there is an A 0 term, since cos 0 = 1 , which represents any DC component
that may be present in the signal.
In more descriptive language the above Fourier series says that any periodic signal can be
reproduced by adding a (possibly infinite) series of harmonically related sinusoidal waveforms of
amplitudes A n or B n . Therefore if a periodic signal with a fundamental period of say 0.01 seconds
is identified, then the Fourier series will allow this waveform to be represented as a sum of various
cosine and sine waves at frequencies of 100 Hz (the fundamental frequency, f 0 ), 200Hz, 300Hz
(the harmonic frequencies 2f 0, 3f 0 ) and so on. The amplitudes of these cosine and sine waves are
given by A 0, A 1, B 1, A 2, B 2, A 3 ..... and so on.
So how are the values of A n and B n calculated? The answer can be derived by some basic
trigonometry. Taking the last line of Eq. 176, if we multiply both sides by cos ( pω 0 t ) , where p is an
arbitrary positive integer, then we get:
∞
cos ( pω 0 t ) x ( t ) = cos ( pω 0 t )
∑ [ An cos ( nω0 t ) + Bn sin ( nω0 t ) ]
n=0
(177)
DSPedia
156
A0
time
T
A1
time
B1
T
time
T/2
A2
time
+
x( t )
time
3
B2
A0 +
∑
2πnt
2πnt
A n cos  ------------ + B n sin  ------------
T
T
n=1
time
T/3
A3
time
B3
time
Fourier series for a periodic signal x ( t ) . If we analyse a periodic signal and realise the
cosine and sine wave Fourier coefficients of appropriate amplitudes A n and B n , then
summing these components will lead to exactly the original signal.
If we now take the average of one fundamental period of both sides, this can be done by integrating
the functions over any one period, T :
T
T
T



∫ cos ( pω0 t ) x ( t )dt = ∫  cos ( pω0 t ) ∑ An cos ( nω0 t ) + ∑ Bn sin ( nω0 t ) dt
n=0
n=0
0
0

T
∞
=
T
∞
T
∑ ∫ { An cos ( pω0 t ) cos ( nω0 t ) } dt + ∑ ∫ { Bn cos ( pω0 t ) sin ( nω0 t ) } dt
n=0 0
n=0 0
Noting the zero value of the second term in the last line of Eq. 178, i.e. :
(178)
157
T
T
0
0
T
T
0
0
Bn
∫ { Bn cos ( pω0 t ) sin ( nω0 t ) } dt = -----2- ∫ ( sin ( p + n )ω0 t – sin ( p – n )ω0 t ) dt
(179)
Bn
Bn
( p + n )2πt
( p – n )2πt
= ------ ∫ sin ----------------------------- dt – ------ ∫ sin ---------------------------- dt
T
T
2
2
= 0
using the trigonometric identity 2 cos A sin B = sin ( A + B ) – sin ( A – B ) and noting that the integral
over one period, T, of any harmonic of the term sin [ 2πt ⁄ T ] is zero:
T
2πt
sin --------- = sin ω 0 t
T
6πt
sin --------- = sin 3ω 0 t
T
T
time
time
The integral over T of any sine/cosine waveform of frequency f 0 = 1 ⁄ T or harmonics
thereof, 2f 0, 2f 0, 3f 0, … is zero, regardless of the amplitude or phase of the signal.
Eq. 179 is true for all values of the positive integers p and n .
For the first term in the last line of Eq. 178 the average is only zero if p ≠ n , i.e. :
T
T
An
∫ An cos ( pω0 t ) cos ( nω0 t ) dt = -----2- ∫ ( cos ( p + n ) ω0 t + cos ( p – n ) ω0 t ) dt = 0,
0
p≠n
(180)
0
this time using the trigonometric identity 2 cos A cos B = cos ( A + B ) + cos ( A – B ) .
If p = n then:
T
∫ An cos ( nω0 t ) cos ( nω0 t ) dt
T
= A n ∫ cos2 ( nω 0 t ) dt
0
0
T
T
0
0
An
An
An t
= ------ ∫ ( 1 + cos 2n ω 0 t ) dt = ------ ∫ 1 dt = ------2
2
2
T
0
(181)
AnT
= --------2
Therefore using Eqs. 179, 180, 181 in Eq. 178 we note that:
T
∫ cos ( pω0 t ) x ( t )dt
0
and therefore:
An T
= --------2
(182)
DSPedia
158
T
2
A n = --- ∫ x ( t ) cos ( nω 0 t ) dt
T
(183)
0
By premultiplying and time averaging Eq. 178 by sin ( pω 0 t )
simplifications to Eqs. 179, 180, 181 we can similarly show that:
and using a similar set of
T
2
B n = --- ∫ x ( t ) sin ( nω 0 t ) dt
T
(184)
0
Hence the three key equations for calculating the Fourier series of a periodic signal with
fundamental period T are:
∞
x(t) =
∑
n=0
T
2πnt
A n cos  ------------ +
 T 
∞
∑
2πnt
B n sin  ------------
 T 
n=1
2
A n = --- ∫ x ( t ) cos ( nω 0 t ) dt
T
(185)
0
T
2
B n = --- ∫ x ( t ) sin ( nω 0 t ) dt
T
0
Fourier Series Equations
See also Basis Function, Discrete Cosine Transform, Discrete Fourier Transform, Fast Fourier
Transform, Fourier, Fourier Analysis, Fourier Series - Amplitude/Phase Representation, Fourier
Series - Complex Exponential Representation, Fourier Transform, Frequency Response, Impulse
Response, Gibbs Phenomenon, Parseval’s Theorem.
Fourier Series, Amplitude/Phase Representation: It is often useful to abbreviate the notation of
the Fourier series such that the series is a sum of cosine (or sine) only terms with a phase shift. To
perform this notational simplification, first consider the simple trigonometric function:
A cos ωt + B sin ωt
where A and B are real numbers. If we introduce another variable, M such that M =
then:
(186)
A2 + B2
159
A2 + B2
A cos ωt + B sin ωt = ------------------------ ( A cos ωt + B sin ωt )
A2 + B2
A
B
= M  ------------------------ cos ωt + ------------------------ sin ωt
 A2 + B2

A2 + B2
(187)
= M ( cos θ cos ωt + sin θ sin ωt )
= M cos ( ωt – θ )
=
A 2 + B 2 cos ( ωt – { tan–1 B ⁄ A } )
since θ is the angle made by a right angle triangle of hypotenuese M and sides of A and B , i.e.
tan–1 ( B ⁄ A ) = θ .
M =
A2 + B2
B
θ
A
A simple right angled triangle with arbitrary length sides of A and B. The sine of the angle
θ is the ratio of the opposite side over the hypotenuese, B ⁄ M and the cosine of the angle
θ is the ratio of the adjacent side over the hypotenuese, A ⁄ M . the tangent of the angle θ
is the ratio of the opposite side over the adjacent side, B ⁄ A .
This result shows that the sum of a sine and a cosine waveform of arbitrary amplitudes is a
sinusoidal signal of the same frequency but different amplitude and phase from the original sine and
cosine terms. Using this result of Eq. 187 to combine each sine and cosine term, we can rewrite the
Fourier series of Eq. 176 as:
∞
x( t) =
∑
2πnt
A n cos  ------------ +
 T 
n=0
∞
x( t) =
θ n = tan–1 B n ⁄ A n
A n2 + B n2
2πnt
-
∑ Bn sin  ----------T 
n=1
∑ Mn cos ( nω0 t – θn )
n=0
Mn =
∞
(188)
DSPedia
160
where A n and B n are calculated as before using Eqs. 183 and 184.
A0
time
T
M1
T
time
T/2
+
M2
time
time
T/3
M3
x(t)
3
A0 +
∑
n=1
time
2πnt
M n cos  ------------ – θ n
T
Comparing this Fourier series with the one on page 156 note that the sine and cosine terms
have been combined for each frequency to produce a single cosine waveform of amplitude
M n = A n2 + B n2 and phase θ n = B ⁄ A .
From this representation of the Fourier series, can plot an amplitude line spectrum and a phase
spectrum:
T
x( t )
time
Phase
Amplitude
Fourier series calculation
M1
M2
M3
100
200
300
frequency/Hz
Amplitude Spectrum
-30o
100
200
300
frequency/Hz
Phase Spectrum
The Fourier series components of the form: M n cos ( 2πf 0 t – θ n ) . The amplitude spectrum
shows the amplitudes of each of the sine waves, and the phase spectrum shows the phase
shift (in degrees in this example) of each cosine component. Note that the combination of
the amplitude and phase spectrum completely defines the time signal.
See also Discrete Cosine Transform, Discrete Fourier Transform, Fast Fourier Transform - Zero
Padding, Fourier, Fourier Analysis, Fourier Series, Fourier Series - Complex Exponential
Representation, Fourier Transform, Impulse Response, Gibbs Phenomenon, Parseval’s Theorem.
Fourier Series, Complex Exponential Representation: It can be useful and instructive to
represent the Fourier series in terms of complex exponentials rather than sine and cosine
waveforms. (In the derivation presented below we will assume that the signal under analysis is real
valued, although the result extends easily to complex signals.) From Euler’s formula, note that:
161
e jω + e –jω
⇒ cos ω = -----------------------2
e jω = cos ω + j sin ω
and
e jω – e – jω
sin ω = -----------------------2j
(189)
Substituting the complex exponential definitions for sine and cosine in Eq. 176 (defined in item
Fourier Series) and rearranging gives:
∞
∑ An cos ( nω0 t ) + Bn sin ( nω0 t )
x ( t ) = A0 +
n=1
∞
∑
= A0 +
(190)
e jnω0 t + e –j nω0 t
e jnω 0 t – e –j nω0 t
A n  ------------------------------------- + B n  ------------------------------------




2
2j
n=1
∞
A B
B
A
-----n- + -----n- e jnω0 t +  -----n- – -----n- e – j nω0 t
2
 2 2j 
2j 
∑
= A0 +
n=1
∞
∑
= A0 +
∞
n – jB n jnω 0 t
A
-------------------- e
+

2 
n=1
A n + jB n
- e – j nω t
∑  -------------------
2
0
n=1
For the second summation term, if the sign of the complex sinusoid is negated and the summation
limits are reversed, then we can rewrite as:
∞
∑
x ( t ) = A0 +
n – jB n jnω 0 t
A
-------------------- e
+

2 
n=1
–1
∑
n + jB n jnω 0 t
A
-------------------- e


2
n = –∞
∞
=
∑
(191)
C n e jnω 0 t
n = –∞
Writing C n in terms of the Fourier series coefficients of Eqs. 183 and 184 gives:
C0 = A0
C n = ( A n – jB n ) ⁄ 2 for n > 0
C n = ( A n + jB n ) ⁄ 2 for n < 0
From Eq 192, note that for n ≥ 0 :
(192)
DSPedia
162
T
T
0
0
A n – jB n
1
1
- = --- ∫ x ( t ) cos ( nω 0 t ) dt – j --- ∫ x ( t ) sin ( nω 0 t ) dt
C n = -------------------T
T
2
T
1
= --- ∫ x ( t ) [ cos ( nω 0 t ) – j sin ( nω 0 t ) ] dt
T
(193)
0
T
1
= --- ∫ x ( t )e –j nωo t dt
T
0
For n < 0 it is clear from Eq. 192 that C n = C –* n where “*” denotes complex conjugate. Therefore
we have now defined the Fourier series of a real valued signal using a complex analysis and a
synthesis equation:
∞
x(t) =
∑
C n e jnω0 t
Synthesis
n=∞
T
(194)
1
Analysis
C n = --- ∫ x ( t )e –j nωo t dt
T
0
Complex Fourier Series Equations
The complex Fourier series also introduces the concept of “negative frequecies” whereby we view
signals of the form e j2πf 0 as a positive complex sinusoid of frequency f 0 Hz, and signals of the form
e –j 2πf 0 as a complex sinusoid of frequency – f 0 Hz.
Note that the complex Fourier series is more notationally compact, and probably simpler to work
with than the general Fourier series. (The “probably” depends on how clear you are in dealing with
complex exponentials!) Also if the signal being analysed is in fact complex the general Fourier
series of Eq. 176 (see Fourier Series) is insufficient but Eqs. 194 can be used. (For complex signals
the coefficient relationship in Eq. 192 will not in general hold.)
Assuming the waveform being analysed is real (usually the case), then it is easy to convert C n
coefficients into A n and B n . Also note from Eq. 188 (see item Fourier Series) and Eq. 192 that:
Mn =
noting that C n =
A n2 + B n2 = 2 C n
(195)
A n2 + B n2 ⁄ 2 . Clearly we can also note that for the complex number C n :
B
∠C n = tan– 1 ---- = θ n
A
i.e. C n = C n e jθn
(196)
Therefore although a complex exponential does not as such exist as a real world (single wire
voltage) signal, we can easily convert from a complex exponential to a real world sinusoid simply
by taking the real or imaginary part of the complex Fourier coefficients and use in the Fourier series
equation (see Eq. 176, Fourier Series):
163
∞
x(t) =
∑
[ A n cos ( nω 0 t ) + B n sin ( nω 0 t ) ]
(197)
n=0
There are of course certain time domain signals which can be considered as being complex, i.e.
having a separate real and imaginary components. This type of signal can be found in some digital
communication systems or may be created within a DSP system to allow certain types of
computation to be performed.
If a signal is decomposed into its complex Fourier series, the resulting values for the various
components can be plotted as a line spectrum. As we now have both complex and real values and
positive and negative frequencies, this will require two plots, one for the imaginary components and
one for the real components:
T
x(t)
time
Complex Fourier series calculation
Real Valued Line Spectrum (An)
300
200
100
Amplitude
100
200
300
frequency/Hz
Imaginary Valued Line Spectrum (Bn)
300
200
100
Amplitude
100
200
300
frequency/Hz
The complex Fourier series line spectra. Note that there are both positive and negative
frequencies, and for the complex Fourier series of a real valued signal the real line
spectrum is symmetrical about f = 0 and the imaginary spectrum has point symmetry
about the origin.
DSPedia
164
Rather than showing the real and imaginary line spectra, it is more usual to plot the magnitude
spectrum and phase spectrum:
T
x( t )
time
Complex Fourier series calculation
M1
Phase
Magnitude
B
Phase tan– 1 ------n
An
A n + jB n
Magnitude A n2 + B n2
M2
M3
100
200
300
frequency/Hz
Magnitude Spectrum
100
200
300
-30o
frequency/Hz
Phase Spectrum
Calculating the magnitude and phase spectra from the complex Fourier series. For a real
valued signal, the result will be identical, except for a magnitude scaling factor of 2, to that
obtained from the amplitude phase form of the Fourier series as on page 160. As both
spectra are symmetrical about the y-axis the negative frequency values are not plotted.
The “ease” of working with complex exponentials over sines and cosines can be illustrated by
asking the reader to simplify the following equation to a sum of sine waves:
sin ( ω 1 t ) sin ( ω 2 t )
(198)
This requires the recollection (or re-derivation!) of trigonometric identities to yield:
1
1
sin ( ω 1 t ) sin ( ω 2 t ) = --- cos ( ω 1 – ω 2 )t + --- cos ( ω 1 + ω 2 )t
2
2
(199)
While not particularly arduous, it is somewhat easier to simplify the following expression to a sum
of complex exponentials:
e jω1 t e jω2 t = e j ( ω1 + ω2 )t
(200)
Although a seemingly simple comment, this is the basis of using complex exponentials rather than
sines and cosines; they make the maths easier. Of course in situations where the signal being
analysed is complex, then the complex exponential Fourier series must be used.
See also Discrete Fourier Transform, Fast Fourier Transform, Fast Fourier Transform - Decimationin-Time, Fourier, Fourier Analysis, Fourier Series, Fourier Series - Amplitude/Phase
Representation, Fourier Transform, Frequency Response, Impulse Response, Gibbs
Phenomenon, Parseval’s Theorem.
Fourier Transform: The Fourier series (rather than transform) allows a periodic signal to be
broken down into a sum of real valued sine and cosine waves (in the case of a real valued signal)
or more generally a sum of complex exponentials. However most signals are aperiodic, i.e. not
165
periodic. Therefore the Fourier transform was derived in order to analyse the frequency content of
an aperiodic signal.
Consider the complex Fourier series of a periodic signal:
∞
∑
x(t) =
C n e jnω0 t
n = –∞
T
(201)
1
C n = --- ∫ x ( t )e –j nωo t dt
T
0
1T = --f0
x(t)
t0
t0 + T
t 0 + 2T
time
A periodic signal x ( t ) with period T . The fundamental frequency, f 0 is calculated simply
as f 0 = 1 ⁄ T . Clearly x ( t 0 ) = x ( t 0 + T ) = x ( t 0 + 2T ) .
The period of the signal has been identified as T and the fundamental frequency is f 0 = 1 ⁄ T .
Therefore the Fourier series harmonics occur at frequencies f 0, 2f 0, 3f 0, … .
Magnitude Response
Amplitude / Volts
Time Signal
0.5
C0=0.5
0.4
T
1
0.3
0.5
0.2
0.1
0
0
1
2
3
4
5
0
0.5
1
time/s
1.5
2
2.5
3
3.5
4
4.5
frequency/Hz
Fourier Series Computation
Magnitude response of a (periodic) square wave. The phase response is zero for all
components. The fundamental period is T = 2 and therefore the fundamental frequency
is f 0 = 1 ⁄ 2 = 0.5 Hz and harmonics are therefore 0.5 Hz apart when the Fourier series
is calculated.
For the above square wave we can calculate the Fourier series using Eq. 201 as:
DSPedia
166
T
1
1
1
1
t
C0 = --- ∫ s ( t ) dt = --- ∫ 1 dt = --T
2
2
0
0
0
T
1
0
0
= 1
--2
1
1
e –j πntC n = --- ∫ s ( t )e –j ω0 nt dt = --- ∫ e –j πnt dt = -------------T
2
– 2jπn
jπn ------e2
– j πn
-----------
e 2
1
0
(202)
– j πn – 1
= e
--------------------– 2jπn
(203)
– j πn
– j πn
----------–
sin πn ⁄ 2 ----------=  ----------------------------- e 2 = ---------------------- e 2
 2jπn 
πn


recalling that sin x = ( e jx – e –jx ) ⁄ 2j
Noting that e –j πn ⁄ 2 = cos πn ⁄ 2 – j sin πn ⁄ 2 = 0, j or –j (depending on the value of n ) and recalling
from Eq. 190 and 191 (see Fourier Series) that C n = A n + jBn then the square wave can be
decomposed into a sum of harmonically related sine waves of amplitudes:
A0 = 1 ⁄ 2
(204)
 1 ⁄ nπ for odd n
An = 
 0 for even n
The amplitude response of the Fourier series is plotted above.
Now consider the case where the signal is aperiodic, and is in fact just a single pulse:
Amplitude / Volts
Time Signal
1
0.5
0
1
2
3
4
5
time/s
A single aperiodic pulse. This signal is most defintely not periodic and therefore the Fourier
series cannot be calculated.
167
One way to obtain “some” information on the sinusoidal components comprising this aperiodic
signal would be to assume the existence of a periodic “relative” or “pseudo-period” of this signal:
Amplitude / Volts
Time Signal
“Pseudo-period”
Tp
1
0.5
0
1
2
3
4
6
5
7
8
9
time/s
Fourier Series
Magnitude Response
0.25
0.2
0.15
0.1
0.05
0
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
frequency/Hz
A periodic signal that is clearly a relative of the single pulse aperiodic signal. By adding the
pseudo-periods we essentially assume that the single pulse of interest is a periodic signal
and therefore we can now use the Fourier series tools to analyse. The fundamental period,
T p = 4 and therefore the harmonics of the Fourier series are placed f 0 = 0.25 Hz apart.
DSPedia
168
If we assumed that the “periodicity” of the pulse was even longer, say 8 seconds, then the spacing
between the signal harmonics would further decrease:
Amplitude / Volts
Time Signal
Tp
1
“Pseudo-period”
0.5
0
1
2
3
4
6
5
7
8
9
time/s
Fourier Series
Magnitude Response
0.125
0.1
0.075
0.05
0.025
0
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
frequency/Hz
If we increase the fundamental pseudo-period to T p = 8 the harmonics of the Fourier
series are more closely spaced at f 0 = 1 ⁄ 8 = 0.125 Hz apart. The magnitude of all the
harmonics proportionally decreases with the increase in the pseudo-period. This is
expected since the power of the signal decreases as the number of harmonics decreases.
If we further assumed that the period of the signal was such that T → ∞ then f 0 → ∞ and given the
finite energy in the signal, the magnitude of each of the Fourier series sine waves will tend to zero
given that the harmonics are now so closely spaced! Hence if we multiply the magnitude response
169
by T and plot the Fourier series we have now realised a graphical interpretation of the Fourier
transform:
Amplitude / Volts
Time Signal
1
Period, T → ∞
0.5
0
1
time/s
Fourier Series
Magnitude Response
0.5/T
0.4/T
0.3/T
0.2/T
0.1/T
0
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
frequency/Hz
If we increase the fundamental pseudo-period such that T → ∞ the frequency spacing
between the harmonics of the Fourier series tends to zero, i.e. f 0 → 0 . Note that the
magnitude of the Fourier series components are scaled proportionally down by the value
of the “pseudo” period and in the limit as T → ∞ will tend to zero. Hence the y-axis is plotted
as 1 ⁄ T .
To realise the mathematical version of the Fourier transform first define a new function based on
the general Fourier series of Eq. 201 such that:
C
X ( f ) = ------n- = C n T
f0
(205)
then:
∞
x(t) =
∑
C n e j2πnf 0 t
n = –∞
T⁄2
X( f) =
∫
–T ⁄ 2
x ( t )e –j 2πnf0 t dt =
∞
(206)
∫ x ( t )e–j2πft dt
–∞
where nf 0 becomes the continuous variable f as f 0 → 0 and n → ∞ . This equation is refered to as
the Fourier transform and can of course be written in terms of the angular frequency:
DSPedia
170
∞
∫ x ( t )e –jωt dt
X(ω) =
(207)
–∞
Knowing the Fourier transform of a signal, of course allows us to transform back to the original
aperiodic signal:
∞
x(t) =
∑
∞
Cn e
j2πnf 0 t
=
n = –∞
∑
X ( f )f 0 e
j2πnf0 t
n = –∞
 ∞

j2πnf 0 t

=
X f
f
 ∑ ( )e
 0
 n = –∞

∞
⇒ x(t) =
(208)
∫ X ( f )e j2πft df
–∞
This equation is refered to as the inverse Fourier transform and can also be written in terms of the
angular frequency:
1
x ( t ) = -----2π
∞
∫ X ( ω )e jωt dω
(209)
–∞
Hence we have realised the Fourier transform analysis and synthesis pair of equations:
∞
x(t) =
∫ X ( f )e j2πft df
–∞
∞
X(f) =
∫
–∞
Synthesis
(210)
x ( t )e –j 2πft dt
Analysis
Fourier Transform Pair
Therefore the Fourier transform of a continuous time signal, x ( t ) , will be a continuous function in
frequency.
See also Discrete Cosine Transform, Discrete Fourier Transform, Fast Fourier Transform, Fourier
Analysis, Fourier Series, Fourier Series - Complex Exponential Representation, Fourier Transform.
Forward Substitution: See Matrix Algorithms - Forward Substitution.
Fractals: Fractals can be used to define seemingly irregular 1-D signals or 2-D surfaces using,
amongst other things, properties of self similarity. Self similarity occurs when the same pattern
repeats itself at different scalings, and is often seen in nature. A good introduction and overview of
fractals can be found in [86].
Fractional Binary: See Binary Point.
171
Fractional Bandwidth: A definition of (relative) bandwidth for a signal obtained by dividing the
difference of the highest and lowest frequencies of the signal by its center frequency. The result is
a number between 0 and 2. When this number is multiplied by 100, the relative bandwidth can be
stated in terms of percentage. See also Bandwidth.
Fractional Delay Implementation: See All-pass Filter - Fractional Sample Delay Implementation.
Fractional Sampling Rate Conversion: Sometimes sampling rate conversions are needed
between sampling rates that are not integer multiples of each other and therefore simple integer
downsampling or upsampling cannot be performed. One method of changing sampling rate is to
convert a signal back to its analog form using a DAC, then resample the signal using an ADC
sampling at the required frequency. In general this is not acceptable solution as two levels of noise
are introduced by the DAC and ADC Interpolation by a factor of N, followed by decimation by a
factor of M results in a sampling rate change of N/M. The higher the values of N and M, the more
computation that is required. For example to convert from CD sampling rates of 44100Hz to DAT
sampling rate of 48000Hz requires upsampling by a factor of 160, and downsampling by a factor of
147. When performing fractional sampling rate conversion the low pass anti-alias filter associated
with decimation, and the low pass filter used in interpolation can be combined into one digital filter.
See also Upsampling, Downsampling, Decimation, Interpolation.
fs
N
Low Pass Filter
Cut-Off = fs/2 max(N,M)
Upsampler
M
N/Mfs
Downsampler
Frequency: Frequency is measured in Hertz (Hz) and gives a measure of the number of cycles
per second of a signal. For example if a sine wave has a frequency of 300Hz, this means that the
signal has 300 single wavelength cycles in one second. Square waves also can be assigned a
frequency that is defined as 1/T where T is the period of one cycle of the square wave. See also
Sine Wave.
Frequency Domain Adaptive Filtering: The LMS (and other adaptive algorithms) can be
configured to operate of time series data that has been transformed into the frequency domain [53],
[131].
Frequency, Logarithmic: See Logarithmic Frequency.
Frequency Modulation: One of the three ways of modulating a sine wave signal to carry
information. The sine wave or carrier has its frequency changed in accordance with the information
signal to be transmitted. See also Amplitude Modulation, Phase Modulation.
Frequency Range of Hearing: The frequency range of hearing typically goes from around 20Hz
to up to 20kHz in healthy young people. For adults the upper range of hearing is more likely to be
in the range 11-16kHz as age erodes the high frequency sensitivity. The threshold of hearing varies
over the frequency range, with the most sensitive portion being from around 1-5kHz, where speech
frequencies occur. Low frequencies, below 20Hz, are tactile and only audible at very high sound
pressure levels. Also listening to frequencies below 20Hz does not produce any further perception
of reducing pitch. Inaudible sound below the lowest perceptible frequency is termed infrasound, and
above the highest perceptible frequency, is known as ultrasound.
DSPedia
172
Discrimination between tones at similar frequencies (the JND - just noticeable difference or DL Difference Limen), depends on a number of factors such as the frequency, sound pressure level
(SPL), and sound duration. The ear can discriminate by about 1Hz for frequencies in the range 12kHz where the SPL is about 20dB above the threshold of hearing, and the duration is at least 1/4
seconds [30]. See also Audiogram, Audiometry, Auditory Filters, Beat Frequencies, Binaural Beats,
Difference Limen, Ear, Equal Loudness Contours, Hearing Aids, Hearing Impairment, Hearing
Level, Infrasound, Sensation Level, Sound Pressure Level, Spectral Masking, Temporal Masking,
Threshold of Hearing, Ultrasound.
Frequency Response: The frequency response a system defines how the magnitude and phase
of signal components at different frequencies will be changed as the signal passes through, or is
convolved with a linear system. For example the frequency response of a digital filter may attenuate
low frequency magnitudes, but amplify those at high frequencies. The frequency response of a
linear system is calculated by taking the discrete Fourier transform (DFT) of the impulse response
or evaluating the z-transform of the linear system for z = e jω = e j2πf . See also Discrete Fourier
Transform, Fast Fourier Transform. .
Digital Filter
Impulse
Responseh(n)
Magnitude
|H(k)|
Frequency Response
(Magnitude Only)
N–1
H(k) =
∑
h ( n )e
2πnk
– j  --------------
 N 
n=0
frequency, k
Frequency Shift Keying (FSK): A digital modulation technique in which the information bits are
encoded in the frequency of a symbol. Typically, the frequencies are chosen so that the symbols
are orthogonal over the symbol period. FSK demodulation can be either coherent (phase of carrier
signal known) or noncoherent (phase of carrier signal unknown). Given a symbol period of T
seconds, signals separated in frequency by 1/T Hz will be orthogonal and will have continuous
phase. Signals separated by 1/(2T) Hz will be orthogonal (if demodulated coherently) but will result
in phase discontinuities. See also Amplitude Shift Keying, Continuous Phase Modulation, Minimum
Shift Keying, Phase Shift Keying.
Frequency Transformation: The transformation of any time domain signal into the frequency
domain.
Frequency Weighting Curves: See Sound Pressure Level Weighting Curves.
Frobenius Norm: See Matrix Properties - Norm.
Formants: The vocal tract (comprising throat, mouth and lips) can act as an acoustics resonator
with more than one resonant frequency. These resonant frequencies are known as formants and
they change in frequency while we move tongue and lips in the process of joining speech sounds
together (articulation).
173
Four Wire Circuit: A circuit containing two pairs of wires (or their logical equivalent) for
simultaneous (full duplex) two transmission. See also Two Wire Channel, Full Duplex, Half Duplex,
Simplex.
Fricatives: One of the elementary sounds of speech, namely plosives, fricatives, sibilant fricative,
semi-vowels, and nasals. Fricatives are formed from the lower lip and teeth with air through as when
“f” is used in the word “fin”. See also Nasals, Plosives, Semi-vowels, and Sibilant Fricatives.
Full Adder: The full adder is the basic single bit arithmetic building block for design of multibit
binary adders, multipliers and arithmetic logic units. The full adder has three single bit inputs and
two single bit outputs:
a
b
cin
cout
sout
0
0
0
0
0
0
0
1
0
1
0
1
0
0
1
0
1
1
1
0
1
0
0
0
1
1
0
1
1
0
1
1
0
1
0
1
1
1
1
1
Truth Table
c out = abc + abc + abc + abc = ab + bc + ac
s out = abc + abc + abc + abc = ( a ⊕ b ) ⊕ c
a
b
b
cin
a
cin
Boolean
Algebra
a
cout
cout
b
FA
cin
sout
a
b
Symbol
sout
cin
Logic Circuit
Boolean Algebra: (a+b) represents (a OR b); (ab) represents (a AND b); a ⊕ b represents (a
Exclusive-OR b). The full adder (FA) simply adds three bits (0 or 1) together to produce a sum
bit, s out and carry bit, c out
See also Arithmetic Logic Unit, Parallel Adder, Parallel Multiplier, DSP Processor.
Full Duplex: Pertaining to the capability to send and receive simultaneously. See also Half
Duplex, Simplex.
Fundamental Frequency: The name of the lowest (and usually) dominant frequency component
which has associated with it various harmonics (integer multiples of the frequency). In music for
example the fundamental frequency identifies the note being played, and the various harmonics
(and occasionally sub-harmonics) give the note its rich characteristic quality pertaining to the
instrument being played. See also Fourier Series, Harmonics, Music, Sub-Harmonic, Western
Music Scale.
Fundamental Period: See also Fourier Series.
Fuzzy Logic: A mathematical set theory which allows systems to be described in natural language
rules. Binary for example uses only two level logic: 0 and 1. Fuzzy logic would still have the levels
0 and 1, but it would also be capable of describing all logic levels in between perhaps ranging
through: almost definitely low, probably low, maybe high or low, probably high, to almost definitely
high. Control of systems defined by fuzzy logic are currently being implemented in conjunction with
174
DSPedia
DSP algorithms. Essentially fuzzy logic is a technique for representing information and combining
objective knowledge (such as mathematical models and precise definitions) with subjective
knowledge (a linguistic description of a problem). One advantage often cited about fuzzy systems
is that they can produce results almost as good as an “optimum” system, but they are much simpler
to implement. A good introduction, with tutorial papers, can be found in [63].
175
G
G-Series Recommendations: The G-series recommendations from the International
Telecommunication (ITU), advisory committee on telecommunications (denoted ITU-T, and
formerly known as CCITT) propose a number of standards for transmission systems and media,
digital systems and networks. From a DSP perspective the G164/5/6/7 define aspects of echo and
acoustic echo cancellation, and some of the G.7XX define various coding and compression
schemes which underpin digital audio telecommunication. The ITU-T G-series recommendations
(http://www.itu.ch) can be summarised as:
G.100
G.101
G.102
G.103
G.105
G.111
G.113
G.114
G.117
G.120
G.121
G.122
G.123
G.125
G.126
G.132
G.133
G.134
G.135
G.141
G.142
G.143
G.151
G.152
G.153
G.162
G.164
G.165
G.166
G.167
G.172
G.173
G.174
G.180
G.181
G.191
G.211
G.212
G.213
G.214
Definitions used in Recommendations on general characteristics of international telephone
connections and circuits.
The transmission plan.
Transmission performance objectives and Recommendations.
Hypothetical reference connections.
Hypothetical reference connection for crosstalk studies.
Loudness ratings (LRs) in an international connection.
Transmission impairments.
One-way transmission time.
Transmission aspects of unbalance about earth (definitions and methods).
Transmission characteristics of national networks.
Loudness ratings (LRs) of national systems.
Influence of national systems on stability and talker echo in international connections.
Circuit noise in national networks.
Characteristics of national circuits on carrier systems.
Listener echo in telephone networks.
Attenuation distortion.
Group-delay distortion.
Linear crosstalk.
Error on the reconstituted frequency.
Attenuation distortion.
Transmission characteristics of exchanges.
Circuit noise and the use of Companders.
General performance objectives applicable to all modern international circuits and national extension
circuits.
Characteristics appropriate to long-distance circuits of a length not exceeding 2500 km.
Characteristics appropriate to international circuits more than 2500 km in length.
Characteristics of Companders for telephony.
Echo suppressors.
Echo cancellers.
Characteristics of syllabic Companders for telephony on high capacity long distance systems.
Acoustic echo controllers.
Transmission plan aspects of international conference calls.
Transmission planning aspects of the speech service in digital public land mobile networks.
Transmission performance objectives for terrestrial digital wireless systems using portable terminals to
access the PSTN.
Characteristics of N + M type direct transmission restoration systems for use on digital and analogue
sections, links or equipment.
Characteristics of 1 + 1 type restoration systems for use on digital transmission links.
Software tools for speech and audio coding standardization.
Make-up of a carrier link.
Hypothetical reference circuits for analogue systems.
Interconnection of systems in a main repeater station.
Line stability of cable systems.
DSPedia
176
G.215
G.221
G.222
G.223
G.224
G.225
G.226
G.227
G.228
G.229
G.230
G.231
G.232
G.233
G.241
G.242
G.243
G.322
G.325
G.332
G.333
G.334
G.341
G.343
G.344
G.345
G.346
G.352
G.411
G.421
G.422
G.423
G.431
G.441
G.442
G.451
G.473
G.601
G.602
G.611
G.612
G.613
G.614
G.621
G.622
G.623
G.631
G.650
G.651
Hypothetical reference circuit of 5000 km for analogue systems.
Overall recommendations relating to carrier-transmission systems.
Noise objectives for design of carrier-transmission systems of 2500 km.
Assumptions for the calculation of noise on hypothetical reference circuits for telephony.
Maximum permissible value for the absolute power level (power referred to one milliwatt) of a signalling
pulse.
Recommendations relating to the accuracy of carrier frequencies.
Noise on a real link.
Conventional telephone signal.
Measurement of circuit noise in cable systems using a uniform-spectrum random noise loading.
Unwanted modulation and phase jitter.
Measuring methods for noise produced by modulating equipment and through-connection filters.
Arrangement of carrier equipment.
12-channel terminal equipments.
Recommendations concerning translating equipments.
Pilots on groups, supergroups, etc.
Through-connection of groups, supergroups, etc.
Protection of pilots and additional measuring frequencies at points where there is a throughconnection.
General characteristics recommended for systems on symmetric pair cables.
General characteristics recommended for systems providing 12 telephone carrier circuits on a
symmetric cable pair [(12+12) systems].
12 MHz systems on standardized 2.6/9.5 mm coaxial cable pairs.
60 MHz systems on standardized 2.6/9.5 mm coaxial cable pairs.
18 MHz systems on standardized 2.6/9.5 mm coaxial cable pairs.
1.3 MHz systems on standardized 1.2/4.4 mm coaxial cable pairs.
4 MHz systems on standardized 1.2/4.4 mm coaxial cable pairs.
6 MHz systems on standardized 1.2/4.4 mm coaxial cable pairs.
12 MHz systems on standardized 1.2/4.4 mm coaxial cable pairs.
18 MHz systems on standardized 1.2/4.4 mm coaxial cable pairs.
Interconnection of coaxial carrier systems of different designs.
Use of radio-relay systems for international telephone circuits.
Methods of interconnection.
Interconnection at audio-frequencies.
Interconnection at the baseband frequencies of frequency-division multiplex radio-relay systems.
Hypothetical reference circuits for frequency-division multiplex radio-relay systems.
Permissible circuit noise on frequency-division multiplex radio-relay systems.
Radio-relay system design objectives for noise at the far end of a hypothetical reference circuit with
reference to telegraphy transmission.
Use of radio links in international telephone circuits.
Interconnection of a maritime mobile satellite system with the international automatic switched
telephone service; transmission aspects.
Terminology for cables.
Reliability and availability of analogue cable transmission systems and associated equipments (10)
Characteristics of symmetric cable pairs for analogue transmission.
Characteristics of symmetric cable pairs designed for the transmission of systems with bit rates of the
order of 6 to 34 Mbit/s.
Characteristics of symmetric cable pairs usable wholly for the transmission of digital systems with a bit
rate of up to 2 Mbits.
Characteristics of symmetric pair star-quad cables designed earlier for analogue transmission systems
and being used now for digital system transmission at bit rates of 6 to 34 Mbit/s.
Characteristics of 0.7/2.9 mm coaxial cable pairs.
Characteristics of 1.2/4.4 mm coaxial cable pairs.
Characteristics of 2.6/9.5 mm coaxial cable pairs.
Types of submarine cable to be used for systems with line frequencies of less than about 45 MHz.
Definition and test methods for the relevant parameters of single-mode fibres.
Characteristics of a 50/125 µm multimode grades index optical fibre cable.
177
G.652
G.653
G.654
G.661
G.662
G.701
G.702
G.703
G.704
G.705
G.706
G.707
G.708
G.709
G.711
G.712
G.720
G.722
G.724
G.725
G.726
G.727
G.728
G.731
G.732
G.733
G.734
G.735
G.736
G.737
G.738
G.739
G.741
G.742
G.743
G.744
G.745
G.746
G.747
G.751
G.752
G.753
G.754
Characteristics of a single-mode optical fibre cable.
Characteristics of a dispersion-shifted single-mode optical fibre cable.
Characteristics of a 1550 nm wavelength loss-minimized single-mode optical fibre cable.
Definition and test methods for relevant generic parameters of optical fibre amplifiers.
Generic characteristics of optical fibre amplifier devices and sub-systems.
Vocabulary of digital transmission and multiplexing, and pulse code modulation (PCM) terms.
Digital hierarchy bit rates.
Physical/electrical characteristics of hierarchical digital interfaces.
Synchronous frame structures used at primary and secondary hierarchical levels.
Characteristics required to terminate digital links on a digital exchange.
Frame alignment and cyclic redundancy check (CGC) procedures relating to basic frame structures
defined in Recommendation G.704.
Synchronous digital hierarchy bit rates.
Network node interface for the synchronous digital hierarchy.
Synchronous multiplexing structure.
Pulse code modulation (PCM) of voice frequencies.
Transmission performance characteristics of pulse code modulation.
Characterization of low-rate digital voice coder performance with non-voice signals.
7 kHz audio-coding within 64 kbit/s; Annex A: Testing signal-to-total distortion ratio for kHz audiocodecs at 64 kbit/s.
Characteristics of a 48-channel low bit rate encoding primary multiplex operating at 1544 kbit/s.
System aspects for the use of the 7 kHz audio codec within 64 kbit/s.
40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code Modulation (ADPCM). Annex A: Extensions of
Recommendation G.726 for use with uniform-quantized input and output.
5-, 4-, 3- and 2-bits sample embedded adaptive differential pulse code modulation (ADPCM).
Coding of speech at 16 kbit/s using low-delay code excited linear prediction. Annex G to Coding of
speech at 16 kbit/s using low-delay code excited linear prediction: 16 kbit/s fixed point specification.
Primary PCM multiplex equipment for voice frequencies.
Characteristics of primary PCM multiplex equipment operating at 2048 kbit/s.
Characteristics of primary PCM multiplex equipment operating at 1544 kbit/s.
Characteristics of synchronous digital multiplex equipment operating at 1544 kbit/s.
Characteristics of primary PCM multiplex equipment operating at 2048 kbit/s and offering synchronous
digital access at 384 kbit/s and/or 64 kbit/s.
Characteristics of a synchronous digital multiplex equipment operating at 2048 kbit/s.
Characteristics of an external access equipment operating at 2048 kbit/s offering synchronous digital
access at 384 kbit/s and/or 64 kbit/s.
Characteristics of primary PCM multiplex equipment operating at 2048 kbit/s and offering synchronous
digital access at 320 kbit/s and/or 64 kbit/s.
Characteristics of an external access equipment operating at 2048 kbit/s offering synchronous digital
access at 320 kbit/s and/or 64 kbit/s.
General considerations on second order multiplex equipments.
Second order digital multiplex equipment operating at 8448 kbit/s and using positive justification.
Second order digital multiplex equipment operating at 6312 kbit/s and using positive justification.
Second order PCM multiplex equipment operating at 8448 kbit/s.
Second order digital multiplex equipment operating at 8448 kbit/s and using positive/zero/negative
justification.
Characteristics of second order PCM multiplex equipment operating at 6312 kbit/s.
Second order digital multiplex equipment operating at 6312 kbit/s and multiplexing three tributaries at
2048 kbit/s.
Digital multiplex equipments operating at the third order bit rate of 34368 kbit/s and the fourth order bit
rate of 139264 kbit/s and using positive justification.
Characteristics of digital multiplex equipments based on a second order bit rate of 6312 kbit/s and
using positive justification.
Third order digital multiplex equipment operating at 34368 kbit/s and using positive/zero/negative
justification.
Fourth order digital multiplex equipment operating at 139264 kbit/s and using positive/zero/negative
justification.
DSPedia
178
G.755
G.761
G.762
G.763
G.764
G.765
G.766
G.772
G.773
G.774
G.775
G.780
G.781
G.782
G.783
G.784
G.791
G.792
G.793
G.794
G.795
G.796
G.797
G.801
G.802
G.803
G.804
G.821
G.822
G.823
G.824
G.825
G.826
G.831
G.832
G.901
G.911
G.921
G.931
G.950
G.951
G.952
G.953
G.954
G.955
G.957
G.958
G.960
G.961
G.962
Digital multiplex equipment operating at 139264 kbit/s and multiplexing three tributaries at 44736 kbit/s.
General characteristics of a 60-channel transcoder equipment.
General characteristics of a 48-channel transcoder equipment.
Summary of Recommendation G.763.
Voice packetizationpacketized voice protocols.
Packet circuit multiplication equipment.
Facsimile demodulation/remodulation for DCME.
Protected monitoring points provided on digital transmission systems.
Protocol suites for Q-interfaces for management of transmission systems.
Synchronous Digital Hierarchy (SDH) management information model for the network element view.
G.774.01: Synchronous digital hierarchy (SDH) performance monitoring for the network element view.
G.774.02: Synchronous digital hierarchy (SDH) configuration of the payload structure for the network
element view. G.774.03: Synchronous digital hierarchy (SDH) management of multiplex-section
protection for the network element view.
Loss of signal (LOS) and alarm indication signal (AIS) defect detection and clearance criteria.
Vocabulary of terms for synchronous digital hierarchy (SDH) networks and equipment.
Structure of Recommendations on equipment for the synchronous digital hierarchy (SDH).
Types and general characteristics of synchronous digital hierarchy (SDH) equipment.
Characteristics of synchronous digital hierarchy (SDH) equipment functional blocks.
Synchronous digital hierarchy (SDH) management.
General considerations on transmultiplexing equipments.
Characteristics common to all transmultiplexing equipments.
Characteristics of 60-channel transmultiplexing equipments.
Characteristics of 24-channel transmultiplexing equipments.
Characteristics of codecs for FDM assemblies.
Characteristics of a 64 kbit/s cross-connect equipment with 2048 kbit/s access ports.
Characteristics of a flexible multiplexer in a plesiochronous digital hierarchy environment.
Digital transmission models.
Interworking between networks based on different digital hierarchies and speech encoding laws.
Architectures of transport networks based on the synchronous digital hierarchy (SDH).
ATM cell mapping into plesiochronous digital hierarchy (PDH).
Error performance of an international digital connection forming part of an integrated services digital
network.
Controlled slip rate objectives on an international digital connection.
The control of jitter and wander within digital networks which are based on the 2048 kbit/s hierarchy.
The control of jitter and wander within digital networks which are based on the 1544 kbit/s hierarchy.
The control of jitter and wander within digital networks which are based on the Synchronous Digital
Hierarchy (SDH).
Error performance parameters and objectives for international, constant bit rate digital paths at or
above the primary rate.
Management capabilities of transport networks based on the Synchronous Digital Hierarchy (SDH).
Transport of SDH elements on PDH networks: Frame and multiplexing structures.
General considerations on digital sections and digital line systems.
Parameters and calculation methodologies for reliability and availability of fibre optic systems.
Digital sections based on the 2048 kbit/s hierarchy.
Digital line sections at 3152 kbit/s.
General considerations on digital line systems.
Digital line systems based on the 1544 kbit/s hierarchy on symmetric pair cables.
Digital line systems based on the 2048bit/s hierarchy on symmetric pair cables.
Digital line systems based on the 1544 kbit/s hierarchy on coaxial pair cables.
Digital line systems based on the 2048 kbit/s hierarchy on coaxial pair cables.
Digital line systems based on the 1544 kbit/s and the 2048 kbit/s hierarchy on optical fibre cables.
Optical interfaces for equipments and systems relating to the synchronous digital hierarchy.
Digital line systems based on the synchronous digital hierarchy for use on optical fibre cables.
Access digital section for ISDN basic rate access.
Digital transmission system on metallic local lines for ISDN basic rate access.
Access digital section for ISDN primary rate at 2048 kbit/s.
179
G.963
Access digital section for ISDN primary rate at 1544 kbit/s.
G.964
V-Interfaces at the digital local exchange (LE)V5.1-Interface (based on 2048
kbit/s) for the support of access network (AN).
G.965
V-Interfaces at the digital local exchange (LE)V5.2 interface (based on 2048 kbit/s) for th support of
Access Network (AN).
G.971
General features of optical fibre submarine cable systems.
G.972
Definition of terms relevant to optical fibre submarine cable systems.
G.974
Characteristics of regenerative optical fibre submarine cable systems.
G.981
PDH optical line systems for the local network.
For additional detail consult the appropriate standard document or contact the ITU. See also
International Telecommunication Union, ITU-T Recommendations, Standards.
Gabor Spectrogram: An algorithm to transform signals from the time domain to the joint timefrequency domain (similar to the Short Time FFT spectrogram). The Gabor is most useful for
analyzing signals who frequency content is time varying, but which does not show up on
conventional spectrogram methods. For example in a particular jet engine the casing vibrates at
50Hz when running at full speed. If the frequency actually fluctuates about ±1Hz around 50Hz, then
when using the conventional FFT the fluctuations may not have enough energy to be detected or
may be smeared due to windowing effects. The Gabor spectrogram on the other hand should be
able to highlight the fluctuations.
Gain: An increase in the voltage, or power level of a signal usually accomplish by an amplifier.
Gain is expressed as a factor, or in dB. See also Amplifier.
Gauss Transform: See Matrix Decompositions - Gauss Transform.
Gaussian Distribution: See Random Variable.
Gaussian Elimination: See Matrix Decompositions - Gaussian Elimination.
Gibbs Phenomenon: The Fourier series for a periodic signal with (almost) discontinuities will tend
to an infinite series. If the signal is approximated using a finite series of harmonics then the
reconstructed signal will tend to oscillate near or on the discontinuities. For example, the Fourier
series of a signal, x ( t ) , is given by:
∞
x( t) =
∑
2πnt
A n cos  ------------ +
 T 
n=0
∞
2πnt
-
∑ Bn sin  ----------T 
(211)
n=1
For a signal such as a square wave, the series will be infinite. If however we try to produce the signal
using just the first few Fourier series coefficients up to M:
M
x( t) =
∑
n=0
2πnt
A n cos  ------------ +
 T 
M
2πnt
-
∑ Bn sin  ----------T 
n=1
(212)
DSPedia
180
then “ringing” will be seen near the discontinuties since to adequately represent these parts of the
waveform we require the high frequency components which have been truncated. This ringing is
refered to as Gibb’s phenonmenon.
Time Signal
x(t)
100
Ts
A
m
50
p
l
i
t
0
u
d
e
-50
time/s
-100
0
10.e-3
20.e-3
30.e-3
time/s
The Fourier series for a square wave is an infinite series of sine waves at frequencies of
f 0, 3f 0, 5f 0, … . and relative amplitudes of 1, 1 ⁄ 3, 1 ⁄ 5, … If this series is truncated to the
15th harmonic, then the resulting “square wave” rings at the discontinuities.
See also Discrete Fourier Transform, Fourier Series, Fourier Series - Amplitude/Phase
Representation, Fourier Series - Complex Exponential Representation, Fourier Transform.
Given’s Rotations: See Matrix Decompositions - Given’s Rotations.
Global Information Infrastructure (GII): The Global Information Infrastructure will be jointly
defined by the International Organization for Standards (ISO), International Electrotechnical
Committee (IEC) and the International Telecommunication Union (ITU). The ISO, IEC and ITU have
all defined various standards that have direct relevance to interchange of graphics, audio, video and
data information via computer and telephone networks and all therefore have a relevant role to play
in the definition of the GII.
Global Minimum: The global minimum is the smallest value taken on by that function. For
example for the function, f(x), the global minimum is at x = xg. The minima are x1, x2 and x3 are
termed local minima:
f(x)
x1
xg
x2
x3
x
The existence of local minima can cause problems when using a gradient descent based adaptive
algorithm. In these cases, the algorithm can get stuck in a local minimum. This is not a problem
when the cost function is quadratic in the parameter of interest (e.g., the filter coefficients), since
181
quadratic functions (such as a parabola) have a unique minimum (or maximum) or, worst case, a
set of continuous minima that all give the same cost. See also Hyperparaboloid, Local Minima,
Adaptive IIR Filters, Simulated Annealing.
Glue Logic: To connect different chips on printed circuit boards (PCBs) it is often necessary to use
buffers, inverters, latches, logic gates etc. These components are often referred to a glue logic.
Many DSP chip designers pride themselves in having eliminated glue logic for chip interfacing,
especially between D/A and A/D type chips.
Golden Ears: A term often used to describe a person with excellent hearing, both in terms of
frequency range and threshold of hearing. Golden ear individuals can be in demand from recording
studios, audio equipment manufacturers, loudspeaker manufacturers and so on. Although a
necessary qualification for golden ears is excellent hearing, these individuals most probably learn
their trade from many years of audio industry experience. It would be expected that a golden ears
individual could “easily” distinguish Compact Disc (CD) from analog records. The big irony is that
golden eared individuals cannot distinguish recordings of REO Speedwagon from those of Styx.
See also Audiometry, Compact Disc, Frequency Range of Hearing, Threshold of Hearing.
Goertzel’s Algorithm: Goertzel’s algorithm is used to calculate if a frequency component is
present at a particular frequency bin of a discrete Fourier transform (DFT). Consider the DFT
equation calculating the discrete frequency domain representation, X ( m ) , of N samples of a
discrete time signal x ( k ) :
N–1
X(m) =
∑
x ( n )e
2πnm
– j  ----------------
N
, for all k = 0 to N – 1
(213)
n=0
This computation requires N 2 complex multiply accumulates (CMACs), and the frequency
representation will have a resolution of f s ⁄ N Hz. If we require to calculate the frequency component
at the p-th frequency bin, only N CMACs are required. Of course the fast Fourier transform (FFT)
is usually used instead of the DFT, and this requires Nlog 2 N CMACs. Therefore if a Fourier
transform is being performed simply to find if a tonal component is present at one frequency only,
it makes more sense to use the DFT. Note that by the nature of the calculation data flow, the FFT
cannot calculate a frequency component at one frequency only - it’s all bins or none. Goertzel’s
algorithm provides a formal algorithmic procedure for calculating a single bin DFT.
Goertzel’s algorithm to calculate the p-th frequency bin of an N point DFT is given by:
2πp
sp ( k ) = x ( k ) + 2 cos  ---------- s p ( k – 1 ) – s p ( k – 2 )
 N 
where W Np = e
2πp
j ---------N
yp ( k ) = sp ( k ) –
W Np s p ( k
– 1)
and the initial conditions s p ( – 2 ) = s p ( – 1 ) = 0 apply.
(214)
DSPedia
182
Eq. 214 calculates the p-th frequency bin of the DFT after the algorithm has processed N data
points, i.e. X ( p ) = y p ( N ) . Goertzel’s algorithm can be represented as a second order IIR:
x(k)
yp ( k )
sp ( k )
2πn
2 cos ---------N
p
WN = e
2πp
j ---------N
-1
An IIR filter representation of Goertzel’s algorithm. Note that the non-recursive part of the
filter has complex weights, whereas the recursive part has only real weights. The recursive
part of this filter is in fact a simple narrowband filter. For an efficient implementation it is best
to compute s p ( k ) for N samples, and thereafter evaluate y p ( N ) .
For tone detection (i.e. tone present or not-present), only the signal power of the p-th frequency bin
is of interest, i.e. X ( p ) 2 . Therefore from Eq. 214:
X ( p ) 2 = X ( p )X * ( p ) = yp ( N )yp* ( N )
2πp
= s p ( N )sp ( N ) + 2 cos  ---------- sp ( N )s p ( N – 1 ) + s p ( N – 1 )s p ( N – 1 )
 N 
(215)
Goertzel’s algorithm is widely used for dual tone multifrequency (DTMF) tone detection because of
its simplicity and that it requires less computation than the DFT or FFT. For DTMF tones, there are
8 separate frequencies which must be detected. Therefore a total of 8 frequency bins are required.
The International Telecommunication Union (ITU) suggest in standards Q.23 and Q24 that a 205
point DFT is performed for DTMF detection. To do a full DFT would require 205 × 205 = 42025
complex multiplies and adds (CMACs). To use a zero padded 256 point FFT would require
256log 2 256 = 2048 CMACs. Given that we are only interested in 8 frequency bins (and not 205
or 256), the computation required by Goerztel’s algorithm is 8 × 205 = 1640 CMACs. Compared
to the FFT, Goertzel’s algorithm is simple and requires little memory or assembly language code to
program. For DTMF tone detection the frequency bins corresponding to the second harmonic of
each tone are also calculated. Hence the total computation of Goertzel’s algorithm in this case is
3280 CMACs which is more than for the FFT. However the simplicity of Goertzel’s algorithm means
it is still the technique of choice.
In order to detect the tones at the DTMF frequencies, and using a 205 point DFT with
f s = 8000 Hz , the frequency bins to evaluate via Geortzel’s algorithm are:
frequency, f / Hz
bin
697
18
770
20
852
22
183
frequency, f / Hz
bin
941
24
1209
31
1336
34
1477
38
1633
42
Note that if the sampling frequency is not 8000 Hz, or a different number of data points are used,
then the bin numbers will be different from above. See also Discrete Fourier Transform, Dual Tone
Multifrequency, Fast Fourier Transform.
Gram-Schmidt: See Matrix Decompositions - Gram-Schmidt.
Granular Synthesis: A technique for musical instrument sound synthesis [13], [14], [32]. See also
Music, Western Music Scale.
Granularity Effects: If the step size is too large in a delta modulator, then the delta modulated
signal will give rise to a large error and completely fail to encode signals with a magnitude less than
the step size. See also Delta Modulation, Slope Overload.
x(n)
time
Graphic Interchange Format (GIF): The GIF format has become a de facto industry standard for
the interchange of raster graphic data. GIF was first developed by Compuserve Inc, USA. GIF
essentially defines a protocol for on-line transmission and interchange of raster graphic data such
that it is completely independent of the hardware used to create or display the image. GIF has a
limited, non-exclusive, royalty-free license and has widespread use on the Internet and in many
DSP enabled multimedia systems. See also Global Information Infrastructure, Joint Photographic
Experts Group, Standards.
Graphical Compiler: A system that allows you to draw your algorithm and application architecture
on a computer screen using a library of icons (FIR filters, FFTs etc.) which will then be compiled
into executable code, usually ‘C’, which can then be cross compiled to an appropriate assembly
language for implementation on a DSP processor. See also Cross Compiler.
Graphical Equalizer: This is a device used in music systems which can be used to control the
frequency content of the output. A graphic equalizer is therefore effectively a set of bandpass filters
with independent gain settings that can be implemented in the analog or digital domains.
Group Delay: See Finite Impulse Response Filter.
DSPedia
184
G ( e jω )
0
-10
-2π
-4π
-20
0
G ( e jω )
0
frequency (Hz)
frequency (Hz)
All-pass filter
Input
Gain (dB)
G(z)
0
HA(z)
G ( e jω )H A ( e jω )
-10
G ( e jω )H A ( e jω )
-2π
-20
0
0
Output
Phase
0
Phase
Gain (dB)
Group Delay Equalisation: A technique to equalise the phase response of a system to be linear
(i.e. constant group delay) by cascading the output of the system with an all pass filter designed to
have suitable phase shifting characteristics. The magnitude frequency response of the system
cascaded with the all pass filter is the same as that of the system on its own.
-4π
frequency (Hz)
0
frequency (Hz)
Group delay equalisation by cascading an all pass filter H A ( z ) with a non-linear phase filter
G ( z ) in order to linearise the phase response and therefore produce a constant group
delay. The magnitude frequency response of the cascaded system, G ( e jω )H A ( e jω ) is the
same as the original system, G ( e jω ) ..
The design of group delay equalisers is not a trivial procedure. See also All-pass Filter,
Equalisation, Finite Impulse Reponse Filter - Linear Phase .
Group Speciale Mobile (GSM): The European mobile communication system that implements
13.5kbps speech coding (with half-rate 6.5kbps channels optional) and uses Gaussian Minimum
Shift Keying (GMSK) modulation [85]. Data transmission is also available at rates slightly below the
speech rates. See also Minimum Shift Keying.
185
H
H261: See H-Series Recommendations - H261.
H320: See H-Series Recommendations - H320.
H-Series Recommendations: The H-series recommendations from the International
Telecommunication (ITU), advisory committee on telecommunications (denoted ITU-T, and
formerly known as CCITT) propose a number of standards for the line transmission of nontelephone signals. Some of the current ITU-T H-series recommendations (http://www.itu.ch) can be
summarised as:
H.100
H.110
H.120
H.130
H.140
H.200
H.221
H.224
H.230
H.231
H.233
H.234
H.242
H.243
H.261
H.281
H.320
H.331
Visual telephone systems.
Hypothetical reference connections for videoconferencing using primary digital group transmission.
Codecs for videoconferencing using primary digital group transmission.
Frame structures for use in the international interconnection of digital codecs for videoconferencing or
visual telephony
A multipoint international videoconference system
Framework for Recommendations for audiovisual services
Frame structure for a 64 to 1920 kbit/s channel in audiovisual teleservices
A real time control protocol for simplex application using the H.221 LSD/HSD/MLP channels.
Frame-synchronous control and indication signals for audiovisual systems.
Multipoint control units for audiovisual systems using digital channels up to 2 Mbit/s.
Confidentiality system for audiovisual services.
Encryption key management and authentication system for audiovisual services.
System for establishing communication between audiovisual terminals using digital channels up to 2
Mbit/s.
Procedures for establishing communication between three or more audiovisual terminals using digital
channels up to 2 Mbit/s.
Video codec for audiovisual services at p x 64 kbit/s.
A far end camera control protocol for videoconferences using H.224.
Narrow-band visual telephone systems and terminal equipment below.
Broadcasting type audiovisual multipoint systems and terminal equipment.
From the interest point of DSP and multimedia systems and algorithms the above title descriptions
of H242, H261 and H320 can be expanded upon as per http://www.itu.ch:
• H.242: The H242 recommendation defines audiovisual communication using digital channels up to 2 Mbit/s. This
recommendation should be read in conjunction with ITU-T recommendations G.725, H.221 and H.230. H242 is
suitable for applications that can use narrow (3 kHz) and wideband (7 kHz) speech together with video such as
video-telephony, audio and videoconferencing and so on. H242 can produce speech, and optionally video and/
or data at several rates, in a number of different modes. Some applications will require only a single channel,
whereas others may require two or more channels to provide the higher bandwidth.
• H.261: The H.261 recommendation describes video coding and decoding methods for the moving picture
component of audiovisual services at the rate of p x 64 kbit/s, where p is an integer in the range 1 to 30, i.e.
64kbits/s to 1.92Mbits/s. H261 is suitable for transmission of video over ISDN lines, for applications such as
videophones and videoconferencing. The videophone application can tolerate a low image quality and can be
achieved for p = 1 or 2 . For videoconferencing applications where the transmission image is likely to include a
few people and last for a long period, higher picture quality is required and p > 6 is required. H.261 defines two
picture formats: CIF (Common Intermediate Format) has 288 lines by 360 pixels/line of luminance information
and 144 x 180 of chrominance information; and QCIF (Quarter Common Intermediate Format) which is 144 lines
by 180 pixels/line of luminance and 72 x 90 of chrominance. The choice of CIF or QCIF depends on available
channel capacity and desired quality.
DSPedia
186
The H261 encoding algorithm is similar in structure that of MPEG, however they are not compatible. It is also
worth noting that H.261 requires considerably less CPU power for encoding than MPEG. Also the algorithm
makes available use of the bandwidth by trading picture quality against motion. Therefore a fast moving image
will have a lower quality than a static image. H.261 used in this way is thus a constant-bit-rate encoding rather
than a constant-quality, variable-bit-rate encoding.
• H.320: H.320 specifies a narrow-band visual telephone services for use in channels where the data rate cannot
exceed 1920 kbit/s.
For additional detail consult the appropriate standard document or contact the ITU. See also
International Telecommunication Union, ITU-T Recommendations, Standards.
Haas Effect: In a reverberant environment the sound energy received by the direct path can be
much lower than the energy received by indirect reflective paths. However the human ear is still
able to localize the sound location correctly by localizing the first components of the signal to arrive.
Later echoes arriving at the ear increase the perceived loudness of the sound as they will have the
same general spectrum. This psychoacoustic effect is commonly known as the precedence effect,
the law of the first wavefront, or sometimes the Haas effect [30]. The Haas effect applies mainly to
short duration sounds or those of a discontinuous or varying form. See also Ear, Lateralization,
Source Localization, Threshold of Hearing.
Habituation: Habituation is the effect of the auditory mechanism not perceiving a repetitive noise
(which is above the threshold of hearing) such as the ticking of a nearby clock or passing of nearby
traffic until attention is directed towards the sound. See also Adaptation, Psychoacoustics,
Threshold of Hearing.
Hamming Distance: Often used in channel coding applications, Hamming distance refers to the
number of bit locations in which two binary codewords differ. For example the binary words
10100011 and 10001011 differ in two positions (the third and the fifth from the left) so the Hamming
distance between these words is 2. See also Euclidean Distance, Channel Coding, Viterbi
Algorithm.
Hamming Window: See Windows.
Half Duplex: Pertaining to the capability to send and receive data on the same line, but not
simultaneously. See also Full Duplex, Simplex.
Hand Coding: When writing programs for DSP processors ‘C’ cross compilers are often available.
Although algorithm development with cross compilers is faster than when using assembly
language, the machine code produced is usually less efficient and compact as would be achieved
by writing in assembler. Cleaning up this less efficient assembly code is sometimes referred to as
hand-coding. Coding directly in machine code is also referred to as hand-coding. See also
Assembly Language, Cross-Compiler, Machine Code.
Handshaking: A communication technique whereby one system acknowledges receipt of data
from another system by sending a handshaking signal.
187
Harmonic: Given a signal with fundamental frequency of M Hz, harmonics of this signal are at
integer multiples of M, i.e. at 2M, 3M, 4M, and so on. See also Fundamental Frequency, Music,
Sub-harmonic, Total Harmonic Distortion.
Magnitude
fundamental frequency
harmonics
M
2M
3M
4M
frequency (Hz)
The frequency domain representation of a tone at M Hz with associated harmonics.
harris Window: See Windows.
Hartley Transform: The Hartley transform is “similar” in computational structure (although
different in properties) to the Fourier transform. One key difference is that the Hartley transform
uses real numbers rather than complex numbers. A good overview of the mathematics and
application of the Hartley transform can be found in [121].
Harvard Architecture: A type of microprocessor (and microcomputer) architecture where the
memory used to store the program, and the memory used to store the data are separate therefore
allowing both program and data to be accessed simultaneously. Some DSPs are described as
being a modified Harvard architecture where both program and data memories are separate, but
with cross-over links. See also DSP Processor.
Head Shadow: Due to the shape of the human head, incident sounds can be diffracted before
reaching the ears. Hence the actual waveform arriving at the ears is different than what would have
been received by an ear without the head present. Headshadow is an important consideration in
the design of virtual sound systems and in the design of some types of advanced DSP hearing aids.
See also Diffraction.
Hearing: The mechanism and process by which mammals perceive changes in acoustic pressure
waves, or sound. See also Audiology, Audiometry, Ear, Psychoacoustics, Threshold of Hearing.
Hearing Aids: A hearing aid can be described as any device which aids the wearer by improving
the audibility of speech and other sounds. The simplest form of hearing aid is an acoustic
amplification device (such as an ear trumpet), and the most complex is probably a cochlear implant
system (surgically inserted) which electrically stimulates nerves using acoustic derived signals
received from a body worn radio transmitter and microphone.
More commonly, hearing aids are recognizable as analogue electronic amplification devices
consisting of a microphone and amplifier connected to an acoustic transducer usually just inside the
ear. However a hearing aid which simply makes sounds louder is not all that is necessary to allow
hearing impaired individuals to hear better. In everyday life we are exposed to a very wide range of
sounds coming from all directions with varying intensities, and various degrees of reverberation.
Clearly hearing aids are required to be very versatile instruments, that are carefully designed
around known parameters and functions of the ear, and providing compensation techniques that
are suitable for the particular type of hearing loss, in particular acoustic environments.
DSPedia
188
Simple analogue electronic hearing aids can typically provide functions of volume and tone control.
More advanced devices may incorporate multi-band control (i.e., simple frequency shaping) and
automatic gain control amplifiers to adjust the amplification when loud noises are present. Hearing
aids offering multi-band compression with a plethora of digitally adjustable parameters such as
attack and release times, etc., have become more popular. Acoustic feedback reduction techniques
have also been employed to allow more amplification to be provided before the microphone/
transducer loop goes unstable due to feedback (this instability is often detected as an unsatisfied
hearing aid wearer with a screeching howl in their ear). Acoustic noise reduction aids that exploit
the processing power of advanced DSP processing have also been designed.
Digital audio signal processing based hearing aids may have advantages over traditional analogue
audio hearing aids. They provide a greater accuracy and flexibility in the choice of electroacoustic
parameters and can be easily interfaced to a computer based audiometer. More importantly they
can use powerful adaptive signal processing techniques for enhancing speech intelligibility and
reducing the effects of background noise and reverberation. Currently however, power and physical
size constraints are limiting the availability of DSP hearing aids. See also Audiology, Audiometry,
Beamforming, Ear, Head Shadow, Hearing Impairment, Threshold of Hearing.
Hearing Impairment: A reduction in the ability to perceive sound, as compared to the average
capability of a cross section of unimpaired young persons. Hearing impairment can be caused by
exposure to high sound pressure levels (SPL), drug induced, virus-induced, or simply as a result of
having lived a long time. A hearing loss can be simply quantified by an audiogram and qualified with
more exact audiological language such as sensorineural loss or conductive loss, etc., [4], [30]. See
also Audiology, Audiometry, Conductive Hearing Loss, Ear, Hearing, Loudness Recruitment,
Sensorineural Hearing Loss, Sound Pressure Level, Threshold of Hearing.
Hearing Level (HL): When the hearing of person is to be tested, the simplest method is to play
pure tones through headphones (using a calibrated audiometer) over a range of frequencies, and
determine the minimum sound pressure level (SPL) at which the person can hear the tone. The
results could then be plotted as minimum perceived SPL versus frequency. To ascertain if the
person has a hearing impairment the plot can be compared with the average minimum level of SPL
for a cross section of healthy young people with no known hearing impairments. However if the
minimum level of SPL (the threshold of hearing) is plotted as SPL versus frequency, the curve
obtained is not a straight line and comparison can be awkward. Therefore for Hearing Level (dB)
plots (or audiograms), the deviation from the average threshold of hearing of young people is
plotted with hearing loss indicated by a positive measurement that is plotted lower on the
audiogram. The threshold of hearing is therefore the 0dB line on the Hearing Level (dB) scale. The
equivalent dB (HL) and dB (SPL) for some key audiometric frequencies in the UK are [157]:
Frequency (Hz)
250
500
1000
2000
4000
8000
dB (HL)
0
0
0
0
0
0
dB (SPL)
26
15.6
8.2
5.2
7
20
See also Audiogram, Audiometry, Equal Loudness Contours, Frequency Range of Hearing,
Hearing Impairment, Loudness Recruitment, Sensation Level, Sound Pressure Level, Threshold of
Hearing.
Hearing Loss: See Hearing Impairment.
189
Hermitian: See Matrix Properties - Hermitian Transpose.
Hermitian Transpose: See Matrix Properties - Hermitian Transpose.
Hertz (Hz): The unit of frequency measurement named after Heinrich Hertz. 1 Hz is 1 cycle per
second.
Hexadecimal, Hex: Base 16. Conversion from binary to hex is very straightforward and therefore
hex digits have become the standard way of representing binary quantities to programmers. A 16
bit binary number can be easily represented in 4 hex digits by grouping four bits together starting
from the binary point and converting to the corresponding hex digit. The hex digits are 0, 1, 2, 3, 4,
5, 6, 7, 8, 9, A, B, C, D, E, F. Hexadecimal entries in DSP assembly language programs are prefixed
by either by $ or 0x to differentiate them from decimal entries. An example (with base indicated as
subscript):
0010 1010 0011 11112 = 2A3F16 = (2 x 163) + (10 x 162) + (3 x 161) + 15 = 1081510
High Pass Filter: A filter which passes only the portions of a signal that have frequencies above
a specified cut-off frequency. Frequencies below the cut-off frequency are highly attenuated. See
also Digital Filter, Low Pass Filter, Bandpass Filter, Filters.
Bandwidth
Input
High pass
Filter
G(f)
Output
Magnitude
|G(f)|
Cut-off
frequency
frequency
Higher Order Statistics: Most stochastic DSP techniques such as the power spectrum, least
mean squares algorithm and so on, are based on first and second order statistical measures such
as mean, variance and autocorrelation. The higher order moments, such as the 3rd order moment
(note that the first order moment is the mean, the second order central moment is the variance) are
usually not considered. However there is information to be gathered from a consideration of these
higher order statistics. One example is detecting the baud rate of PSK signals. Recently there has
been considerable interest in higher order statistics within the DSP community. For information
refer to the tutorial article [117]. See also Mean, Variance.
Hilbert Transform: Simply described, a Hilbert transform introduces a phase shift of 90 degrees
at all frequencies for a given signal. A Hilbert transform can be implemented by an all-pass phase
shift network. Mathematically, the Hilbert transform of a signal x(t) can be computed by linear
filtering (i.e., convolution) with a special function:
1
x h ( t ) ≡ x ( t ) ⊗ ----πt
(216)
It may be more helpful to think about the Hilbert transform as a filtered version of a signal rather
than a “transform” of a signal. The Hilbert transform is useful in constructing single sideband signals
(thus conserving bandwidth in communications examples). The transform is also useful in signal
analysis by allowing real bandpass signals (such as a radio signal) to be analyzed and simulated
DSPedia
190
as an equivalent complex baseband (or lowpass) process. Virtually all system simulation packages
exploit this equivalent representation to allow for timely completion of system simulations. Not
obvious from the definition above is the fact that the Hilbert transform of the Hilbert transform of x(t)
is -x(t). This may be expected from the heuristic description of the Hilbert transform as a 90 degree
phase shift -- i.e., two 90 degree phase shifts are a 180 degree phase shift which means multiplying
by a minus one.
Host: Most DSP boards can be hosted by a general purpose computer, such as an IBM compatible
PC. The host allows a DSP designer to develop code using the PC, and then download the DSP
program to the DSP board. The DSP board therefore has a host interface. The host usually supplies
power (analog, 12V and digital, 5V) to the board. See also DSP Board.
Householder Transformation: See Matrix Decompositions - Householder Transformation.
Huffman Coding: This type of coding exploits the fact that discrete amplitudes of a quantized
signal may not occur with equal probability. Variable length codewords can therefore be assigned
to a particular data sequence according to their frequency of occurrence. Data that occurs
frequently are assigned shorter code words, hence data compression is possible.
Hydrophone: An underwater transducer of acoustic energy for sonar applications.
Hyperchief: A MacIntosh program developed by a DSP graduate student from 1986 - 1991,
somewhere on the west coast of the USA, to simulate the wisdom of a Ph.D. supervisor. However,
while accurately simulating the wisdom of a Ph.D. supervisor, Hyperchief precisely illustrated the
pitfalls of easy access to powerful computers. Hyperchief is sometime spelled as Hypercheif
(pronounced Hi-per-chife).
Hyperparaboloid: Consider the equation:
e = x T Rx + 2p T x + s
(217)
where x is an n ×1 vector, R is a positive definite n ×n matrix, p is an n ×1 vector, and s is a scalar.
The equation is quadratic in x. If n = 1, then e will form a simple parabola, and if n = 2, e can be
represented as a (solid) paraboloid:
e
e
x =
x2
x
n = 1
n = 2
x1
x2
x1
The positive definiteness of R ensures that the parabola is up-facing. Note that in both cases the e
has exactly one minimum point (a global minimum) at the bottom of the parabolic shape. For
systems with n ≥ 3 e cannot be shown diagrammatically as four or more dimensions are required!
Hence we are asked to imagine the existence of a hyperparaboloid for n ≥ 3 and which will also
have exactly one minimum point for e. The existence of the hyperparaboloid is much referred to for
191
least squares, and least mean squares algorithm derivations. See also Global Minimum, Local
Minima.
Hypersignal: An IBM PC based program for DSP written by Hyperception Inc. Hypersignal
provides facilities for real time data acquisition in conjunction with various DSP processors, and a
menu driven system to perform off-line processing of real-time FFTs, digital filtering, signal
acquisition, signal generation, power spectra and so on. DOS and Windows versions are available.
HyTime: HyTime (Hypermedia/Time-Based Structuring Language) is a standardised
infrastructure for the representation of integrated, open hypermedia documents produced by the
International Organization for Standards (ISO), Joint Technical Committee, Sub Committee (SC)
18, Working Group (WG) 8 (ISO JTC1/SC18/WG8). See also Bento, Multimedia and Hypermedia
Information Coding Experts Group, Standards.
192
DSPedia
193
I
i: ”i” (along with “k” and “n”) is often used as a discrete time index for in DSP notation. See Discrete
Time.
I: Often used to denoted the identity matrix. See Matrix.
I-Series Recommendations: The I-series telecommunication recommendations from the
International Telecommunication (ITU), advisory committee on telecommunications (denoted ITUT and formerly known as CCITT) provide standards for Integrated Services Digital Networks. Some
of the current recommendations (http://www.itu.ch) include:
I.112
I.113
I.114
I.120
I.121
I.122
I.140
I.141
I.150
I.200
I.210
I.211
I.220
I.221
I.230
I.231
I.231.9
I.231.10
I.232
I.232.3
I.233
I.233.1-2
I.241.7
I.250
I.251.1-9
I.252.2-5
I.253.1-2
I.254.2
I.255.1
I.255.3-5
I.256
I.257.1
I.258.2
I.310
I.311
I.312
I.320
I.321
I.324
Vocabulary of terms for ISDNs.
Vocabulary of terms for broadband aspects of ISDN.
Vocabulary of terms for universal personal telecommunication.
Integrated services digital networks (ISDNs).
Broadband aspects of ISDN.
Framework for frame mode bearer services.
Attribute technique for the characterization of telecommunication services supported by an ISDN and
network capabilities of an ISDN.
ISDN network charging capabilities attributes.
B-ISDN asynchronous transfer mode functional characteristics.
Guidance to the I.200-series of Recommendations.
Principles of telecommunication services supported by an ISDN and the means to describe them.
B-ISDN service aspects.
Common dynamic description of basic telecommunication services.
Common specific characteristics of services.
Definition of bearer service categories.
Circuit-mode bearer service categories.
Circuit mode 64 kbit/s 8 kHz structured multi-use bearer service category.
Circuit-mode multiple-rate unrestricted 8 kHz structured bearer service category.
Packet-mode bearer services categories.
User signalling bearer service category (USBS).
Frame mode bearer services.
ISDN frame relaying bearer service/ ISDN frame switching bearer service.
Telephony 7 kHz teleservice.
Definition of supplementary services.
Direct-dialling-in/ Multiple subscriber number/ Calling line identification presentation/ Calling line
identification restriction/ Connected Line Identification Presentation (COLP)/ Connected Line
Identification Restriction (COLR)/ Malicious call identification/ Sub-addressing supplementary service.
Call forwarding busy/ Call forwarding no reply/ Call forwarding unconditional/ Call deflection.
Call waiting (CW) supplementary service/ Call hold.
Three-party supplementary service.
Closed user group.
Multi-level precedence and preemption service (MLPP)/ Priority service/ Outgoing call barring.
Advice of charge
User-to-user signalling.
In-call modification (IM).
ISDN Network functional principles.
B-ISDN general network aspects.
(See also Q.1201.) Principles of intelligent network architecture.
ISDN protocol reference model.
B-ISDN protocol reference model and its application.
ISDN network architecture.
DSPedia
194
I.325
I.327
I.328
I.329
I.330
I.331
I.333
I.334
I.350
I.351
I.352
I.353
I.354
I.355
I.356
I.361
I.362
I.363
I.364
I.365.1
I.370
I.371
I.372
I.373
I.374
I.376
I.410
I.411
I.412
I.413
I.414
I.420
I.421
I.430
I.431
I.432
I.460
I.464
I.470
I.500
I.501
I.510
I.511
I.515
I.520
I.525
I.530
I.555
I.570
I.580
I.601
I.610
Reference configurations for ISDN connection types.
B-ISDN functional architecture.
Intelligent Network - Service plane architecture.
Intelligent Network - Global functional plane architecture.
ISDN numbering and addressing principles.
Numbering plan for the ISDN era.
Terminal selection in ISDN.
Principles relating ISDN numbers/subaddresses to the OSI reference model network layer addresses.
General aspects of quality of service and network performance in digital networks, including ISDNs.
Relationships among ISDN performance recommendations.
Network performance objectives for connection processing delays in an ISDN.
Reference events for defining ISDN performance parameters.
Network performance objectives for packet mode communication in an ISDN.
ISDN 64 kbit/s connection type availability performance.
B-ISDN ATM layer cell transfer performance.
B-ISDN ATM layer specification.
B-ISDN ATM Adaptation Layer (AAL) functional description.
B-ISDN ATM adaptation layer (AAL) specification.
Support of broadband connectioneless data service on B-ISDN.
Frame relaying service specific convergence sublayer (FR-SSCS).
Congestion management for the ISDN frame relaying bearer service.
Traffic control and congestion control in B-ISDN.
Frame relaying bearer service network-to-network interface requirements.
Network capabilities to support Universal Personal Telecommunication (UPT).
Framework Recommendation on “Network capabilities to support multimedia services”.
ISDN network capabilities for the support of the teleaction service.
General aspects and principles relating to Recommendations on ISDN user-network interfaces.
ISDN user-network interfaces - references configurations.
ISDN user-network interfaces - Interface structures and access capabilities.
B-ISDN user-network interface.
Overview of Recommendations on layer 1 for ISDN and B-ISDN customer accesses.
Basic user-network interface.
Primary rate user-network interface.
Basic user-network interface - Layer 1 specification.
Primary rate user-network interface - Layer 1 specification.
B-ISDN user-network interface - Physical layer specification.
Multiplexing, rate adaption and support of existing interfaces.
Multiplexing, rate adaption and support of Existing interfaces for restricted 64 kbit/s transfer capability.
Relationship of terminal functions to ISDN.
General structure of the ISDN interworking Recommendations.
Service interworking.
Definitions and general principles for ISDN interworking.
ISDN-to-ISDN layer 1 internetwork interface.
Parameter exchange for ISDN interworking.
General arrangements for network interworking between ISDNs.
Interworking between ISDN and networks which operate at bit rates of less than 64 kbit/s.
Network interworking between an ISDN and a public switched telephone network (PSTN).
Frame relaying bearer service interworking.
Public/private ISDN interworking.
General arrangements for interworking between B-ISDN and 64 kbit/s based ISDN.
General maintenance principles of ISDN subscriber access and subscriber installation.
B-ISDN operation and maintenance principles and functions.
For additional detail consult the appropriate standard document or contact the ITU. See also
International Telecommunication Union, ITU-T Recommendations, Standards.
195
Magnitude
Ideal Filter: The ideal filter for a DSP application is one which will give absolute discrimination
between passband and stopband. The impulse response of an ideal filter is always non-causal, and
therefore impossible to build. See also Brick Wall Filter, Digital Filter .
4000Hz
frequency
A brick wall filter cutting off at 4000Hz is the ideal anti-alias filter for a DSP application with
fs = 8000Hz. All frequencies below 4000Hz are passed perfectly with no amplitude or phase
distortion, and all frequencies above 4000Hz are removed. In practice the ideal filter cannot
be achieved as it would be non-causal. In an FIR implementation, the more weights that are
used, the closer the frequency response will be to the ideal.
Identity Matrix: See Matrix Structured - Identity.
IEEE 488 GPIB: Many DSP laboratory instruments such as data loggers and digital oscilloscopes
are equipped with a GPIB (General Purpose Interface Bus). Note that this bus is also referred to as
HPIB by Hewlett-Packard, developers of the original bus on which the standard is based. Different
devices can then communicate through cables of maximum length 20 metres using an 8-bit parallel
protocol with a maximum data transfer of 2Mbytes/sec.
IEEE Standard 754: The IEEE Standard for binary floating point arithmetic specifies basic and
extended floating-point number formats; add, subtract, multiply, divide, remainder, and square root.
It also provides magnitude compare operations, conversion from/to integer and floating-point
formats and conversions between different floating-point formats and decimal strings. Finally the
standard also specifies floating-point exceptions and their handling, including non-numbers caused
by divide by zero. The Motorola DSP96000 is an IEEE 754 compliant floating point processor.
Devices such as the Texas Instruments TMS320C30 use their similar (but different!) floating point
format. The IEEE Standard 754 has also been adopted by ANSI and is therefore often referred to
as ANSI/IEEE Standard 754. See also Standards.
IEEE Standards: The IEEE publish standards in virtually every conceivable area of electronic and
electrical engineering. These standards are available from the IEEE and the titles, classifications
and a brief synopsis can be browsed at http://stdsbbs.ieee.org. See also Standards.
Ill-Conditioned: See Matrix Properties - Ill-Conditioned.
Image Interchange Facility (IIF): The IIF has been produced by the International Organization for
Standards (ISO,) Joint Technical Committee (JTC) 1, sub-committee (SC) 24 (ISO/IEC JTC1/
SC24) which is responsible for standards on “Computer graphics and image processing”. The IIF
standard is ISO 12087-3 and is the definition of a data format for exchanging image data of an
arbitrary structure. The IIF format is designed to allow easy integration into international
telecommunication services. See also International Organisation for Standards, JBIG, JPEG,
Standards.
Imaginary Number: The imaginary number denoted by j for electrical engineers (and by most
other branches of science and mathematics) is the square root of -1. Using imaginary numbers
196
allows the square root of any negative number to be expressed. For example,
also Complex Numbers, Fourier Analysis, Euler’s Formula.
DSPedia
– 25 = 5j . See
Impulse: An impulse is a signal with very large magnitude which lasts only for a very short time. A
mechanical impulse could be applied by striking an object with a hammer; a very large force for a
very short time. A voltage impulse would be a very large voltage signal which only lasts for a few
milli- or even microseconds.
A digital impulse has magnitude of 1 for one sample, then zero at all other times and is sometimes
called the unit impulse or unit pulse. The mathematical notation for an impulse is usually δ ( t ) for
an analog signal, and δ ( n ) for a digital impulse. For more details see Unit Impulse Function,. See
also Convolution, Elementary Signals, Fourier Transform Properties, Impulse Response, Sampling
Property, Unit Impulse Function, Unit Step Function.
Impulse Response: When any system is excited by an impulse, the resulting output can be
described as the impulse response (or the response of the system to an impulse). For example,
striking a bell with a hammer gives rise to the familiar ringing sound of the bell which gradually
decays away. This ringing can be thought of as the bell’s impulse response, which is characterized
by a slowly decaying signal at a fundamental frequency plus harmonics. The bell’s physical
structure supports certain modes of vibrations and suppresses others. The impulsive input has
energy at all frequencies -- the frequencies associated with the supported modes of vibration are
sustained while all other frequencies are suppressed. These sustained vibrations gives rise to the
bell’s ringing sound that we hear (after the extremely brief “chink” of the impulsive hammer blow).
We can also realize the digital impulse response of a system by applying a unit impulse and
observing the output samples that result. From the impulse response of any linear system we can
calculate the output signal for any given input signal simply by calculating the convolution of the
impulse response with the input signal. Taking the Fourier transform of the impulse response of a
system gives the frequency response. See also Convolution, Elementary Signals, Fourier
Transform Properties, Impulse, Sampling Property, Unit Impulse Function, Unit Step Function.
Incoherent: See Coherent.
Infinite Impulse Response (IIR) Filter: A digital filter which employs feedback to allow sharper
frequency responses to be obtained for fewer filter coefficients. Unlike FIR filters, IIR filters can
exhibit instability and must therefore be very carefully designed [10], [42]. The term infinite refers to
the fact that the output from a unit pulse input will exhibit nonzero outputs for an arbitrarily long time.
197
If the digital filter is IIR, then two weight vectors can be defined: one for the feedforward weights
and one for the feedback weights:
xk-1
xk
a0
yk-3
xk-2
a1
b3
a2
Feedforward Zeroes (non-recursive)
2
yk =
∑
yk
b1
Feedback Poles (recursive)
an xk – n +
∑
bn y k – n = a 0 xk + a 1 x k – 1 + a 2 xk – 2 + b 1 y k – 1 + b2 y k – 2 + b3 y k – 3
n=1
xk
⇒ yk =
b2
yk-1
3
n=0
aTx
yk-2
k
+
bT y
k–1
yk – 1
= a 0 a 1 a2 xk – 1 + b 1 b2 b3 yk – 2
xk – 2
yk – 3
A signal flow graph and equation for a 3 zero, 4 pole infinite impulse response filter.
See also Digital Filter, Finite Impulse Response Filter, Least Mean Squares IIR Algorithms.
Infinite Impulse Response (IIR) LMS: See Least Mean Squares IIR Algorithms.
Infinity (∞) Norm: See Matrix Properties - ∞ Norm.
Information Theory: The name given to the general study of the coding of information. In 1948
Claude E. Shannon presented a mathematical theory describing, among other things, the average
amount of information, or the entropy of a information source. For example, a given alphabet is
composed of N symbols (s1, s2, s3, s4,......., sN). Symbols from a source that generates random
elements from this alphabet are encoded and transmitted via a communication line. The symbols
are decoded at the other end. Shannon described a useful relationship between information and
the probability distribution of the source symbols: if the probability of receiving a particular symbol
is very high then it does not convey a great deal of information, and if low, then it does convey a
high degree of information. In addition, his measure was logarithmically based. According to
Shannon’s measure, the self information conveyed by a single symbol that occurs with probability
Pi is:
1
I ( s i ) = log  -----
2  P i
(218)
The average amount of information, or first order entropy, of a source can then be expressed as:
N
Hr ( s ) =
1
∑ Pi log 2  P-----i
i=1
(219)
DSPedia
198
Infrasonic: Of, or relating to infrasound. See Infrasound.
Infrasound: Acoustics signals (speed in air, 330ms-1) having frequencies below 20Hz, the low
frequency limit of human hearing, are known as infrasound. Although sounds as low as 3Hz have
been shown to be aurally detectable, there is no perceptible reduction in pitch and the sounds will
also be tactile. Infrasound is a topic close to the heart of a number of professional recording
engineers who believe that it is vitally important to the overall sound of music. In general CDs and
DATs can record down to around 5Hz.
Exposure to very high levels infrasound can be extremely dangerous and certain frequencies can
set cause organs and other body parts to resonate::
Area of Body
Approximate
Resonance Range (Hz)
Motion sickness
0.3-0.6
Abdomen
3-5
Spine/pelvis
4-6
Testicle/Bladder
10
Head/Shoulders
20-30
Eyeball
60-90
Jaw/Skull
120-200
Infrasound has been considered as a weapon for the military and also as a means of crowd control,
whereby the bladder is irritated. See also Sound, Ultrasound.
Inner Product: See Vector Operations - Inner Product.
In-Phase: See Quadrature.
Instability: A system or algorithm goes unstable when feedback (either physical or mathematical)
causes the system output to oscillate uncontrollably. For example if a microphone is connected to
an amplifier then to a loudspeaker, and the microphone is brought close to the speaker then the
familiar feedback howl occurs; this is instability. Similarly in a DSP algorithm mathematical
feedback in equations being implemented (recursion) may cause instability. Therefore to ensure a
system is stable, feedback must be carefully controlled.
Institute of Electrical Engineers (IEE): The IEE is a UK based professional body representing
electronic and electrical engineers The IEE publish a number of signal processing related
publications each month, and also organize DSP related colloquia and conferences.
Institute of Electrical and Electronic Engineers, Inc. (IEEE): The IEEE is a USA based
professional body covering every aspect of electronic and electrical engineering. IEEE publishes a
very large number of journals each month which include a number of notable signal processing
journals such Transactions on Signal Processing, Transactions on Speech and Audio Processing,
Transactions on Biomedical Engineering, Transactions on Image Processing and so on.
Integration (1): The simplest mathematical interpretation of integration is taking the area under a
graph.
199
Integration (2): The generic term for the implementation of many transistors on a single substrate
of silicon. The technology refers to the actual process used to produce the transistors: CMOS is the
integration technology for MOSFET transistors; Bipolar is the integration technology for TTL. The
number of transistors on a single device is often indicated by one of the acronyms, SSI, MSI, LSI,
VLSI, or ULSI.
Acronym
No. of
Transistors
Technology
First
Circuits
Example
SSI
Small scale integration
< 10
1960s
NPN junction
MSI
Medium Scale Integration
< 1000
1970s
4 NAND gates
LSI
Large Scale Integration
< 10000
Early 1980s
8086 microprocessor
VLSI
Very Large Scale Integration
<1000000
Mid 1980s
DSP56000
ULSI
Ultra Large Scale Integration
<100000000
1990s
TMS320C80
Integrated Circuit (IC): The name given to a single silicon chip containing many transistors that
collectively realize some system level component such as an A/D converter or microprocessor.
Integrated Digital Services Network (ISDN): See I-Series Recommendations.
Integrator: A device which will performs the function of computing the integral as an output for an
arbitrary input signal. In digital signal processing terms an integrator is quite straightforward.
Consider the simple mathematical definition of integration which is the area under a graph. The
output of an integrator, y(t), will be the area cumulative area under the input signal curve, x(t). For
sampled digital signals the input will be constant for one sampling period, and therefore to
approximately integrate the signal we can simply add the area of the sampling rectangles together.
If the sampling period is normalized to one, then a signal can be integrated in the discrete domain
by adding together the input samples. An integrator is implemented using a digital delay element,
and a summing element which calculates the function:
y(n) = x(n) + y(n – 1)
(220)
In the z-domain the transfer function of a discrete integrator is:
Y ( z ) = X ( z ) + z –1 Y ( z )
⇒
Y
( z )z ----------= ----------X( z)
z–1
(221)
DSPedia
200
When viewed in the frequency domain an integrator has the characteristics of a simple low pass
filter. See also Differentiator, Low Pass Filter.
y(t)
x(t)
3
2
1
x(t)
y(t)
∫ x ( t ) dt
time
time
Analog
Integration
y(n)
x(n)
x(n)
Discrete
time, n
∆t
x(n)
+
y(n)
Σ
+ y(n-1)
∑ x ( n )∆t
y(n)
Discrete
time, n
Discrete
Integration
X(z)
1 ----------------1 – Z –1
Y(z)
∆
Time Domain Discrete Integrator SFG
z-domain integrator representation
Intensity: See Sound Intensity.
Interchannel Phase Deviation: The difference in timing between the left and right channel
sampling times of a stereo ADC or DAC.
Interleaving: In channel coding interleaving is used to enhance the performance of a coder over
a channel that is prone to error bursts. The basic idea behind interleaving is to spread a block of
coded bits over a large number of dispersed channel symbols to allow the correction of just a few
errors in each block in spite of the fact that many consecutive channel symbols are corrupted.
201
Interleaving is best illustrated by an example. See also Channel Coding, Cross-Interleaved Reedcoded input
symbol stream
load blocks
into columns
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
single error
correcting block
16
17
18
19
20
read symbols
from rows
interleaved
symbol stream
single error
correcting block
single error
correcting block
single error
correcting block
The interleaving is accomplished by placing symbols from each block into a
separate column of an array and then transmitting the symbols sequentially from
the rows. For this block coding example, interleaving places symbols from
separate blocks of a single error correcting code next to each other. In this way,
when a burst error of 3 consecutive symbols occurs, all 3 symbols can be
corrected because they come from separately coded blocks. Note that in the
example below, all three symbols are from separate blocks.
burst error
1 6 11 16 2 7 12 17 3 8 13 18 4 9 14 19 5 10 15 20
Solomon Coding.
International Electrotechnical Commission (IEC): The IEC was founded in 1906 with the object
of promoting “international co-operation on all questions of standardization and related matters in
the fields of electrical and electronic engineering and thus to promote international understanding.”
The IEC is composed of a number of committees made up from members from the main industrial
countries of the world. The IEC publishes a wide variety of international standards and technical
reports.
The IEC works with other international organizations, particularly with the International Organization
(ISO), and also with the European Committee for Electrotechnical Standardization (CENELEC).
Standards resulting from cooperations are often prefixed with the letters JTC - Joint Technical
Committee. Some of the JTC standards relevant to DSP are discussed under International
Organization for Standards.
More information on the IEC can be found at the WWW site http://133.82.181.177/ikeda/IEC/. See
also International Organization for Standards (ISO), International Telecommunication Union,
Standards.
International Mobile (Maritime) Satellite Organization (Inmarsat): Inmarsat provides mobile
satellite communications world-wide for the maritime community. This satellite communication
system supports services such as telephone, telex, facsimile, e-mail and data connections.
Inmarsat's compact land mobile telephones (an essential tool for workers in remote parts of the
world) can fit inside a briefcase and provide an excellent means of worldwide emergency
communications. The various communication modes of Inmarsat rely on powerful DSP systems
and the use of various coding standards.
International Organisation for Standards (ISO): ISO is not in fact an acronym for the
International Organisation for Standards; that would be IOS. “ISO” is a word derived from the
Greek word isos, meaning “equal” such as in words like isotropic or isosceles. However it is quite
DSPedia
202
commonplace for ISO to be assumed to be an acronym for International Standards Organisation,
which it is not! But, on average, only one out of two authors would care.
ISO is an autonomous organization established in 1947 to promote the development of
standardization worldwide. ISO standards essentially contain technical criteria and other detail to
ensure that the specification, design, manufacture and use of materials, products, processes and
services are fit for their purpose. One common example of standardization in everyday life is the
woodscrew which should be produced in common ISO standards defining thread size, width, length
etc. Another example are credit cards which should all be produced according to ISO standard
widths, heights and lengths.
Standards on coding of audio and video are of particular relevance to DSP. ISO is made of various
committees, sub-committees (SC) and working groups who oversee the definition of new
standards, and ensure that current standards maintain their relevance. Some of the work most
relevant to DSP is actually performed by joint technical committees (JTC) with other standards
organisations such as the International Electrotechnical Commission (IEC). The ISO/IEC JTC 1 is
on information technology and has the scope of standardization within established and emerging
areas of information technology. Some of the key subcommittees that have been set up include:
SC 1:
SC 2:
SC 6:
SC 7:
SC 11
SC 14:
SC 15:
SC 17:
SC 18:
SC 21:
SC 22:
SC 23:
SC 24:
SC 25:
SC 26:
SC 27:
SC 28:
SC 29:
SC 30:
Vocabulary
Coded character sets
Telecommunications and information exchange between systems
Software engineering
Flexible magnetic media for digital data interchange
Data element principles
Volume and file structure
Identification cards and related devices
Document processing and related communication
Open systems interconnection, data management and open distributed processing
Programming languages, their environments and system software interfaces
Optical disk cartridges for information interchange
Computer graphics and image processing
Interconnection of information technology equipment
Microprocessor systems
IT Security techniques
Office equipment
Coding of audio, picture, multimedia and hypermedia information
Open electronic data interchange
Of most relevance to DSP, is the work of SC6, 24 and 29. SC29 is currently of particular interest
and is responsible for standards on “Coding of Audio, Picture, Multimedia and Hypermedia
Information”. SC29 is further subdivided into working groups (WG) which have already defined
various standards:
WG 1: Coding of Still Pictures
ISO/IEC 11 544: JBIG (Progressive Bi-level Compression)
ISO/IEC 10 918: JPEG (Continuous-tone Still Image)
Part 1: Requirement and Guidelines
Part 2: Compliance Testing
Part 3: Extensions
WG 11: Coding of Moving Pictures and Associated Audio
ISO/IEC 11 172: MPEG-1 (Moving Picture Coding up to 1.5 Mbit/s)
203
Part 1: Systems
Part 2: Video
Part 3: Audio
Part 4: Compliance Testing (CD)
Part 5: Technical Report on Software for ISO/IEC 11 172
ISO/IEC 13 818: MPEG-2 (Generic Moving Picture Coding)
Part 1: Systems (CD)
Part 2: Video (CD)
Part 3: Audio (CD)
Part 4: Compliance Testing
Part 5: Technical Report on Software for ISO/IEC 13 818
Part 6: Systems Extensions
Part 7: Audio Extensions
There is also work on MPEG-4 (Very-low Bitrate Audio-Visual Coding).
WG 12: Coding of Multimedia and Hypermedia Information
ISO/IEC 13 522: MHEG (Coding of Multimedia and Hypermedia Information)
Part 1: Base Notation (ASN.1) (CD)
Part 2: Alternate Notation (SGML) (WD)
Part 3: MHEG Extensions for Scripting: Language Support
More information on the ISO and ISO JTC standards can be found in the relevant ISO publications
which are summarized on http://www.iso.ch. See also International Electrotechnical Commission
(IEC), International Telecommunication Union (ITU), Standards.
International Standards Organization: See International Organisation for Standards.
International Telecommunication Union (ITU): The ITU is an agency of the United Nations who
operate a world-wide organization from which governments and private industry from various
countries coordinate the definition, implementation and operation of telecommunication networks
and services. The responsibilities of the ITU extend to regulation, standardization, coordination and
development of international telecommunications. They also have a general responsibility to ensure
the integration of the differing policies and systems in various countries. The headquarters of the
ITU is currently International Telecommunication Union, Place des Nations, CH-1211 Geneva 20,
Switzerland. They can be contacted on the world wide web at address http://www.itu.ch.
The recommendations and various standards of the ITU are divided into two key areas resulting
from the output two advisory committees on: (1) Telecommunication and denoted as ITU-T
recommendations, (formerly known as CCITT); and (2) Radiocommunications and denoted as ITUR recommendations (formerly known as CCIR. See also International Organisation for Standards,
ITU-R Recommendations, ITU-T Recommendations, Multimedia Standards, Standards.
Internet: The name give to the worldwide connection of computers each having a unique
identifying internet number. The internet currently allows interchange of electronic mail, and general
computer files containing anything from text, images, and audio. Useful tools for navigating the
internet and exploring information available from other users on machines other than your own,
include ftp (file transfer protocol) Gopher, Netscape, Mosaic, and Lynx [169], etc.
Interpolation: Interpolation is the creation of intermediate discrete values between two samples of
a signal. For example, if 3 intermediate and equally spaced samples are created, then the sampling
DSPedia
204
n
1
= ----t
n
time
Upsampler
4
From DSP
Processor
t
t
ovs = n
⁄4
freq
fn/2 fn
fovs/2
fovs
Magnitude
4fn
2fn
d
f
ovs
4
= 4f n = ----t
n
freq
fn/2 fn
fovs/2
fovs
fovs
Attenuation
fn/2 fn
Magnitude
freq
time
t
0
0
Magnitude
time
Digital Low Pass Filter
freq
fn/2 fn
2fn
4fn
Oversampling
DAC
Attenuation
f
Amplitude
Amplitude
tn
Amplitude
rate has increased by a factor of 4. Interpolation is usually accomplished by first up-sampling to
insert zeroes between existing samples, and then filtering with a low pass digital filter.
Analog anti-alias
filter
freq
fn/2 fn
fovs/2
Analog
Output
Interpolation of a 4 x’s oversampled signal by upsampling by 4 (zero insertion) and low pass
digital filtering. The interpolation process is essentially a technique whereby the
reconstruction filtering is being done partly in the analog domain and partly in the digital
domain. Note that the digital oversampled baseband signal will be delayed by the group
delay, t d of the digital low pass filter (which is usually linear phase)
Other types of curve fitting interpolators can also be produced, although there are less common.
Interpolators are widely found in digital audio systems such as CD players, where oversampling
filters (typically 4 ×’s) are used to increase the sampling rate in order to allow a simpler
reconstruction filter to be used at the output of the digital to analog converter (DAC). See also
Upsampling, Decimation, Downsampling, First Order Hold, Fractional Sampling Rate Conversion,
Zero Order Hold.
Interrupt: Inside a DSP processor an interrupt will temporarily halt the processor and force it to
perform an interrupt routine. For example an interrupt may happen every 1 ⁄ f s seconds in order
that a DSP processor executes the interrupt service routine, whereby it reads the value from an A/
D converter at a rate of fs samples every second.
Inverse, Matrix: See Matrix Operations - Inverse.
Inverse System Identification: Using adaptive filtering techniques, the approximate inverse of an
unknown filter, plant or data channel can be identified. In an adaptive signal processing inverse
system identification architecture, when the error, ε(k) has adapted to a minimum value (ideally
zero) then this means that in some sense y ( k ) ≈ s ( k ) , where s(k) is the input to the unknown
channel. Therefore the transfer function of the adaptive filter is now an approximate inverse of the
unknown system. Inverse system identification is widely used for equalizing data transmission
205
channels. See also Adaptive Filtering, Adaptive Line Enhancer, Equalisation, Least Mean Squares
Algorithm, System Identification,
Delay
s(k)
Unknown
System
x(k)
Adaptive
Filter
+
y(k)
−
Σ
ε(k)
Adaptive Algorithm
Generic Adaptive Signal Processing Inverse System Identification Architecture
Inversion Lemma: See Matrix Properties - Inversion Lemma.
ITU-R Recommendations: The International Telecommunication Union (ITU) have produced a
very comprehensive set of regulatory, standardizing and coordination documents for
radiocommunication systems. The ITU-Radiocommunications (ITU-R) advisory committee are
responsible for the generation, upkeep and amendment of the ITU-R recommendations. These
recommendations are classified into various subgroups or series identified by the letters:
Series
BO
BR
BS
BT
F
IS
M
PI
PN
RA
S
SA
SF
SM
SNG
TF
V
Description
Broadcasting satellite service (sound and television);
Sound and television recording;
Broadcasting service (sound);
Broadcasting service (television);
Fixed service;
Inter-service sharing and compatibility;
Mobile, radiodetermination, amateur and related satellite services;
Propagation in ionized media;
Propagation in non-ionized media;
Radioastronomy;
Fixed satellite service;
Space applications;
Frequency sharing between the fixed satellite service and the fixed service;
Spectrum management techniques;
Satellite news gathering;
Time signals and frequency standards emissions;
Vocabulary and related subjects.
In addition to the ITU-R (radiocommunication) recommendations, there are also the ITU-T
(telecommunication) recommendations See also International Organization for Standards,
International Telecommunication Union, ITU-T Recommendations, Standards.
ITU-T Recommendations: The International Telecommunication Union (ITU) have produced a
very comprehensive set of regulatory, standardizing and coordination documents for
telecommunication systems. The ITU-Telecommunications (ITU-T) advisory committee are
responsible for the generation, upkeep and amendment of the ITU-T recommendations. These
DSPedia
206
standards, definitions and recommendations are classified into various subgroups or series
identified by a letter:
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
X
Z
Organization of the work of the ITU-T;
Means of expression (definitions, symbols, classification);
General telecommunication statistics;
General tariff principles;
Overall network operation (numbering, routing, network management, etc.;
Services other than telephone (ops, quality, service definitions and human factors);
Transmission systems and media, digital systems and networks;
Line transmission of non-telephone signals;
Integrated Services Digital Networks;
Transmission of sound programmes and television signals;
Protection against interference;
Construction, installation and protection of cable and other elements of outside plant;
Maintenance: international systems, telephone, telegraphy, fax & leased circuits;
Maintenance: international sound programme and television transmission circuits;
Specifications of measuring equipment;
Telephone transmission quality, telephone installations, local line networks;
Switching and Signalling;
Telegraph transmission;
Telegraph services terminal equipment;
Terminal characteristics protocols for telematic services, document architecture;
Telegraph switching;
Data communication over the telephone network;
Data networks and open system communication;
Programming languages.
These recommendations were formerly known as CCITT (the former name of the ITU) regulations,
and are available from the ITU (usually for a price) in published book form (20 volumes and 61
Fascicles), or electronic form (http://www.itu.ch). The book form is also sometimes referred to as
the “blue book”.
The work of the committee is clearly outlined in the A-series recommendations:
A.1
A.10
A.12
A.13
A.14
A.15
A.20
A.21
A.22
A.23
A.30
Presentation of contributions relative to the study of questions assigned to the ITU-T.
Terms and definitions.
Collaboration with the International Electrotechnical Commission (IEC) on the subject of definitions for
telecommunications.
Collaboration with the IEC on graphical symbols and diagrams used in telecommunications.
Production maintenance and publication of ITU-T terminology.
Elaboration and presentation of texts for Recommendations of the ITU Telecommunication
Standardization Sector.
Collaboration with other international organizations over data transmission.
Collaboration with other international organizations on ITU-T defined telematic services.
Collaboration with other international organizations on information technology.
Collaboration with other international organizations on information technology, telematic services and
data transmission.
Major degradation or disruption of service.
From a DSP algorithm and implementation perspective the G-series specifies a variety of
algorithms for audio digital signal coding and compression, the H-series specifies video
compression techniques and the V-series specifies modem data communications strategies
including echo cancellation, equalisation and data compression.
In addition to the ITU-T (telecommunication) recommendations, there are also the ITU-T
(radiocommunication) recommendations. See also G-Series Recommendations, H-Series
207
Recommendations, International Organization for Standards, International Telecommunication
Union, ITU-R Recommendations, MPEG, Standards, V-Series Recommendations.
i860: Intel’s powerful RISC processor which has been used in many DSP applications.
208
DSPedia
209
J
j: The electrical engineering representation of – 1 , the imaginary number that mathematicians
denote as “i”. However, electrical engineers use “i” to denote current.
JND: Just Noticeable Difference. See Difference Limen.
J-Series Recommendations: The J-series telecommunication recommendations from the
International Telecommunication (ITU), advisory committee on telecommunications (denoted ITUT and formerly known as CCITT) provide standards for transmission of sound programme and
television signals. Some of the current recommendations (http://www.itu.ch) include:
J.11
J.12
J.13
J.14
J.15
J.16
J.17
J.18
J.19
J.21
J.23
J.31
J.33
J.34
J.41
J.42
J.43
J.44
J.51
J.52
J.61
J.62
J.63
J.64
J.65
J.66
J.67
J.73
J.74
J.75
J.77
Hypothetical reference circuits for sound-programme transmissions.
Types of sound-programme circuits established over the international telephone network.
Definitions for international sound-programme circuits.
Relative levels and impedances on an international sound-programme connection.
Lining-up and monitoring an international sound-programme connection.
Measurement of weighted noise in sound-programme circuits.
Pre-emphasis used on sound-programme circuit.
Crosstalk in sound-programme circuits set up on carrier systems.
A conventional test signal simulating sound-programme signals for measuring interference in other
channels.
Performance characteristics of 15 kHz-type sound-programme circuits - circuits for high quality
monophonic and stereophonic transmissions.
Performance characteristics of 7 kHz type (narrow bandwidth) sound-programme circuits.
Characteristics of equipment and lines used for setting up 15 kHz type sound-programme circuits.
Characteristics of equipment and lines used for setting up 6.4 kHz type sound-programme circuits.
Characteristics of equipment used for setting up 7 kHz type sound-programme circuits
Characteristics of equipment for the coding of analogue high quality sound programme signals for
transmission on 384 kbit/s channels.
Characteristics of equipment for the coding of analogue medium quality sound-programme signals for
transmission on 384-kbit/s channels.
Characteristics of equipment for the coding of analogue high quality sound programme signals for
transmission on 320 kbit/s channels.
Characteristics of equipment for the coding of analogue medium quality sound-programme signals for
transmission on 320 kbit/s channels.
General principles and user requirements for the digital transmission of high quality sound
programmes.
Digital transmission of high-quality sound-programme signals using one, two, or three 64 kbit/s
channels per mono signal (and up to six per stereo signal).
Transmission performance of television circuits designed for use in international connections.
Single value of the signal-to-noise ratio for all television systems.
Insertion of test signals in the field-blanking interval of monochrome and colour television signals.
Definitions of parameters for simplified automatic measurement of television insertion test signals.
Standard test signal for conventional loading of a television channel.
Transmission of one sound programme associated with analogue television signal by means of time
division multiplex in the line synchronizing pulse.
Test signals and measurement techniques for transmission circuits carrying MAC/packet signals for
HD-MAC signals.
Use of a 12-MHz system for the simultaneous transmission of telephony and television.
Methods for measuring the transmission characteristics of translating equipments.
Interconnection of systems for television transmission on coaxial pairs and on radio-relay links.
Characteristics of the television signals transmitted over 18 MHz and 60-MHz systems.
DSPedia
210
J.80
J.81
J.91
Transmission of component-coded digital television signals for contribution-quality applications at bit
rates near 140 Mbit/s.
Transmission of component-coded television signals for contribution-quality applications at the third
hierarchical level of ITU-T Recommendation G.702.
Technical methods for ensuring privacy in long-distance international television transmission.
For additional detail consult the appropriate standard document or contact the ITU. See also
International Telecommunication Union, ITU-T Recommendations, Standards.
Joint Bi-level Image Group (JBIG): JBIG is the name for a lossless compression algorithm for
binary (one bit/pixel) images which results from the International Organization for Standards (ISO)
sub-committee (SC) 29 which is responsible for standards on “Coding of Audio, Picture, Multimedia
and Hypermedia Information”. Working Group (WG) 1 of SC29 (ISO/IEC JTC1/SC29/WG1)
considered the problem of coding of still binary images and produced a joint standard with the
International Electrotechnical Commission (IEC): ISO/IEC 10918 - JBIG (Progressive Bi-level
Compression).
JBIG is intended to replace the current, (and less effective) Group 3 and 4 fax algorithms which are
primarily used for document text transmission (i.e., Fax). JBIG achieves compression by modelling
the redundancy in the image as the correlations of the pixel currently being coded with a set of
nearby pixels using arithmetic coding techniques. See also JPEG, MPEG Standards, Standards.
Joint Photographic Experts Group (JPEG): JPEG is the general name for a lossy compression
algorithm for continuous tone still images. JPEG is the original name of the committee who drafted
the standard for the International Organization for Standards (ISO) sub-committee (SC) 29 which
is responsible for standards on “Coding of Audio, Picture, Multimedia and Hypermedia Information”.
Working Group (WG) 1 (ISO/IEC JTC1/SC24/WG1) considered the problem of coding of still binary
images and produced the JPEG joint standard with the International Electrotechnical Commission
(IEC): ISO/IEC 11544 - JPEG (Continuous Tone Still Image).
JPEG is designed for compressing full 24 bit colour or gray-scale digital images of “natural” (realworld) scenes (as opposed to, for example, complex geometrical patterns). JPEG does not cater
for motion picture compression (see MPEG) or for black and white image compression (see JBIG)
where is does not cope well with edges formed at black-white boundaries. The primary compression
scheme in JPEG consists of a two dimensional discrete cosine transform (DCT) of image blocks, a
coefficient quantizer, a zig-zag scan of the quantized DCT coefficients (that has probably produced
long runs of zeros) that is subsequently run-length encoded by a Huffman code designed for a set
of training image zig-zag scan fields [39]. JPEG is a lossy algorithm however most of the
compression is achieved by exploiting known limitations of the human eye, for example that small
colour details are not perceived by the eye and brain as well as small details of light and dark.
The degree of information loss from JPEG compression can be varied by adjusting the values of
certain compression parameters. Therefore file size can be traded off against image quality, which
will of course depend on the actual application. Extremely small files (thumbnails) can be produced
using JPEG which are useful for icons or image indexing and archive purposes.
The ITU-T T-series standards T.80 - T83 are similar to JPEG:
• T.80
Common components for image compression and communication; basic
principles.
211
• T.81
• T.82
• T.83
Digital compression and encoding of continuous tone still images.
Progressive compression techniques for bi-level images.
Compliance testing.
Additional information is available form the independent JPEG group at [email protected] JPEG software and file specifications are available from a number of FTP
sites, including ftp://ftp.uu.net:/graphics/jpeg. See also JBIG, MPEG, Standards, T-Series
Recommendations.
Joint Stereo Coding: When compressing hifidelity stereo audio higher levels of compression can
be obtained by exploiting the commonalities between the audio on the left and right channels, than
would be gained by compressing the left and right channels independently. MPEG-Audio has a joint
stereo coding facility. See Compression, Moving Picture Experts Group (MPEG) - Audio.
Just (Music) Scale: A few hundred years ago, prior to the existence of the equitemporal or
Western music scale, a (major) musical key was formed from using certain carefully chosen
frequency ratios between adjacent notes, rather than the constant tone and semitone ratios of the
modern Western music scale. The C-major just scale would have had the following frequency
ratios:
C-major Scale
C
Frequency ratio 1/1
D
E
F
G
A
B
C
9/8
5/4
4/3
3/2
5/3
15/8
2/1
The frequency ratio gives the ratio of the fundamental frequency of the root note, to the
current note. The above ratios correspond to the Just Music Scale.
Any note can be used to realise a just major key or scale. However using the just scale it is difficult
to form other major or minor keys without a complete retuning of the instrument as all of the
fundamental frequencies in other keys are different. Instruments that are tuned and played using
the just scale will probably sound in some sense “medieval” as our modern appreciation of music
is now firmly based on the equitempered Western music scale. See also Digital Audio, Music, Music
Synthesis, Pythagorean Scale, Western Music Scale.
Just Noticeable Difference: See Difference Limen.
212
DSPedia
213
K
k: ”k” (along with “i” and “n”) is often used as a discrete time index for in DSP notation. It is also
often used as the frequency index in the DFT. See Discrete Time, Discrete Fourier Transform.
Karaoke DSP: For professionally recorded stereo music on CDs, DATs and so on, the vocal track,
v ( k ) , of a song is usually centered on the left and right channels, i.e. the same signal in the left
track L ( k ) and the right track R ( k ) which is perceived as coming from between the two
loudspeakers if the listener is sitting equidistant from both. The musical instruments are likely to be
laid out in some off-centre set up which means that they are unlikely to be identical signals on both
left and right channels, i.e.:
Left = L ( k ) = v ( k ) + M L ( k )
Right = R ( k ) = v ( k ) + M R ( k )
(222)
By digitally subtracting the left and right channels:
L ( k ) – R ( k ) = ML ( k ) – MR ( k )
(223)
the vocal track may be somewhat attenuated, enabling the song to be played with the vocals deemphasised by a few dBs, all ready for the bellowing tones of a Karaoke singer! See My Way by
Frank Sinatra.
Knee: The knee is the part of a magnitude-frequency graph of a filter, where the transition from
passband to stopband is made. A soft knee is where the transition realises a filter with very low rolloff, and a harder knee approaches the ideal filter. See also Roll-off .
10log10 Vout/Vin (dB)
Soft knee:
Roll-off of 20dB/decade
simple first order RC circuit
0
-5
-10
-15
-20
-25
-30
-35
-40
-45
-50
-55
-60
Hard knee:
Roll-off of 80dB/decade
using a 4th order active filter.
0.1f3dB
f3dB
10f3dB
100f3dB
1000f3dB
log10 f
Khoros: Khoros is a block diagram simulator for image and video processing which runs on a
variety of computer platforms such as Sun workstations.
Kronecker Impulse, or Kronecker Delta Function: See Unit Impulse Function.
Kronecker Product: See Matrix Operations - Kronecker Product.
214
DSPedia
215
L
LA (Linear Arithmetic) Synthesis: A technique for synthesis of the sound of musical instruments
[32]. See also Music, Music Synthesis.
LabView: A software package from National Instruments Inc. which allows powerful PC based
DSP instrumentation front-ends to be designed. LabView also convincingly presents the Virtual
Instrument concept. See also Virtual Instrument.
Laplace: A mathematical transform use for the analysis of analog systems.
Laplacian: A probability distribution that is often used to model the differences between adjacent
pixels in an image.
Lateralization: Lateralization refers to a psychoacoustics task in which a sound is determined to
be at some point within the head, either near one ear or the other along a line separating the two
ears. Very much like localization, lateralization differs in that the sound source is perceived within
the head rather than outside of the head. The common experience of listening to stereophonic
music via headphones (lateralization) versus listening to the same music via loud speakers in a
normal room (localization) emphasizes the difference between the two tasks. See also Localization.
Law of First Wavefront: In a reverberant environment the sound energy received by the direct
path can be very much lower than the energy received by indirect reflective paths. However the
human ear is still able to localize the sound location correctly by localizing the first components of
the signal to arrive. Later echoes arriving at the ear increase the perceived loudness of the sound
as they will have the same general spectrum. This psychoacoustic effect is sometimes known as
the law of the first wavefront or the Haas effect, and more commonly the precedence effect. The
precedence effect applies mainly to short duration sounds or those of a discontinuous or varying
form. See also Ear, Lateralization, Source Localization, Threshold of Hearing.
LDU: See Matrix Decompositions - LDU Decomposition.
Leaky LMS: See Least Mean Squares Algorithm Variants.
Least Mean Squares (LMS) Algorithm: The LMS is an adaptive signal processing algorithm that
is very widely used in adaptive signal processing applications such as system identification, inverse
system identification, noise cancellation and prediction. The LMS algorithm is very simple to
implement in real time and in the mean will adapt to a neighborhood of the Wiener-Hopf least mean
square solution. The LMS algorithm can be summarised as follows:
To derive the LMS algorithm, first consider plotting the mean squared error (MSE) performance
surface (i.e. E { e 2 ( k ) } as a function of the weight values) which gives an N+1-dimensional
hyperparaboloid which has one minimum. It is assumed that x ( k ) (the input data sequence)and
d ( k ) (a desired signal) are wide sense stationary signals (see Wiener-Hopf Equations). For
DSPedia
216
d(k )
Adaptive FIR
Filter, w(k)
x( k )
y(k)
+ e(k)
−
LMS Algorithm
y( k) =
N–1
∑
w ( k )x ( k – n ) = w T ( k )x ( k )
n=0
x ( k ) = [ x ( k ), x ( k – 1 ), x ( k – 2 ), … , x ( k – N + 2 ), x ( k – N + 1 ) ] T
where
w ( k ) = [ w 0 ( k ), w 1 ( k ), w 2 ( k ), …, w N – 2 ( k ), w N – 1 ( k ) ] T
e ( k ) = d ( k ) – y ( k ) = d ( k ) – w T ( k )x ( k )
w ( k + 1 ) = w ( k ) + 2µe ( k )x ( k )
In the generic adaptive filtering architecture the aim can intuitively be described as adapting
the impulse response of the FIR digital filter such that the input signal x ( k ) is filtered to
produce y ( k ) which, when subtracted from desired signal d ( k ) , will minimise the error
signal e ( k ) . If the filter weights are updated using the LMS weight update then the adaptive
FIR filter will adapt to the minimum mean squared error, assuming d ( k ) and x ( k ) to be wide
sense stationary signals.
discussion and illustration purposes the three dimensional paraboloid for a two weight FIR filter can
be drawn:
Large step size, µ
MSE, E{e2(k)}
Small step size, µ
w1
w1(opt)
MMSE
MMSE
w0(opt)
w0
The mean square error (MSE) performance surface for a two weight FIR filter. The Wiener-Hopf
solution is denoted as w k ( opt ) ) , which denotes where the minimum MSE (MMSE) occurs. The
gradient based LMS algorithm will (on average) adapt towards the MMSE by taking “jumps” in
the direction of the negative of the gradient of the surface (therefore “downhill”).
To find the minimum mean squared error (MMSE) we can use the Wiener Hopf equation, however
this is an expensive solution in computation terms. As an alternative we can use gradient based
techniques, whereby we can traverse down the inside of the parabola by using an iterative algorithm
217
which always updates the filter weights in the direction opposite of the steepest gradient. The
iterative algorithm is often termed gradient descent and has the form:
w ( k + 1 ) = w ( k ) + µ ( –∇k )
(224)
where ∇ k is the gradient of the performance surface:
∂ E{ e2( k ) }
∂w(k )
= 2Rw ( k ) – 2p
∇k =
(225)
where R is the correlation matrix, p is cross correlation vector (see Correlation Matrix and Cross
Correlation Vector) and µ is the step size (used to control the speed of adaption and the achievable
minimum or misadjustment). In the above figure a small step size “jumps” in small steps towards
the minimum are is therefore slow to adapt, however the small jumps mean that it will arrive very
close to the MMSE and continue to jump back and forth close to the minimum. For a large step size
the jumps are larger and adaption to the MMSE is faster, however when the weight vector reaches
the bottom of the bowl it will jump back and forth around the MMSE with a large magnitude than for
the small step size. The error caused by the traversing of the bottom of the bowl is usually called
the excess mean squared error (EMSE).
To calculate the MSE performance surface gradient directly is (like the Wiener Hopf equation) very
expensive as it requires that both R, the correlation matrix and p, the cross correlation vector are
known (see Wiener-Hopf Equations). In addition, if we knew R and p, we could directly compute the
optimum weight vector. But in general, we do not have access to R and p. Therefore a subtle
innovation, first defined for DSP by Widrow et al [152], was to replace the actual gradient with an
instantaneous (noisy) gradient estimate. One approach to generating this noisy gradient estimate
is to take the gradient of the actual squared error (versus the mean squared error), i.e.
ˆ
∇k =
∂ e2 ( k )
∂w(k)
(226)
∂
∂
= 2e ( k )
e ( k ) = –2 e ( k )
y ( k ) = – 2 e ( k )x ( k )
∂w(k)
∂w(k)
ˆ
Therefore using this estimated gradient, ∇ k in the gradient descent equation yields the LMS
algorithm:
w ( k + 1 ) = w ( k ) + 2µe ( k )x ( k )
(227)
DSPedia
218
The LMS is very straightforward to implement and only requires N multiply-accumulates (MACs) to
perform the FIR filtering, and N MACs to implement the LMS equation. A typical signal flow graph
for the LMS is shown below:
x(k)
d(k)
FIR Filter
w0
w1
w2
wN-2
wN-1
+
y(k)
−
LMS Weight Updates:
e(k)
Σ
w ( k + 1 ) = w ( k ) + 2µe ( k )x ( k )
A simple signal flow graph for an adaptive FIR filter, where the adaptive nature of the
filter weights is explicitly illustrated.
The LMS is very widely used in many applications such as telecommunications, noise control,
control systems, biomedical DSP, and so on. Its properties have been very widely studied and a
good overview can be found in [77], [53].
From a practical implementation point of view the algorithm designer must carefully choose the filter
length to suit the application. In addition, the step size must be chosen to ensure stability and a good
convergence rate. For the LMS upper and lower bounds for the adaptive step size can be calculated
as:
1
-≅
0 < µ < ---------------------------NE { x 2 ( k ) }
1
0 < µ < ------------------------------------------------------------N { Input Signal Power }
(228)
A more formal bound can be defined in terms of the eigenvalues of the input signal correlation
matrix [53]. However for practical purposes these values are not calculated and the above practical
bound is used (see Least Mean Squares Algorithm Convergence).
In general the speed of adaption is inversely proportional to the step size, and the excess MSE or
steady state error is proportional to the step size. A simple example of a 20 weight FIR filter being
219
used to identify an unknown filter (i.e., system identification) was simulated to produce the error
plots below for two different step sizes of 0.001 and 0.01:
Amplitude, e(k)
20log|e(k)| (dB)
Small Step Size
time index, k
time index, k
Adapting with a step size of µ = 0.001 the error signal, e ( k ) adapts slowly, however the
steady state error of about -35dB that is reached is about 10dB smaller than for the larger
step size of µ = 0.01 .
20log |e(k)| (dB)
Amplitude, e(k)
Large Step Size
time index, k
time index, k
Adapting with a step size of µ = 0.01 the error signal, e ( k ) adapts quickly, however the
steady state error of about -25dB that is reached is about 10dB larger than for the smaller
step size of µ = 0.001 .
Clearly a trade-off exists -- once again the responsibility of choosing this parameter is in the domain
of the algorithm designer. See also Acoustic Echo Cancellation, Active Noise Control, Adaptive Line
Enhancer, Adaptive Signal Processing, Adaptive Step Size, Correlation Matrix, Correlation Vector,
Echo Cancellation, Least Mean Squares Algorithm Convergence, Least Mean Squares Algorithm
Misadjustment/Algorithm/IIR Algorithms/Time Constant/ Variants, Least Mean Squares Filtered-X
Algorithm, Least Squares, Noise Cancellation, Recursive Least Squares, Wiener-Hopf Equations,
Volterra Filter.
Least Mean Squares (LMS) Algorithm Convergence: It can be shown that the (noisy) gradient
estimate used in the LMS algorithm (see Least Mean Squares Algorithm) is an unbiased estimate
of the true gradient:
DSPedia
220
ˆ 
E  ∇ k  = E [ –2 e ( k )x ( k ) ]
 
= ( E [ – 2 ( d ( k ) – w T ( k )x ( k ) )x ( k ) ] )
= 2 Rw ( k ) – 2p
ˆ
= ∇k
(229)
where we have assumed that w ( k ) and x ( k ) are statistically independent.
It can be shown that in the mean the LMS will converge to the Wiener-Hopf solution if the step size,
µ, is limited by the inverse of the largest eigenvalue. Taking the expectation of both sides of the LMS
equation gives:
E { w ( k + 1 ) } = E { w ( k ) } + 2µE [ e ( k )x ( k ) ]
= E { w ( k ) } + 2µ ( E [ d ( k )x ( k ) ] – E [ ( x ( k )x T ( k ) )w ( k ) ] )
(230)
and again assuming that w ( k ) and x ( k ) are statistically independent:
E { w ( k + 1 ) } = E { w ( k ) } + 2µ ( p – RE { w ( k ) } )
= ( I – 2µR )E { w ( k ) } + 2µRw opt
(231)
where w opt = R –1 p and I is the identity matrix. Now, defining v ( k ) = w ( k ) – w opt then we can
rewrite the above in the form:
E { v ( k + 1 ) } = ( I – 2µR )E { v ( k ) }
(232)
For convergence of the LMS to the Wiener-Hopf, we require that w ( k ) → w opt as k → ∞ , and
therefore v ( k ) → 0 as k → ∞ . If the eigenvalue decomposition of R is given by Q T ΛQ , where
Q T Q = I and Λ is a diagonal matrix then writing the vector v ( k ) in terms of the linear
transformation Q, such that E { v ( k ) } = Q T E { u ( k ) } and multiplying both sides of the above
equation, we realise the decoupled equations:
E { u ( k + 1 ) } = ( I – 2µΛ )E { u ( k ) }
(233)
E { u ( k + 1 ) } = ( I – 2µΛ ) k E { u ( 0 ) }
(234)
and therefore:
where ( I – 2µΛ ) is a diagonal matrix:
221
( I – 2µΛ ) =
( 1 – 2µλ 0 )
0
0
…
0
0
( 1 – 2µλ 1 )
0
…
0
0
0
( 1 – 2µλ 2 ) …
0
:
0
:
0
:
0
(235)
…
0
0 ( 1 – 2µλ N – 1 )
For convergence of this equation to the zero vector, we require that
( 1 – 2µλ n ) n → 0 for all n = 0, 1, 2, …, N – 1
(236)
Therefore the step size, µ, must cater for the largest eigenvalue, λ max = max ( λ 0, λ 1, λ 2, …, λ N – 1 )
such that: 1 – 2µλ max < 1 , and therefore:
1
0 < µ < -----------λ max
(237)
This bound is a necessary and sufficient condition for convergence of the algorithm in the mean
square sense. However, this bound is not convenient to calculate, and hence not particularly useful
for practical purposes. A more useful sufficient condition for bounding µ can be found using the
linear algebraic result that:
N–1
trace [ R ] =
∑
λn
(238)
n=0
i.e. the sum of the diagonal elements of the correlation matrix R, is equal to the sum of the
eigenvalues, then the inequality:
λ max ≤ trace [ R ]
(239)
will hold. However if the signal x ( k ) is wide sense stationary, then the diagonal elements of the
correlation matrix, R, are E { x 2 ( k ) } which is a measure of the signal power. Hence:
trace [ R ] = NE { x 2 ( k ) } = N<Signal Power>
(240)
and the well known LMS stability bound (sufficient condition) of:
1
0 < µ < ------------------NE [ x k2 ]
(241)
is the practical result. See also Adaptive Signal Processing, Least Mean Squares Algorithm, Least
Mean Squares Algorithm Misadjustment, Least Mean Squares Algorithm Time Constant, WienerHopf Equations.
Least Mean Squares (LMS) Algorithm Misadjustment: Misadjustment is a term used in
adaptive signal processing to indicate how close the achieved mean squared error (MSE) is to the
DSPedia
222
minimum mean square error. It is defined as the ratio of the excess MSE, to the minimum MSE, and
therefore gives a measure of how well the filter can adapt. For the LMS:
MSE-------------------------------Misadjustment = excess
MMSE
≈ µtrace [ R ]
≈ µN<Signal Power>
(242)
Therefore misadjustment from the MMSE solution is proportional to the LMS step size, the filter
length, and the signal input power of x ( k ) . See also Adaptive Signal Processing, Least Mean
Squares Algorithm, Least Mean Squares Algorithm Convergence, Least Mean Squares Algorithm
Time Constant, Wiener-Hopf Equations.
Least Mean Squares (LMS) Algorithm Time Constant: The speed of convergence to a steady
state error (expressed as an exponential time constant) can be precisely defined in terms of the
eigenvalues of the correlation matrix, R (see Least Mean Squares Algorithm Convergence). A
commonly used (if less accurate) measure is given by:
N
1
τ mse ≈ ----------------------------------- = ------------------------------------------------4µ ( trace [ R ] )
4µ<Signal Power>
(243)
Therefore the speed of adaption is proportional to the inverse of the signal power and the inverse
of the step size. A large step size will adapt quickly but with a large MSE, whereas a small step size,
will adapt slowly but achieve a small MSE. The design trade-off to select µ, is a requirement of the
algorithm designer, and will, of course, depend of the particular application. See also Adaptive
Signal Processing, Adaptive Step Size, Least Mean Squares Algorithm, Least Mean Squares
Algorithm Convergence, Least Mean Squares Algorithm Misadjustment, Wiener-Hopf Equations.
Least Mean Squares (LMS) Algorithm Variants: A number of variants of the LMS exist. These
variants can be split into three families: (1) algorithms derived to reduce the computation
requirements compared to the standard LMS; (2) algorithms derived to improve the convergence
properties over the standard LMS; (3) modifications of the LMS to allow a more efficient
implementation.
In order to reduce computational requirements, the sign-error, sign-data and sign-sign LMS
algorithms circumvent multiplies and replace them with shifting operations (which are essentially
power of two multiplies or divides). The relevance of the sign variants of the standard LMS however
is now somewhat dated due to the low cost availability of modern DSP processors where a multiply
can be performed in the same time as a bit shift (and faster than multiple bit shifts). The
convergence speed and achievable mean squared error for all of the sign variants of the LMS are
less desirable than the for the standard LMS algorithm.
To improve convergence speed, the stability properties and ensure a small excess mean squared
error the normalized, the leaky and the variable step size LMS algorithms have been developed. A
summary of some of the LMS variants are:
• Delay LMS: The delay LMS simply delays the error signal in order that a “systolic” timed application
specific circuit can be implemented:
w ( k + 1 ) = w ( k ) + 2µe ( k – n )x ( k – n )
(244)
223
Note that the delay-LMS is in fact a special case of the more general filtered-X LMS.
• Filtered-X LMS: See Least Mean Squares Filtered-X Algorithm.
• Filtered-U LMS: See Active Noise Control.
• Infinite Impulse Response (IIR) LMS: See Least Mean Squares - IIR Filter Algorithms.
• Leaky LMS: A leakage factor, c, can be introduced to improve the numerical behaviour of the standard
LMS:
w ( k + 1 ) = cw ( k ) + 2µe ( k )x ( k )
(245)
By continually leaking the weight vector, w ( k ) , even if the algorithm has found the minimum mean
squared error solution it will require to continue adapting to compensate for the error introduced by the
leakage factor. The advantage of the leakage is that the sensitivity to potentially destabilizing round off
errors is reduced. In addition, in applications where the input occasionally becomes very small, leaky LMS
drives the weights toward zero (this can be an advantage in noise cancelling applications). However the
disadvantage to leaky LMS is that the achievable mean squared error is not as good as for the standard
LMS. Typically c has a value between 0.9 (very leaky) and 1 (no leakage).
• Multichannel LMS: See [68]. .
• Newton LMS: This algorithm improves the convergence properties of the standard LMS. There is quite a
high computational overhead to calculate the matrix vector product (and, possibly, the estimate of the
correlation matrix R – 1 ) at each iteration:
w ( k + 1 ) = w ( k ) + 2R – 1 µe ( k )x ( k )
(246)
• Normalised Step Size LMS: The normalised LMS calculates an approximation of the signal input power
at each iteration and uses this value to ensure that the step size is appropriate for rapid convergence. The
normalized step size, µn, is therefore time varying. The normalised LMS is very useful in situations where
the input signal power fluctuates rapidly and the input signal is slowly varying non-stationary:
w ( k + 1 ) = w ( k ) + 2µ n e ( k )x ( k ),
1
µ n = --------------------------ε + x( k) 2
(247)
where ε is a small constant to ensure that in conditions of a zero input signal, x ( k ) , a divide by zero does
not occur. x ( k ) is 2-norm of the vector x ( k ) .
• Sign Data/Regressor LMS: The sign data (or regressor) LMS was first developed to reduce the number
of multiplications required by the LMS. The step size, µ, is carefully chosen to be a power of two and only
bit shifting multiplies are required:
w ( k + 1 ) = w ( k ) + 2µe ( k )sign [ x ( k ) ],
 1, x ( k ) > 0

sign [ x ( k ) ] =  0, x ( k ) = 0
– , ( ) <
 1 x k 0
(248)
• Sign Error LMS: The sign error LMS was first developed to reduce the number of multiplications required
by the LMS. The step size, µ, is carefully chosen to be a power of two and only bit shifting multiplies are
required:
w ( k + 1 ) = w ( k ) + 2µsign [ e ( k ) ]x ( k ),
 1, e ( k ) > 0

sign [ e ( k ) ] =  0, e ( k ) = 0
– , ( ) <
 1 e k 0
• Sign-SIgn LMS: The sign-sign error LMS was first presented in 1966 to reduce the number of
multiplications required by the LMS. The step size, µ, is carefully chosen to be a power of two and only bit
shifting multiplies are required:
(249)
DSPedia
224
w ( k + 1 ) = w ( k ) + 2µsign [ e ( k ) ]sign [ x ( k ) ],
 1, z ( k ) > 0

sign [ z ( k ) ] =  0, z ( k ) = 0

 – 1, z ( k ) < 0
(250)
• Variable Step Size LMS: The variable step size LMS was developed in order that when the LMS
algorithm first starts to adapt, the step size is large and convergence is fast. However as the error reduces
the step size is automatically decreased in magnitude in order that smaller steps can be taken to ensure
that a small excess mean squared error is achieved:
w ( k + 1 ) = w ( k ) + 2µ v e ( k )x ( k ),
(251)
µv ∝ E { e 2 ( k ) }
Alternatively variable step size algorithms can be set up with deterministic schedules for the modification
of the step size. For example
w ( k + 1 ) = w ( k ) + 2µ v e ( k )x ( k ),
µ v = µ2 – int ( λk )
(252)
such that as time, k, passes the step size, µ v , gets smaller in magnitude. µ is the step size calculated for
the standard LMS, λ is a positive constant, and int ( λk ) is the closest integer to λk .
Note that a hybrid of more than one of the above LMS algorithm variants could also be
implemented. See also Adaptive Signal Processing, Least Mean Squares Algorithm, Least Mean
Squares IIR Algorithms, Recursive Least Squares.
Least Mean Squares (LMS) Filtered-X Algorithm: In certain control applications the adaptive
architecture has a transfer function at the output of the adaptive filter:
d(k)
x(k)
Adaptive
Filter, w(k)
y(k)
Transfer
Function, G(f)
+
e(k)
−
z(k)
This adaptive filtering architecture has a known transfer function at the output of the
adaptive filter which filters y ( k ) before subtraction from d ( k ) to produce the error.
Compare this to the generic adaptive filtering described previously (see Adaptive
Filtering). Note that the DAC and ADC at the input and output respectively of the transfer
function G ( f ) are not shown for diagrammatic clarity.
In deriving the standard LMS algorithm the gradient of the instantaneous squared error was
calculated. Note, however, in the above architecture the instantaneous error is given by:
e(k ) = d( k) – z( k)
= d ( k ) – { y ( k )* g ( k ) }
(253)
where g ( k ) is the perfectly sampled impulse response of the transfer function at the output of the
adaptive filter, and the term { y ( k )* g ( k ) } is the result of y ( k ) being convolved with g ( k ) . Therefore
calculating the derivative of the instantaneous error produces:
225
ˆ
∇k =
∂ e2( k )
∂w( k )
= 2e ( k )f ( k )
(254)
where
and
f ( k ) = [ f ( k ), f ( k – 1 ), f ( k – 2 ), …, f ( k – M + 1 ) ]
(255)
f ( k ) = { x ( k )* g ( k ) }
(256)
Therefore this algorithm requires that the pulse response g ( k ) is known exactly in order to convolve
with the input vector to create the f ( k ) vector. Clearly it is unlikely that g ( k ) will be known exactly,
however an estimate, ĝ ( k ) can be found by apriori system identification. Therefore the filtered-X
LMS algorithm is:
w ( k + 1 ) = w ( k ) + 2µe ( k )f ( k – n )
f( k) =
M–1
∑
(257)
ĝ ( k )x ( k – n )
n=0
where M is the number of filter weights used in the FIR filter estimate of g ( k ) . Note that the number
of weights in this estimate will influence the performance of the algorithm; too few weights may not
adequately model the transfer function and could degrade performance. Therefore M must be
carefully chosen by the algorithm designer. The filtered-X LMS can be summarised as:
d(k)
Adaptive
Filter, w(k)
x(k)
y(k)
Transfer
Function, g(t)
+
e(k)
−
z(k)
ĝ ( k )
f(k)
w ( k + 1 ) = w ( k ) + 2µe ( k )f ( k – n )
The filtered-X LMS prefilters the x ( k ) vector using an estimate, ĝ ( k ) , of the impulse
response of the transfer function g ( t ) . The accuracy of this estimate will influence the
performance of the algorithm.
See also Active Noise Control, Adaptive Signal Processing, Adaptive Step Size, Inverse System
Identification, Least Mean Squares (LMS) Algorithm.
Least Mean Squares (LMS) IIR Algorithms: Recently adaptive filtering algorithms based on IIR
filters have been investigated for a number of applications. A good overview of adaptive IIR filters
can be found in [36], [132]. The very simplest form of adaptive IIR LMS, sometimes referred to as
Feintuch’s algorithm [71], can be represented as:
In addition to the normal step size stability concerns of adaptive filters, the adaptive IIR LMS filter
instability can also result if the poles of the filter migrate outside of the unit circle. Therefore extreme
DSPedia
226
d(k)
Output Error IIR LMS
x(k)
y(k)
FIR Filter
a(k)
Σ
+
Σ
−
+
+
e(k)
FIR Filter
b(k)
a ( k + 1 ) = a ( k ) + 2µe ( k )x ( k )
M–1
N–1
y(k) =
∑
b ( k + 1 ) = b ( k ) + 2µe ( k )y ( k – 1 )
a ( k )x ( k – n ) +
∑
b ( k )y ( k – m ) = a ( k )x ( k ) + b ( k )y ( k – 1 )
n=1
n=0
The simplest form of output error adaptive IIR LMS where the filter poles and zeroes
are updated by independent pole and zero weight updates.
care is necessary when choosing the adaptive step size for both recursive and non-recursive weight
updates. While this simple (some would say simple-minded) algorithm appears to be useless, it is
surprisingly robust in a wide variety of applications.
In order to address the problem of poles migrating outside of the unit circle, one suggestion has
been the equation error adaptive IIR LMS filter which is actually the updating of two independent
FIR filters:
d(k)
Equation Error IIR LMS
x(k)
y(k)
FIR Filter
a(k)
+
Σ
-
+
+
Σ
e(k)
FIR Filter
b(k)
a ( k + 1 ) = a ( k ) + 2µe ( k )x ( k )
M–1
N–1
y(k) =
∑
n=0
b ( k + 1 ) = b ( k ) + 2µe ( k )d ( k )
a ( k )x ( k – n ) +
∑
b ( k )d ( k – m ) = a ( k )x ( k ) + b ( k )d ( k )
n=0
The simplest form of output error adaptive IIR LMS where the filter poles and zeroes
are updated by independent pole and zero weight updates.
In conditions of high observation noise the equation error will give a biased (and very poor!)
solution. See also Active Noise Control, Adaptive Signal Processing, Least Mean Squares
Algorithm.
227
Least Significant Bit (LSB): The bit in a binary number with the least arithmetic numerical
significance. See also Most Significant Bit, Sign Bit.
MSB
LSB
-128 64
0
1
32
16
8
4
2
1
0
1
1
0
1
1
In 2’s complement notation the
MSB has a negative weighting.
= 64 + 16 + 8 + 2 + 1 = 9110
Least Squares: Given the overdetermined linear set of equations, Ax = b, where A is a known
m × n matrix of rank n (with m > n ), b is a known m element vector, and x is an unknown n element
vector, then the least squares solution is given by:
x LS = ( A T A ) –1 A T b
(258)
(Note that if the problem is underdetermined, m < n , then Eq.258 is not the solution, and in fact
there is no unique solution; a good (i.e., close) solution can often be found however using the
pseudoinverse obtained via singular value decomposition.)
The least squares solution can be derived as follows. Consider again the overdetermined linear set
of equations:
a 1 a 12 … a 1n
b1
a 21 a 22 … a 2n x 1
b2
a 31 a 32 … a 3n x 2
a 41 a 42 … a 4n
:
:
a m1 a m2
A
… :
… a mn
:
xn
x
=
b3
b4
:
bm
b
(259)
If A is a nonsingular square matrix, i.e. m = n , then the solution can be calculated as:
x = A –1 b
(260)
However if m ≠ n then A is a rectangular matrix and therefore not invertible, and the above equation
cannot be solved to give exact solution for x. If m < n then the system is often referred to as
underdetermined and an infinite number of solutions exist for x (as long as the m equations are
consistent). If m > n then the system of equations is overdetermined and we can look for a solution
by striving to make Ax be as close as possible to b, by minimizing Ax – b in some sense. The most
mathematical tractable way to do this is by the method of least squares, performed by minimizing
the 2-norm denoted by e :
e = ( Ax – b 2 ) 2 = ( Ax – b ) T ( Ax – b )
(261)
DSPedia
228
Plotting e against the n-dimensional vector x gives a hyperparabolic surface in (n+1)-dimensions. .
If n = 1 , x has only one element and the surface is a simple parabola. For example consider the
case where A is a 2 × 1 matrix, then from Eq. 261:
T
 a
b   a
b 
e =  1 x– 1   1 x– 1
b2   a2
b2 
 a2
= a1 a2
a1
a2
a1
x 2 –2 b1 b2
a2
b1
x + b1 b2
(262)
b2
= a 12 + a 22 x 2 – 2 a 1 b 1 + a 2 b 2 x + b 12 + b 22
= P x 2 – Qx + R
where P = a 12 + a 22 , Q = 2 a 1 b 1 + a 2 b 2 and R = b 12 + b 22 .
Clearly the minimum point on the surface lies at the bottom of the parabola:
de
= 2Px LS – Q = 0
dx
e
R
Q⇒ x LS = -----2P
a 1 b1 + a2 b 2
Qx LS = ( A T A ) – 1 A T b = ------------------------------------ = -----2P
a 12 + a 22
emin
xLS
x
If n = 2, x = [x1 x2]T and the error surface is a paraboloid. This surface has one minimum point at
the bottom of the paraboloid where the gradient of the surface with respect to both x1 and x2 axis:
e
x =
x2
emin
x2LS
x1LS
x1
x1
x2
de
d x1
de
=
= 0
dx
0
de
d x2
229
If the x vector has three or more elements (n ≥ 3) the surface will be in four or more dimensions and
cannot be shown diagrammatically.
To find the minimum value of e for the general case of an n-element x vector the “bottom” of the
hyperparaboloid can be found by finding the point where the gradient in every dimension is zero (cf.
the above 1 and 2-dimensioned examples). Therefore differentiating e with respect to the vector x:
de
= de de de
dx
d x1 d x2 d x3
…
de
d xn
T
= 2A T ( Ax – b )
(263)
and setting the gradient vector to the zero vector,
de =
0
dx
(264)
to find the minimum point, emin, on the surface gives the least squares error solution for xLS:
2A T ( Ax LS – b ) = 0
A T Ax LS – A T b = 0
(265)
x LS = ( A T A ) –1 A T b
If the rank of matrix A is less than n, then the inverse matrix (ATA)-1 does not exist and the least
squares solution cannot be found using Eq. 265 and the pseudoinverse requires to be calculated
using singular value decomposition techniques. Note that if A is an invertible square matrix, then
the least squares solution simplifies to:
x = ( A T A ) –1 A T b = A –1 A –T A T b = A –1 b
(266)
See also Matrix Decompositions - Singular Value Decompositions, Matrix Inversion, Minimum
Residual, Normal Equations, Least Mean Squares, Least Squares Residual, Square System of
Equations, Overdetermined System, Recursive Least Squares.
Least Squares Residual: The least squares error solution to the overdetermined system of
equations, Ax = b , is given by:
x LS = ( A T A ) –1 A T b
(267)
where A is a known m × n matrix of rank n and with m > n, b is a known m element vector, and x
is an unknown n element vector. The least squares residual given by:
r LS = b – Ax LS
(268)
is a measure of the error obtained when using the method of least squares. The smaller the value
of rLS, then the more accurately b can be generated from the columns of the matrix A. The
DSPedia
230
magnitude or size of the least squares residual is calculated by finding the squared magnitude, or
2-norm of rLS:
ρ LS = Ax LS – b
(269)
2
As an example, for a system with n = 2 the least squares residual can be shown on the least
squares error surface, e, as:
e
e = ( Ax – b 2 ) 2
x2
ρ2LS
x2LS
x1LS
x1
Note that if m = n, and A is a non-singular matrix, then ρLS =0. See also Least Squares, Matrix, QR
Algorithm, Recursive Least Squares.
Leq: See Equivalent continuous level.
Linear Algebra: Linear algebra is an older branch of mathematics that uses matrix based
equations. The computer has spawned a rebirth of interest in linear algebra and changed what was
thought to be an arcane, obsolete and strictly academic area into a ubiquitous, fundamental tool in
virtually every applied, pure or social science field. Over the last few years the advent of fast DSP
processors has led to the solution of many DSP problems using numerical linear algebra [15]. See
also Matrix, Matrix Algorithms, Matrix Decompositions, Matrix Properties.
Linear Feedback Shift Register (LFSR): A simple shift register with feedback and combinational
logic using for the generation of pseudo random binary noise. See Pseudo Random Binary
Sequence.
Linear Phase Filter: See Finite Impulse Response Filter.
Linear Predictive Coding (LPC): Linear predictive coding is a compression algorithm for
reducing the storage requirements of digitized speech. In LPC the vocal tract is modelled as an allpole digital filter (IIR) and the calculated filter coefficients are used to code the speech down to
levels of 2400 bits/sec from speech sampled at 8kHz with 8 bit resolution.
231
Linear System: A system is said to be linear if the weighted sum of the system output given two
distinct inputs equals the system output given a single input equal to the weighted sum of the two
distinct inputs.:
x(n)
Linear
System
y(n)
time
time
In general, for a linear system y ( n ) = f ( x ( n ) ) , if, whenever:
y1 ( n ) = f [ x1 ( n ) ]
y2 ( n ) = f [ x2 ( n ) ]
(270)
then:
a 1 y 1 ( n ) + a 2 y2 ( n ) = f [ a1 x1 ( n ) + a 2 x2 ( n ) ]
(271)
for all values of a 1 and a 2 . For example consider the linear system:
y ( n ) = 4.3x ( n ) + 6.01x ( n – 1 )
(272)
If x1(n) = sin100nt, then the output which will be denoted as y1(n), is given by:
y 1 ( n ) = 4.3 sin 100nt + 6.01 sin 100 ( n – 1 )t
(273)
For a different input x2(n) = sin250nt, then the output denoted as y2(n) is given by:
y 2 ( n ) = 4.3 sin 250nt + 6.01 sin 250 ( n – 1 )t
(274)
Therefore, given that the system is linear, if x3(n) = sin100nt + sin250nt, then:
y 3 ( n ) = 4.3 ( sin 100nt + sin 250nt ) + 6.01 ( sin 100 ( n – 1 )t + sin 250 ( n – 1 )t )
= y1 ( n ) + y2 ( n )
(275)
In general inputting a sine wave to a linear system will yield an output that is a sine wave at exactly
the same frequency but with modified phase and magnitude. If any other frequencies are output
(e.g., if the sine wave is distorted in any way) then the system is nonlinear. (Note that this is not true
for other waveforms; inputting a square to a linear system is unlikely to produce a square wave at
the output. If the square wave is viewed as its sine wave components (from Fourier analysis) then
the output of the linear system should only contain sine waves at those frequencies, but where the
modification of their amplitude, phase and frequency means that their superposition no longer gives
a square wave.)
DSP systems such as digital filters (IIR and FIR) are linear filters. Any filter that has time varying
weights, however, is non-linear. See also Distortion, Non-Linear System, Poles, Transfer Function,
Frequency Response.
DSPedia
232
Linearity: Linearity is the property possessed by any system which is linear.
Linearly Dependent Vectors: See Vector Properties and Definitions - Linearly Dependent.
LLT: See Matrix Decompositions - Cholesky Decomposition.
Local Minima: The global minimum is the smallest value taken on by that function. For example
for the function, f ( x ) , the global minimum is at x = x g . The minima are x 1 , x 2 and x 3 are termed
local minima:
f(x )
x1
xg
x2
x3
x
When attempting to use least squares, or least mean squared based algorithms to find the global
minimum of a function, the zero gradient of the function is found. For a quadratic surface with only
one minimum the method works very well. However if the surface in not quadratic, then the solution
obtained is not necessarily the global minimum, as the gradient is also zero at the local minima (and
the local maxima and inflection points). See also Adaptive IIR Filters, Hyperparaboloid, Global
Minima, Least Squares, Simulated Annealing.
Localization: When used in the context of acoustics, localization is the ability to perceive the
direction from which sounds are coming. For animals the two ears provide excellent instruments of
localization. Localization problems are also found in radar and sonar systems where arrays of
sensors are used to sense the direction from which signals are radiating. Generally, a minimum of
two sensors are required to accurately localize a sound source. A current focus of research is in
producing arrays of microphones using DSP algorithms to improve sound quality for applications
such as hands-free telephony, hearing aids, and concert hall microphone pick-ups. Some
applications require that a desired source be located before it can be extracted or filtered from the
rest of the sound field. See also Audiology, Beamforming, Lateralization.
Logarithmic Amplitude: If the amplitude range of a signal or system is very large then it is often
convenient to plot the magnitude on a logarithmic scale rather than a linear scale. The most
common form of logarithmic magnitude uses the logarithmic decibel scale which represents a ratio
of two powers. See also Decibels (dB).
Logarithmic Frequency: When the frequency range of a signal or system is very large, it is often
convenient to plot the frequency axis on a logarithmic rather than a linear scale. The human ear, for
example, has a sensitivity range from around 70Hz to 15000Hz and is often described as being a
logarithmic frequency response. Logarithmically spaced frequencies are equally spaced distances
on the basilar membrane within the cochlea. The perception of frequency change is such that a
doubling of frequency from 200Hz to 400Hz is perceived as being the same change as a doubling
of frequency from 2000Hz to 4000Hz, i.e., both sounds have increased by an octave. In DSP
233
systems everything from digital filter frequency responses, to spectrograms may be represented
with a logarithmic frequency scale. See also Wavelet Transform.
The most common logarithmic scales are decade (log10f) and octave (log2f) although clearly any
logarithmic base can be used. If the y-axis is also plotted on a logarithmic scale (such as dB), then
the graph is log-log. See also Decibels, Roll-off .
1
10 log 10  -------------2-
1 + f 
Range 1- 100Hz
0
-10
-20
Linear frequency
-30
-40
10
20
30
40
50
60
70
80
90
100
f (Hz)
Range 1- 100Hz
1
10 log 10  -------------2-
1 + f 
0
Log10 (decade) frequency
-10
Roll-off at 20dB/decade
-20
-30
-40
-50
-60
0.1
0.5
1
5
10
50
100
500
log10 f
Range 1- 100Hz
1
10 log 10  -------------2-
1 + f 
0
-6
1000
Log2 (octave) frequency
-12
-18
Roll-off at 6dB/octave
-24
-30
-36
-42
-48
-54
0.125
0.25
0.5
1
2
4
8
16
32
64
128
256
512
log2 f
Graphs of the second order system 1 ⁄ ( 1 + f 2 ) . The range of 1 to 100Hz is the width on all three
graphs. Clearly using a logarithmic scale allows much greater frequency ranges to be represented
than with a linear scale. More resolution is available at the lower frequencies (0 to 1 Hz), although
at higher frequencies there is less resolution.
Lossless Compression: If a compression algorithm is lossless, then the signal information (or
entropy) after the signal has been compressed and decompressed has not changed, i.e. all signal
information has been retained. Hence, the uncompressed signal is identical to the original signal.
DSPedia
234
Lossless compression for digital audio signals is not particularly successful and is likely to achieve
at best 2.5:1 compression ratio [61]. See also Compression, Lossy Compression.
Lossy Compression: If a compression algorithm is lossy, then the signal information (or entropy)
after the signal has been compressed and decompressed is reduced, i.e. some signal information
has been lost. However if the lossy algorithm is carefully designed then the elements of the signal
that are lost are not particularly important to the integrity of the signal. For example, the precision
adaptive subband coding (PASC) algorithm compresses a hifidelity digital audio signal by a factor
of 4, however the information that is “lost” would not have been perceived by the listener due to
masking effects of the human ear. Alternatively if very high levels of compression are being
attempted then the lossy effects of the algorithm may be quite noticeable. See also Compression,
Lossless Compression.
Loudness Recruitment: Defects in the auditory mechanism can lead to a hearing impairment
whereby the dynamic range from the threshold of audibility to the threshold of discomfort is greatly
reduced [30]. Loudness recruitment is the abnormally rapid growth in perceived loudness (versus
intensity) in individuals with reduced dynamic range of audibility. The range of hearing is nominally
120dB(SPL). However, in persons with hearing loss, the range may be as low as 40dB. These
individuals have a raised threshold of audibility, but after sounds exceed that threshold the
perceived loudness grows rapidly until they reach normal perceived loudness for sounds near the
threshold of discomfort. This growth in their perceived loudness is termed loudness recruitment.
One common misconception is that individuals with loudness recruitment are more sensitive to
changes in intensity (i.e., they have smaller intensity JNDs or DLs). When tested, however, their
JNDs for intensity are very near normal -- this indicates that they have fewer different perceptible
difference limens (DLs) over the normal range of loudness than normal hearing individuals. See
also Ear, Equal Loudness Contours, Hearing Aids, Threshold of Hearing.
Low Noise Components: All electronic components introduce certain levels of unwanted and
potentially interfering noise. Low noise components introduce lower levels of noise than standard
components, but the cost is usually higher.
Low Pass Filter: A filter which passes only the portions of a signal that have frequencies between
DC (0 Hz) and a specified cut-off frequency. Frequencies above the cut-off frequency are highly
attenuated. See also Digital Filter, Filters, High Pass Filter, Bandpass Filter, Filters.
Input
Low pass
Filter
G(f)
Output
Magnitude
Bandwidth
G(f)
Cut-off
frequency
Lower Triangular Matrix: See Matrix Structured - Lower Triangular.
LU: See Matrix Decompositions - LU Decomposition.
frequency
235
M
m-sequences: Shorthand term for a maximum length sequence. See Maximum Length
Sequences, Pseudo-Random Binary Sequence.
Machine Code: The binary codes that are stored in memory and are fetched by the DSP
processor to be executed on the chip and perform some useful function, such as multiplication of
two numbers. Collectively machine code instructions form a meaningful program. Machine code is
usually generated (by the assembler program) from source code written in the assembly language.
This machine code can then be downloaded onto the DSP processor. Machine code has a one to
one correspondence with assembly language. See also Assembly Language, Cross Compiler.
Main Lobe: In an antenna or sensor array processing system, main lobe refers to the primary lobe
of sensitivity in the beampattern. For a filter or a data window, main lobe refers to the primary
passband lobe of sensitivity. The more narrow the main lobe, the more selective or sensitive a given
system is said to be. Main lobes are best illustrated by an example.
mainlobe
array gain
as a function
of angle
sidelobe
-15
-10
-5
0 dB contour
Typical Beampattern
See also Beamformer, Beampattern, Sidelobes, Windows.
Magnitude Response: See Fourier Series - Complex Exponential Representation.
Mammals: While not using digital signal processing capabilities, many mammals do of course use
analog signals for communication and navigation. Most obviously mammals (including humans)
use acoustic signals for communication via, for example, speech (humans), barking (dogs), and so
on. Elephants communicate with very low frequencies (around 100Hz and well below -- even down
to a few Hz), and can therefore communicate over very long distances via acoustic waves travelling
in the ground. These ground-borne waves suffer less attenuation than airborne acoustic waves. It
was this low frequency rumble communication that caused many early elephant watchers to believe
that elephants had ESP (extra sensory perception) ability. Light signals (from the electromagnetic
family) are used by most animals for navigation and communication purposes. Another well known
use of signal processing is by the bat which uses sonar blips to avoid objects in its path during night
flying. The magnetic field sensing abilities of birds and bees is another well known though not fully
understood use of signal processing for navigation. Some mammals (mainly antipodean), such as
the platypus an the echidna have electroreception abilities. See also Electroreception.
Marginally Stable: If a discrete system has poles on the unit circle then it can be described as
marginally stable. See Dual Tone Multifrequency.
DSPedia
236
Masking: Masking refers to the process whereby one particular sound is close to inaudible in the
presence of another louder signal. Masking is more precisely defined as spectral or temporal,
although in audio and speech coding the term is usually used in reference to spectral masking. For
spectral masking a loud signal raises the threshold of hearing of signals of a lower level but with
slightly higher or lower frequencies. This effectively leaves these other signals inaudible. For
temporal masking, sounds that occur a short time before of after a louder sound are not perceived.
Simultaneous masking is also used in audiometry in order to minimize the perceivable conductance
of test tones from the ear under test by injecting noise into the ear not being tested. See also
Audiometry, Spectral Masking, Temporal Masking, Threshold of Hearing.
Masking Pattern Adapted Universal Subband Integrated Coding and Multiplexing
(MUSICAM): MUSICAM was developed jointly by CCETT (France), IRT (Germany) and Philips
(the Netherlands), amongst others, originally for the application of digital audio broadcasting (DAB).
MUSICAM is based on subband psychoacoustic compression techniques and has been
incorporated into MPEG-1 in combination with the ASPEC compression system. See also Adaptive
Spectral Perceptual Entropy Coding (ASPEC), Precision Adaptive Subband Coding (PASC),
Psychoacoustics, Spectral Masking, Temporal Masking.
Matlab: A program produced by the MathWorks that allows high level simulation of matrix and DSP
systems, with excellent post-processing graphics facilities for data presentation. Libraries
containing virtually every DSP operation are widely available for Matlab.
Matrix: A matrix is a set of numbers stored in a 2 dimensional array usually to represent data in an
ordered structure. If ℜ denotes the set of real numbers, then the vector space of all m × n real
matrices is denoted by ℜ m × n , and if
A∈
ℜm × n
a 11 … a 1n
then
A =
:
:
a m1 … a mn
with

a ij ∈ ℜ, for  0 ≤ i ≤ m
 0≤j≤n
(276)
where the symbol ∈ simply means “is an element of” -- so A is an m × n matrix. The ordering
of the data values is important to the information being conveyed by the matrix. The dimensions of
a matrix are specified as the number of rows by the number of columns (the rows running from left
to right, and the columns from top to bottom). Matrices are usually denoted in upper-case boldface
font or upper case font with an underscore, e.g. M or M. (Note that vectors are usually represented
in lower case boldface font or lower font with an underscore, e.g. v or v.
As an example a particular 4 × 3 matrix, A, is:
4
A = 10
3
1
9
1
4
2
2
13
5
2
a 4 (row) by 3 (column) matrix
(277)
Clearly each element in the matrix can be denoted by a subscript which refers to its row and column
position:
237
a 11 a 12 a 13
A =
a 21 a 22 a 23
(278)
a 31 a 32 a 33
a 41 a 42 a 43
In the example, a12 = 9, and a32 = 4.
In DSP algorithms and analysis, matrices are extremely useful for compact and convenient
mathematical representation of data and algorithms. For example the Wiener Hopf solution, and the
Recursive Least Squares algorithm are expressed using matrix equations. See also Matrix
Algorithms, Matrix - Complex, Matrix Decompositions, Matrix Identities, Matrix Properties, Vector.
Matrix - Complex: Each element in an m × n complex matrix is a complex number. The complex
vector space is often denoted as C m × n where every element of that space is a complex number
c ij ∈ C . Scaling, addition, subtracting and multiplication of a complex matrix is performed in the
same way as for real matrices, except that the arithmetic is complex. For example:
1 – 3j + 4 – 4j = 9 – 5j + 4 – 4j = 13 – 9j
Cd + a = 1 + 2j 2 + j
3 2 – 0.5j – 2j
1 + 3j
2 – 13j
1 + 3j
3 – 10j
(279)
Simple row column transposition (i.e. transpose operation) of complex matrices is not normally
performed, but instead the Hermitian transpose is done where the matrix is transposed in the
normal row-column style, but every element is complex conjugated. In DSP applications such as
beamforming and digital communications, complex representation of information is often used for
convenience. See also Matrix, Matrix Properties - Hermitian Transpose.
Matrix Algorithms: There are a number of well known matrix algorithms used in DSP for solving
structured systems of equations. These algorithms are invariably used after a suitable
decomposition has been performed on a matrix in order to produce a structured matrix/system of
equations. See also Matrix, Matrix Decompositions, Matrix - Partitioning.
• Back Substitution: If an upper triangular system of linear equations:
u 11 … u 1, n – 2
Ux = b
⇒
u 1, n – 1
u 1n
x1
b1
:
0
: :
:
:
:
:
… u n – 2 , n – 2 u n – 2, n – 1 u n – 2, n x n – 2 = b n – 2
0
…0
u n – 1, n – 1 u n – 1, n x n – 1
bn – 1
0
…0
0
u nn
bn
xn
(280)
has to be solved for the unknown n element vector x, where U is an n × n non-singular upper triangular
matrix, then the last element of the unknown vector, x n can be calculated from multiplication of the last
row of U with the vector x:
u nn x n = b n
⇒
bn
x n = -------u nn
the second last element can therefore be calculated from multiplication of the second last row of U with
vector x, and substitution of x n from Eq. 281:
(281)
DSPedia
238
⇒
u n – 1, n – 1 x n – 1 + u n – 1, n x n = b n – 1
xn – 1
bn
b n – 1 – u n – 1, n  ---------
u nn
= ----------------------------------------------------u n – 1, n – 1
(282)
In general it can be shown that all elements of x can be calculated recursively from:
n
∑
bi –
u ij x j
(283)
j = i+1
x i = -----------------------------------u ii
This method of solving an upper triangular system of linear equations is called backsubstitution. Note that
if the diagonal elements of U are very small relative to the off-diagonal elements, then the arithmetic
required for the computation may require a large dynamic range. See also Matrix Decompositions Cholesky/Forward Substitution/Gaussian Elimination/QR.
• Forward Substitution: If a system of lower triangular linear equations:
0 … 0
x1
b1
l 21 l 22 0 … 0
x2
l 31 l 32 l 33 … 0
x3
b2
= b
3
:
: :
… l nn x n
:
bn
l 11 0
Lx = b
⇒
:
:
:
l n1 l n2 l n3
(284)
has to be solved for the unknown n element vector x, where L is an n × n non-singular lower triangular
matrix, then the first element of the unknown vector, x 1 can be calculated from multiplication of the first
row of L with the vector x:
l 11 x 1 = b 1
b1
x 1 = -----l 11
⇒
(285)
The second element can therefore be calculated from multiplication of the second row of L with vector x,
and substitution of x 1 from Eq. 285:
l 21 x 1 + l 22 x 2 = b 2
b1
b 2 – l 21  ------
l 11
x 2 = ------------------------------l 22
⇒
(286)
In general it can be shown that all elements of x can be calculated sequentially from:
i
bi –
1
∑ lij xj
j=1
x i = ---------------------------l ii
(287)
This method of solving an upper triangular system of linear equations is called forward substitution. Note
that if the diagonal elements of L are very small relative to the off-diagonal elements, then the arithmetic
required for the computation may require a large dynamic range. See also Matrix Decompositions - BackSubstitution/Cholesky/Gaussian Elimination/QR.
Matrix Decompositions: There are a number of methods which allow a matrix to be decomposed
into structured matrices. The reason for performing a matrix decomposition is to either extract
certain parameters from the matrix, or to provide a computationally cost effective and, ideally,
239
numerically stable method of solving a set of linear equations. A number of decompositions often
performed in DSP can be identified.
• Back Substitution: See Matrix Algorithms - Backsubstitution.
• Cholesky: The Cholesky decomposition or factorization can be applied to a n × n non-singular symmetric
(and therefore positive definite) matrix, A such that:
0 … 0
l 11 l 21 l 31 … l n1
l 21 l 22 0 … 0
0 l 22 l 32 … l n2
= l l l … 0
31 32 33
0 0 l 33 … l n3
l 11 0
A =
LL T
:
:
:
l n1 l n2 l n3
: :
… l nn
(288)
: : : … :
0 0 0 … l nn
If a system of equations, Ax = b is to be solved for the unknown n element vector x, where A is an n × n
symmetric matrix, and b a known n element vector, the solution can be found by Cholesky factoring matrix
A, and performing a backsubtitution followed by forward substitution:
Ax = b
⇒
LL T x = b
⇒
 Ly = b

 LT x = y
solve by forward substitution
solve by backward substitution
(289)
The elements of the Cholesky matrix, L, are well bounded and in general Cholesky factorization is a
numerically well behaved algorithm with fixed point arithmetic.
The Cholesky factorization may also be written in the form of the LDLT factorization, where L is now a unit
upper triangular matrix, and D is a diagonal matrix. See also Matrix Decompositions - Back Substitution/
Forward Substitution/Gaussian Elimination/LDU/LU/LDLT, Recursive Least Squares - Square Root
Covariance.
• Complete Pivoting: See entry for Matrix Decompositions - Pivoting.
• Eigenanalysis: Eigenanalysis allows a square n × n matrix, A, to be broken down into components of an
eigenvector and an eigenvalue which satisfy the condition:
Ax = λx
(290)
where x is an n × 1 vector, referred to as a (right) eigenvector of A, and the scalar λ is an eigenvalue of
A. In order to calculate the eigenvalues Eq. 290 can be rearranged to give:
( A – λI )x = 0
(291)
and if x is to be a non-zero vector, then the solution to Eq. 291 requires that the matrix ( A – λI ) is singular
(i.e. linearly dependent columns) and therefore the determinant is zero, i.e.
det ( A – λI ) = 0
(292)
This equation is often referred to as the characteristic equation of the matrix A, and can be expressed as
a polynomial of order n, which in general has n distinct roots. (If the eigenvalue does not have n distinct
roots, then the matrix A is said to be degenerate). Therefore we can note that there are n instances of Eq.
290:
Ax i = λ i x i for i = 1 to n
(293)
Writing the eigenvalues as a diagonal matrix, Λ = diag ( λ 1, λ 2, λ 3, …λ n ) , and each vector, x i as a
column of an n × n matrix X:
A ( x 1, x 2, x 3, …, x n ) = AX = XΛ
(294)
DSPedia
240
and therefore X is a similarity transform matrix:
(295)
X – 1 AX = Λ
and matrices A and Λ are said to be similar. Note also that
trace ( A ) = trace ( Λ ) = λ 1 + λ 2 + … + λ n ,
(296)
which is easily seen from noting that:
(297)
trace ( Λ ) = trace ( X – 1 AX ) = trace ( AX – 1 X ) = trace ( A )
For the general eigenvalue problem, techniques such as the QL algorithm (not to be confused with the
QR decomposition) are used to reduce the matrix A to various structured intermediate forms before
ultimately extracting eigenvalues and eigenvectors. Note that although the eigenvalues could be found
from solving the polynomial in Eq. 292 this is in general not a good method either numerically or
computationally.
For DSP systems a particularly relevant problem is the symmetric eigenvalue problem, whereby a
(symmetric) correlation matrix is to be decomposed. For a symmetric n × n matrix R,
(298)
Rq i = λq i for i = 1 to n
it is relatively straightforward to show for the symmetric case that the eigenvectors, q i , will be orthogonal
to each other, and Eq. 295 can be written in the form:
Q T RQ = Λ
or
R = QΛQ T
(299)
where, Q T Q = I .
Other useful properties of the symmetric eigenanalysis problems are that the condition number of R can
be calculated as the eigenvalue spread:
λ max
κ ( R ) = ----------λ min
(300)
See also Matrix Decompositions - Singular Value, QL, QR Algorithm.
• Schur Form: A canonical form of a matrix that displays the eigenvalues but not eigenvectors of matrix.
• Eigenvalue: See Matrix Decompositions - Eigenanalysis.
• Eigenvector: See Matrix Decompositions - Eigenanalysis.
• Fast Given’s Rotations: See Matrix Decompositions - Square Root Free Givens.
• Forward Substitution: See Matrix Algorithms - Forward Substitution.
• Gauss Transform: In general the Gauss transform, G k is an n × n matrix used to zero the k – 1
elements below the main diagonal in column k of a non-singular n × n matrix, A:
Main Diagonal
k
n
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
g
g
g
0
0
0
1
0
0
n
G
0
0
0
0
1
0
0
0
0
0
0
1
n
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
n
A
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
n
x
x
x
y
y
y
x
x
x
y
y
y
x
x
x
0
0
0
x
x
x
y
y
y
n
GA
x
x
x
y
y
y
x
x
x
y
y
y
Note that only
elements in the
rows being
zeroed actually
change.
241
As an example of zeroing a matrix column below the main diagonal, the elements a 31 and a 21 can be
“zeroed” by premultiplying a 3 × 3 matrix A with a 3 × 3 Gauss transform matrix, G 1 :
a 11
a 12
a 13
1 0 0 a 11 a 12 a 13
G 1 A = g 21 1 0 a 21 a 22 a 23 = ( a 11 g 21 + a 21 ) ( a 12 g 21 + a 22 ) ( a 13 g 21 + a 23 )
g 31 0 1 a 31 a 32 a 33
a 11 a 12 a 13
=
( a 11 g 31 + a 31 ) ( a 12 g 31 + a 32 ) ( a 13 g 31 + a 33 )
(301)
a 31
a 21
where, g 31 = – -------- and g 21 = – -------a 11
a 11
0 a 22 a 23
0 b 32 b 32
In general the Gauss transform matrix which will zero all elements below the diagonal in the k-th column
of an n × n matrix, A can be specified as:
1
:
0
Gk A =
0
…
0
:
:
…
1
… g k + 1, k
: …
0 …
:
g nk
a 11
0
:
:
a k1
0
0 a
k + 1, 1
: : :
:
0 …1
a
0
:
0
1
…
:
…
…
=
…
a 1k
a 1, k + 1
…
a 1n
:
…
:
a kk
:
a k, k + 1
:
…
:
a kn
… a k + 1, k a k + 1 , k + 1 … a k + 1, n
n1
…
…
:
a nk
:
a n, k + 1
:
…
:
a nn
a 11
…
a 1k
a 1, k + 1
…
a 1n
:
:
…
:
:
:
a kk
a k, k + 1
:
…
b k + 1, 1 …
0
…
…
:
0
a k1
:
b n1
(302)
a kn
b k + 1 , k + 1 … b k + 1, n
:
b n, k + 1
:
…
:
b nn
where g ik = – a ik ⁄ a kk .
The inverse of a Gauss transform matrix, G k– 1 is simply calculated by negating the “ g “entries:
G k– 1
1
:
0
= –
0
…
0
:
:
…
1
… – g k + 1, k
: …
0 …
:
– g nk
0
:
0
1
…
:
…
…
0
:
0
0
: : :
0 …1
Gauss transforms are used in the main for performing LU matrix decomposition. Gauss transforms are not
in general numerically well behaved, and if the pivot element (the divisor a ii ) is very small in magnitude,
then very large values may occur in the resulting transformed matrix; hence “pivoting” strategies are often
used whereby rows and/or columns of the matrix are interchanged, but the integrity of the problem being
solved is maintained. See also Matrix Decompositions - Gaussian Elimination/LU/Pivotting, Matrix
Structured - Lower Triangular/Upper Triangular.
• Gaussian Elimination: Gaussian elimination is a technique used to find the solution of a square set of
linear equations, Ax = b , for the unknown n element vector x, where A is an n × n non-singular matrix,
and b a known n element vector. Gaussian elimination converts a square non-singular matrix into an
equivalent, and easier to solve system of equations where A has been implicitly premultiplied by a matrix,
(303)
DSPedia
242
G, to produce an upper triangular matrix, U and a new vector y. (Note that the premultiplication is
described as “implicit” as it is not necessary to explicitly form the matrix G - the Gaussian elimination is
done in stages.).
A
n
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
n
n
n
u
1
0
0
0
0
u
u
1
0
0
0
n
b
x
x
x
x
x
x
b
b
b
b
b
b
n
1
1
Performing Gaussian elimination to
produce equivalent upper triangular
system of equations
U
1
0
0
0
0
0
x
u
u
u
1
0
0
u
u
u
u
1
0
u
u
u
u
u
1
n
x
y
x
x
x
x
x
x
y
y
y
y
y
y
1
n
GA = U
Gb = y
1
Gaussian elimination can be formally described in terms of Gauss transforms which are used to “zero” the
elements below the main diagonal of a matrix to ultimately convert it to an upper triangular form using a
series of Gauss transforms for each column of the matrix.
The Gauss transform matrix, G k can be specified which will zero all elements below the diagonal in the
k-th. column of an n × n matrix, A. Therefore to solve the system of linear equations, Ax = b , the
transforms G 1 to G k can be used to premultiply matrix A (in the correct order) such that:
Ax = b
⇒ G n – 1 …G 2 G 1 Ax = G n – 1 …G 2 G 1 b
⇒ Ux = y
and the equivalent system of equations, Ux = y is solved by backsubstitution.
In general Gaussian elimination is not numerically well behaved, and will fail if A is singular. In particular
small pivot elements, a ii on the diagonal of matrix A may lead to very small and very large values
appearing in the L and U matrices respectively. Therefore pivoting techniques are often used whereby the
rows and/or columns of A are interchanged using (orthogonal) permutation matrices. In fact where
Gaussian elimination is to be used for solving a set of linear equations, it is recommended that pivoting is
always used. See also Matrix Decompositions - Gauss Transforms/LU/Pivotting, Matrix Structured - Lower
Triangular/Upper Triangular.
• Givens Rotations: Given’s rotations (also known as plane rotations, and Jacobi rotations) represent an
orthogonal transformation for introducing zero elements into a matrix. The element a 21 of the following
(full rank) matrix can be zeroed by applying the appropriate Givens rotation as follows:
(304)
243
c s
a 11
a 12 … a 1n
–s c
a 21
a 22 … a 2n
2 + a2
a 11
21 ( a 12 c + sa 22 )
=
… ( a 1n c + sa 2n )
( – sa 12 + ca 22 )
0
=
(305)
… ( – sa 1n + ca 2n )
b 11
b 12 … b 1n
0
b 22 … b 2n
where
a 11
c = ---------------------------2 + a2
a 11
21
and
a 21
s = ---------------------------2 + a2
a 11
21
(306)
More generally if a zero is to be introduced in the i-th row and j-th column of an m × n matrix A by rotating
with the element in the k-th row and j-th column, then an m × m Given’s rotation matrix, G, can be applied:
k
1
:
k 0
GA =
:
i 0
:
m 0
…
:
…
:
…
:
…
0
:
c
:
–s
:
0
m
i
…
:
…
:
…
:
…
0
:
s
:
c
:
0
…
:
…
:
…
:
…
… a 1j … a 1n
a 11
0
:
:
a
0
k1
:
:
0 a i1
:
:
1 a
m1
: : :
:
… a kj … a kn
: : :
:
… a ij … a in
a kj
a ij
where, c = -----------------------and s = -----------------------2 + a2
2 + a2
a kj
a kj
ij
ij
(307)
: : :
:
… a mj … a mn
Given’s rotations are particularly useful for realizing the upper triangular R matrix in a QR decomposition
algorithm. Consider that a 5 × 3 full rank matrix is to be decomposed into its Q and R components (for
notational clarity all matrix variables row-column subscripts have been omitted):
A-Matrix
aa a
aa a
aa a
aa a
aa a
Zero
Element
R-Matrix
b
a
a
a
0
b
a
a
a
b
b
a
a
a
b
5,1
c
a
a
0
0
c
a
a
c
b
c
a
a
c
b
4,1
d
a
0
0
0
d
a
d
c
b
3,1
d
a
d
c
b
e
0
0
0
0
e
e
d
c
b
2,1
e
e
d
c
b
e
0
0
0
0
e
f
d
c
0
5,2
e
f
d
c
f
e
0
0
0
0
e
g
d
0
0
4,2
e
g
d
g
f
e
0
0
0
0
e
h
0
0
0
3,2
e
h
h
g
f
e
0
0
0
0
e
h
0
0
0
5,3
e
h
i
g
0
e
0
0
0
0
e
h
0
0
0
e
h
j
0
0
4,3
All of the elements below the main diagonal in column 1 are first rotated with the a 11 element and after
four Given’s rotations all appropriate elements are zeroed. For column 2, all elements below the main
diagonal are rotated with the e 22 element, and after three Given’s rotations all appropriate elements are
zeroed. Finally for column 3, all elements below the main diagonal are rotated with i 33 and after two
Given’s rotations the upper triangular matrix R is realized. Note that the order of element rotation is
important in order that previously zeroed elements are retained as zeroes when subsequent columns are
rotated. Also note that when a matrix is rotated the only elements that change are the ones in the row with
the element being zeroed, and the row with which the element is being rotated. Finally if the Q matrix is
specifically required, then the Given’s rotation (sparse) matrices of the form in Eq. 307 can be retained
and multiplied together at a later stage.
DSPedia
244
The name Given’s rotations is after W. Givens , and the word rotation is used because the transform
corresponds to an angle rotation of a vector [ x, y ] T in the x-y plane to the vector [ x r, y r ] T by an angle of
θ ; this also explains the name “plane rotation”.
y
xr
yr
x
– sin θ cos θ y
x
cos θ = ---------------------x2 + y2
(x, y)
θ
0
cos θ sin θ
=
(xr, yr) x
y
sin θ = ---------------------x2 + y2
y
θ = tan– 1 --x
Because of the orthogonal nature of the Given’s rotations, the technique is numerically well behaved.
From an intuitive consideration of Eq. it can be seen that the magnitude of c and s will always be less than
one (i.e. c < 1 and s < 1 ) and therefore elements in the transformed matrix will have adequately
bounded values.
Over the last few years Given’s rotations have been widely used for adaptive signal processing problems
where fast numerically stable parallel algorithms have been required. See also Matrix Decompositions QR, Recursive Least Squares - QR.
• Householder Transformation: The Householder transformation is an m × m matrix, H, used to zero the
elements below the main diagonal in the k-th row of a full rank m × n matrix A:
Main Diagonal
k
k
m
k
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
v
v
v
v
0
0
0
v
v
v
v
0
0
0
v
v
v
v
0
0
0
v
v
v
v
m
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
y
y
y
y
m
x
x
x
y
y
y
y
x
x
x
y
y
y
y
x
x
x
y
0
0
0
m
n
n
H
A
HA
x
x
x
y
y
y
y
Note that only
elements in the
rows being
zeroed actually
change.
Householder matrices are orthogonal, i.e. HH T = I , and also symmetric, i.e. H = H T . The Householder
transformation can be illustrated by noting that the k – 1 lower elements of a k × 1 vector, x, can be
zeroed by premultiplying with a suitable Householder matrix:
x1
x2
2vv T
2vv T
Hx =  I – ------------- x =  I – ------------- x =
3
vTv
vTv
:
xk
where
x
0
0
:
0
2
(308)
245
x1 + x
2
x2
v =
(309)
x3
:
xk
and the 2-norm, x
2
2
x 12 + x 22 + x 3 + … + x k2 .
=
Therefore the general Householder matrix, H k , to zero the elements in column k, below the main diagonal
of a matrix A can be written in a partitioned matrix form:
Hk =
I 0
0 H kk
a 11
a 12
…
a 1k
…
a 1n
a 21
a 22
…
a 2k
…
a 2n
:
a k1
:
a k2
…
…
:
a kk
…
…
:
a kn
a k + 1, 1 a k + 1, 2 … a k + 1, k … a k + 1, n
:
a m1
:
a m2
…
…
a 11
a 12
… a 1k … a 1n
a 21
a 22
… a 2k … a 2n
:
= b k1
:
b k2
…:
… :
… b kk … b kn
b k + 1, 1 b k + 1, 2 … 0
:
b m1
=
:
b m2
:
a mk
A 11
A 1k
I 0 A 11 A 1k
=
0 H kk A k1 A kk
H kk A k1 H kk A kk
…
:
… a m, n
(310)
… b k + 1, n
…:
…0
… :
… b m, n
where,
a kk + a kk
2v k v kT
H kk = I – --------------v kT v k
with v k =
a k + 1, k
:
a kk
2
and a kk =
a mk
where a kk
2
=
a k + 1, k
:
a mk
2 + a2
2
a kk
k, k + 1 + … + a mk .
A sequence of Householder matrices is very useful for performing certain matrix transforms such as the
QR decomposition. Consider an example where a 5 × 3 full rank matrix is to be decomposed into its Q
and R components (for clarity all matrix variable row-column subscripts have been omitted):
A-Matrix
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
R-Matrix
b bb
0bb
0bb
0bb
0bb
H1 A
bb b
0 c c
00 c
00 c
00 c
H2 H 1 A
b bb
0 c c
0 0 d
0 00
0 00
H3 H 2 H1 A
(311)
DSPedia
246
Compared to Given’s rotations, which zero a column vector element by element, Householder
transformations requires fewer arithmetic operations, however Given’s rotations have become more
popular for modern DSP techniques as a result of their suitability for parallel array implementation [77],
[88], unlike the Householder transformation which has no recursive implementation.
The zeroing of column elements in a matrix can also be performed by the Gauss transform typically for
implementation of algorithms such as LU decomposition. However unlike the Householder transform,
Gauss transforms are not orthogonal. Therefore because the Householder transform does not produce
matrices with very large or very small elements (which may happen with the Gauss transform) then the
numerical behavior is in general good [136]. See also Matrix Decompositions - Given’s Rotations/QR/
SVD/, Recursive Least Squares - QR
• LDLT: See Matrix Decompositions - Cholesky.
• LDU: LDU decomposition is a special case of LU decomposition, whereby a non-singular matrix n × n A,
can be factored into a unit upper triangular matrix L, a unit lower triangular matrix U, and a diagonal matrix,
D. See also Matrix Decompositions - Cholesky/LU.
• LLT: See Matrix Decompositions - Cholesky.
• LU: The LU decomposition is used to convert a non-singular n × n matrix A, into a lower and upper
triangular matrix product:
l 11 0
0 … 0
l 21 l 22 0 … 0
A = LU = l l l … 0
31 32 33
:
:
:
l n1 l n2 l n3
u 11 u 12 u 13 … u 1n
0
u 22 u 23 … u 2n
0
0
u 33 … u 3n
: : :
… l nn 0
:
0
:
0
(312)
: :
… u nn
Gaussian elimination (or factorization), via a series of Gauss transforms, can be used to produce the LU
decomposition. The k-th Gauss transform matrix, G k will zero all of the elements below the main diagonal
in the k-th column of an n × n matrix, A. After applying the Gauss transforms G 1 to G n – 1 an upper
triangular matrix is produced:
G n – 1 …G 2 G 1 A = U
(313)
To obtain the lower triangular matrix, the above equation can be rearranged to go:
⇒ A = G 1– 1 G 2– 1 …G n– 1– 1 U
⇒ A = LU
(314)
where L = G 1– 1 G 2– 1 …G n– 1– 1 .
Note that the inverse Gauss transform matrices, G i– 1 are trivial to compute from G i . and they will also be
lower triangular matrices (the product of two lower triangular matrices is always lower triangular).
If a system of equations, Ax = b is to be solved for the unknown n element vector x, where A is an n × n
non-singular matrix, and b a known n element vector, the solution can be found by LU factoring matrix A,
and performing a backsubtitution and a forward substitution:
Ax = b
⇒
LUx = b
⇒
 Ly = b

 Ux = y
solve by forward substitution
solve by backward substitution
It is however less computation to perform Gaussian elimination which form the U matrix, but does not
explicitly form the L matrix. In general using LU decomposition (or Gaussian elimination) to solve a system
of linear equations does not have good numerical behavior and the existence of small elements on the
diagonals of L and U, and large values elsewhere may lead to the computation requiring a very large
(315)
247
dynamic range. Therefore pivoting techniques are usually used on the Gaussian elimination computation
in an attempt to circumvent the effects of small and large values.
See also Matrix Decompositions - Backsubstitution/Cholesky/Forward Substitution/Gaussian Elimination/
LDU/LDLT/Pivoting.
• Partial Pivoting: See entry for Matrix Decompositions - Pivoting.
• Pivoting: When performing certain forms of matrix decomposition such as LU, small elements on the main
diagonal are used as divisors when producing matrices such as Gauss transforms to zero certain
elements in the matrix. If these elements are very small then they can result in very large numbers
appearing in the matrices resulting from the decomposition.
For example consider the LU decomposition of the following 3 × 3 matrix:
A =
0.0001 1 1
1
0 0 0.0001
1
1
– 9999 – 9998 = LU
1
1 2 = 10000 1 0
0
1
13
10000 1 1
0
0
1
(316)
If fixed point arithmetic is used, then the dynamic range of numbers required for the L and U matrices is
twice that for the A matrix. Small pivot elements can be avoided by rearranging the A matrix elements
using orthogonal permutation matrices. Therefore for the above example:
0 0 1 0.0001 1 1
1
1 3
1
00 1
1
3
PA = 1 0 0
1
1 2 = 0.0001 1 1 = 0.0001 1 0 0 0.9999 0.9997 = L p U p
–1
0 10
1
1 3
1
1 2
1
01 0
0
(317)
and the LU factors now contain suitably small elements. In general when performing pivoting, prior to
applying the Gauss transform on the k-th column, the column is scanned to find the smallest element in
order to set up the permutation matrix to appropriately swap the rows and attempt to ensure that small
pivots are avoided. If a system of linear equations:
(318)
Ax = b
is to be solved using Gaussian elimination (or more exactly LU decomposition with one stage of pivoting),
where A is a non-singular n × n matrix, b is a known n element vector, and x is an unknown n element
vector then:
PAx = Pb
⇒
LUx = Pb
⇒
 Ly = Pb

 Ux = y
solve by forward substitution
solve by backward substitution
If both the rows and the columns are scanned to circumvent small pivots, then this is often referred to as
complete pivoting. Column swapping is achieved by postmultiplication of matrix A, with a suitable
permutation matrix Q. Pivoting can be used on many other linear algebraic decompositions where small
pivoting/divisor elements need to be avoided. Note that because the pivot matrix P (and also Q) is
orthogonal, then for least squares type operations, the 2-norm of the pivoted matrix, PA is not affected.
See also Matrix Decomposition - Gaussian Elimination/LU, Vector Properties - Norm.
• Plane Rotations: See entry for Matrix Decompositions - Given’s Rotations.
(319)
DSPedia
248
• QR: The QR matrix decomposition is an extremely useful technique in least squares signal processing
systems where a full rank m × n matrix A ( m > n ) is decomposed into an upper triangular matrix, R and
an orthogonal matrix Q:
m
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
x
x
x
x
x
x
m
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
m
r
0
0
0
0
0
r
r
r
0
0
0
r
r
r
r
0
0
QTQ = I
n
R
m
Q
n
A
r
r
0
0
0
0
If the least squares solution is required for the overdetermined linear set of equations:
(320)
Ax = b
where A is an m × n matrix, b is a known m element vector, and x is an unknown n element vector, then
the minimum norm solution is required, i.e. minimize, ε , where ε = Ax – b 2 . This can be found by the
least squares solution:
(321)
x LS = ( A T A ) –1 A T b
However noting that the 2-norm (or Euclidean norm) is invariant under orthogonal transforms, then the QR
decomposition allows a different computation method to find the solution. Using a suitable sequence of
Given’s rotations, or Householder rotations, for a full rank m × n matrix A (where m > n ), the QR
decomposition yields:
A = Q R
0
,
(322)
or
a 11
a 12
…
a 1n
q 11
q 12
…
q 1n
q 1, n + 1
…
q 1m
a 21
a 22
…
a 2n
q 21
q 22
…
q 2n
q 2, n + 1
…
q 2m
:
a n1
:
a n2
…
…
:
a nn
:
q n1
:
q n2
…
…
:
q nn
:
q n, n + 1
…
…
:
q nm
=
a n + 1, 1 a n + 1, 2 … a n + 1, n
:
a m1
:
a m2
…
…
:
a mn
q n + 1 , 1 q n + 1, 2 … q n + 1, n q n + 1, n + 1 … q n + 1, m
:
q m1
:
q m2
…
…
:
q mn
…
:
q m, n + 1 …
:
q mm
r 11 r 12 … r 1n
0 r 22 … r 2n
:
0
: … :
0 … r nn
0
:
0
0 … 0
: : :
0 … 0
(323)
where Q is an m × m orthogonal matrix, (i.e. QQ T = I ), and R is an n × n upper triangular matrix, and 0
a ( m – n ) × n zero matrix, then:
ε = Ax – b
2
= Q T Ax – Q T b
2
=
R x– c
0
d
= v
2
2
where c is an n element vector, and d and m – n element vector and vector v is therefore computed as:
(324)
249
v1
v2
:
vn
vn + 1
:
vm
c1
r11 r 12 … r 1n
=
r 11 x 1 + r 12 x 2 + … + r n x n
c1
r 22 x 2 + … + r 2n x n
c2
c2
0 r 22 … r 2n x
1
:
: : … : x
2 –
=
c
0 0 … r nn
n
:
d1
0 0 … 0 x
n
: : : :
:
dm – n
0 0 … 0
In order to minimize v
2,
:
r nn x n
0
:
0
–
:
cn
(325)
d1
:
dm – n
note that:
v
2
2
= Rx – c
2
2
+ d
(326)
2
2
Therefore solving the system of equations Rx – c = 0 will give the desired least squares solution of v
(note that the sub-vector norm d 2 cannot be minimized) i.e.,
2
x LS = R – 1 c
(327)
which can be conveniently solved using backsubstitution rather than performing the explicit inverse. The
least squares residual is simply the value d 2 .
Because of the orthogonal nature of the algorithm, the QR is numerically well behaved and represents an
extremely powerful and versatile basis for least squares signal processing techniques. Also a brief
comparison of the solution obtained in Eq. 321 and that of Eqs. 322-327 will show that the QR approach
operates directly on the data matrix, whereas the pseudoinverse form in Eq. 321 requires to square the
matrix A. Therefore a simplistic argument is that twice the dynamic range is required to accommodate the
spread of numerical values in the pseudoinverse method, as compared to the QR based least squares
solution. (Note that both solutions are identical if infinite precision arithmetic is used.)
See also Least Squares, Matrix Decompositions - Back substitution/Given’s Rotation/Pseudoinverse,
Matrix Properties - Overdetermined, Recursive Least Squares - QR.
• Similarity Transform: Two non-singular n × n matrices A and B are said to similar is there exists a
similarity transform matrix X, such that:
B = X – 1 AX
(328)
See also Matrix Decompositions - Eigenanalysis.
• Singular Value: The singular value decomposition (SVD) is one of the most important and useful
decompositions in linear algebraic theory. The SVD allows an m × n matrix A, with
r = rank ( A ) ≤ min ( m, n ) to be transformed in the following manner:
U T AV = Σ 0
0 0
(329)
A = U Σ 0 VT
0 0
(330)
and therefore:
where U is an m × m orthogonal matrix, i.e. U T U = I , V is an n × n orthogonal matrix, i.e. V T V = I , and
Σ is a diagonal sub-matrix containing the singular values of A:
Σ = diag ( σ 1, σ 2, σ 3, …, σ r )
(331)
DSPedia
250
The Σ matrix is usually written such that σ 1 > σ 2 > … > σ r . The singular value decomposition can be
illustrated in a more diagrammatic form. If for matrix A, m > n , and r = rank ( A ) = n the Σ matrix has all
non-zero elements in the main diagonal:
V
UT
m
n
0
A
n
m
0
Σ
m
0
n
n
Note if r < n then A has linearly dependent columns and there will be only r non-zero elements:
Non-zero main diagonal
V
m
U
T
n
0
0
Σ 0
m
A
m
0
0
n
n
n
If for matrix A, m < n , and r = rank ( A ) = m the Σ matrix has all non-zero elements in the main diagonal
(again, note if r < m then A has linearly dependent columns and there will be only r non-zero elements):
m
UT
A
V
m
n
m
n
0
Σ
0
0
n
n
For signal processing algorithms, one of the main uses of the SVD is the definition of the pseudoinverse,
A + which can be used to provide the least squares solution to a system of linear equations of the form:
Ax = b
(332)
where A is an m × n matrix, b is a known m element vector, and x is an unknown n element vector. The
least squares, minimum norm solution is given by:
x = A+ b
(333)
–1
A+ = V Σ 0 UT
0 0
(334)
where
If it is assumed that A has full rank, i.e. rank ( A ) = min ( m, n ) There are three possible cases for the
dimensions of matrix A, if:
- m = n (square matrix) then A + = A – 1 ;
- m > n (the overdetermined problem) then A + = ( A T A ) – 1 A T , and
- m < n (the underdetermined problem) then A + = A T ( AA T ) – 1 .
251
The transformation of the pseudo-inverse in Eq. 334 into the three forms shown above can be confirmed
with straightforward linear algebra. Note that if A is rank deficient then none of the above three cases apply
and solution can only be found using the pseudoinverse of Eq. 334. In DSP systems the overdetermined
problem (such as found in adaptive DSP) is by far the most common and recognizable “least squares
solution”. However the pseudoinverse also provides a minimum norm solution for the underdetermined
problem when A is rank deficient (e.g., inverse modelling problems such as are found in biomedical
imaging and seismic data processing).
Note that if a non-singular square n × n matrix R is symmetric then the eignenvalue decomposition can
then be written as:
R = Q T ΛQ
(335)
where Λ = [ λ 1, λ 2, λ 3, …, λ n ] and the eigenvalues equal the singular values. If in fact, R = A T A , and A
is a full rank m × n matrix, then the singular values of A, are the square roots of the eigenvalues. This can
be seen by noting that:
A T A = V Σ 0 U T U Σ V T = V Σ 0 Σ V T = VΣ 2 V T
0
0
where for illustration purposes m > n .
To calculate the singular value decomposition, there are two useful techniques - the Jacobi algorithm and
the QR algorithm [15], [77]. See also Least Squares Matrix Properties - Pseudoinverse, Vector Properties
- Minimum Norm. .
• Spectral Decomposition: The eigenvalue-eigenvector decomposition of a matrix is often referred to as
the spectral decomposition. See also Matrix Decomposition - Eigenanalysis.
• Square Root Free Given’s Rotations: Square root free Given’s rotations (also known as fast Given’s)
are simply a rearranged version of the Given’s rotation, where the square root operation has been
circumvented, and an additional diagonal matrix introduced [15]. The reason for doing so is that most DSP
processors are not optimized for the square root operation, and hence their implementation can be slow.
It is worth pointing out that stable versions of the square root free Given’s require more divisions per
rotation than standard Given’s, and DSP processors usually perform square roots faster than divides!
Hence the alternative name of fast Given’s, is not a wholly representative name. It is also worthwhile
noting that the square root free Given’s may have numerical problems of overflow and underflow, unlike
the standard Given’s rotations. Unless square rooting is impossible, there is probably no good reason to
use square root free Given’s rotations. .
• Square Root Decomposition: See entry for Matrix Decompositions - Cholesky Decomposition.
• Triangularization: There are a number of matrix decompositions and algorithms which produce factors
of a matrix that have upper and lower triangular forms. Any such procedure can therefore be referred to
as a Triangularization. See Matrix Decompositions - Cholesky/LU/QR.
Matrix Identities: See Matrix Properties.
Matrix Inverse: See Matrix Properties - Inversion.
Matrix Inversion Lemma: See Matrix Properties - Inversion Lemma.
Matrix Addition: See Matrix Operations - Addition.
Matrix Multiplication: See Matrix Operations - Multiplication.
Matrix Postmultiplication: See Matrix Operations - Postmultiplication.
Matrix Premultiplication: See Matrix Operations - Premultiplication.
(336)
DSPedia
252
Matrix Operations: Matrices can be added, subtracted, multiplied, scaled, transposed, and
inverted. See also Matrix Operation Complexity.
• Addition (Subtraction): If two matrices are to be added (or subtracted) then they must be of exactly the
same dimensions. Each element in one matrix is added (subtracted) to the analogous element in the other
matrix. For example:
35 1
1 54
6 23 + 02 1 =
32 0
2 87
(1 + 3) (5 + 5) (4 + 1)
(6 + 0) (2 + 2) (3 + 1)
(2 + 3) (8 + 2) (7 + 0)
4 10 5
= 6 4 4
5 10 7
(337)
Matrix addition is commutative, i.e. A + B = B + A.
(338)
( AB ) T = B T A T
• Hermitian Transpose: When the Hermitian transpose of a complex matrix is found, the n-th row of the
matrix is written as the n-th column and each (complex) element of the matrix is conjugated. The Hermitian
transpose of a matrix A is denoted as AH. Note that the matrix product of AAH will always produce a real
and symmetric matrix.
A = ( 1 + 2j ) ( – 2 + j ) ( – 1 + 4j )
( 3 + j ) ( 3 + 7j ) ( 1 + 5j )
⇔
AH =
( 1 – 2j ) ( 3 – j )
( – 2 – j ) ( 3 – 7j )
( – 1 – 4j ) ( 1 – 5j )
(339)
⇒ AA H = 27 25
25 84
Note that if a matrix, B, has only real number elements, then B H = B T . See also Matrix Properties Hermitian, Complex Matrix, Matrix.
• Inverse: If for two square matrices A and B:
AB = I
(340)
then B can be referred to as the inverse of A, or B = A-1. If A-1 exists, then A is non-singular. Note that
AA – 1 = A – 1 A = I
(341)
( AB ) – 1 = B – 1 A –1
(342)
and
For example
10 1
A = 21 3 ⇒
01 2
–1 1 –1
A –1 = –4 2 –1
2 –1 1
1 0 1 – 1 1 –1
10 0
AA – 1 = 2 1 3 – 4 2 – 1 = 0 1 0
0 1 2 2 –1 1
00 1
Inversion of matrices is useful for analytical procedures in DSP, however its use in real time computation
is rare because of the very large computation requirements and the potential numerical instability of the
algorithm. In general the explicit inversion of matrices is circumvented by the use of linear algebraic
(343)
(344)
253
methods such as LU decomposition (with pivoting), QR decomposition and Cholesky decomposition (for
symmetric matrices) which have improved numerical properties [15].
• Kronecker Product: This is a useful mathematical operator for generating vectors and matrices. It is
particularly useful in interpretive programming languages such as MatlabTM for implementing simple DSP
operations such as upsampling. In general, the Kronecker Product multiplies every element of one matrix
by a second matrix and arranges these matrices into the same shape as the first matrix.
• Multiplication: The multiplication of two matrices AB is only possible when the number of columns in A
is the same as the number of rows in B. Each row of matrix A is multiplied by each column of B in a sum
of products (or vector inner product) form. If A is an m × n matrix and B is an n × p matrix the result will
be C, an m × p matrix. (Note that because of the dimensions the product of BA cannot be formed unless
m = p . Matrix matrix multiplication is not a commutative operation, i.e. in general AB ≠BA)
For example, if we form the matrix product C = AB, where A is a 3 × 4 , and B is a 4 × 2 matrix:
m
B = o
q
s
ab c d
A = e f gh
i j k l
n
p
r
t
(345)
then
mn
( am + bo + cq + ds ) ( an + bp + cr + dt )
a b c d
o p =
( me + fo + gq + hs ) ( ne + fp + gr + ht )
e f g h
q r
( im + jo + kq + ls ) ( in + jp + kr + lt )
i j k l
s t
A
3×4
B
4×2
C
3×2
In general for an m × n matrix, A, and an n × p matrix, B, the m × p elements of the product matrix C will
have elements:
n
c ij =
∑
a ik b kj o
(346)
k=1
• Matrix-Vector Multiplication: Multiplication of a vector by a matrix is a special case of matrix
multiplication, where one of the matrices to be multiplied is a vector, or n × 1 matrix. Multiplication of an
n ×1 vector by an m ×n vector yields an m ×1.
a 11 a 12 a 13 a 14
y = Rx = a 21 a 22 a 23 a 24
a 31 a 32 a 33 a 34
3 ×4
x1
x2
x3
x4
4 ×1
( a 11 x 1 + a 12 x 2 + a 13 x 3 + a 14 x 4 )
y1
= ( a 21 x 1 + a 22 x 2 + a 23 x 3 + a 24 x 4 ) = y 2
( a 31 x 1 + a 32 x 2 + a 33 x 3 + a 34 x 4 )
(347)
y3
3 ×1
• Premultiplication: See Postmultiplication.
• Postmultiplication: Noting that in general for two matrices, A and B, (of dimension n × m and m × n
respectively):
DSPedia
254
(348)
AB ≠ BA
and therefore when multiplying two matrices it is important to specify the order. If it is required to multiply
two matrices then the order can be verbosely described using the term postmultiplication or
premultiplication. To state that matrix C is formed by A being postmultiplied by B means:
(349)
C = AB
which is equivalent to stating that B is premultiplied by A.
• Scaling: A matrix, A, is scaled by multiplying every element by a scale factor, c.
a 11 a 12 a 13
ca 11 ca 12 ca 13
(350)
cA = c a 21 a 22 a 23 = ca 21 ca 22 ca 23
a 31 a 32 a 33
ca 31 ca 32 ca 33
• Transpose: The transpose of a matrix is obtained by writing the n-th column (top to bottom) of the matrix
as the n-th row (left to right). The transpose of a matrix, A, is denoted as AT. For example, if:
a 11 a 12 a 13
A =
a 11 a 21 a 31 a 41
a 21 a 22 a 23
(351)
A T = a 12 a 22 a 32 a 42
⇒
a 31 a 32 a 33
a 13 a 23 a 33 a 43
a 41 a 42 a 43
Therefore if B = A T , then for every element of A and B, a ij = b ji . Note also the identity:
( AB ) T = B T A T
(352)
( AT )T = A
(353)
and
The product of A T A is frequently found in DSP particularly in least squares derived algorithms. See also
Hermitian Transpose.
• Subtraction: See Matrix-Vector Addition.
• Vector-Matrix Multiplication: See Matrix-Vector Multiplication.
Matrix Operation Complexity: The number of arithmetic operations to perform the fundamental
matrix operations of addition (subtraction), multiplication and inversion can be given in terms of the
number of multiplies, adds, divisions and square roots that are required.
Matrix Operation
Matrix Dimension
Additions
Multiplies
Divides/Sqrts
Addition A + B
(m × n) + (m × n)
mn
0
0
Multiplication AB
(m × n).(n × p)
mnp
mnp
0
Inversion A-1
(n × n)
O( n3 )
O( n3)
O(n2 )
In general if a matrix is sparse (e.g. upper triangular, diagonal etc.) then the number of arithmetic
operations will be reduced since operations with one or more zero arguments need not be
255
performed. For example multiplication of two diagonal matrices both of dimension n × n requires
only n multiplies and adds. Also inversion of a diagonal matrix only requires n divisions.
It is worth noting that the matrix inverse is rarely calculated explicitly and systems of linear
equations of the form Ax = b are usually solved via Gaussian Elimination, or QR decomposition
type algorithms [15].
Matrix, Partitioning: It is often convenient to group the elements of a matrix into smaller
submatrices either for notational convenience or to highlight a logical division between two
quantities represented in the same matrix. For example the 6 × 4 matrix A, can be partitioned into
four 3×2 submatrices:
a 11 a 12 a 13 a 4
a 21 a 22 a 23 a 24
A =
a 31 a 32 a 33 a 34
a 41 a 42 a 43 a 44
=
A 11 A 12
(354)
A 21 A 22
a 51 a 52 a 53 a 54
a 61 a 62 a 63 a 64
A partitioned matrix is often referred to as a block matrix, i.e. a matrix in which the elements are
submatrices, rather than scalars. The use of block matrices is often exploited in the development
of DSP algorithms for notational convenience.
The specification of an algorithm using partitioned matrices (block matrices) is often referred to as
a block algorithm. Block algorithms (such as block matrix multiplication and addition etc.) should be
expressed such that the block dimensions and the submatrix dimensions are consistent with the
normal procedures of the matrix operation. QR decomposition and the matrix vector form of an IIR
filter can be conveniently represented as block matrix algorithms.
For example consider the multiplication of the 6 ×4 matrix partitioned into 3 ×2 blocks (or
submatrices) by a 4 ×4 matrix partitioned into 2 ×2 blocks or submatrices. The product C = AB can
be expressed in terms of the submatrices. Note that the dimensions of the submatrices Aim and Bmj
must be such that they can be matrix multiplied. In this example the result gives submatrices Cij of
dimension 3 ×2.
A =
C =
A 11 A 12 B 11 B 12
A 21 A 22 B 21 B 22
=
A 11 A 12
A 21 A 22
B =
B 11 B 12
(355)
B 21 B 22
( A 11 B 11 + A 12 B 21 ) ( A 11 B 12 + A 12 B 22 )
( A 21 B 11 + A 22 B 21 ) ( A 21 B 12 + A 22 B 22 )
=
C 11 C 12
C 21 C 22
(356)
Matrix Properties: In this entry properties of a matrix include useful identities and general forms of
information that can be extracted from or stated about a matrix. See also Matrix Decompositions,
Matrix Operations.
• Condition Number: The condition number provides a measure of the ill-condition or poor numerical
behavior of a matrix. Consider the following set of equations where A is a known n × n non singular matrix,
and b is a known n × 1 vector:
DSPedia
256
(357)
Ax = b
The solution to this system of equations is well known to be:
(358)
x = A –1 b
Using a processor with “infinite” arithmetic precision an exact answer will be obtained. If however the
equation is to be solved using finite precision arithmetic, then this can be modeled as a small error added
to the elements of A and d where this error is such that:
δA
-----------≈ε
A
and
δx
----------≈ε
x
and ε « 1
(359)
Therefore the problem is now one of solving:
x + δx = ( A + δA ) – 1 ( b + δb )
(360)
where δA and δb represent the error (or perturbation) matrix and vector of A and b respectively. It can
be shown that the relative error of the norm (perturbation) of the vector x is given by:
δx
----------- ≤ εκ ( A )
x
(361)
where for a square matrix A the condition number, κ ( A ) , is defined as:
(362)
κ ( A ) = A A –1
The norm of a matrix, A , gives information in some sense of the magnitude of the matrix. One measure
of matrix norm is its largest singular value. If the matrix A is decomposed using the singular value
decomposition (SVD):
(363)
A = UΣV T
where Σ = diag ( σ 1, σ 2, σ 3, …, σ n ) is a diagonal matrix denoting the singular values of A, and U and V
are orthogonal matrices. The condition number of matrix A, denoted as κ ( A ) is defined as the ratio of
the largest singular value to the smallest singular value (in accordance with Eq. 362):
max ( σ i )
κ ( A ) = --------------------- for 0 < i < n
min ( σ i )
(364)
Therefore if a matrix has a very large condition number a simple interpretation is that when solving
equations of the form in Eq. 358 then even very small errors in the matrix A, as modelled in Eq. 360, may
lead to very large errors in the solution vector x; hence “numerical” care must be taken.
To state the relevance of κ ( A ) in another way, if the condition number is very large then this implies that
when calculating the inverse matrix:
(365)
A – 1 = VΣ – 1 U T
the dynamic range of numbers in the inverse will be very large. This easily seen by noting that
Σ = diag ( σ 1–1, σ 2– 1, σ 3– 1, …, σ n– 1 ) . For example if:
A= 10
02
then A – 1 = 1 0
0 0.5
and κ ( A ) = 2
the matrix A is well-conditioned and a numerical dynamic range of around 0.1 to 10
( 40dB = 20 log ( 10 ⁄ 0.1 ) ) is “suitable” for the arithmetic. However for a matrix B:
(366)
257
0
B= 1
0 0.0001
0
then B – 1 = 1
0 10000
and κ ( B ) = 10000
(367)
the condition number highlights the ill-conditioning of the matrix, and this time a numerical dynamic range
of around 0.00001 to 10000 (160dB) is required for reliable arithmetic. Therefore matrix A could be reliably
inverted by a 16 bit DSP processor (96dB dynamic range), whereas matrix B would require a 32 bit floating
point DSP processor (764dB dynamic range).
Note that the larger the condition number the “closer” the matrix is to singularity. A singular matrix will have
a condition number of ∞ .
For analysis of many DSP algorithms note that the condition number is often given as the ratio of the
largest eigenvalue to the smallest eigenvalue:
λ max
Largest Eigenvalue- = ----------κ ( A ) = ----------------------------------------------------λ min
Smallest Eigenvalue
(368)
This is because in most DSP problems solved using linear algebra techniques the matrix A is square and
very often symmetric positive definite, and the eigenvalue decomposition is in fact a special case of the
more general singular value decomposition, and the eigenvalues are the same as the singular values. See
also Adaptive Signal Processing, Matrix Decompositions - Eigenvalue/Singular Value, Matrix Properties Norm/Eigenvalue Ratio, Vector Properties - Norm, Recursive Least Squares.
• Conjugate Transpose: See Matrix Properties - Hermitian Transpose.
• Determinant: Noting that the for a 1 × 1 matrix, α = [ a ] , the determinant is given by det ( α ) = a , the
determinant of a square matrix, A of dimension m × m can be defined recursively in terms of the
determinant of a related ( m – 1 ) × ( m – 1 ) matrix, A 1i obtained by deleting the first row and the i-th
column of A.
m
det ( A ) =
∑ ( –1 )i + 1 a 1i det ( A1i )
(369)
i=1
where a 1i is the first element in the i-th column of the matrix. If det ( A ) = 0 then the matrix is singular.
Also for two square matrices A and B it can be shown that det ( AB ) = det ( A )det ( B ) , and
det ( A T ) = det ( A ) . In general the determinant of a matrix defines the number of independent rows/
columns of the matrix.
• Eigenvalue: For a square n × n matrix, A, if there exists a non-zero n × 1 vector x, and a non-zero scalar
λ such that:
Ax = λx
(370)
then λ is an eigenvalue and x is an eigenvector of matrix A. See also Matrix Decompositions Eigenanalysis.
• Eigenvalue Ratio: The ratio of the largest eigenvalue to the smallest eigenvalue, denoted κ ( A ) , for a
square symmetric positive definite matrix, A:
λ max
Largest Eigenvalue- = ----------κ ( A ) = ----------------------------------------------------λ min
Smallest Eigenvalue
is more precisely known as the condition number of a matrix. The eigenvalue ratio (also known as
eigenvalue spread) gives information about the general numerical behavior (good or otherwise!) of a data
matrix A to when a problem usually of the form, Ax = b is solved for the unknown vector x, i.e.
x = A – 1 b . See also Matrix Properties - Condition Number, Matrix Decompositions - Eigenvalue/Singular
Value, Adaptive Signal Processing Algorithms.
• Eigenvalue Spread: See entry for Matrix Properties - Condition Number/Eigenvalue Ratio.
(371)
DSPedia
258
• Frobenius Norm: See Matrix Properties - Norm. See also Vector Properties - Norm.
• Hermitian (Symmetric): A complex matrix is often described as Hermitian if A = A H . Synonymous
names are Hermitian symmetric, or complex-symmetric. Note that if the matrix A is real, then A = A T and
A would be described as symmetric. See Matrix Decompositions - Hermitian Transpose.
• Hermitian Transpose: For two complex matrices A ( m × n ) and B ( n × m ) then the Hermitian transpose
of the product can be written as:
( AB ) H = B H A H
(372)
(AH )H = A
(373)
Note that:
A “dagger” is often used as the Hermitian transpose symbol, i.e. A H = A†
The matrix product R of an m × n matrix, A and its Hermitian transpose, A H will always produce a
conjugate symmetric m × m matrix, i.e. R = R H :
A = ( 1 + 2j ) ( – 2 + j ) ( – 1 + 4j )
( 3 + j ) ( 3 + 7j ) ( 1 + 5j )
⇒ R = AA H =
AH =
⇔
27
25 – 31j
( 1 – 2j ) ( 3 – j )
( – 2 – j ) ( 3 – 7j )
( – 1 – 4j ) ( 1 – 5j )
(374)
25 + 31j = R H
84
(also, if A is full rank, the R will be positive definite, otherwise R will be positive semi-definite).
Note that if a matrix, B, has only real number elements, then the Hermitian transpose is equivalent to the
normal matrix transpose, i.e. B H = B T . See also Complex Matrix, Complex Numbers, Matrix Properties
- Hermitian Transpose.
• Ill-Conditioned: An m × n matrix, A is said to be ill-conditioned when the condition number, calculated
as the ratio of the maximum singular value to minimum singular value (or maximum eigenvalue to
minimum eigenvalue for n × n matrices) is very high. A matrix that is not ill-conditioned is well-conditioned.
For more detail see entry Matrix Properties - Condition Number. See also Matrix Decompositions Eigenvalue/Singular Value
• . ∞ -norm: See Matrix Properties - Norm.
• Inversion: For two square invertible matrices A and B then:
( AB ) – 1 = B – 1 A –1
(375)
See also Matrix Operations - Inversion.
• Inversion Lemma: If A and C are nonsingular square matrices and B and D are of compatible
dimension such that:
P = A + BCD
(376)
and P is non singular, then the matrix inversion lemma allows P – 1 to be expressed as:
P – 1 = A –1 – A – 1 B ( C – 1 + DA – 1 B ) – 1 DA –1
This identity can be confirmed by multiplying the right sides of Eq. 376 and Eq. 377 together:
(377)
259
( A + BCD ) ( A – 1 – A –1 B ( C – 1 + DA – 1 B ) – 1 DA – 1 )
= I + BCDA – 1 – B ( C – 1 + DA – 1 B ) –1 DA – 1 + BCDA – 1 B ( C – 1 + DA – 1 B ) – 1 DA – 1
= I + BCDA – 1 – B [ ( C –1 + DA – 1 B ) – 1 + CDA – 1 B ( C – 1 + DA – 1 B ) –1 ]DA – 1
(378)
= I + BCDA – 1 – B ( C – 1 + DA – 1 B ) –1 ( I + CDA – 1 B )DA – 1
= I + BCDA – 1 – B ( C – 1 + DA – 1 B ) –1 ( C – 1 + DA – 1 B )CDA – 1
= I + BCDA – 1 – BCDA – 1
= I
QED
For some digital signal processing algorithms (such as the recursive least squares (RLS) algorithm) it is
often that case that C is a 1 × 1 identity matrix, B is a vector and D the same vector transposed. Also for
notational reasons A is written as an inverse matrix. Therefore applying the matrix inversion lemma to:
P = R –1 + vv T
(379)
P –1 = R – Rv ( 1 + v T Rv )v T R
(380)
gives
• Non-negative Definite: See entry for Matrix Properties - Positive Definite.
• Nonsingular: See Matrix Properties - Singular.
• Norm: A matrix norm gives a measure of the overall magnitude of the matrix space. The most common
norms are the Frobenius norm and the set of p-norms.
The Frobenius norm of an m × n matrix A, is usually denoted, A
m
A
F
F
and calculated as:
n
∑ ∑ aij2
=
(381)
i = 1j = 1
The p-norms are generally defined in terms of vector p-norms and calculated as
A
p
Ax p
= max --------------x p
(382)
This can also be expressed in the form:
A
p
= max Au
p
where
u
p
= 1
(383)
On an intuitive level, the matrix 2-norm gives information on the amount by which a matrix will “amplify”
the length (vector 2-norm) of any unit vector. Typically p = 1, 2 or ∞ . Note that the ∞ norm is easily
calculated as the largest element magnitude in a matrix. See also Matrix Properties - Condition Number,
Vector Properties - Norms.
• Null Space: The null space of A is defined as:
null ( A ) = { x ∈ ℜ N, where Ax = 0 }
Intuitively, the null space of A is the set of all vectors orthogonal to the rows of A. See also Matrix
Properties - Rank/Range, Vector Properties - Space/Subspace.
• 1-norm: See Matrix Properties - Norm.
• Overdetermined System: The linear set of equations, Ax = b, where A is a known m × n matrix with
linearly independent columns (i.e. rank ( A ) = n ), b is a known m element vector and x is an unknown n
element vector, is said to be overdetermined if m > n thus meaning there are more equations than
unknowns. An overdetermined system of equations has no exact solution for x. However by minimizing
(384)
DSPedia
260
the 2-norm of the error vector, e = Ax – b i.e. minimizing ε = ( Ax – b 2 ) , the least squares solution is
found:
x LS = ( A T A ) –1 A T b
(385)
For example, given the overdetermined system of equations (note there is no exact solution):
3
1 0 x
1
= 4
0 0
x
2
0 1 2
(386)
we can make a geometrical interpretation of the least squares solution by representing the various vectors
and projected vectors in three dimensional space:
y
4
z
3
2
-1
Vector Ax
3
1
-2
e
b
1
0
1
2
2
3
x
4
Now considering the subspace defined by the matrix:
10
A = 00
01
(387)
the columns only span the x-z plane ( y = 0 ) of the above three-dimensional space. Therefore the vector
Ax LS that minimizes the norm of the error vector, ε = Ax – b 2 , must lie on the x-z plane. Using the
least squares solution:
x LS = ( A T A ) – 1 A T b =
10
1 0 0
00
0 0 1
01
–1
3
3
1 00
1 0 10 0
3
4 =
4 =
0 01
0 1 00 1
2
2
2
(388)
From the above geometrical representation it should be clear that the because the vector Ax is
constrained to lie in the x-z plane, if the 2-norm (Euclidean length) of the error vector e = Ax – b is to be
minimized this will occur when e is perpendicular (orthogonal) to the x-z plane, i.e. the same solution as
the least squares. For problems with more than three dimensions a geometric interpretation cannot be
offered explicitly, however intuition gained from simpler examples is useful. See also Least Squares,
Square System of Equations, Matrix Properties - Underdetermined System, Vector Properties - 2 norm.
• Positive Definite: An n × n square matrix, A, is positive definite if:
x T Ax > 0
for all non-zero n element vectors, x.
If
(389)
261
(390)
x T Ax ≥ 0
then A is said to be positive semi-definite or non-negative definite.
Note that if a matrix B has full column rank, then a matrix R, calculated as R = B T B is always positive
definite. R will also be symmetric. This can be simply seen by noting that:
x T Rx = xB T Bx = Bx
2
2
(391)
where Bx 22 is the square of the 2-norm of the vector which is always a positive quantity for non zero
vectors x. Noting that a symmetric matrix can always be decomposed into its square root or Cholesky
form, then all symmetric matrices are positive definite. See also Correlation Matrix, Matrix Decompositions
- Cholesky, Vector Properties - Norm.
• Positive Semi-definite: See entry for Matrix Properties - Positive Definite.
• Pseudo-Inverse: If an m × n matrix, where m > n has rank(A) = n, then the system of equations Ax = b
cannot be solved by calculating x = A – 1 b because A is clearly non-invertible. However the least squares
solution can be found such that:
x LS = ( A T A ) – 1 A T b
(392)
If A is not full rank (i.e., rank(A)<n) however, then the inverse of ( A T A ) will fail to exist. In this case, the
pseudo-inverse of A, A+, is used. The pseudo-inverse is defined from the singular value decomposition of
A as:
–1
T
A+ = V Σ 0 U
0 0
(393)
where A has been decomposed (see Matrix Decompositions-Singular Value) into
A = U Σ 0 VT
0 0
with Σ being a rank r (r<n) diagonal matrix with a well-defined inverse. If A happens to be full rank then the
pseudo-inverse can be directly related to A as: A + = ( A T A ) – 1 A T if m>n. While we have focussed on
the over-determined problem here, we should note that the pseudoinverse also provides a minimum norm
solution for the underdetermined problem where A is rank deficient.
See also Least Squares, Matrix Decompositions - Singular Value Decomposition, Overdetermined
System, Underdetermined System.
• Rank: The rank of a matrix is equal to the number of independent rows or columns of the matrix. For an
m × n matrix, A, where m ≥ n , then rank ( A ) = n if and only if the column vectors are linearly
independent; note than rank ( A ) = rank ( A T ) . Similarly, if m < n then rank ( A ) = m if and only if the
row vectors of A are linearly independent. If rank ( A ) < min ( m, n ) then the matrix may be described as
rank deficient. Note that for an m × m square matrix, if rank ( A ) < m then the matrix is singular.
While in an analytical, academic framework (i.e., infinite precision), the concept of rank is clearly defined,
it becomes somewhat more problematic to define rank when working with matrix based packages such as
MatlabTM. Because of round-off errors, it is possible to have a test for matrix rank indicate a full rank
matrix, when the matrix is actually very poorly conditioned. In some cases software packages warn of rank
deficiencies (especially on matrix inversions). However, in DSP applications the significance of low power
dimensions is often very application specific. Therefore, it is generally a good idea to pay attention to the
condition number of matrices with which you are working. As an example, if you are performing a least
squares filter design and the coefficient magnitudes become enormous (say on the order of 1015) when
you were expecting much more reasonable numbers (say 10-1, 101, etc.) this is a good indication of
possible rank deficiency (in this case, the rank deficiency is unlikely to be detected by software
monitoring).
(394)
DSPedia
262
See also Matrix Properties - Range/Singular/Condition Number.
• Rank Deficient: See entry for Matrix Properties - Rank.
• Range Space: For an m × n matrix A, the subspace spanned by the column partitioning of the matrix
A = [ a 1, a 2, a 3, …, a n ] is referred to as the range space of the matrix. Therefore:
range ( A ) = { y ∈ ℜ m, where y = Ax }, for any x ∈ ℜ n
(395)
See also Vector Properties - Space/Subspace.
• Singular: For a square matrix, A , if there exists no matrix, X such that AX = I (where I is the identity
matrix) then the inverse matrix, A –1 does NOT exist and the matrix is singular; otherwise the matrix is
nonsingular. For example the matrix:
A = 10
90
(396)
is singular as there exists no matrix X such that AX = I . For an n × n singular matrix, A, the rank will
less than n. See also Matrix Decompositions - Singular Value Decompositions, Matrix Properties Pseudo-Inverse.
• Singular Value: See Matrix Decompositions - Singular Value Decomposition.
• Sherman-Morrison-Woodbury Formula: See Matrix Properties - Inversion Lemma.
• Space: See Vector Properties - Space.
• Square Root Matrix: If a symmetric matrix, R, is decomposed into its Cholesky factors:
(397)
R = LL T
where L is a lower triangular matrix, L is often also called a square root matrix of R. There are many other
definitions of matrix square root. For example, for the symmetric square matrix R:
1
--2
1
--2
(398)
R ≡ VΛ V T
where the eigen-decomposition of R is used and the square root of the diagonal matrix of eigenvalues is
simply defined as the diagonal matrix of the square root of the individual eigenvalues.
See also Matrix Decompositions - Cholesky/Eigenanalysis.
• Square System of Equations: The linear set of equations:
(399)
Ax = b
where A is a known non-singular n × n matrix (i.e., rank(A)=n), b is a known n element vector, and x is
an unknown n element vector, represents a square system of equations which has an exact solution for x
given by:
(400)
x = A –1 b
For example:
3 2
x1
1 1
x2
=
1
3
⇒
x1
x2
=
1 –2
1
–1 3
3
=
–5
8
For large n it is usually not advisable to calculate A-1 directly due to potential numerical instabilities
particularly if A is ill-conditioned. Equations of the form in Eq. 399 are best solved using orthogonal
techniques such as the QR algorithm, or more general matrix decomposition techniques such LU
decomposition (with pivoting), or Cholesky decomposition if A is symmetric. If matrix A has m > n then
(401)
263
the problem is overdetermined and if m < n then the problem is underdetermined. If the rank of A is less
than n, then the pseudo-inverse is required. See also Least Squares, Matrix Decompositions - Cholesky/
LU/QR/SVD,
Matrix
Properties
Ill-Conditioned/Overdetermined
System/Pseudo-Inverse/
Underdetermined System.
• Subspace: See Vector Properties - Subspace.
• Trace: The trace of a square n × n matrix, A, is defined as the sum of the diagonal elements of that matrix:
a 11 a 12 … a 1n
trace ( A ) = trace
a 21 a 22 … a 2n
:
:
a n1 a n2
: :
… a nn
n
=
∑ aii
(402)
i=1
It is relatively straightforward (using matrix decompositions) to show that for any m × n matrix A, and any
n × m matrix B, then:
trace ( AB ) = trace ( BA )
(403)
In DSP a particularly useful property of the trace is that trace ( A ) = λ1 + λ 2 + … + λn , where λ i is the ith eigenvalue of an n × n matrix A. See also Matrix Decompositions - Eigenanalysis.
• Transpose: For two matrices A ( m × n ) and B ( n × m ) then the transpose of the product can be written
as:
( AB ) T = B T A T
(404)
( A T )T = A
(405)
Note that:
The product of any m × n matrix and its transpose gives an m × m square symmetric matrix:
A = 1 2 –3
4 –1 5
⇒
AA T = 1 2 – 3
4 –1 5
1 4
14 – 13
2 –1 =
– 13 42
–3 5
(406)
• 2-norm: See Matrix Properties - Norm.
• Underdetermined System: The linear set of equations Ax = b is said to be underdetermined, when A
is a known m × n matrix with m < n , b is a known m element vector and x is an unknown n element vector.
Essentially, there are fewer equations than unknowns and an infinite number of solutions for x exist. If A
has linearly independent rows (i.e. rank ( A ) = m ), then there are an infinite number of exact solutions. If
rank(A)<m, however, then the set of equations may be inconsistent, i.e., no exact solution exists. In this
latter case, an infinite number of least squares (inexact) solutions exists, with the pseudo-inverse giving
the minimum norm solution.
An underdetermined system of equations has an infinite number of solutions for x. Consider the following
underdetermined system of equations:
a 11 a 21
i.e.
x1
x2
= b1
a 11 x 1 + a 21 x 2 = b 1
Choosing any value for x1, a value of x2 satisfying the underdetermined system of equations can be
produced. Hence there is no unique solution and there are an infinite number of solutions. However some
(407)
DSPedia
264
solutions are “better” than others, and the minimum norm solution, where the smallest magnitude 2-norm
x 2 is calculated can be found using least squares techniques.
The overdetermined problem can be usefully illustrated geometrically. Consider the following
overdetermined system of equations:
x1
1 00
3
x2 =
0 01
2
x3
(408)
The solution set to Eq. 408 is:
x 1 = 3,
x 2 = Any Real Number, x 3 = 2
(409)
Representing this solution in three dimensional space
y
x2
z
x
e
3
1
-2
-1
0
1
2
b
2
3
4
x
From a geometrical interpretation, regardless of the magnitude of x 2 , the matrix A will project the vector
x onto b.
The underdetermined least squares problem can however be uniquely solved using the minimum norm
solution. If the 2-norm of the error vector e = Ax – b is minimized, i.e. ε = e 2 = Ax – b 2 , then from
the above geometrical interpretation the best solution occurs when x 2 = 0 . This solution is unique and
best in the sense that the x vector has minimum norm. This solution can be calculated by using the least
squares solution for underdetermined systems:
x LS
1 0 
1 0 
1
0
0
1
–
T
T
= A ( AA ) b = 0 0 
0 0
 0 01

01 
0 1
–1
10
3
3 =
1 0 3 =
00
0
2
0 1 2
01
2
See also Least Squares, Matrix Decompositions - Singular Value, Overdetermined systems, Square
System of Equations.
• Well-Conditioned: An m × n matrix, A is said to be well-conditioned when the condition number,
calculated as the ratio of the maximum singular value to minimum singular value (or maximum eigenvalue
to minimum eigenvalue for n × n matrices) is low relative to the precision of the system on which the matrix
is being manipulated. A matrix that is not well-conditioned is ill-conditioned. For more details see entry
Matrix Properties - Condition Number. See also Matrix Decompositions - Eigenvalue/Singular Value.
• Woodbury’s Identity: See Matrix Properties - Inversion Lemma.
Matrix Scaling: See Matrix Operations - Scaling.
(410)
265
Matrix, Structured: A matrix that has regularly grouped elements and a specific structure of zero
elements is called a structured matrix. When structured matrices are to be used in calculations, the
zeroes in the structure can often be exploited to reduce the total number of computations, and the
matrix storage requirements. A number of key structured matrices often found in linear algebra
based DSP algorithms and analysis can be identified. See also Matrix Decompositions, Matrix
Operations, Matrix Properties.
• Band: In a band matrix the upper right and lower left corners of the matrix are zero elements, and a band
of diagonal elements are non-zero. For example a 5 × 6 matrix with band width of 3 may have the form:
b 11 b 12 0
B =
0
0
0
b 21 b 22 b 23 b
0
0
0 b 32 b 33 b 34 0
0
0
0 b 43 b 44 b 45 0
0
0
(411)
0 b 54 b 55 b 56
• Bidiagonal: A matrix where only the main diagonal, and the first diagonal (above or below the main) are
non-zero. See also Bidiagonalization.
d 1 g1 0 0
0 d2 g2 0
E =
0 0 d3 g3
(412)
0 0 0 d4
• Circulant: An n × n circulant matrix has only N distinct elements, where each row is formed by shifting
the previous row by one element to the right in a circular buffer fashion. One interesting property of
circulant matrices is that the eigenvalues can be determined by taking a DFT of the first row. The
eigenvectors are given by the standard basis vectors of the DFT. See also Matrix-Structured-Toeplitz.
r0 r1 r2 r3
r3 r0 r1 r2
C =
(413)
r2 r3 r0 r1
r1 r2 r3 r0
• Diagonal: A diagonal matrix has all elements, except those on the main diagonal, equal to zero.
Multiplying an appropriately dimensioned matrix by a diagonal matrix is equivalent to multiplying the i-th
row of the matrix, by the i-th diagonal element. Diagonal matrices are usually square matrices, although
this is not necessarily the case.
d 11 0
D=
0
0
0 d 22 0
0
0
0 d 33 0
0
0
For shorthand, a diagonal matrix is often denoted as:
D = diag(d1 d2 d3 d4) where di = dii.
0 d 44
(414)
DSPedia
266
• Identity: The identity matrix has all elements zero, except for the main diagonal elements which are equal
to one. The identity matrix is almost universally denoted as I. For any matrix A, multiplied by the
appropriately dimensioned identity matrix, the result is A. Any matrix multiplied by its inverse, gives the
identity matrix. See also Diagonal Matrix, Matrix Inverse.
10 00
I =
01 00
(415)
00 10
00 01
• Lower Triangular: A matrix where all elements below the main diagonal are equal to zero. Lower
triangular matrices are useful in solving linear algebraic equations with algorithms such as LU (lower,
upper) decomposition. Useful properties are that the product of a lower triangular matrix, and a lower
triangular matrix is a lower triangular matrix, and the inverse of a lower triangular matrix is a lower
triangular matrix. See also Forward-substitution, Upper Triangular Matrix.
l 11 0
L =
0
0
l 21 l 22 0
0
l 31 l 32 l 33 0
(416)
l 41 l 42 l 43 l 44
• Orthogonal: A matrix is called orthogonal (or orthonormal) if its transpose, QT, forms the inverse matrix
Q –1 , i.e. Q T = Q – 1 and,
Q T Q = I = QQ T
(417)
It can also be said that the columns of the matrix Q form an orthonormal basis for the space ℜ m. While the
terms orthogonal and orthonormal are used interchangeably as applied to matrices, they have distinct
meanings when applied to sets of functions or vectors -- with orthonormal indicating unit norm for every
element in an orthognonal set. See also Matrix Decompositions Eigenvalue/QR, Matrix Properties Unitary Matrix.
• Orthonormal: See Orthogonal.
• Permutation: A matrix that is essentially the identity matrix with the row orders changed. Multiplying
another matrix, A, by a permutation matrix, P, will swap the row orders of A. In general multiplication of a
matrix by a permutation matrix does not change any of the fundamental quantities such as eigenvalues,
condition number. The permutation matrix is an orthogonal matrix.
0
P = 0
0
1
0
1
0
0
0
0
1
0
1
0
0
0
• Rectangular: A matrix that does not have the same number of rows and columns.
• Sparse: Any matrix with a large proportion of zero elements is often termed a sparse matrix. Matrices such
as lower triangular, diagonal etc can be described as structured sparse matrices. When performing matrix
algebra on sparse matrices, the number of MACs required is usually greatly reduced over an equivalent
operation using the a full populated matrix, given that many null operations are performed, e.g. multiplies
and additions that have one or two zero values.
• Square: A matrix with the same number of rows as columns. Covariance and correlation matrices are
necessarily square.
(418)
267
• Symmetric: A matrix is symmetric if A = AT. The line of symmetry is therefore through the main diagonal.
Many matrices used in DSP algorithms are symmetric, such as the correlation matrix.
s 11 s 12 s 13 s 14
S =
s 12 s 22 s 23 s 24
s 13 s 23 s 33 s 34
(419)
s 4 s 24 s 34 s 44
• Toeplitz: This matrix has constant elements in all diagonals. The correlation matrix of stationary
stochastic N element data vector forms an N × N Toeplitz matrix. See also Matrix-Circulant, and
Correlation Matrix, Covariance Matrix.
r 0 r 1 r 2 r3
T =
r –1 r 0 r 1 r2
r –2 r –1 r 0 r1
(420)
r –3 r –2 r –1 r0
• Tridiagonal: A matrix where only the main, first upper and first lower diagonals are non-zero elements.
t 11 s 12 0
T =
0
v 21 t 22 s 23 0
0 v 32 t 33 s 34
0
(421)
0 v 43 t 44
• Unitary: A complex data matrix is unitary if the transpose of a complex data orthogonal matrix, UT, forms
the inverse matrix U – 1 , i.e. U T = U – 1 and therefore,
U T U = I = UU T
(422)
The unitary property is the complex matrix equivalent property of orthogonality. See also Eigenvalue
Decomposition, QR algorithm, Unitary Matrix.
• Upper Triangular: A matrix where all elements above the main diagonal are equal to zero. Upper
triangular matrices are useful in solving linear algebraic equations with algorithms such as LU (lower,
upper) decomposition. Use properties are that the product of an upper triangular matrix, and an upper
triangular matrix is an upper triangular matrix, and the inverse of an upper triangular matrix is an upper
triangular matrix. See also Back-substitution, Lower Triangular Matrix.
u 11 u 12 u 13 u 14
U =
0 u 22 u 23 u 24
0
0 u 33 u 34
0
0
(423)
0 u 44
Matrix-Vector Multiplication: See Matrix Operations - Matrix-Vector Multiplication.
Maximum Length Sequences: If a binary sequence is produced using a pseudo random binary
sequence generator, the sequence is said to be a maximum length sequence if for an N bit register,
the binary sequence is of length 2 N – 1 before it repeats itself. In a maximum length sequence the
DSPedia
268
number of 1’s is one more than the number of 0’s. Also known as m-sequences. See also PseudoRandom Binary Sequence.
Mean Value: The statistical mean value of a signal, x ( k ) , is the average amplitude of the signal.
Statistical mean is calculated using the statistical expectation operator, E { . } :
E { x ( k ) } = Statistical Mean Value of x ( k ) =
∑ x ( k )p { x ( k ) }
(424)
k
where p { x ( k ) } is the probability density function of x ( k ) . In real time DSP the probability density
function of a signal is rarely known. Therefore to find the mean value of a signal the more intuitively
obvious calculation of a time average computed over a large and representative number of
samples, N, is used:
1
Time Average = ---N
N–1
∑ x(k)
(425)
k=0
Mean Value
x(k)
N-1
time, k
The time averaged mean value can be calculated by finding the average signal
amplitude over a large and representative number of samples. If the signal is ergodic
then the time averages equal the statistical averages.
If the signal is ergodic then the time averages and statistical averages are the same. See also
Ergodic, Expected Value, Mean Squared Value, Wide Sense Stationarity.
Mean Squared Value: The statistical mean squared value of a signal, x ( k ) , is the average
amplitude of the signal. Statistical mean squared value is often denoted using the statistical
expectation operator, E { . } , which is calculated as:
E { x 2 ( k ) } = Statistical Mean Squared Value of x ( k ) =
∑ x 2 ( k )p { x ( k ) }
(426)
k
where p { x ( k ) } is the probability density function of x ( k ) . In real time DSP the probability density
function of a signal is rarely known and therefore to find the mean squared value of a signal then
the more intuitively obvious calculation of a time average calculated over a large and representative
number of samples, N, is used:
269
1
Average Squared Value = ---N
N–1
∑ x2( k ) .
(427)
k=0
x(k)
N
time, k
Mean Squared Value
[x(k)]2
N time, k
The time averaged mean squared value can be calculated by finding the average
signal amplitude of the squared signal over a large and representative number of
samples. If the signal is ergodic then the time averages equal the statistical averages.
If the signal is ergodic then the time averages and statistical averages are the same. Note that mean
squared value is always a positive value for any non-zero signal. See also Ergodic, Expected Value,
Mean Squared Value, Variance, Wide Sense Stationarity.
Memory: Integrated circuits used to store binary data. Most memory devices are CMOS
semiconductors. For a DSP system memory will either be ROM or RAM. See also Static RAM,
Dynamic RAM.
Message: The information to be communicated in a communication system. The message can be
continuous (analog) or discrete (digital). If an analog message is to be transmitted via a digital
communications system it must first be sampled and digitized. See also Analog to Digital Converter,
Digital Communications.
MFLOPS: This measure gives the speed rating of processor in terms of the number of millions of
floating point operations per second (MFLOPS) a processor can do. DSP processors can often
perform more FLOPS than their clock speeds. This counter-intuitive capacity results from the fact
that the floating point operations are pipeline -- with MFLOPS calculated as a time-averaged (best
case) performance. The MFLOPS rating can be misleading for practical programs running on a
DSP processor that rarely attain the MFLOPS speed when performing peripheral functions such as
data acquisition, data output, etc.
Middle A: See Western Music Scale.
Middle C: See Western Music Scale.
MiniDisc (MD): The MiniDisc was introduced to the audio market in 1992 as a digital audio
playback and record format with the aim of competing with both compact disc (CD) introduced in
1983, and the compact cassette introduced in the 1960s. Sony developed the MiniDisc partly to
break into the portable hifidelity audio market and therefore the format need to be compact and
resistant to vibration and mechanical knocks [155]. Compared to the very successful CD format, the
DSPedia
270
MiniDisc offers the advantage of being much smaller by virtue of smaller media requited by
psychoacoustically compressed data. In addition, it features a record facility. The MiniDisc is a
competing format to Philip’s DCC which also uses psychoacoustic data compression techniques.
The MiniDisc is 64mm in diameter and uses magneto-optical techniques for recording. The size of
the disc was kept small by using adaptive transform acoustic coding (ATRAC) to compress original
44.1kHz, 16 bit PCM music by a factor of 4.83. One MiniDisc can store 64 minutes of compressed
stereo audio requiring around 140 Mbytes. Space is also made available for timing and track
information. . The MiniDisc encodes data using the same modulation and similar error checking as
the CD, namely eight to fourteen modulation (EFM) and a slightly modified cross interleaved ReedSolomon coding (CIRC).
The risk of shock and vibration in everyday use is addressed by a 4Mbit buffer capable of storing
more than 14 seconds of compressed audio. Therefore if the optical pickup loses its tracking the
music can continue playing while the tracking is repositioned (requiring less than a second) and the
buffer is refilled. In fact the pickup can read 5 times faster that the ATRAC decoder and therefore
during normal operation the MiniDisc reads only intermittently.
L
R
in
out
L
R
ADC
Digital
I/O
Three
Channel
Subband
Filter
Modified
Discrete
Cosine
Transform
Bit
allocation/
Spectral
Quantizing
Error
coding/Data
Modulation
4 Mbit
Data
Buffer
Read/
Write
Head
DAC
The MiniDisc (MD) compresses stereo 16 bit PCM audio signals sampled at
44.1kHzby a factor of almost 5:1. MiniDisc are read/writable and have a built in data
buffer to resist mechanical shock.
The MiniDisc can also be used for data storage and corresponds to a read-write disc of storage
capacity 140Mbyte. See also Adaptive Transform Acoustic Coding, Compact Disc (CD), Digital
Audio, Digital Audio Tape (DAT), Digital Compact Cassette (DCC), Psychoacoustics.
Minimum Audible Field: A measure of the lowest level of detectable sound by the human ear.
See entry Threshold of Hearing.
Minimum Norm Vector: See Vector Properties and Definitions - Minimum Norm.
Minimum Phase: All zeroes of the transfer function lie within the unit circle on the z-plane. See
also Z-transform.
Minimum Residual: See Least Squares Residual.
Minimum Shift Keying (MSK): A form of frequency shift keying in which memory is introduced
from symbol to symbol to ensure continuous phase. The separation in frequency between symbols
is 1/(2T) Hz (for a symbol period of T seconds) allowing the maximum number of orthogonal signals
in a fixed bandwidth. The fact that the MSK symbol stream is constrained to ensure continuous
phase and has signals closely spaced in frequency means that MSK modulation is the most
spectrally efficient form of FSK. MSK is sometimes referred to as Fast FSK since more data can be
transmitted over a fixed bandwidth with MSK than FSK. Gaussian MSK (GMSK, as used in the GSM
mobile radio system, for example) introduces a Gaussian pulse shaping on the MSK signals. This
271
pulse shaping allows a trade-off between spectral overlap and interpulse interference. See also
Frequency Shift Keying, Continuous Phase Modulation.
MIPS: This gives a measure of the number of MIPS (millions of instructions per second) that a DSP
processor can do.
Modem: A concatenation of MODulate and DEModulate. Modems are devices installed at both
ends of an analog communication line (such as a telephone line). At the transmitting end digital
signals are modulated onto the analog line, and at the receiving end the incoming signal is
demodulated back to digital format. Modems are widely used for inter-computer connection and on
FAX machines.
Modular Interface eXtension (MIX): MIX is a high performance bus to connect expansion
modules to a VME bus or a Multibus II baseboard. A few companies have adopted this standard.
Modulo-2 Adder: Another name for an exclusive OR gate. See also Full Adder, Pseudo-Random
a
b
z
0
0
0
0
1
1
1
0
1
1
1
0
Truth Table
z = ab + ab = a ⊕ b
Boolean Algebra
a
b
z
Logic Circuit
Binary Sequence.
Monaural: This refers to a system that presents signals to only one ear (e.g. a hearing aid worn
on only one ear is monaural.) See also Binaural, Monophonic, Stereophonic.
Monaural Beats: When two tones with slightly different frequencies are played together, the ear
may perceive a composite tone beating at the rate of the frequency difference between the tones.
See also Beat Frequencies, Binaural Beats.
Monophonic: This refers to a system that has only one audio channel (although this single signal
may be presented on multiple speakers). See also Monaural, Stereophonic, Binaural.
Moore-Penrose Inverse: See Matrix Properties - Pseudo-Inverse.
Mosaic: A hypertext browser used on the internet for interchange and exchange of information in
the form of text, graphics, and audio. See also Internet, World Wide Web.
DSPedia
272
Most Significant Bit (MSB): The bit in a binary number with the largest arithmetic significance.
See also Least Significant Bit, Sign Bit.
MSB
LSB
-128 64
0
0
32
16
8
4
2
1
0
1
1
0
1
1
= 16 + 8 + 2 + 1 = 2710
Motherboard: A DSP board that has its own functionality, and also spaces for smaller functional
boards (extra processors, I/O channels) to be inserted is called a motherboard. This is analogous
to the main board on a PC system that is home to the processor and other key system components.
Moving Average (MA) FIR Filter: The moving average (MA) filter “usually” refers to an FIR filter
of length N where all filter weights have the value of 1. (The term MA is however sometimes used
to mean any (non-recursive) FIR filter usually within the context of stochastic signal modelling [77]).
The moving average filter is a very simple form of low pass filter often found in applications where
computational requirements need to be kept to a minimum. A moving average filter produces an
output sample at time, k, by adding together the last N input samples (including the current one).
This can be represented on a simple signal flow graph and with discrete equations as:
x(k)
x(k-1)
x(k-2)
N-1 delay elements
x(k-3)
x(k-N+2)
x(k-N+1)
y(k)
N–1
y ( k ) = x ( k ) + x ( k – 1 ) + x ( k – 2 ) + x ( k – 3 ) + ..... + x ( k – N + 1 ) =
∑
n=0
x(k – n)
The signal flow graph and output equation for a moving average FIR filter. The moving
average filter requires no multiplications, only N additions.
273
As an example the magnitude frequency domain representations of a moving average filter with 10
weights is:
H(f)
20 log H ( f ) (dB)
Linear Magnitude Freq. Response
Attenuation
10
5
0
1000
2000
3000
4000
5000
Log Magnitude Freq. Response
20
10
0
-10
-20
0
1000
2000
frequency (Hz)
3000
4000
5000
frequency (Hz)
The linear and logarithmic frequency responses of a 10 weight moving average FIR filter.
The peak of the first sidelobe of any moving average filter is always approximately 13dB
below the gain at 0 Hz.
In terms of the z-domain, we can write the transfer function of the moving average FIR filter as:
( z )- = 1 + z –1 + z –2 + + z – N + 1
----------H( z) = Y
…
X(z)
N–1
∑
=
i=0
z –i
1 – z –N= ----------------1 – z –1
(428)
recalling that the sum of a geometric series { 1, r, r 2, …, r m } is given by ( 1 – r m + 1 ) ⁄ ( 1 – r ) . If the
above moving average transfer function polynomial is factorized, this therefore represents a
transfer function with N zeroes and a single pole at z = 1 , which is of course cancelled out by a
zero at z = 1 since an FIR filter has no poles associated with it. We can find the zeroes of the
polynomial in Eq. 428 by solving:
1 – z –N = 0
⇒ zn =
N
1 where n = 0…N –
⇒ zn =
N
e j2πn noting e j2πn = 1
⇒ zn = e
j2πn----------N
(429)
DSPedia
274
which represents N zeroes equally spaced around the unit circle starting at z = 1 , but with the
z = 1 zero cancelled out by the pole at z = 1 . The pole-zero z-domain plot for the above 10 weight
moving average FIR filter is:
Imag
z-domain
1
Y(z)
H ( z ) = ------------ = 1 + z –1 + z – 2 + … + z – 9
X(z)
0.5
9
-1
-0.5
0
0.5
1
Real
-0.5
=
∑ z –i
i=0
– z – 10
= 1
------------------1 – z –1
-1
The pole-zero plot for a moving average filter of length 10. As expected the filter has 9
zeroes equally spaced around the unit circle (save the one not present at z = 1 ). In some
representations a pole and a zero may be shown at z = 1 , however these cancel each
other out. The use of a pole is only to simplify the z-transform polynomial expression.
In general if a moving average filter has N weights then the width of the first (half) lobe of the
mainlobe is f s ⁄ 2N Hz, which is also the bandwidth of all of the sidelobes up to f s ⁄ 2 .
The moving average filter shown will amplify an input signal by a factor of N. If no gain (or gain =1)
is required at 0 Hz then the output of the filter should be divided by N. However one of the attractive
features of a moving average filter is that it is simple to implement and the inclusion of a division is
not conducive to this aim. Therefore should 0 dB be required at 0 Hz, then if the filter length is made
a power of 2 (i.e. 8, 16, 32 and so on) then the division can be done with a simple shift right
operation of the filter output, whereby each shift right divides by 2.
The moving average FIR filter is linear phase and has a group delay equal to half of the filter length
(N/2). See also Comb Filter, Digital Filter, Exponential Averaging, Finite Impulse Response Filter,
Finite Impulse Response Filters-Linear Phase, Infinite Impulse Response Filter.
Moving Picture Experts Group (MPEG): The MPEG standard comes from the International
Organization for Standards (ISO) sub-committee (SC) 29 which is responsible for standards on
“Coding of Audio, Picture, Multimedia and Hypermedia Information”. Working Group (WG) 11 (ISO
JTC1/SC29/WG11) considered the problem of coding of multimedia and hypermedia information
and produced the MPEG joint standards with the International Electrotechnical Commission (IEC):
• ISO/IEC 11 172: MPEG-1 (Moving Picture Coding up to 1.5 Mbit/s)
Part 1: Systems
Part 2: Video
Part 3: Audio
Part 4: Compliance Testing (CD)
Part 5: Technical Report on Software for ISO/IEC 11 172
• ISO/IEC 13 818: MPEG-2 (Generic Moving Picture Coding)
Part 1: Systems (CD)
Part 2: Video (CD)
Part 3: Audio (CD)
Part 4: Compliance Testing
275
Part 5: Technical Report on Software for ISO/IEC 13 818
Part 6: Systems Extensions
Part 7: Audio Extensions
Some current work of (ISO JTC1/SC29/WG11) is focussed on the definition of the MPEG-4
standard for Very-low Bitrate Audio-Visual Coding.
MPEG-1 essentially defines a bit stream representation for the synchronized digital video and audio
compressed to fit in a bandwidth of about 1.5Mbits/s, which corresponds to the bit rate output of a
CD-ROM or DAT. The video stream requires about 1.15 Mbits/s, with the remaining bandwidth used
by the audio and system data streams. MPEG is also widely used on the Internet as a means for
transferring audio/video clips. MPEG-1 has subsequently enabled the development of various
multimedia systems and CD-DV (compact disc digital video).
The MPEG standard is aimed at using intra-frame (as in JPEG) and inter-frame compression
techniques to reduce the digital storage requirement of moving pictures, or video [72]. MPEG-1
video reduces the color subsampling ratio of a picture to one quarter of the original source values
in order that actual compression algorithms are less processor intensive. MPEG-1 video then uses
a combination of the discrete cosine transform (DCT) and motion estimation to exploit the spatial
and temporal redundancy present in video sequences and (depending on the resolution of the
original sequence) can yield compression ratios of approximately 25:1 to give almost VHS quality
video. The motion estimation algorithm efficiently searches blocks of pixels, and therefore can track
the movement of objects between frames or as the camera pans around. The DCT exploits the
physiology of the human eye by taking blocks of pixels and converting them from the spatial domain
to the frequency domain with subsequent quantization. As with JPEG, a zig-zag scan of the DCT
coefficients yields long runs of zero for the higher frequency components. This improves the
efficiency of the run length encoding (also similar to JPEG).
In general very high levels of computing power are required for MPEG encoding (of the order of
hundreds of MIPs to encode 25 frames/s. However decoding is not quite as demanding and there
are a number of single chip decoder solutions available.
MPEG-2 is designed to offer higher than MPEG-1 quality playback at bit rates of between 4 and
10Mbits/s which is above the playback rate currently achievable using CD disc technology . MPEG4 is aimed at very low bit rate coding for applications such as video-conferencing or videotelephony. See also Compression, Discrete Cosine Transform, H-Series Recommendations H261, International Organisation for Standards (ISO), Moving Picture Experts Group - Audio,
Psychoacoustic
Subband
Coding,
International
Telecommunication
Union,
ITU-T
Recommendations, Standards.
Moving Picture Experts Group (MPEG) - Audio: The International Organization for Standards
(ISO) MPEG audio standards were based around the developed compression techniques of
MUSICAM (Masking Pattern Adapted Universal Subband Integrated Coding and Multiplexing) and
ASPEC (Adaptive Spectral Perceptual Entropy Coding). MPEG audio compression uses subband
coding techniques with dynamic bit allocation based on psychoacoustic models of the human ear.
By exploiting both spectral and temporal masking effects, compression ratios of up to 12:1 for CD
quality audio (without too much degradation to the average listener) can be realized.
The so called MPEG-1, ISO 11172-3 standard, describes compression coding schemes of hifidelity
audio signals sampled at 48kHz, 44.1 kHz or 32 kHz with 16 bits resolution in one of four modes:
(1) single channel; (2) dual (independent or bilingual) channels; (3) stereo channels; and (4) joint
stereo . The standard only defines the format of the encoded data and therefore if improved
DSPedia
276
psychoacoustic models can be found then they can be incorporated into the compression scheme.
Note that the psychoacoustic modelling is only required in the coder, and in the decoder the only
requirement is to “unpack” the signals. Therefore the cost of an MPEG decoder is lower than an
MPEG encoder.
The standard defines layers 1, 2 and 3 which correspond to different compression rates which
require different levels of coding complexity, and of course have different levels of perceived quality.
The various parameters (based on an input signal sampled at 48 kHz with 16 bits samples - a data
rate of 768 kbits/s) of the three layers of the model are:
MPEG Audio
ISO 11172-3
Standard
Theoretical
coding/
decoding
delay (ms)
Target bit
rate/channel
(kbits/s)
Compression
ratio
No of
subbands in
psychoacoustic model
“Similar”
compression
schemes
Layer 1
19
192
4:1
32
PASC
Layer 2
35
128
6:1
32
MUSICAM
Layer 3
59
64
12:1
576
ASPEC
Layer 1 is the least complex to implement and is suitable for applications where good quality is
required and audio transmission bandwidths of at least 192 kbits/s are available. PASC (precision
adaptive subband coding) as used on the digital compact cassette (DCC) developed by Philips is
very similar to layer 1. Layer 2 is identical to MUSICAM. Layer 3 which achieves the highest rate of
data compression is only required when bandwidth is seriously limited; at 64 kbits/s the quality is
generally good, however a keen listener will notice artifacts.
In the MPEG-2, ISO 13818-3 standard, key advancements have been made over MPEG-1 ISO
11172 with respect to inclusion of dynamic range controls, surround sound, and the use of lower
sampling rates. Surround sound, or multichannel sound is likely to be required for HDTV (high
definition television) and other forms of digital audio broadcasting. Draft standards for multichannel
sound formats have already been published by the International Telecommunication Union Radiocommunication Committee (ITU-R) and European Broadcast Union (EBU). MPEG-2 is
designed to transmit 5 channels, 3 front channels and 2 surround channels in so called 3/2 surround
format. Using a form of joint stereo coding the bit rate for layer 2 of MPEG-2 will be about 2.5 times
the 2 channel MPEG-1 layer 2, i.e. between 256 and 384 bits/sec.
MPEG-2 was also aimed at extending psychoacoustic compression techniques to lower sampling
frequencies such as (24 kHz, 22.05 kHz and 16 kHz) which will give good fidelity for speech only
type tracks. It is likely that this type of coding could replace techniques such as the ITU-T G.722
coding (G - series recommendations).
MPEG-4 will code audio at very low bit rates and is currently under consideration. See also
Psychoacoustics, Precision Adaptive Subband Coding (PASC), Spectral Masking, Temporal
Masking
MPEG: See Moving Picture Experts Group.
Multichannel LMS: See Least Mean Squares Algorithm Variants.
Multimedia: The integration of speech, audio, video and data communications on a computer. For
all of these aspects DSP co-processing may be necessary to implement the required computational
277
algorithms. Multimedia PCs have integrated FAX, videophone, audio and TV - all made possible by
DSP.
is
a
Multimedia and Hypermedia Information Coding Experts Group (MHEG): MHEG
standard for hypermedia document representation. MHEG is useful for the implementation aspects
of interactive hypermedia applications such as on-line textbooks, encyclopedias, and learning
software such as are already found on CD-ROM [94].
The MHEG standard comes from the International Organization for Standards (ISO) sub-committee
(SC) 29 which is responsible for standards on “Coding of Audio, Picture, Multimedia and
Hypermedia Information”. Working Group (WG) 12 (ISO JTC1/SC29/WG12) considered the
problem of coding of multimedia and hypermedia information and produced the MHEG joint
standard with the International Electrotechnical Commission (IEC): ISO/IEC 13522 MHEG (Coding
of Multimedia and Hypermedia Information).
See also International Organisation for Standards, Multimedia, Standards.
Multimedia Standards: The emergence of multimedia systems in the 1990s brings the
communication and presentation of audio, video, graphics and hypermedia documents onto a
common platform. The successful integration of software and hardware from different
manufacturers etc requires that standards are adopted. For current multimedia systems a number
of ITU, ISO and ISO/IEC JTC standards are likely to be adopted. A non-exhaustive sample list of
standards that are suitable include:
• ITU-T Recommendations:
F.701
Teleconference service.
F.710
General principles for audiographic conference service.
F.711
Audiographic conference teleservice for ISDN.
F.720
Videotelephony services - general.
F.721
Videotelephony teleservice for ISDN.
F.730
Videoconference service- general.
F.732
Broadband Videoconference Services.
F.740
Audiovisual interactive services.
G.711
Pulse code modulation (PCM) of voice frequencies.
G.712
Transmission performance characteristics of pulse code modulation.
G.720
Characterization of low-rate digital voice coder performance with non-voice signals.
G.722
7 kHz audio-coding within 64 kbit/s; Annex A: Testing signal-to-total distortion ratio for kHz
audio-codecs at 64 kbit/s.
G.724
Characteristics of a 48-channel low bit rate encoding primary multiplex operating at 1544 kbit/s.
G.725
System aspects for the use of the 7 kHz audio codec within 64 kbit/s.
G.726
40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code Modulation (ADPCM). Annex A:
Extensions of Recommendation G.726 for use with uniform-quantized input and output.
G.727
5-, 4-, 3- and 2-bits sample embedded adaptive differential pulse code modulation (ADPCM).
G.728
Coding of speech at 16 kbit/s using low-delay code excited linear prediction. Annex G to Coding
of speech at 16 kbit/s using low-delay code excited linear prediction: 16 kbit/s fixed point
specification.
DSPedia
278
H.221
Frame structure for a 64 to 1920 kbit/s channel in audiovisual teleservices
H.242
System for establishing communication between audiovisual terminals using digital channels up
to 2 Mbit/s.
H.261
Video codec for audiovisual services at p x 64 kbit/s.
H.320
Narrow-band visual telephone systems and terminal equipment.
T.80
Common components for image compression and communication - basic principles.
X.400
Message handing system and service overview (same as F.400).
• Proprietary Standards:
Bento
Sponsored by Apple Inc for multimedia data storage.
GIF
Compuserve Inc graphic interchange file format.
QuickTime Digital video replay on the MacIntosh.
RIFF
Microsoft and IBM multimedia file format.
DVI
Intel’s digital video.
MIDI
Musical digital interface.
• International Organization for Standards:
HyTime Hypermedia time based structuring language.
IIF
Image interchange format.
JBIG
Lossless compression for black and white images.
JPEG
Lossy compression for continuous tone, natural scene images.
MHEG
Multimedia and hypermedia information coding.
MPEG
Digital video compression techniques.
ODA
Open document architecture.
See also International Telecommunication Union, International Organisation for Standards,
Standards.
Multiply Accumulate (MAC): The operation of multiplying two numbers and adding to another
value, i.e. ((a × b) + c). Many DSP processors can perform (on average) one MAC in one instruction
cycle. Therefore if a DSP processor has a clock speed of 20MHz, then it can perform a peak rate
of 20,000,000 multiply and accumulates per second. See also DSP Processor, Parallel Adder,
Parallel Multiplier.
b
a
c
a.b +c
Multiprocessing: Using more than one DSP processor to solve a particular problem. The
TMS320C40 has six I/O ports to communicate with other TMS320C40s with independent DMA. The
term multiprocessing is sometimes used interchangeably with parallel processing.
279
Multipulse Excited Linear Predictive Coding (MLPC): MLPC is an extension of LPC for speech
compression that goes some way to overcoming the false synthesized sound of LPC speech.
Multipurpose Internet Mail Extensions (MIME): MIME is a proposed standard from the Internet
Architecture Board and supports several predefined types of non-text (non-ASCII) message
contents, such as 8 bit 8kHz sampled µ-law encoded audio, GIF image files, and postscript as well
as other forms of user definable types. See also Standards.
Multirate: A DSP system which performs computations on signals at more than one sampling rate
usually to achieve a more efficient computational schedule. The important steps in a multirate
system are decimation (reducing the sampling rate), and interpolation (increasing the sampling
rate). Sub-band systems can be described as multirate. See also Decimation, Interpolation,
Upsampling, Downsampling, Fractional Sampling Rate Conversion.
µ-law: Speech signals, for example, have a very wide dynamic range: Harsh “oh” and “b” type
sounds have a large amplitude, whereas softer sounds such as “sh” have small amplitudes. If a
uniform quantization scheme were used then although the loud sounds would be represented
adequately the quieter sounds may fall below the threshold of the LSB and therefore be quantized
to zero and the information lost. Therefore companding quantizers are used such that the
quantization level at low input levels is much smaller than for higher level signals. Two schemes are
widely in use: the µ-law in the USA and the A-law in Europe. The expression for µ-law compression
is given by:
( 1 + µx )------------------------y ( x ) = ln
ln ( 1 + µ )
(430)
with y(x) being the compressed output for input x, and the function being negative symmetric around
x=0. A typical value of µ is 255. See also A-Law.
Music: Music is a collection of sounds arranged in an order that sounds cohesive and regular.
Most importantly, the sound of music is pleasant to listen to. Music can have has two main
elements: a quasi-periodic set of musical notes and a percussive set of regular timing beats. Each
musical note or discrete sound in music is characterized by a fundamental frequency and a rich set
of harmonics, whereas the percussion sounds are more random (although distinctive) in nature
[13], [14].
Many different ordered music scales (sets of constituent notes) exist. The most familiar is the 12
notes in an octave of the Western music scale on which most modern and classical music is played.
The fundamental frequency of each note on the Western music scale can be related to the
fundamental frequency of all other notes by a simple ratio. The same musical notes on different
musical instruments are characterized by the harmonic content and the volume envelope. The
DSPedia
280
3
Trumpet, C3
1
0
-1
-2
-3
0.5
3
1
1.5
1
0
-1
-2
2
2.5
time/seconds
Violin, C3
0.5
3
1
0
-1
-2
-3
1
1.5
2
2.5
time/seconds
Piano, C3
2
Amplitude, p(k)
2
0
x 104
x 104
Guitar, C3
-3
0
Amplitude, v(k)
3
2
Amplitude, g(k)
2
Amplitude, t(k)
x 104
x 104
following figure shows the characteristic waveform for a sampled 0.03 second segment of C4 note
played on a trumpet, guitar, violin and piano:
1
0
-1
-2
-3
0
0.5
1
1.5
2
2.5
time/seconds
0
0.5
1
1.5
2
2.5
time/seconds
Digitally sampled time waveforms representing the variation in sound pressure level of
0.03 second segments of a C4 note (fundamental frequency of 261.6Hz on the
Western music scale) played on a trumpet, guitar, violin and piano. The samples were
taken from the full notes shown in the figures below.
Clearly, although all of the instruments have a similar fundamental frequency, the varying harmonic
content gives them completely different appearances in the time domain. The volume envelope of
281
3
Trumpet, C3
1
0
-1
-2
-3
0.5
1
3
1.5
1
0
-1
-2
2
2.5
time/seconds
Violin, C3
1
0
-1
-2
-3
0.5
1
3
1.5
2
2.5
time/seconds
Piano, C3
2
Amplitude, p(k)
2
0
x 104
x 104
Guitar, C3
-3
0
Amplitude, v(k)
3
2
Amplitude, g(k)
Amplitude, t(k)
2
x 104
x 104
a musical note also contributes to the characteristic sound, as shown in the following figure (from
which the above 0.03 time segments were in fact taken):
1
0
-1
-2
-3
0
0.5
1
1.5
2
2.5
time/seconds
0
0.5
1
1.5
2
2.5
time/seconds
Time waveforms showing the sound pressure level volume envelope of a C3 note (fundamental
frequency of 261.6Hz on the Western music scale) played on a trumpet, guitar, violin and
piano. The amplitude envelope of the different musical instruments can be clearly seen.
DSPedia
282
Trumpet, C3
-10
-30
-40
-50
x 104 (dB)
Magnitude, G(f)
Magnitude, T(f)
-20
x 104 (dB)
0
0
2
0
4
6
Violin, C3
-10
Guitar, C3
-10
-20
-30
-40
0
2
0
4
6
8
10
frequency / kHz
Piano, C3
-10
-20
Magnitude, P(f)
Magnitude, V(f)
-20
-30
-40
-50
0
-50
8
10
frequency / kHz
x 104 (dB)
x 104 (dB)
To see the harmonic content of each of the four musical instruments we can perform a 2048 point
FFT on a representative portion of the waveform resulting in the following frequency domain plots:
0
2
4
6
8
10
frequency / kHz
-30
-40
-50
0
2
4
6
8
10
frequency / kHz
Frequency spectra of a C4 note (fundamental frequency of 261.6Hz on the Western
music scale) for a trumpet, guitar, violin and piano. The spectra were generated from a
0.05 second segment of the note.
Musical instruments are carefully designed to give them flexible tuning capabilities and, where
possible, good natural frequency resonating. For example violins can be designed such that
significant frequencies (such as A4, of fundamental frequency 440Hz) corresponds to the
resonance of the lower body of the instrument which as a result will enhance the sound, and also
the feeling and tactile feedback to the violinist [14]. Clearly the subtleties of the generation and
analysis of music is very complex, although the appreciation of music is very simple!
There are many other music scales such as the 22 note Hindu scale, and many other different Asian
scales. This perhaps explains why when someone who has never experienced Chinese music
listens to it for the first time it may be perceived off key and dissonant because it contains various
notes that are just not present in the familiar Western music scale. Another example of an
instrument that is not quite playing to the Western music scale are the Scottish bagpipes. The high
notes on the chanter are not in fact a full octave (frequency ratio of 2:1) above the analogous lower
notes. Hence the bagpipes can sound a little flat at the high notes. However, if the bagpipes are the
sound to which we had become accustomed, and anything else might not sound right!
Music synthesis is now largely achieved using digital synthesizers that use a variety of DSP
techniques to produce an output. See also Digital Audio, Percussion, Music Synthesis, Sound
Pressure Level, Western Music Scale.
283
Music Synthesis: Most modern synthesizers use digital techniques to produce simulated musical
instruments. Most synthesis requires setting up the fundamental frequency components with
appropriate relative harmonic content and a suitable volume profile. A good overview of this area
can be found in [14], [32]. See also Attack-Decay-Sustain-Release, Granular Synthesis, LA
Synthesis, Music.
284
DSPedia
285
N
n: ”n” (along with “k” and “i”) is often used as a discrete time index for in DSP notation. See Discrete
Time.
Narrowband: Signals are defined as narrowband if the fractional bandwidth of the signals is small,
say <10%. See also Fractional Bandwidth, Wideband.
Nasals: One of the elementary sounds of speech, namely plosives, fricatives, sibilant fricative,
semi-vowels, and nasals. Nasals are formed by lowering the soft palate of the mouth so blocking
the mouth and forcing the air stream to pass out via the nose, as in the letter “m”. See also
Fricatives, Plosives, Semi-vowels, and Sibilant Fricatives.
Natural Frequency: See Resonant Frequency.
Near End Echo: Signal echo that is produced by components in local telephone equipment. Near
end echo arrives before far end echo. See also Echo Cancellation, Far End Echo.
Neper: The neper is a logarithmic measure used to express the attenuation or amplification of
voltage or current where the natural logarithm (base e = 2.71828... ) is used rather than the more
normal base 10 logarithm:
V out
Neper (Np) = ln  ----------
 V in 
(431)
A decineper is calculated by multiplying the neper quantity by 10 (rather than 20 as would be used
for decibels):
V out
Decineper (dNp) = 10 ln  ----------
V in
(432)
To convert from nepers to decibels simply multiply by 20 log e = 8.686... . The neper should not be
confused with the Scottish word for turnips (or swedes) which is the neep. Traditionally neeps are
eaten on 25th January each year to celebrate the birthday of Robert Burns, the Scottish poet who
popularized Auld Lang Syne as well as many other of his own songs and poems. Neeps can of
course be eaten at other times of the year. There is no known means by which neeps can be
converted to decibels.
Neural Networks: Over the last few years the non-linear processing techniques known as neural
networks have been used to solve a wide variety of DSP related problems such as speech
recognition and image recognition [18], [112], [24]. The simplest forms of neural network can be
directly related to the adaptive LMS filter, however the multi-layer nature of even these simple
networks have very high computational loads. The name derives from the similarity of the
computational model to a simplified model of the nervous system in animals. The applications and
implementation of neural networks in DSP is set to grow in the next few years.
Newton LMS: See Least Mean Squares Algorithm Variants.
DSPedia
286
Noise: An unwanted component of a signal which interferes with the signal of interest. Most
signals are contaminated by some form of noise, either present before sensing, or actually induced
by the process of sensing the signal (conversion to electrical form) or the sampling process
(quantization noise). Computations on a DSP processor can also induce various forms of arithmetic
noise (round-off noise). Most DSP algorithms assume that noise sources can be well modelled as
additive, i.e., the noise is added to the signal of interest. See also Round-Off Noise, Truncation,
White Noise, Additive White Gaussian Noise.
time
Sine Wave
time
Sine Wave + Noise
time
Noise
A Sine wave corrupted by additive noise.
Noise Cancellation: Using adaptive signal processing techniques, noise cancellation can be used
to remove noise from a signal of interest in situations where a correlated reference of the noise
signal is available:.
s(k) + n(k)
d(k)
n’(k)
Adaptive
Filter
+
s(k)
−
e(k)
Adaptive Algorithm
Generic adaptive signal processing noise canceller. Signal s ( k ) is uncorrelated
with n ( k ) or n' ( k ) . However n ( k ) and n' ( k ) . are correlated.
287
Noise cancellation techniques are found in biomedical applications where, for example it is required
to remove mains hum periodic noise from an ECG waveform:
s(k) + n(k)
d(k)
n’(k)
Adaptive
Filter
+
s(k)
−
e(k)
Adaptive Algorithm
Adaptive noise cancellation of an ECG signal corrupted by mains hum.
Primary Microphone
s(k)
d(k)
n(k)
n’(k)
NOISE
Reference
Microphone
Adaptive
Filter
+
s(k)
−
e( k ) ≈ s(k )
Adaptive Algorithm
Adaptive noise cancellation of a speech signal corrupted by noise. The reference microphone
picks up the noise only, whereas the primary microphone picks both noise and speech. Note
that if the reference microphone also picks up speech then the adaptive noise canceller will try
to also cancel the speech signal. (This is clearly not the desired effect!)
See also Active Noise Control, Adaptive Line Enhancer, Adaptive Filter, Echo Cancellation, Least
Mean Squares Algorithm, Recursive Least Squares.
Noise Control: See Active Noise Control, Noise Cancellation.
Noise Dosemeter: For persons subjected to noise at the workplace, a noise dosemeter or sound
exposure meter can be worn which will average the “total” sound they are exposed to in a day. The
measurements can then be compared with national safety standards [46].
Noise Shaping: A technique used for audio signal processing and sigma delta analog to digital
converters where quantisation noise is high pass filtered out of the baseband. See also
Oversampling, Sigma Delta.
Noncausal: See Causal.
Noncoherent: See Coherent.
Nonlinear: Not linear. See also Linear System, Non-linear System.
DSPedia
288
Non-linear System: A non-linear system is one that does not satisfy the linearity criteria such that
if:
y1 ( n ) = f [ x1 ( n ) ]
(433)
y2 ( n ) = f [ x2 ( n ) ]
then:
a1 y1 ( n ) + a2 y2 ( n ) = f [ a1 x1 ( n ) + a2 x2 ( n ) ]
(434)
For example the system y ( n ) = 1.2x ( n ) + 3.4 ( x ( n ) ) 2 is nonlinear as it does not satisfy the
above linearity criteria. Any system which introduces harmonic distortion or signal clipping is nonlinear. Non-linear systems can be extremely difficult to analyse both mathematically and practically.
Low levels of nonlinear components that are relatively small in magnitude are often ignored in the
analysis and simulation of systems.
A simple way to test the linearity of a system is to input a single sine wave and vary the frequency
over the bandwidth of interest and observe the output signal. If the output contains any sine wave
components other than at the frequency of the input sine wave then it is nonlinear system. The most
common form of nonlinearity is called harmonic distortion. See also Distortion, Linear System, Total
Harmonic Distortion, Volterra Filter.
x(t)
y(t)
time, t
x(t)
|X(f)|
f0
sin 2πf 0 t
time,
Simple Nonlinear System
2
1
y ( t ) = x ( t ) + --- [ x ( t ) ]
2
y(t)
Nonlinear
component
|Y(f)|
frequency, f
f0
2f0 frequency
1
sin 2πf t + 1
--- – --- cos 2π ( 2f )t
0 4 4
0
Non-negative Definite Matrix: See Matrix Properties - Positive Definite.
Non-Return to Zero (NRZ): When a stream of binary data is to be sent serially, such as
transmission of PCM, the data can be sent as (half binary) return to zero (RZ), or (full binary) nonreturn to zero (NRZ). With RZ data streams after a 1 has been sent, the output waveform returns
289
back to 0, whereas with NRZ the output remains at 1 for the duration of the bit period. The waveform
bit period
NRZ
RZ
The same sequence of bits, 1011110, transmitted as RZ and NRZ
assumed below is polar. See also Bipolar (2), Polar.
Non-Simultaneous Masking: See Temporal Masking.
Nonsingular Matrix: See Matrix Properties - Nonsingular.
Non-Volatile: Semiconductor memory that does not lose information when the power is removed
is called non-volatile. ROM is an example of non-volatile memory. Non-volatile RAM is also
available.
Norm: See Vector Properties and Definitions - Norm.
1-norm: See Matrix Properties - 1-norm.
2-norm: See Matrix Properties - 2-norm.
2-norm of a Vector: See Vector Properties and Definitions - 2-norm.
Normal Equations: In least squares error analysis the normal equation is given by:
A T Ax LS = A T b
(435)
given the overdetermined system of equations:
Ax = b
(436)
where A is a known m × n matrix of rank n and with m > n, b is a known m element vector, and x
is an unknown n element vector. See also Least Squares, Overdetermined System,
Underdetermined System.
Normalised Step Size LMS: See Least Mean Squares Algorithm Variants, Step Size Parameter.
DSPedia
290
Notch Filter: A notch filter, H ( z ) removes signal components at a very narrow band of
frequencies:
20 log H ( f )
Gain (dB)
10
0
-10
-20
-30
-40
0
frequency (Hz)
A notch filter removes a very narrow band of frequencies.
Notch filters can be designed using standard filter design techniques for band-stop filters. One form
of notch filter can be designed using an all-pass IIR digital filter of the form:
r 2 – 2r cos θ + z –2H A ( z ) = ----------------------------------------------1 – 2r cos θ + r 2 z – 2
(437)
in the configuration:
x(k)
y(k)
HA(z)
0.5
1
Y(z)
H ( z ) = ------------ = --- ( 1 + H A ( z ) )
2
X(z)
Notch filter designed using an all pass filter HA(z).
The parameters cos θ and r are used to set the notch frequency and bandwidth of the notch. The
notch frequency, f n can be calculated from:
2πf
cos θ-----------------cos ----------n- = 2r
fs
1 + r2
(438)
291
which is calculated from Eq. 437 by noting the frequency when the phase shift of the output of the
all pass filter is –π radians (see below). The above notch filter can be drawn more explicitly as the
signal flow graph (SFG):
x(k)
x(k-1)
2
r
x(k-2)
-2rcosθ
1
y(k-3)
r2
y(k-2)
0.5 y(k)
y(k-1)
-2rcosθ
1
y ( k ) = r 2 x ( k ) – 2r cos θx ( k – 1 ) + x ( k – 2 ) + y ( k – 1 ) – 2r cos θ y ( k – 2 ) + r 2 y ( k – 3 )
Signal flow graph for a notch filter based on an all-pass filter.
In order to appreciate the notch filtering attribute of this filter, note that the all pass filter H A ( z ) has
a phase response of the form:
Phase (radians)
H A ( e jω )
All-pass filter
0
-π
-2π
0
frequency (Hz) fs/2
Typical form (i.e. -ve sigmoidal) phase response of the all-pass filter H A ( z ) . The actual
transition point through -π radians and the various graph slopes are determined by setting
the parameters r and cos θ .
fn
Therefore when the input signal is the frequency f n , then the phase of the output signal of the all
pass filter is exactly -π. When added to the input signal x ( k ) , the output y ( k ) is zero:
time
1
t π = ---fπ
time
y(k)
x(k)
HA(z)
zero output
0.5
time
time
When the output of the all pass filter produces a phase shift of – π radians for an input sine
wave input of f n Hz, the output, y ( k ) of the notch filter is zero.
As examples, using Eq. 438 we can design two notch filters with a notch frequency of
f n = 1250 Hz , for a sampling rate of f s = 10000 Hz . The first design has r = 0.8 and the second
design has r = 0.99 , thus giving different notch bandwidths :
DSPedia
292
Setting r close to 1 is equivalent to putting the poles and zeroes of the all-pass filter very close to
the unit circle.
Phase (radians)
Gain (dB)
20 log H ( f )
0
-20
r = 0.8
-40
1.64
π
cos θ = ----------- cos --1.6
4
-60
-80
0
1000
2000
3000
4000
H ( e jω ) Phase Response
π
π/2
0
-π/2
-π
5000
0
frequency (Hz)
0.1
0.2
0.3
0.4
0.5
frequency (Hz)
Notch filters at f n = 1250 Hz , with r = 0.8 and cos θ = ( 1.64 ⁄ 1.6 ) cos π ⁄ 4 .
Phase (radians)
Gain (dB)
20 log H ( f )
0
-20
r = 0.99
-40
1.8
π
cos θ = ----------- cos --1.81
4
-60
-80
0
1000
2000
3000
4000
5000
frequency (Hz)
H ( e jω ) Phase Response
π
π/2
0
-π/2
-π
0
0.1
0.2
0.3
0.4
0.5
frequency (Hz)
Notch filter at f n = 1250 Hz with r = 0.99 and cos θ = ( 1.8 ⁄ 1.81 ) cos π ⁄ 4 . The notch
bandwidth is smaller that the above design with r = 0.8 and cos θ = ( 1.64 ⁄ 1.6 ) cos π ⁄ 4 .
Note that the phase shift is very small at frequecies other than those near the notch
frequency
If a notch filter is to be used to remove a “single” frequency, then adaptive noise cancellation can
often be used as a suitable alternative if a suitable correlated noise source is available. See also
Adaptive Signal Processing, All-pass Filter, Digital Filter, Infinite Impulse Response Filter.
Noy: The noy is a measurement of noisiness similar in its measurement to a phon. It is defined as
the sound pressure level (SPL) of a band of noise from 910Hz to 1090 Hz that subjectively sounds
as noisy as the sound under consideration [46]. See also Equal Loudness Contours, Frequency
Range of Hearing, Phons, Sound Pressure Level.
Null Space: See Vector Properties - Null Space.
Numerical Integrity: Instability in a DSP system can either be (1) a function of feedback causing
large unbounded outputs, or (2) when very large numbers are divided by very small numbers, or
vice versa. Instability of type (2) can cause a loss of numerical integrity when the result is smaller
than the smallest decimal number or larger than the largest decimal number that can be
represented in the DSP processor being used. In the case of a number that is too small, then the
result will likely be returned as zero. However if this number is to be used as a dividend the result
is a divide by zero error, which will cause the algorithm to stop or become unstable by generating
a maximum amplitude quotient.
As an example consider a particular microprocessor that has precision of 3 decimal places. The
following matrix algorithm is to be implemented:
293
C = [ A –1 + B ] – 1
(439)
Where,
A = 1000 0
0 1
B = 00
02
(440)
Solving the problem using a processor with 3 decimal place of precision is straightforward and
gives:
1000 0
0 1
C =
–1
+ 0 0
0 1
0.001 0 + 0 0
0 1
01
=
= 0.001 0
0 2
–1
–1
(441)
–1
= 1000 0
0 0.5
However if the same problem was solved using a processor with only two places of decimal
precision, then:
1000 0
0 1
C =
=
–1
+ 0 0
0 1
0 0 + 0 0
0 1
0 1
= 00
02
–1
–1
(442)
–1
= non-invertible matrix
and the algorithm breaks down. See also Ill-Conditioned.
Numerical Properties: The ability of a DSP algorithm to produce intermediate results that are
within the wordlength of the processor being used indicates that the particular algorithm has good
numerical properties. If, for example, a particular DSP algorithm running on a 32 bit floating point
DSP processor produces intermediate values that require more precision than 32 bits floating point,
then clearly the final result will be in error by some margin. Therefore it is always desirable to used
algorithms with good numerical properties. In linear algebra, for solving a linear set of equations the
294
DSPedia
QR algorithm is recognised as having good numerical properties, whereas Gaussian Elimination
has very poor numerical properties. See also Round-Off Noise.
Numerical Stability: See Numerical Integrity.
Nyquist: The Nyquist frequency is the minimum frequency at which an analog signal must be
sampled in order that no information is lost (assuming the sampling process is perfect).
Mathematically, it can be shown that the Nyquist frequency must be greater than twice the highest
frequency component of the signal being sampled in order to preserve all information [10]. In
practical terms, real-world signals are never exactly bandlimited. However, the energy that gets
aliased is kept small in properly designed DSP systems. See also Aliasing.
295
O
Octave: An octave refers the interval between two frequencies where one frequency is double to
other. For example, from 125Hz to 250Hz is an octave, and from 250Hz to 500 Hz is an octave and
so on. It may seem strange that octave derived from the Greek prefix “oct” which means eight,
however this relates to the Western Music Scale whereby an octave is a set of eight musical notes
(of increasing frequency), and where the first note has half of the frequency of the last note. See
also Decade, Logarithmic Frequency, Roll-off, Western Music Scale.
Odd Function: The graph of an odd function has point symmetry about the origin such that
y = f ( x ) = – f ( – x ) . For example both the functions y = sin x and y = x 3 are odd functions.
In contrast an even function is symmetric about the y-axis such that y = f ( x ) = f ( x ) . See also Even
Function.
y
y
x
x
y = x3
y = sin x
Off-Line Processing: If recorded data is available on a hard disk and it is only required to process
this data then store it back to disk then the computation is not time limited and this is referred to as
off-line processing. If on the other hand an output must be generated as fast as an input is received
from a real world sensor then this is real-time processing. See also Real Time Processing.
Offset Keyed Phase Shift Keying (OPSK or OKPSK): See Offset Keying.
Offset Keyed Quadrature Amplitude Modulation (OQAM or OKQAM): See Offset Keying.
Offset Keying: A modulation technique used with quadrature signals (i.e., those signals that can
be described in terms of in-phase and quadrature, or cosine and sine, components). In offset
keying, symbol transitions for the quadrature component are delayed one half a symbol period from
those for the in-phase component.
OnCE: Motorola on-chip emulator that allows easy debugging of the DSP56000 family of
processors.
On-chip Memory: Most DSP processors (DSP56/96 series, TMS320, DSP16/32, ADSP 2100
etc.) have a few thousand words of on-chip memory which can be used for storing short programs,
and (significantly) data. The advantage of on-chip memory is that it is faster to access than off-chip
memory. For DSP applications such as a FIR filter, where very high speed is essential, the on-chip
memory is very important. See also DSP Processor, Cache.
On-line Processing: See Real Time Processing.
Operational Amplifier (or Op-Amp): An integrated circuit differential amplifier that has a very
high open-loop gain (of the order 100000), a high input impedance (MΩ), and low output impedance
(100Ω) over a relatively small bandwidth. By introducing negative feedback around the amplifier,
DSPedia
296
gain ratios of 1-1000 over a wide bandwidth can be set up. Op-Amps are very widely used for many
forms of signal conditioning in DSP audio, medical, telecommunication applications.
+
schematic icon for an op-amp
Oppenheim and Schafer: Alan Oppenheim and Ronald Schafer are the authors of the definitive
1975 text Digital Signal Processing published by Prentice Hall. Still a very relevant reference for
DSP students and professionals, although since then many other excellent texts have been
published.
Order of a Digital Filter: See Digital Filter Order.
Order Reversed Filter: See Finite Impulse Response.
Orthogonal Matrix: See Matrix Properties - Orthogonal.
Orthonormal Matrix: See entry for Matrix Properties - Orthogonal.
Orthogonal Vector: See Vector Properties and Definitions - Orthogonal.
Orthonormal Vector: See Vector Properties and Definitions - Orthonormal.
Otoacoustic Emissions: Sounds that are emitted spontaneously from the ear canal.
Measurements of these emissions are used to diagnose hearing loss and other pathologies within
the ear. The emissions are induced by stimulating the ear and then measured by recording the
response produced after the stimulus.
Outer Product: See Vector Properties and Definitions - Outer Product.
Overdetermined System of Equations: See Matrix Properties - Overdetermined System of
Equations.
Oversampling: If a signal is sampled at a much higher rate than the Nyquist rate, then it is
oversampled. Oversampling can bring two benefits: (1) a reduction in the complexity of the analog
anti-alias filter; and (2) an increase in the resolution achievable from an N-bit ADC or DAC.
As an example of oversampling for reducing the complexity of the analog anti-alias filter, consider
a particular digital audio system in which the sampling rate is 48kHz. The Nyquist criterion is
satisfied by attenuating all frequencies above 24kHz that may be output by certain musical
instruments (or interfering electronic equipment) by at least 96 dB (equivalent to a 16 bit dynamic
297
48
96
192
log freq
Magnitude
Input frequency spectrum
Sampling frequency fs = 48 kHz
12
24
48
96
192
log freq
Input frequency spectrum
Sampling frequency fs = 192 kHz
Magnitude
Anti- Alias Filter
18 kHz
0
240 dB/octave
-96
12
24
48
96
192 freq
Anti- Alias Filter
18 kHz
48 dB/octave
0
-96
12
24
48
96
192 freq
48 log freq
24
Anti-alias output frequency spectrum
Magnitude
24
Attenuation (dB)
12
Attenuation (dB)
Magnitude
range). If it is decided that the low pass filter will cut off at 18 kHz, and if 96dB attenuation is required
at 24kHz, then the filter requires a roll-off of 240 dB/octave as shown in the following figure:
12
24
48
96
log freq
Anti-alias output frequency spectrum
For a particular audio application, sampling at 48 kHz requires that the anti-alias has a sharp
cut-off at 18kHz to attenuate by 96dB at 24kHz. For a system that oversamples by a factor of
4, i.e. at 192 kHz the anti-alias analogue filter has a reduced roll-off specification as only
aliasing frequencies above 96 kHz must be removed to avoid baseband aliasing. Thereafter a
digital low pass filter can be designed to filter off the frequencies between 18 and 24 kHz prior
to a 4 x’s downsampling
Clearly this is a 40th order filter and somewhat difficult to reliably design in analogue circuitry!
(Please note the figures used here are for example purposes only and do not necessarily reflect
actual digital systems.) However if we oversample the music signal by 4 x’s, i.e. at
4 × 48 kHz = 192 kHz , then an analog anti-alias filter with a roll-off of only 48 dB/octave starting
at 18 kHz and providing more than 96dB attenuation at half of the oversampled rate of 96 kHz is
required as also shown in the above figure. (In actual fact the roll-off could be even lower as it is
very unlikely there will be any significant frequency components above 30 kHz in the original
analogue music.)
If an oversampled digital audio signal is input to a DSP processor, clearly the processing rate must
now run at the oversampled rate. This requires R x’s the computation of its Nyquist rate counterpart
(i.e. the impulse response length of all digital filters is now increased by a factor of R), and at a
frequency R x’s higher. Hence the DSP processor may need to be R x’s faster to do the same useful
processing as the baseband sampled system. This is clearly not very desirable and a considerable
disadvantage compared to the Nyquist rate system. Therefore the oversampled signal is decimated
to the Nyquist rate, first by digital low pass filtering, then by downsampling. Therefore any
frequencies that thereafter exist between 18 and 96 kHz can be removed with a digital low pass
filter prior to downsampling by a factor of 4. Hence the complexity of the analogue low pass antialias filter has been reduced by effectively adding a digital low pass stage of anti-alias filtering.
For an R x’s oversampled signal the only portion of interest is the baseband signal extending from
0 to f n ⁄ 2 Hz, where f n is the Nyquist rate and f s = Rf n , and hence the decimation described above
is required. Therefore in order to reduce the processing rate to the baseband rate the oversampled
signal is first digitally low pass filtered to the f n ⁄ 2 using a digital filter with a sharp cut-off. The
DSPedia
298
resulting signal is therefore now bandlimited to f n ⁄ 2 and can be downsampled by retaining only
every R-th sample. This process of oversampling has therefore reduced the specification of the
analog anti-alias filter, by introducing what is effectively a digital anti-alias filter. The design tradeoff is the cost of the sharp cut-off digital low pass (decimation) filter versus the cost of the sharp cutoff analogue anti-alias filter.
As well as reducing the cost, oversampling can be used to increase the resolution of an ADC or
DAC. For example, if an ADC has a quantization level of q volts the in band quantization noise
power can be calculated as:
2q 2 f
Q N = --------------B12f s
(443)
Baseband signal of interest
Signal Power
2q 2 f
Q N = --------------B12f s
Total quantization noise (q2/12)
fB
Quantisation noise
fs/2 freq
Therefore in order to increase the baseband signal to quantisation noise ratio we can either
increase the number of bits in the ADC or increase the sampling rate f s a number of factors above
Nyquist. From the above figure it can be seen that oversampling a signal by a factor of 4 x’s the
Nyquist rate reduces the in-band quantization noise (assumed to be a flat spectrum between 0 Hz
and f s ⁄ 2 Hz) by 1/4. This noise power is equivalent to an ADC with step size q ⁄ 2 and hence
baseband signal resolution has been increased by 1 bit [8]. In theory, therefore, if a single bit ADC
were used and oversampled by a factor of 4 15 ( ≈ 10 9 × f s ) then a 16 bit resolution signal could be
realized! Clearly this sampling rate is not practically realisable. However at a more intuitively useful
level, if an 8 bit ADC converter was used to oversample a signal by a factor of 16x’s the Nyquist
rate, then when using a digital low pass filter to decimate the signal to the Nyquist rate,
approximately 10 bits of meaningful resolution could be retained at the digital filter output. See also
Decimation, Noise Shaping, Quantisation Error, Sigma Delta, Upsampling, Undersampling.
299
P
P*64: Another name for the H.261 image compression/decompression standard.
Packet: A group of binary digits including data and call control signals that is switched by a
telecommunications network as a composite whole.
Parallel Adder: The parallel adder is composed of N full adders and is capable of adding two N bit
binary numbers to realise an N+1 bit result. A four bit parallel adder is:
MSB’s
s4
a3 b3
a2 b2
a1 b1
a0 b0
FA
FA
FA
FA
s3
General 4 bit addition:
s2
a3 a2 a1 a0
+ b3 b2 b1 b0
s4 s3 s2 s1 s0
s1
s0
A
Example:
B
S
LSB’s
0
1101
+1011
11000
13
11
Four bit binary addition can be performed using a simple linear array of full adder logic
circuits. For an N bit full adder, N full adders are required.
Because the above carry ripples from the LSB to the MSB (right to left) it is often called a ripple
adder. The latency of the adder is calculated by finding the longest path through the adder. The
above example is for simple unsigned arithmetic, however the parallel adder can easily be
converted to perform in 2’s complement arithmetic [20].
In general inside a DSP processor, the parallel adder will be integrated with the parallel multiplier
and arithmetic logic unit, thereby allowing single cycle adds, and single cycle multiply-add
operations. See also Arithmetic Logic Unit, Full Adder, Parallel Multiplier, DSP Processor.
Parallel Multiplier: The key arithmetic element in all DSP processors is the parallel multiplier
which is essentially a digital logic circuit that allows single clock cycle multiplication of N bit binary
numbers, where N is the wordlength of the processor. Consider the multiplication of two unsigned
4 bits numbers:
General 4 bit multiplication:
a3 a2 a1 a0
b3 b2 b1 b0
c3 c2 c1 c0
d3d2d1d0
e3 e2 e1 e0
f3 f2 f1 f0
p7p6p5p4p3p2p1p0
A
B
P=BxA
Example:
1101
1011
1101
1101
0000
1101
10001111
13
11
143 = 11 x 13
Binary multiplication can be performed using the same partial product formation as used
for decimal multiplication. This calculation can then be easily mapped onto an array of full
adders with single bit multiplication performed by a simple AND gate.
DSPedia
300
(In practice 2’s complement multiplication is required in DSP calculations to represent both positive
and negative numbers, however for the illustrative purpose here the unsigned parallel multiplier
should suffice; the 2’s complement multiplier requires only minor modification [20]). The above 4 bit
calculation can be mapped onto an array of binary adders/AND gates:
s
a
0
a3
0
a2
0
a1
0
a0
b
bout
c
FA
cout
aout
b0
0
0
sout
b1
0
0
cout = s.z.c + s.z.c + s.z.c + s.z.c
b2
0
0
z = a.b
sout = (s ⊕ z) ⊕ c
b3
0
aout = a
bout = b
p7
p6
p5
p4
p3
p2
p1
p0
Each cell of the parallel multiplier has a full binary adder and a logical AND gate. The
multiplier performs a binary multiplication by forming the partial products and summing
them together using the same mechanism as used in decimal. This multiplier is for
positive integer values. Some modification is required to produce a multiplier the operates
on 2’s complement arithmetic as required for DSP.
The above 4 bit multiplier produces an 8 bit product and requires 4 2 = 16 cells. Therefore a 16 bit
multiplier requires 16 2 = 256 cells and produces a 32 bit product, and a 24 bit multiplier requires
24 2 = 576 cells and produces a 48 bit product, and so on. Given that about 12 logic gates may be
required for each cell in the multiplier, and each gate requires say 5 transistors, the total transistor
count and therefore silicon area required for the multiplier can be very high in terms of percentage
of the total DSP processor silicon area. Most general purpose processors do not have parallel
multipliers and will perform multiplication using the processor ALU and form one partial product per
clock cycle, to produce the product in N clock cycles (where N is the data wordlength).
For some ASIC DSP designs a parallel multiplier may be too expensive and therefore a bit serial
multiplier may be implemented. These devices require only N cells, however the latency is N clock
cycles [12]. See also Division, DSP Processor, Full Adder, Parallel Adder, Square Root.
Parallel Processing: When a number of DSP processors are connected together as part of the
same system, this is referred to as parallel processing system, as the DSPs are operating in
parallel. Although defined as a research area on its own (for complex parallel systems), some
simple parallel processing approaches to decomposing DSP algorithms are usually rather obvious
where small numbers of DSPs are concerned.
Parseval’s Theorem: The total energy in a signal can be calcuated based on its time
representation, or its frequency representation. Given that the power calculated in both domains
must be the same, this equality is called Parseval’s theorem.
From the Fourier series, recall that a signal, x ( t ) , can be represented in terms of its complex Fourier
series:
301
∞
x(t) =
∑
C n e jnω 0 t
Synthesis
n=∞
T
(444)
1
Analysis
C n = --- ∫ x ( t )e –j nωo t dt
T
0
Complex Fourier Series Equations
The power in the signal, x ( t ) , can be calculated by integrating over one time period, T :
T
1
P = --- ∫ x 2 ( t ) dt
T
(445)
0
However if we calculated the power based on the power of each of the complex exponential signals,
then the total power is:
∞
P =
∑
∞
∞
∑
C n e jnω0 t 2 =
n = –∞
C n 2 e jnω0 t
2
=
∑
Cn 2
(446)
n = –∞
n = –∞
given that power in the complex exponential e jnω0 t = cos nω 0 t + j sin nω 0 t is 1. Hence for the
complex Fourier series representation of a signal, we can state Parseval’s theorem as:
∞
T
--1- ∫ x 2 ( t ) dt =
T
∑
Cn 2
(447)
n = –∞
0
If the periodic signal x ( t ) is real valued, we can also stated Parseval’s theorem in terms of the
amplitude/phase Fourier series representation. Recalling that for a period signal that:
∞
x(t) =
∑ Mn cos ( nω0 t – θn )
n=0
(448)
θn = tan– 1 B ⁄ A
Mn =
A n2 + B n2
where A n and B n are the Fourier coefficients then:
∞
P =
∑
∞
( M n cos ( nω 0 t – θ n ) ) 2 =
n=0
and Parseval’s theorem can be stated as:
∑
n=0
M n2
-------2
(449)
DSPedia
302
∞
T
--1- ∫ x 2 ( t ) dt =
T
∑
M n2
-------2
(450)
n=0
0
If a signal is aperiodic, the Parseval’s theorem can be stated in terms of the total energy in the signal
being the same in the time domain and frequency domain:
∞
E =
∫
∞
x ( t ) 2 dt =
–∞
∫
X ( f ) 2 df
(451)
–∞
See also Discrete Fourier Transform, Fourier Series, Fourier Transform.
Passband: The range of frequencies that pass through a filter with very little attenuation. See also
Filters.
PC-Bus: Plug in DSP cards (or boards) for IBM PC (AT) and compatibles conform to the PC-Bus
standard. Through the PC-Bus, a DSP processor will be provided with power, (12V and 5V), Ground
lines, and a 16 bit data bus for transfer between DSP board and PC. See also DSP Board.
Percentage Error: See Relative Error.
Perceptual Audio Coding: By exploiting well understood psychoacoustic aspects of human
hearing, data compression can be applied to audio thus reducing transmission bandwidth or
storage requirements [30], [52]. When the ear is perceiving sound, spectral masking or temporal
masking may occur - a simple example of spectral masking is having a conversation next to a busy
freeway where speech intelligibility will be reduced as certain portions of the speech are masked by
noisy passing vehicles. If a perceptual model can be set up which has similar masking attributes to
the human ear, then this model can be used to perform perceptual audio coding, whereby
redundant sounds (which will not be perceived) do not require to be coded or can be coded with
reduced precision. See also Adaptive Transform Acoustic Coding, Audiology, Auditory Filters,
Precision Adaptive Subband Coding (PASC), Psychoacoustics, Spectral Masking, Temporal
Masking, Threshold of Hearing.
Percussion: Any instrument which can be struck to produce a sound can be described as
percussive [14]. Percussion sounds are either pitched or unpitched. For example drums and
cymbals are usually unpitched instruments used to create and sustain the rhythm of music. Certain
type of drums however, such as timpani actually have an associated pitch. Xylophones and
marimba’s are pitched percussion instruments with a range of three or four octaves.
303
Drum
3
x 10-4
x 10-4
In the figures below the sound pressure level volume envelope, a short time segment and a
frequency domain representation is shown for a cymbal strike and a snare drum beat.
2
Amplitude, c(k)
Amplitude, d(k)
2
Cymbal
3
1
0
-1
-2
-3
1
0
-1
-2
-3
0
0.5
1
1.5
2
2.5
time/seconds
0
0.5
1
1.5
2
2.5
time/seconds
Drum
3
x 10-4
x 10-4
The variation in sound pressure level for a drum beat and cymbal strike. Both signals last
for about 1.5 seconds. From a simple visual inspection the cymbal seems to have more
sustain and is a “fuller” waveform.
2
Amplitude, c(k)
Amplitude, d(k)
2
Cymbal
3
1
0
-1
-2
-3
0.7
0.71
0.72
0.73
1
0
-1
-2
-3
0.74
0.75
time/seconds
0.7
0.71
0.72
0.73
0.74
0.75
time/seconds
Drum
0
Magnitude, C(f) (dB)
Magnitude, D(f) (dB)
A short 0.15 second segment of the drum and cymbal signals clearly shows the cymbal to
contain a wider range of higher frequencies. Both signals are random in nature with little
discernible periodic content.
-10
-20
-30
-40
-50
-10
-20
-30
-40
-50
0
2
4
6
8
10
frequency/kHz
Cymbal
0
0
2
4
6
8
10
frequency/kHz
Taking an FFT over a short 0.05 segment of the drum and cymbal waveforms serves to
illustrate the stochastic nature of the two sounds.
From the above figures it can be seen that the drum beat and cymbal strike signals both appear to
be stochastic in nature although given that they produce sound based on a resonating impulse there
is clear quasi-periodic content. These signals also possess a degree of regularity in that successive
DSPedia
304
strikes sound “similar”. The drum exhibits a lower frequency content than the cymbal which is
consistent with the more “bassy” sound it has.
The sound pressure level created by drums and cymbals depends on the force with which they are
struck; both are capable of generating up to 100 dB at a distance of 1 metre. See also Music,
Western Music Scale.
Perfect Pitch: The ability to exactly specify the name of a musical note being played on the
Western music scale is called perfect pitch. Only a very few individuals have perfect pitch, and there
is still some debate to whether such skills can be learned. Many individuals and musicians have
good relative pitch, whereby given the name of one note in a sequence, they can correctly identify
others in the sequence. See also Music, Pitch, Relative Pitch, Western Music Scale.
Permanent Threshold Shift (PTS): When the threshold of hearing is raised due to exposure to an
excessive noise a permanent threshold shift is said to have occurred. See also Audiology,
Audiometry, Temporary Threshold Shift (TTS), Threshold of Hearing.
Permutation Matrix: See Matrix Structured - Permutation.
Period: The period, T, of a simple sine waveform is the time it takes for one complete wavelength
to be produced. The inverse of period, gives the frequency, or the number of wavelengths in one
sec:
f = --1T
(452)
period
f(t)
T
2T
3T
time t
Personal Computer Memory Card International Association (PCMCIA): The name given to
bus slots that became almost standard on notebook and subnotebook PCs around 1994. PCMCIA
cards were originally memory cards, but now modems, small disk drives, digital audio soundcards,
and DSP cards are available. The term PC Card is now being used in preference to the rather
unwieldy acronym PCMCIA [169].
Personal Digital Assistant (PDA): A consumer electronics category which classifies handheld
computers that can decode handwritten information (pattern recognition) and communicate with
other computers and FAX machines [169].
Phase: The relative starting point of a periodic signal, measured in angular units such as radians
or degrees. Also, the angle a complex number makes relative to the real axis. A sine wave
(occurring with respect to time) can be written as:
x ( t ) = A sin ( 2πft + φ )
(453)
305
Voltage
where A is the signal amplitude; f is the frequency in Hertz; φ is the phase and t is time.
A sin(φ)
period = 1/f
A
time t
Phase Compensation: A technique to modify the phase of a signal, but leaving the magnitude
response unchanged. Phase compensation is usually peformed using an all-pass filter. If the phase
of a system is compensated to produce an overall linear phase, then this is often refered to as group
delay equalisation as linear phase corresponds to a constant group delay. See All-pass FilterPhase Compensation, Equalisation, Finite Impulse Reponse Filter - Linear Phase.
Phase Delay: A term usually synonymous with group delay. See Group Delay.
Phase Jitter: In telephony the measurement (in degrees out of phase) that an analog signal
deviates from the referenced phase of the main data carrying signal. Phase jitter interferes with the
interpretation of information by changing the timing or misplacing a demodulated signal in
frequency. See also Clock Jitter.
Phase Modulation: One of the three ways of modulating a sine wave signal to carry information.
The sine wave or carrier has its phase changed in accordance with the information signal to be
transmitted. See also Amplitude Modulation, Frequency Modulation.
Phase Response: See also Fourier Series - Amplitude/Phase Representation, Fourier Series Complex Exponential Representation.
Phase Shift Keying (PSK): A digital modulation technique in which the information data bits are
encoded in the phase of the carrier signal. The receiver recovers the data bits by detecting the
phase of the received signal over a symbol period and decoding this phase into the appropriate data
bit pattern. See also Amplitude Shift Keying, Differential Phase Shift, Frequency Shift Keying.
Phasing: A musical effect whereby the phase of a signal is modified, mixed (or added) with original
signal, and the composite signal is then played [32]. See also Music, Music Synthesis.
Phons: The phon (pronounced fone) is a (subjective) measure of loudness. The units of phons are
given to the sound pressure level of a 1000Hz tone that a human listener has judged to be equally
loud to the sound to be measured. Hence to measure a particular sound in phons would require a
listener to switch back and forth between a calibrated, variable 1000Hz tone and the sound to be
measured. See also Equal Loudness Contours, Equivalent Sound Continuous Level, Frequency
Range of Hearing, Sound Pressure Level.
Piezoelectric: Piezoelectric materials can convert mechanical stress into electrical output energy,
hence they are widely used as sensors. Piezoelectric crystals are also used in a feedback
configuration to make very precise clocks.
Pipelining Execution: DSP processors having RISC architectures often implement a pipelining
structure whereby instructions are executed by the processor in four stages: (1) Instruction Fetch,
DSPedia
306
(2) Instruction Decode, (3) Memory Read, (4) Execute. Each stage takes one cycle of the processor
clock, meaning that each instruction is a minimum of 4 clock cycles. However because the DSP
processor has been designed to be pipelined, the processor can perform all four stages in one
cycle. Hence this overlapping means that on average one instruction can be executed every clock
cycle.
Pink Noise: Pink noise is similar to white noise, except that rather than having a flat power
spectrum, it falls off at 10dB/decade. Pink noise is sometimes referred to a 1 ⁄ f noise.
Pitch: There are a number of varying definitions of pitch, however the generic meaning is the
subjective quality of a sound which positions it somewhere in the musical scale [14]. As the number
of cycles per second of a musical note increases linearly our perceived sense of pitch increases
logarithmically. Although very similar to frequency which is measured exactly, pitch is determined
subjectively. For example if two pure tones of slightly different frequencies are presented to a
listener and they are allowed to adjust the intensity levels of one of them, then it is likely that they
will be able to find a level where both tones sound as if they have the same pitch. Pitch is therefore
to some extent dependent on intensity. At louder levels for low frequency tones the pitch decreases
with increase in intensity, but for high tones the pitch increases with increase in intensity. See also
Music, Perfect Pitch, Western Music Scale.
Pivotting: See Matrix Decompositions - Pivoting.
Plane Rotations: See Matrix Decompositions - Plane Rotations.
Plosives: One of the elementary sounds of speech, namely plosives, fricatives, sibilant fricative,
semi-vowels, and nasals. Plosives are formed by blocking the vocal tract so that no air flows and
suddenly removing the obstruction to produce a puff of air. Examples of plosive sounds are “p”, “b”,
“t”, “d”, “g”, and “k”. See also Fricatives, Nasals, Semi-vowels, and Sibilant Fricatives.
PN Sequence: See Pseudo-Random Noise Sequence.
Polar: Polar refers to the type of signalling method used for digital data transmission, in which the
marks (ones) are indicated by positive polarities and the spaces (zeros) are indicated by negative
polarities (or vice-versa). See also Bipolar (2), Non-return to Zero.
Poles: If the impulse response of a recursive system (with feedback) is transformed into the zdomain, the poles of the function are found by factoring the denominator polynomial to find the
roots. If the poles are outside the unit circle, then this is an indication that the system is unstable.
The transfer function H(z) of a simple two pole IIR filter with the output y(n) = x(n) + 0.75y(n-1) 0.125 y(n-2) is stable:
x(k)
0.75
y(k)
0.125
307
1
1
- = ----------------------------------------------------------------H ( z ) = --------------------------------------------------------------–
–
–
1
2
1
( 1 – 0.75z + 0.125z )
( 1 – 0.5z ) ( 1 – 0.25z –1 )
(454)
i.e. the poles are z = 0.25 and z = 0.5. If the roots were outside of the unit circle (having a magnitude
greater than 1), then the system, h(n) would be unstable.
Positive Definite Matrix: See Matrix Properties - Positive Definite.
Positive Semi-definite: See Matrix Properties - Positive Semi-definite.
Postmultiplication: See Matrix Operations - Postmultiplication
Power Spectral Density (PSD): The power spectral density describes the frequency content of a
stationary stochastic or random signal. The PSD can be estimated by taking the average of the
magnitude squared DFT sample values (the periodogram). Many other DSP techniques have been
developed for estimating signal frequency content. This area of research is collectively call spectral
estimation. The PSD is calculated from the Fourier transform of the autocorrelation function:
∞
Power Spectral Density, S ( f ) =
∑
r ( n )e –j2πfn
(455)
n = –∞
where the autocorrelation function, r ( n ) , provides a measure of the predictability of a signal, x ( k ) :
r ( n ) = E { x ( k )x ( k + n ) } =
∑ x ( k )x ( k + n )p { x ( k ), x ( k + n ) }
(456)
k
where p { x ( k ), x ( k + n ) } is the joint probability density function of x ( k ) and x ( k + n ) . For
signals assumed to be ergodic the autocorrelation can be estimated as a time average:
1
r ( k ) = -----------------2M – 1
2M – 1
∑
x ( n )x ( n + k ) for large M
(457)
k=0
If a particular autocorrelation function is estimated for n different time lags, then a PSD estimate can
be computed as the DFT of these correlations.
.See also Autocorrelation, Discrete Fourier Transform.
Power Rails: The voltage used to power a DSP board will usually consist of a number of voltage
sources, which are often referred to as power rails. For a DSP board, there are usually digital power
rails (0 volts and 5 volts) to power the digital circuitry, and analog power rails (-12 volts, 0 volts, and
+12 volts) to power the analog circuitry.
DSPedia
308
PQRST Wave: The name given to the characteristic shape of an electrocardiogram (heartbeat)
signal waveform. See also Electrocardiogram.
Amplitude (mV)
0.6
R
0.5
0.4
0.3
0.2
0.1
0
-0.1
T
P
Q
-0.2
0
0.1
S
0.2
0.3
0.4
0.5
time (secs)
Precedence Effect: In a reverberant environment the sound energy received by the direct path can
be much lower than the energy received by indirect reflected paths. However the human ear is still
able to localize the sound location correctly by localizing the first components of the signal to arrive.
Later echoes arriving at the ear increase the perceived loudness of the sound as they will have the
same general spectrum. This psychoacoustic effect is known as the precedence effect, law of the
first wavefront, or sometimes the Haas effect. The precedence effect applies mainly to short
duration sounds or those of a discontinuous or varying form. See also Ear, Lateralization, Source
Localization, Threshold of Hearing.
Precision Adaptive Subband Coding (PASC): A data compression technique developed by
Philips and used in hifidelity digital audio systems such as digital compact cassette (DCC). PASC
is closely related to the audio compression methods defined in ISO/MPEG layer 1. Listening tests
have revealed that the overall quality of PASC encoded music is “almost identical to that of compact
disc (CD)”. In fact it has been argued that in terms of dynamic range DCC has improved
performance given that it is compressing 20 bit PCM data compared to the encoding of 16 bit PCM
data by a CD [83].
Precision adaptive subband coding compresses audio by not coding elements of an audio signal
that a listener will not hear. PASC is based mainly on two psychoacoustic principles. First, the ear
only hears sounds above the absolute threshold of hearing, and therefore any sounds below this
threshold do not require to be coded. Second louder sounds spectrally mask quieter sounds of a
“similar” frequency such that the quiet sound is unheard in the simultaneous presence of the louder
309
80
70
60
50
40
30
20
10
0
-10
20
80
70
100Hz narrowband noise at
60
10dB (SPL) is not perceived
50
Approximate absolute
40
threshold of hearing
30
20
10
0
-10
50 100 200
500 1000 2000 5000 10000
20
frequency (Hz)
Raised threshold
of hearing
600Hz tone at
20dB (SPL) is not
perceived
SPL (dB)
SPL (dB)
sound due to the psychoacoustic raising of the threshold of hearing. The following figure illustrates
both principles:
A sound below the threshold of hearing
50
100 200
500 1000 2000 5000 10000
frequency (Hz)
Simultaneous spectral masking
of a 1000Hz tone
Attenuation (dB)
In order to exploit psychoacoustic masking the first stage of a PASC system splits the Nyquist
bandwidth of a signal (of between 16 and 20 bit resolution) sampled at 48kHz into 32 equal
subbands each of bandwidth 750Hz. This is accomplished using a 512 weight prototype FIR low
pass filter, h ( n ) , of 3dB bandwidth 375Hz, and stopband attenuation 120dB. Note that to achieve
120dB attenuation 20 bit filter coefficients are required. By modulating the impulse response h ( n )
with modulating frequencies of 375Hz, 1125Hz, 1875Hz and so on in 750Hz intervals, a series of
32 bandpass filters with a 3dB bandwidth of 750Hz and centered around the modulating frequency
are produced. A polyphase subband filter bank is therefore set up as illustrated below
512 weight prototype
FIR filter
0
3
0
375
1
1125
Subband filters
BW = 750Hz
2
3
4
1875
2625
3375
30
31
22875
23625
0
x ( k)
input
0
1 2 3 4
382 383
Polyphase
subband
filter bank
0 1 2
11
n
0 1 2
11
n
0 1 2
11
n
0 1 2
11
n
1
2
k
f s = 48 kHz
f s ⁄ 32 = 1.5 kHz
31
32 subbands used for PASC. The filter bank is based on a 512 weight FIR filter
prototype with stopband attenuation of 120dB, i.e. 20 bits resolution. Data is input in
8 ms blocks (384 samples) and each subband is decimated to 12 samples.
(Note that although aliasing occurs between adjacent subbands, the alias components are
cancelled when the subbands are merged to reconstruct the original audio data spectrum [49].) The
DSPedia
310
input data stream is subband filtered in blocks of 8 ms, which corresponds to 384 samples
( 48000 × 0.008 ). Therefore the output of each subband filter after decimation consists of 12
samples.
With the signal in subband coded form the second stage of the PASC system is to perform a
comparison of the full audio spectrum with a model of the human ear. The subband filtering allows
a simple (but coarse) spectral analysis of the signal to be produced by calculating the power of the
12 sample values in each subband. If the power in a subband is below the threshold of hearing,
then the subband is treated as being empty and does not need to be coded for the particular 8ms
block being analyzed. If the power in a particular subband is above the threshold of hearing then a
comparison is made with the known masking threshold to calculate the in-band masking level.
Following this the level of masking caused by this signal in other neighboring subbands is
established. The overall masking calculation is accomplished using a 32 × 32 matrix containing the
masking information and defined in the ISO/MPEG standard.
From the masking calculation results, a decision is made as to the number of bits that will be
allocated to represent the data in that subband such that the quantization noise introduced is below
the masking level (or raised threshold of hearing) and will therefore not be heard when the audio
signal is reconstructed. The bit rate of a PASC encoded time frame of 8ms is fixed at 96 bits/frame
(for each subband, on average). Therefore the bits must be allocated judiciously to the subbands.
The subbands with the highest power relative to the masking level are allocated first as it is likely
they will be important and dominant sounds in the overall audio spectrum and will require the best
resolution. If two subbands have the same ratio, the lower frequency subband is given priority over
the higher one. An example of quantization noise masking is given below:
0
Masking
Level
1000Hz signal
Signal Power (dB)
Signal Power (dB)
1000Hz signal
750 1000 1500
log frequency (Hz)
16 bit quantization noise
0
Masking
Level
750 1000 1500
log frequency (Hz)
8 bit quantization noise
The 1000 Hz narrowband noise will spectrally mask any signals below the masking level
(or raised threshold of hearing). Therefore, considering only this subband, when the
signal is reproduced the higher level of quantization noise in the 8 bit signal will not be
perceived. Hence the 8 bit signal has the same perceived quality as the 16 bit signal
and data compression has been achieved without noticeable loss in quality. Note the
masking effect of signals in nearby subbands may extend into the 750-1500Hz subband
which could further increase the masking level and therefore allow even fewer bits to
represent the signal.
Rather than fixed point sample values (as used in the above illustrative example) PASC uses a
simple block floating point number representation to represent sample values. The mantissa can
be between 2 and 15 bits and the exponent is a 6 bit value. The actual number of bits assigned to
the mantissa depend on the masking calculations. This leads to an overall dynamic range from
311
+6dB to -118dB (the extra 6dB headroom is required due to the subband filtering process) which is
more than the 96dB available from 16 bit linear coding.
On average a psychoacoustic subband coded music signal rarely assigns bits to subbands
covering the frequency range 15 kHz to 24 kHz (i.e. they are usually empty!), around 3 to 7 mantissa
bits will typically be required for subbands covering the frequency range 5kHz - 15 kHz, and for the
frequency range 100Hz - 5 kHz between 8 and 15 bits are typically required. The higher bit
allocation for lower frequencies is as expected as the masking effect is less pronounced at lower
frequencies (see Spectral Masking). This allocation of precision would perhaps suggest that the
initial subband structure should have a small bandwidth for low frequencies and a higher bandwidth
for larger frequencies. However the small bandwidth required at low frequencies would require a
very long impulse response filter which needs to be compensated for by delaying the output signal
from higher subbands which have a smaller bandwidth if phase is to be preserved. To implement
this delay on chip requires such a large area that this solution is not economically attractive, albeit
good compression ratios would be possible.
After each 8 ms time frame has undergone the PASC coding and bit allocation, the data is then
stored in a encoded bit stream for recording to magnetic tape. Cross interleaved Reed-Solomon
code (CIRC) is used for error correction coding of PASC data when recorded onto DCC (digital
compact cassette).
PASC techniques can also be applied to input data sampled at 32kHz or 44.1kHz. Because the data
rate stays the same at 384bits/sec, the subband filter bandwidth for these sampling frequencies
reduces to 500Hz and 698Hz respectively.
See also Adaptive Transform Acoustic Coding (ATRAC), Auditory Filters, Compact Disc, Data
Compression, Digital Compact Cassette (DCC), Frequency Range of Hearing, Psychoacoustics,
Spectral Masking, Subband Filtering, Temporal Masking, Threshold of Hearing.
Premultiplication: See Matrix Operations - Premultiplication
Probability: The use of probabilistic measures and statistical mathematics in digital signal
processing is very important. Specifically the concept of a random variable which is characterised
via a probability density function (PDF) is very important. With probability, random signals can be
characterised and information on their frequency content can be realised.
In its simplest form the probability of an event A happening, and denoted as p ( A ) can be
determined by performing a large number of trials, and counting the number of times that event A
occurs. Therefore:
no. of times A occured, n
P ( A ) = lim --------------------------------------------------------------------Atotal no. of trials, n
n→∞
(458)
determines the probability of event A occurring. A simple example is the shaking of a die to
determine the probability of a 6 occurring. If, for example 60 trials were done and a 6 occurred 8
times then P d ( 6 ) = 8 ⁄ 60 , where the subscript “d” specifies the process name. Of course the true
probability is P d ( 6 ) = 1 ⁄ 6 which would have been determined if an “infinite” number of trials were
done.
From the above simple definition, it can be noted that 0 ≤ P ( A ) ≤ 1 . Clearly if P ( A ) = 0 (the null
event) then the event (almost) never occurs, whereas if P ( A ) = 1 then it (almost) always occurs
DSPedia
312
(the sure event). If you find the paranthetical “almosts” annoying, amusing, confusing, etc.,
remember that probability means never having to say you’re certain (or was that statistics?).
The joint probability that the event AB will occur is denoted P ( AB ) . The following definitions are
also useful for probability:
• Bayes Theorem: The joint probability that an event AB occurs can be expressed as:
P ( AB ) = P ( A )P ( B A ) = P ( B )P ( A B )
(459)
If two events A and B are independent then P ( A B ) = P ( A ) or P ( B A ) = P ( B ) .
• Conditional Probability: The probability that an event A occurs, where an event B has already occurred
is denoted as P ( A B ) .
• Independence: Two separate events, A and B, are independent if the probability of A and B occurring is
obtained from the multiplication of the probability of A occurring, and B occurring:
P ( AB ) = P ( A )P ( B )
(460)
• Joint Probability: The probability of two events, A and B, occurring is:
no. of times AB occured, n AB
P ( AB ) = lim ---------------------------------------------------------------------------total no. of trials, n
n→∞
(461)
where the notation P(AB) can be read “the probability of event A and event B. As an example consider an
experiment where a coin is flipped, and a die is shaken at the same time. The probability that a head shows
up P c ( head ) , and the number 3, P d ( 3 ) is:
1 1
1
P ( head & 3 ) = P d ( 3 )P c ( head ) = --- × --- = -----6 2
12
(462)
The shaking of the die and flipping of the coin are both independent events, i.e. the outcome of the coin
flip has no bearing on the outcome of the die shake.
See also Ergodic, Expected Value, Mean Value, Mean Squared Value, Probability, Random
Variable, Variance, Wide Sense Stationarity.
Probability Density Function: See Random Variable.
Proportional Integral Derivative (PID) Controller: Process control applications monitor a
variable such as temperature, level, flow and so on, and output a signal to adjust that variable to
equal some desired value. In a PID the difference between the desired and measured variable is
found (the error), and if large then the integral part of the controller causes the output to change
faster and the derivative adjusts the magnitude of the output (controlling) signal in proportion to the
error rate. PID controllers usually do not require the processing power of a DSP as the data
processing rates are well within that of microcontrollers.
Pseudo-Inverse: See Matrix Properties - Pseudo-Inverse.
Pseudo-Inverse Matrix: See Matrix Properties - Pseudo-Inverse.
313
Pseudo-Noise (PN): Analog pseudo-noise can be generated using pseudo random binary
sequence generator connected to a digital to analog converter (DAC):
Clock, fc
N-bit Pseudo Random Binary
Sequence Shift Register
1
t c = ---fc
x (k )
2N-1-1
x(k)
0
N-bit DAC
k
-2N-1
Analog Reconstruction Filter
Volts
x ( t)
0
t
The period of the pseudo noise is Nt c seconds. There are of course other methods of producing
analog “noise”, however the term pseudo noise usually indicates that the sequence was generated
using pseudo random noise sequence generating schemes. See also Pseudo-Random Noise
Sequence, Pseudo-Random Binary Sequence.
Pseudo-Random Binary Sequence (PRBS): The PRBS is a binary sequence generated by the
use of an r-bit sequential linear feedback shift register arrangement. PRBS’s are sometimes called
pseudo noise (PN) sequences and pseudo random noise (PRN). PRBS’s are widely used in digital
communications, where for example both ends of digital channel contain a circuit capable of
generating the same PRBS, and which can therefore allow the bit error rate of the channel to be
measured, or perhaps adaptive equalization to be performed.
Exclusive-OR gate
PRBS
Generator
PRBS
Generator
0100101110110
0100101110110
Receiver &
Demodulation
Modulation &
Transmission
Error if
output = 1
0100101110110
Communication line
Duluth, Minnesota, USA
Glasgow, Scotland
A PRBS sequence can be transmitted down a communications line (e.g. telephone,
satellite etc.) and the data sequence received at the receiver checked against the
known transmitted sequence, assuming the two PRBS generators are synchronised
and producing the same sequence. If the output of an exclusive-OR gate is binary 1,
then an error has occurred.
Other applications include using PRBS for spread spectrum communications [9], for scrambling
data, and using a PRBS for range finding via radar or sonar [116].
DSPedia
314
A PRBS is called pseudo random because in actual fact the sequence repeats over a large number
of bits and is therefore actually periodic, however the short term behaviour of the sequence appears
random. The general construction of a PRBS producing linear feedback shift register of length r bits
is:
PRBS output
Exclusive-OR
p(k)
C1
C2
C3
C4
Single Bit
Register
Cr
Cn
Single Bit
Multiplier
where the register is clocked at every T c seconds (often denoted as the chip interval), and the
binary data signal, p ( k ) , is therefore output at a rate of f c = 1 ⁄ T c . The longer the register, then
the longer the PRBS that can be generated. The values of the single bit multipliers C r are either 0
or 1 and they can be represented in a convenient characteristic polynomial notation:
r
f( X) = 1 +
∑ Ck Xk
= Cr X r + Cr – 1 X r – 1 + … + C1 X + 1
(463)
k=1
By carefully choosing the polynomial it is possible to ensure that the shift register cycles through all
the possible states (or N-tuples), with the exception of the all zero state [40]. This will produce a
PRBS of 2 r – 1 bits (and known as a maximal sequence) before the cycle restarts. If the register
ever enters the zero state it will never leave. As an example consider a 31 bit maximal length
sequence can be produced from the polynomial:
X5 + X2 + 1
(464)
which specifies the 5 bit PRBS shift register:
PRBS output
p( k)
For a particular PRBS, a sequence of the same bits (either 1’s or 0’s) is referred to as “run”, and the
number of bits in the run, is the “length”. For a maximal length sequence from an r bit register of
length N (= 2 r – 1 ) bits it can be shown that the PRBS will contain one run length of N 1’s, and one
315
run of N – 1 0’s. The number of other run lengths of 1’s and 0’s increases with the power of 2 as
follows:
Run Length
1’s
0’s
N
1
0
N-1
0
1
N-2
1
1
N-3
2
2
:
:
:
3
N-5
2
2N-5
2
2N-4
2N-4
1
2N-3
2N-3
For example an r = 4 bit shift register can be set up from the polynomial X 4 + X 3 + 1 to produce a
15 bit maximal length PRBS as follows:
p(k)
1
0
0
0
1
0
p(k)
0
1
1
0
1
0
1
1
1
1
0
15 bits
time
Priming the shift register with 0001, will cause it to cycle through 1000, 0100, 0010,
1001, 1100, 0110, 1011, 0101, 1010, 1101, 1110, 1111, 0111, 0011, and back to
0001. If the contents of the shift register are considered as a binary number, then a
PRBS generator contains all binary numbers from 1 to 2 N – 1 in a “random” order.
Note that the PRBS has a sequence of four 1’s, one sequence of three 0,s and so on
accordance with the above table denoting the run lengths for an N bit PRBS.
Note that when a PRBS is generated over N clock cycles, then the shift register contains at some
point, all binary numbers from 1 to 2 r – 1 , i.e. except zero, a state from which the PRBS can never
leave. Feedback taps for some maximal length sequences using longer shift lengths are shown in
the table below:
Shift Register
Length, r
Maximal Code
Length, N
Maximal Sequence
Generating Polynomials
5
31
X5+X3+1
8
255
X8+X6+X5+X4+1
10
1023
X10+X7+1
16
65535
X16+X15+X13+X4+1
20
1048575
X20+X17+1
24
16777215
X24+X23+X22+X17+1
DSPedia
316
Note that other polynomials can be used to generate other maximal length sequences of N bits. The
actual number of maximal length generating polynomials can be calculated using prime factor
analysis [116].
A useful property of a maximal length sequence is that the alternate bits in a sequence form the
same sequence at half of the rate. Consider two runs of the above 15 bit PRBS sequence generated
from the polynomial X 4 + X 3 + 1 and creating a new sequence by retaining only every second bit:
15 bits
15 bits
1 0 0 0 1 0 0 1 1 0 1 0 1 1 1 1 0 0 0 1 0 0 1 1 0 1 0 1 1 1 1
1
2
1
3
4
0
5
6
7
1
8
0
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1
1
1
1
0
0
0
1
0
0
1
Taking only every alternate bit then the same PRBS is generated but at half of the
frequency. For example above, taking bits, 1, 3, 5 and so on, produces the same
PRBS at half of the frequency. In turn the PRBS sequence at one quarter of the
frequency can be produced from the half rate PRBS, and so on decimating by any
factor R, where R is a power of 2.
If a signal q ( k ) is derived from the PRBS signal p ( k ) such that
 1 volt, if p ( k ) = 1
q( k) = 
 – 1 volt, if p ( k ) = 0
(465)
then the autocorrelation of a maximal length PRBS, q ( k ) , of N bits is:
1
R q ( m ) = --L
 1,
m = jN

∑ p ( n )p ( n + m ) =  – ---1- , m ≠ jN where j = 0, 1, 2, …
N
n=0

L–1
(466)
(for L large), which can be represented as:
Rq ( m )
1
0
-2N
-N
6 5 4 3 2 1
1 2 3 4 5 6
-1/N
N-1 N
2N
m
317
It can therefore be shown that the autocorrelation of the continuous time waveform q ( t ) is also
periodic and is a triangular waveform:
Rq ( τ )
1
0
-2NTc
lag, τ
NTc
-NTc
2NTc
-1/N
The power spectrum, P ( f ) , obtained from the Fourier transform of the autocorrelation, is therefore
a line spectrum, with a ( sin x ⁄ x ) 2 envelope:
log P ( f )
1--N
1
∆f = ---------NT c
1----Tc
2----Tc
3----Tc
4----Tc
f, frequency (Hz)
Similar types of feedback shift registers to the PRBS generator are also used for setting up cyclic
redundancy check codes. See also Characteristic Polynomial, Cyclic Redundancy Check.
Pseudo-Random Noise Sequence (PRNS): A sequence numbers that has properties that make
the sequence appear to be random, in spite of the fact that the numbers are generated in a
deterministic way and therefore periodic. Linear feedback shift registers are often used to generate
these sequences. Maximal Length (ML) binary sequences produce 2 N – 1 bit sequences (the
longest sequence possible without repetition) from an N bit shift register. See also Pseudo-Random
Binary Sequence.
Psychoacoustics: The study of how acoustic transmissions are perceived by a human listener.
Psychoacoustics relates physical quantities such as absolute frequency and sound intensity levels
to perceptual qualities, such as pitch, loudness and awareness. Although certain sounds may be
presented to the ear, the human hearing mechanism and brain may not perceive these sounds.
For example a simple psychoacoustic phenomena is habituation whereby a repetitive sound such
as a clock ticking is not heard until attention is specifically drawn to it. Spectral masking is an
example of a more complex psychoacoustic phenomena known whereby loud sounds over a
318
DSPedia
certain frequency band mask the presence of other quieter sounds with similar frequencies.
Spectral masking is now widely exploited to allow data compression of music such as in PASC,
Musicam and ATRAC. See also Adaptive Transform Acoustic Coding, Audiology, Auditory Filters,
Beat Frequencies, Binaural Beats, Binaural Unmasking, Equal Loudness Contours, Habituation,
Lateralization, Monaural Beats, Precedence Effect, Perceptual Audio Coding, Precision Adaptive
Subband Coding (PASC), Sound Pressure Level, Sound Pressure Level Weighting Curves,
Spectral Masking, Temporal Masking, Temporary Threshold Shift, Threshold of Hearing.
Psychoacoustic Model: A model of the human hearing mechanism based on aspects of the
human perception of different sounds to the actual sounds being played. For example a
psychoacoustic model for the phenomenon known as spectral masking has been realized and used
to facilitate data compression technique for digital compact cassette (DCC), and for the mini-disc
(MD). See also Psychoacoustics, Precision Adaptive Subband Coding (PASC), Spectral Masking,
Temporal Masking, Threshold of Hearing.
Ptolemy: An object oriented framework for discrete event simulation and DSP systems, design,
testing and simulation. Ptolemy is available from University of Berkeley.
Pulse Amplitude Modulation (PAM): PAM is a term generally used to refer to communication via
a sequence of analog values such as would be needed to send the voltages corresponding to a
sampled but not quantized analog signal. When the set of values the samples can take on is finite,
the term Amplitude Shift Keying (ASK) is usually used to denote this digital modulation technique.
However, PAM is sometimes used interchangeably with ASK. See also Sampling, Amplitude Shift
Keying.
Pulse Code Modulation (PCM): If an analog waveform is sampled at a suitable frequency, then
each sample can be quantized to a value represented by a binary code (often 2’s complement). The
number of bits in the binary code defines the voltage quantization level, and the sampling rate
should be at least twice the maximum frequency component of the signal (the Nyquist rate). See
also Analog to Digital Converter, Digital to Analog Converter. See figure after Pulse Width
Modulation.
Pulse Position Modulation (PPM): If an analog waveform is sampled at a suitable frequency
then the value of each sample can be represented by a single pulse that has a variable position
within the sample period that is proportional to the sample analog value. Signals that are received
in PPM can be converted back to analog by comparing the samples with a sawtooth waveform.
When the pulse is detected, the level of the sawtooth at that time represents the analog value. The
earlier a pulse is detected, the lower the analog value. See figure after Pulse Width Modulation.
Pulse Train: A periodic train of single unit pulses. Pulse trains with a period equalling human voice
pitch are used as excitation in vocoding (voice coding) schemes such as linear predictive coding
(LPC). See Linear Predictive Coding, Square Wave.
Pulse Width Modulation (PWM): PWM is similar to Pulse Position Modulation except that is the
information is coded as a the width of a pulse rather than its position in the symbol period. The pulse
319
width is proportional to the analog value of that sample. The analog signal can be recovered by
integrating the pulses.
111
110
101
100
011
010
001
000
4
1
5
Pulse Width Modulation
4
1
5
Pulse Position Modulation
time
Sampling and Quantization
001
100
4
101
6
time
6 time
110
1
5
Pulse Code Modulation
6
time
Pythagorean (Music) Scale: Prior to the existence of the equitemporal or Western music scale, a
(major) musical key was formed from using certain carefully chosen frequency ratios between
adjacent notes, rather than the constant tone and semitone ratios of the modern Western music
scale. The ancient C-major Pythagorean scale would have had the following frequency ratios:
C-major Scale
C
Frequency ratio 1/1
D
E
F
G
9/8
81/64
4/3
3/2
A
B
27/16 243/128
C
2/1
The frequency ratio gives the ratio of the fundamental frequency of the root note, to the
current note. The above ratios correspond to the Pythagorean Music Scale.
Any note can be used to realise a Pythagorean major key or scale. However using the Pythagorean
scale it is difficult to form other major or minor keys without a complete retuning of the instrument.
Instruments that are tuned and played using the Pythagorean scale will probably sound in some
sense “ancient” as our modern appreciation of music is now firmly based on the equitempered
Western music scale. See also Digital Audio, Just Scale, Music, Music Synthesis, Western Music
Scale..
320
DSPedia
321
Q
Q Format: Representing binary numbers in the Q format ensures that all numbers have a
magnitude between -1 and 1. The MSB of a Q15 number is the sign bit with magnitude 1, and the
bits following have bit values of:
2 –1 = 0.5 , 2 –2 = 0.25 , 2 –3 = 0.125 ,.....
2 – 15 = 3.0517578 × 10 –5
.
The only difference between normal two’s complement (binary point after the LSB) and Q format is
the position of the binary point.
The Q format is used in DSP to ensure that when two numbers are multiplied together their
magnitude will always be less than 1. Therefore fixed point DSP processors can perform arithmetic
without overflow.
QR: See Matrix Decompositions - QR.
QR Algorithm: A linear technique that implicitly forms an orthogonal matrix Q to transform a matrix
A into an upper triangular matrix R, i.e. A = QR. The QR algorithm is numerically stable and can be
used for solving linear sets of equations in a variety of DSP applications from speech recognition to
beamforming. The algorithm is however, very computationally expensive and not used very often
for real time DSP. See Matrix Decompositions - QR.
Quad: A prefix to mean “four of”. For example the Burr Brown DAC4814 chip is described as a
Quad 12 Bit Digital to Analog Converter (DAC) meaning that the chip has four separate (or
independent) DACs. See also Dual.
Quadraphonic (or Quadrophonic): Using four independent channels for the reproduction of hifidelity music. Quadrophonic systems were first introduced in the 1970s as an enhancement to the
stereophonic system, however the success was limited. In the 1990s surround sound systems such
as Dolby Prologic use four and more channels to encode the sound with 3-dimensional effect. The
term quadraphonic is rarely implemented or used. Note that a system which simply uses four
loudspeakers (two left channels and two right channels) is not quadraphonic. See also
Stereophonic, Surround Sound, Dolby Prologic.
Quadratic Equation: A polynomial is a quadratic equation if it has the form, ax 2 + bx + c = 0 ,
where x is a variable, and a,b, and c are constants. Note that the quantity x may be a vector, and
a, b, and c are appropriately dimensioned vectors and matrices. For example in calculating the
Wiener-Hopf solution the following equation must be solved:
xRx T + px + c = 0
(467)
where x is an n × 1 vector, R is an n × n matrix, p is an n × 1 vector and c is a scalar constant.
Quadratic Formula: Given a quadratic polynomial,
polynomial can be calculated from:
– b ± b 2 – 4ac
x = --------------------------------------2a
ax 2 + bx + c = 0 , the roots of this
(468)
DSPedia
322
such that:
+ b 2 – 4ac-  + ----------------------------------– b 2 – 4ac- = 2 + b
x + b
--- x + --c----------------------------------x b
x



a
a
2a
2a
Geometrically, the roots of a polynomial are where a graph of
parabolic in shape) cuts the x-axis.
y
y = ax 2 + bx + c
(469)
(which is
6
5
4
y = x2 – x – 2
3
2
1
0
-4 -3 -2 -1
-1
1
2
3
4
x
-2
Note that if the graph does not cut the x-axis, then the quantity b – 4ac will be an imaginary
number (square root of a negative number), and the roots are then complex numbers. See also
Complex Roots, Poles, Polynomial, Zeroes.
Quadratic Surface: See Hyperparaboloid.
Quadrature: This term is used in reference to the four quadrants defined in two dimensions.
Quadrature representations are particularly useful in communications because the cosine and sine
components of a single frequency can be thought of as the two axes in the complex plane. By
representing signals via in-phase (cosine) and quadrature (sine) components, all of the tools of
complex number analysis are available to simplify the analysis and design of digital signal sets.
Quadrature Amplitude Modulation (QAM): When both the amplitude and the phase of a
quadrature (two dimensional) signal set are varied to encode the information bits in a digital
communication system, the modulation technique is often referred to as QAM. Common examples
are rectangular signal sets defined on a two-dimensional Cartesian lattice, such as 16 QAM (4 bits
per symbol), 32 QAM (5 bits per symbol), and 64 QAM (6 bits per symbol). QAM modulation
techniques are used for many modem communication standards. See also V-Series
Recommendations, Amplitude Shift Keying, Phase Shift Keying.
Quadrature Mirror Filters (QMF): A type of digital filter which has special properties making it
suitable for sub-band coding filters.
Quadrature Phase Shift Keying (QPSK): QPSK is a common digital modulation (phase shift
keying) technique that uses four signals (symbols) that have equal amplitude and are successively
shifted by 90 degrees in phase. See also Phase Shift Keying, Quadrature.
Quantization: Converting from a continuous value into a series of discrete levels. For example, a
real value can be quantized to its nearest integer value (rounding) and the resulting error is referred
323
to as the quantization error. The quantization error therefore reflects the accuracy of an ADC.
Quantization introduces an irreversible distortion on an analogue signal.
Binary
Output
Quantization
Level, q
Analog
Input
Quantizers are found somewhere at the heart of every lossy compression algorithm. In JPEG, for
example, the quantizer appears when the DCT coefficients for an image block are quantized. See
also Analog to Digital Converter, A-law C, Sample and Hold.
Quantization Error: The difference between the true value of a signal and the discrete value from
the A/D at that particular sampling instant. If the quantization level is q volts, then the maximum
error at each sample is q/2 volts. If an analog value x is to be quantized it is convenient to represent
the quantized value as a sum of the true analog value and a quantization error component, e,
i.e.: x = x + e , where x is the quantized value of x. See also Rounding Noise, Truncation Noise.
Quantization Noise: Assuming the an ADC rounds to the nearest digital level, the maximum
quantisation error of any one sample is q/2 volts (see Quantization Error). If we assume that the
probability of the error being at a particular value between +q/2 and -q/2 is equally likely then the
probability density function for the error is flat.
p( e)
1/q
q/2 e
-q/2
Therefore treating the error as white noise, then we can calculate the noise power of the error as:
n adc
1
= --q
q⁄2
∫
–q ⁄ 2
q 2e 2 de = ----12
(470)
DSPedia
324
The quantisation noise will extend over the frequency range 0 to fs/2. , i.e. the full baseband.
Signal Spectrum
Y(f)
Signal Power
E(f), Quantisation Noise
fs/2
frequency (Hz)
Low level signals may be masked by the quantisation noise. Although it is assumed that
the quantisation noise is uncorrelated with the signal, in practice for periodic signals this
is not strictly true, and therefore the flat white spectrum is not strictly true.
For an N-bit signal, there are 2 N levels from the maximum to the minimum value of the quantiser:
Binary Output
2N – 1 – 1
-1
2
Quantization step size q = ------2N
1
Analog Input
–2 N – 1
Therefore the mean square value of the quantisation noise power can be calculated as:
( 2 ⁄ 2 N )2
4
Q N = 10 log  --------------------- = 10 log 2 –2 N + 10 log ------ ≈ – 6.02N – 4.77 dB
 12 
12
(471)
Another useful measurement is the signal to quantisation noise ratio (SQNR). For the above ADC
with voltage input levels between -1 and +1 volts, if the input signal is the maximum possible, i.e. a
sine wave of amplitude 1 volt, then the average input signal power is:
--Signal Power = E [ sin 2πf t 2 ] = 1
2
(472)
Therefore the maximum SQNR is:
Signal Power
0.5
3
SQNR = 10 log ----------------------------------- = 10 log --------------------------- = 10 log 2 – 2 N + 10 log --Noise Power
2
( 2 ⁄ 2 N ) 2
 -------------------- 12 
(473)
= 6.02N + 1.76 dB
For a perfect 16 bit ADC the quantisation noise can be calcuated to be 98.08 dB. See also A-law
compression, Signal to Noise Ratio.
325
Quantisation Noise, Reduction by Oversampling: Oversampling can be used to increase the
resolution of an ADC or DAC. If an ADC has a step size of q volts (see Quantisation Error) and a
Nyquist sampling rate of f n , then the maximum error, e ( n ) , of a quantised sample is between
– q ⁄ 2 and q ⁄ 2 . Therefore if the true sample value is x ( n ) , then the quantised sample, y ( n ) , is:
y( n ) = x( n ) + e( n )
(474)
If we assume that the quantisation value is equally likely to take any value in this range (i.e. it is
white), then we can assume that the probability density function for the noise signal is uniform.
Therefore the average quantisation noise power in the range 0 to f n ⁄ 2 can be calculated as the
average squared value of e :
QN
1 q⁄2 2
11
= --- ∫
e de = --- --- e 3
q –q ⁄ 2
q3
q⁄2
–q ⁄ 2
q 2= ----12
(475)
The same answer could be obtained from the time average:
1
Q N ≈ ----M
M–1
∑
q2
e 2 ( m ) = -----12
(476)
m=0
In order to appreciate that the quantisation noise does not decrease, note that the same
approximate answer is obtained for a signal that is oversampled by R times:
1
Q N ≈ --------MR
MR – 1
∑
q 2e 2 ( r ) = ----12
(477)
r=0
For an oversampled system sampling at f ovs and using the same converter, the total quantisation
noise power will of course be the same but because it is white (a flat spectrum) it is now spread over
the range 0 to f ovs ⁄ 2 . Evaluating Eqs. 475 or 476 for different sampling rates will give the same
answer. The actual noise power in the baseband, Q ovs , is now given as:
q2( fn ⁄ 2 )
Q ovs = --------------------------12 ( f ovs ⁄ 2 )
(478)
(Note that for the more common periodic and aperiodic signals, the quantisation noise spectra is
not “white”; however for a “noisy” stochastic input signal the white quantisation noise assumption is
“reasonably” valid). From Eq. 478, in order to increase the baseband signal to quantisation noise
ratio we can either increase the number of bits in the ADC or increase the sampling rate above the
Nyquist rate. By increasing the sampling rate, the total quantisation noise power does not increase,
and as a result the in-band quantisation noise power will decrease.
DSPedia
326
As an example, oversampling a signal by a factor of 4 x’s the Nyquist rate reduces the in-band
quantisation noise by 1/4:
Signal Power
Baseband signal of interest
Qovs = 1/4 QN
QN
Total quantisation noise QN
{
fn /2
fovs/2 freq
When a signal is oversampled the total level of quantisation noise does not change.
Therefore for every increase in sampling rate above Nyquist the baseband quantisation
noise power will reduce.
This level of baseband noise power is equivalent to an ADC with step size q/2:
q 2 ( fn ⁄ 2 )
( q ⁄ 2 ) 2- = Q
N
- = 2
--------------------------Q OVS = -------------------------12 ( 4f n ⁄ 2 )
12
4
(479)
and hence baseband signal resolution has been increased by 1 bit since. For each extra bit of
resolution the signal to quantisation noise ratio improves by 20 log 2 = 6.02 dB . In theory therefore
if a single bit ADC were used and oversampled by a factor of 4 15 ( ≈ 10 9 × f n ) then a 16 bit
resolution signal could be realized! Clearly this sampling rate is not practically realisable. However
on a more pragmatic level, if a well trimmed low noise floor 8 bit ADC converter was used to
oversample a signal by a factor of 16 x’s the Nyquist rate, then when using a digital low pass filter
to decimate the signal to the Nyquist rate, approximately 10 bits of resolution could be obtained.
Single bit oversampling ADCs can however still be achieved using quantisation noise shaping
strategies within sigma delta converters (see Sigma Delta).
To illustrate increasing the signal resolution by oversampling, the figure below shows the result of
a simulation quantising a high resolution floating point white noise digital signal in the amplitude
range -1 to +1 to 4 bits (i.e. 16 levels in range -1 to +1) using a digital quantiser to simulate an ADC.
The bandwidth of interest is 0-5000 Hz, and hence the Nyquist rate is f n = 10000 Hz , and
oversampling at 16 x’s gives f ovs = 160000 Hz and should yield two “extra bits” of resolution. The
quantisation noise for the Nyquist rate and oversampled rate quantisers (ADCs) then reveals the
327
expected 12 dB advantage from the oversampling strategy.
16
White noise
band limited
0-80000 Hz
4 bit
quantiser
−
Magnitude (dB)
LPF
0-5000Hz
Nyquist
Quantisation
Noise
+
10000Hz
4 bit
quantiser
+
−
160000Hz
LPF
0-5000Hz
0
10
20
30
12 dB
40
0
16
2500
5000
frequency/Hz
Oversampled
Quantisation
Noise
Quantising a real value (floating point) signal of baseband 0-5000 Hz to 4 bits. Note that
the oversampling procedure produces a level of inband quantisation noise that is 12 dB
below that of the Nyquist rate quantiser. The magnitude spectra was produced from a 1024
point FFT of the quantisation noise, and smoothed by a window of length 8. The input white
noise signal was 16384 samples.
See also Decimation, Interpolation, Oversampling, Quantization, Sigma Delta Converter.
Quarter Common Intermediate Format (QCIF): The QCIF image format is 144 lines by 180
pixels/line of luminance and 72 x 90 of chrominance information and is used in the |TU-T H.261
digital video recommendation. A full version of QCIF called CIF (common image format) is also
defined in H.261. The choice between CIF or QCIF depends on available channel capacity and
desired quality. See also Common Intermediate Format, H-series Recommendations, International
Telecommunication Union.
Quicksilver: A versatile, if difficult to find, software package.
Quicktime: A proprietary algorithm for video compression using very low levels of processing to
allow real time implementation in software on and Macintosh computers [79]. Quicktime does not
achieve the picture quality of techniques such as MPEG1. See also MPEG1.
328
DSPedia
329
R
Ramp Waveform (Continuous and Discrete Time): The continuous ramp waveform can be
defined as:

– t0
 t---------- τ ramp ( ( t – t 0 ) ⁄ τ ) = 

 0 --
if 0 ≤ ( t – t 0 ) < τ
continuous time
(480)
otherwise
r(t)
1
0
t0
t0 +τ
t
The continuous triangular pulse r ( t ) = ramp ( ( t – t 0 ) ⁄ τ )
The discrete time ramp waveform can be defined as:

– k0
 k------------
ramp ( ( k – k 0 ) ⁄ κ ) =  κ
 -- 0

if 0 ≤ ( k – k 0 ) < κ
discrete time
(481)
otherwise
g( k )
1
0
k0 −κ
k0
k0 +κ
k
g ( k ) = tri ( ( k – k 0 ) ⁄ κ )
See also Elementary Signals, Rectangular Pulse, Sawtooth Waveform, Square Wave, Triangular
Pulse, Unit Impulse Function, Unit Step Function.
Random Access Memory (RAM): Digital memory which can be used to read or write binary data
to. RAM memory is usually volatile, meaning that it loses information when the power is switched
off. Non-volatile RAM is available. See also Non-Volatile, Static RAM, Dynamic RAM.
Random Variable: A random variable is a real valued function which is defined based on the
outcomes of a probabilistic system. For example a die can be used to create a signal based on the
random variable of the die outcome. The probabilistic event is the shaking of the die where each
DSPedia
330
independent event is denoted by k, and there are 6 equally likely outcomes. A particular random
variable x ( k ) can be defined by the following table:
Die Event
Random
Variable x(.)
p(x(.))
1
-15
1/6
2
-10
1/6
3
-5
1/6
4
+5
1/6
5
+10
1/6
6
+25
1/6
Table 1:
and the random signal x ( k ) turns out to be:
x(k)
25
20
15
10
5
0
k
-5
-10
-15
The time average of the signal x ( k ) , denoted as x , can be calculated as:
1
x = lim ---N → ∞N
N
∑ x(k )
= 1.6666…
(482)
k=0
The statistical mean, and denoted as E [ x ( k ) ] , where E [ . ] is the expectation operator can be
calculated as:
E[x(k)] =
∑ p ( x )x,
for all values of x
x
1
1
1
1
1
1
=  25 ⋅ --- +  10 ⋅ --- +  5 ⋅ --- –  5 ⋅ --- –  10 ⋅ --- –  15 ⋅ ---

6 
6  6  6 
6 
6
= 1.6666…
The time average mean squared value, denoted as x 2 , can be calculated as:
(483)
331
1
x 2 = lim ---N → ∞N
N
∑ x2( k )
= 183.333…
(484)
k=0
The statistical average squared value, denoted as E [ x 2 ( k ) ] can be calculated from:
E[ x2( k) ] =
∑ p ( x )x 2,
for all values of x
x
1
1
1
1
1
1
=  625 ⋅ --- +  100 ⋅ --- +  25 ⋅ --- –  25 ⋅ --- –  100 ⋅ --- –  225 ⋅ ---

6 
6 
6 
6 
6 
6
(485)
= 183.333…
If the random process generating x ( k ) is ergodic, then the statistical averages equal the time
averages, i.e. x = E [ x ( k ) ] and x 2 = E [ x 2 ( k ) ] .
For a particular random variable, x ( k ) , a cumulative distribution function can be specified, where:
no. of values of x ( k ) ≤ a
F ( a ) = P ( x ( k ) ≤ a ) = lim  ---------------------------------------------------------------

total no. of values, n 
n→∞
(486)
i.e., F ( a ) specifies the probability that the value x ( k ) is less than a. Therefore for the above random
variable, x ( k ) , the cumulative distribution function is:
F(a)
1
5/6
2/3
1/2
1/3
1/6
-15
-10
-5
0
-5
10
15
20
25
a
The probability density function (PDF) is defined as:
(a)
--------------p ( x ) = dF
dx
a=x
( x ≤ a )= dP
-----------------------dx
(487)
a=x
where the “ ( k ) “has been dropped for notational convenience. The PDF for the random variable
x ( k ) produced by the probabilistic events of a die shake is therefore:
where the arrows represent dirac-delta functions located at the discrete values of the random
variable. Therefore the total area under the graph p ( x ) is 1.
The above distributions are discrete, in that the random variable can only take on specific values
and therefore the distribution function increases in steps, and the PDF consists of dirac delta
functions. There also exist continuous distributions where the random variable can take on any real
DSPedia
332
p( x)
1/6
-15
-10
-5
0
-5
10
15
20
25
x
number within some range. For example, consider a continuously distributed random variable
which denotes the exact voltage measured from a 9 volt battery. By measuring the voltage of a large
number of batteries, a random variable y ( . ) denoting the battery voltages can be produced. For a
particular batch of a few thousand batteries the distribution function and PDF obtained are:
F(a)
p( y )
1
0.8
0.3
0.6
Area = 0.14
0.2
0.4
0.1
0.2
0
Total area = 1
1 2 3 4 5 6 7 8 9 10 11
a, volts
Cumulative Distribution Function
0
1 2 3 4 5 6 7 8 9 10 11
y, volts
Probability Distribution Function
If, for example, it is required to calculate the probability of a battery having a voltage between 6 and
7 volts, then the area under the PDF between y values of 6 and 7 can be calculated, or the
appropriate values of the distribution function subtracted:
P( 6 < y ≤ 7) =
7
∫6 p ( y ) dy
= F ( 7 ) – F ( 6 ) = 0.14
(488)
In DSP signals with both discrete and continuous distributions are found. For example thermal
noise is continuously distributed signal, whereas the sequence of character symbols typically sent
by a modem has a discrete distribution.
Some important discrete distributions in DSP are:
• Binomial;
• Poisson;
Some important continuous probability density functions in DSP are:
333
• Gaussian:
p( x)
1 -------------2πσ
(x – µ)2
– -------------------1
2
p ( x ) = --------------- e 2σ
2πσ
σ
Mean, E [ x ] = µ
µ
x
Variance, E [ ( x – µ ) 2 ] = σ 2
• Uniform:
p( x)
 0, x < ( 2µ – A ) ⁄ 2

p ( x ) =  1 ⁄ A, x – m ≤ A ⁄ 2
 , >(
 0 x 2µ – A ) ⁄ 2
Mean, E [ x ] = µ
1--A
A
µ
x
A2
Variance, E [ ( x – µ ) 2 ] = -----12
The n-th moment of a PDF taken about the point x = x 0 is:
E [ ( x – xo )n ] =
∞
∫–∞ ( x – xo ) n p ( x ) dx
(489)
The second order moment around the mean, E [ ( x – E [ x ] ) 2 ] is called the variance or the second
central moment.
See also Ergodic, Expected Value, Mean Value, Mean Squared Value, Probability, Variance, Wide
Sense Stationarity.
Range of Matrix: See Matrix Properties - Range.
Rank of Matrix: See Matrix Properties - Rank.
Rate Converter: Usually referring to the change of the sampling rate of a signal. See Decimation,
Downsampling, Fractional Sampling Rate Converter, Interpolation, Upsampling.
RBDS: An FM data transmission standard that allows radio stations to send traffic bulletins,
weather reports, song titles or other information to a display on RBDS compatible radios. Radios
will therefore be able to scan for a particular type of music. For emergency broadcasting an RBDS
signal can automatically turn on a radio, turn up the radio volume and issue an emergency alert.
RC Circuit: The very simplest form of analog low pass or high pass filter used in DSP systems.
The 3dB point is at f 3dB = 1 ⁄ ( 2πRC ) . An RC circuit is only suitable as a (low pass) anti-alias filter
when the sampling frequency is considerably higher than the highest frequency present in the input
signal; this is usually only the case for oversampled DSP systems where the anti-alias process is
primarily performed digitally. The roll-off for a simple low pass RC circuit is 6dB/octave, or 20dB/
decade when plotted on a logarithmic frequency scale.
DSPedia
334
An RC circuit can also be used as a differentiator noting that the current through a capacitor is
limited by the rate of change of the voltage across the capacitor:
dV
i = C ------dt
(490)
See also 3dB point, Decade, Differentiator, Logarithmic Frequency, Octave, Oversampling, Roll-off,
Sigma Delta.
Low Pass RC Filter
V out
1
----------- = ------------------------------------------V in
1 + 4π 2 R 2 f 2 C 2
R
Vin
Vout
1
= -----------------------------------1 + ( f ⁄ f 3dB ) 2
1.0
0.9
0.8
20log10 Vout/Vin (dB)
V out
----------V in
C
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
f3dB
2f3dB
3f3dB
4f3dB
5f3dB
frequency (Hz)
0
-5
-10
-15
-20
-25
-30
-35
-40
-45
-50
-55
-60
0.1f3dB
f3dB
10f3dB
100f3dB
1000f3dB
log10 f
335
High Pass RC Filter
V out
2πfRC
----------- = ------------------------------------------V in
1 + 4π 2 R 2 f 2 C 2
C
R
V out
----------V in
Vout
f ⁄ f 3dB
= -----------------------------------2
1 + ( f ⁄ f 3dB )
1.0
0.9
20log10 Vout/Vin (dB)
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
f3dB
2f3dB
3f3dB
4f3dB
5f3dB
0
-5
-10
-15
-20
-25
-30
-35
-40
-45
-50
-55
-60
0.001f3dB 0.01f3dB
0.1f3dB
frequency (Hz)
f3dB
10f3dB
log10 f
Read Only Memory (ROM): Digital memory to which data cannot be written. ROM also retains
information even when the power is switched off.
Reasoning, Circular: See Circular Reasoning.
Real Exponential: See Exponential, Complex Exponential.
Real Time Processing: Real time is the expression used to indicate that a signal must be
processed and output again without any noticeable delay. For example, consider speech being
sensed by a microphone before being sampled by a DSP system. Suppose it is required to filter out
the low frequencies of the speech before sending the data down a telephone line. The filtering must
be done in real time otherwise new samples of data will arrive before the system has DSP system
has finished its calculations on the previous ones! Systems that do not operate in real time are often
referred to as off-line. See also Off-Line Processing.
Reciprocal Polynomial: Consider the polynomial:
H ( z ) = a 1 + a 2 z –1 + … + a N – 1 z – N + 1 + a N z – N
(491)
The reciprocal polynomial is given by:
H r ( z ) = a N* + a N* – 1 z – 1 + … + a 1* z – N + 1 + a 0* z –N
(492)
where a i* is the complex conjugate of a i . The polynomials are so called because the reciprocals of
the zeroes of H ( z ) are the zeroes of H r ( z ) . If H ( z ) factorises to:
H ( z ) = ( 1 – α1 z –1 ) ( 1 – α 2 z – 1 )… ( 1 – α N – 1 z –1 ) ( 1 – α N z –1 )
(493)
DSPedia
336
then the zeroes of the order reversed polynomial are α 1– 1, α 2–2, …α N–1– 1 , α N–1 which can be seen
from:
H r ( z ) = z – N H ( z –1 )
= z – N ( 1 – α 1 z ) ( 1 – α 2 z )… ( 1 – α N – 1 z ) ( 1 – α N z )
(494)
= ( z –1 – α1 ) ( z –1 – α2 )… ( z –1 – α N – 1 ) ( z –1 – α N )
( –1 ) N
= ------------------------------------------ ( 1 – α 1–1 z –1 ) ( 1 – α 2–1 z –1 )… ( 1 – α N– 1– 1 z –1 ) ( 1 – α N–1 z –1 )
α 1 α 2 …α N – 1 α N
Reciprocal polynomials are of particular relevance to the design of all pass filters. See All-pass
Filter, Finite Impulse Response, Order Reversed Filter.
Reconstruction Filter: The analog filter at the output of a DAC to remove the high frequencies
present in the signal (in the form of the steps between the discrete levels of signal).
y(k)
Voltage
time, k
time,t
Analog
Reconstruction
Filter
Steppy output voltage
from Digital to Analog
Converter
Reconstruction filter
smooths out the high
frequency steps.
freq
fs/2
fs
3fs/2
freq
fs/2
freq
fs/2
fs
3fs/2
Magnitude spectra of aliased signal after DAC; of the Reconstruction filter; and of
the reconstructed analog signal.
Rectangular Matrix: See Matrix Structured - Rectangular.
Rectangular Pulse (Continuous and Discrete Time): The continuous time rectangular pulse
can be defined as:
337
 1 if t – t 0 < τ ⁄ 2
rect ( ( t – t 0 ) ⁄ τ ) = 
 0 otherwise
continuous time
(495)
p(t)
1
t0 - τ/2
0
t0
t0 +τ/2
t
The continuous rectangular pulse p ( t ) = rect ( ( t – t0 ) ⁄ τ )
The discrete time rectangular pulse can be defined as:
 1 if k – k 0 < κ ⁄ 2
rect ( ( k – k 0 ) ⁄ κ ) = 
 0 otherwise
(496)
discrete time
q(t)
1
k0 −κ/2
0
k0
continuous
The
q ( t ) = rect ( ( k – k 0 ) ⁄ κ )
k0 +κ/2
k
rectangular
pulse
A rectangular pulse can also be generated by the addition of unit step functions. The unit step
function is defined as:

 0 if k < k 0
u ( k – k0 ) = 
 1 if k ≥ k 0

discrete time
(497)
x( k)
1
0
1 2 3 4 5 6 7 8 9 10 11 12
x ( k ) = rect ( ( k – 9 ) ⁄ 7 ) = u ( k – 4 ) – u ( k – 10 )
k
DSPedia
338
A rectangular pulse train, or square wave can be produced by distributing a rectangular pulse in a
non-overlapping fashion. See also Elementary Signals, Square Wave, Triangular Pulse, Unit Step
Function.
Rectangular Pulse Train: See Square Wave.
Recursive LMS: See Least Mean Squares IIR Algorithms.
Red Book: The specifications for the compact disc (CD) digital audio format were jointly specified
by Sony and Philips and are documented in what is known as the Red Book. The standards for CD
are also documented in the IEC (International Electrotechnical Commission) standard BNN15-83095, and IEC-958 and IEC-908 .
Reed Solomon Coding: See Cross Interleaved Reed Solomon Coding.
Recruitment: See Loudness Recruitment.
Recursive Least Squares (RLS): The RLS algorithm can also be used to update the weights of
an adaptive filter where the aim is to minimize the sum of the squared error signal. Consider the
adaptive FIR digital filter which is to be updated using an RLS algorithm such that as new data
arrives the RLS algorithm uses this new data (innovation) to improve the least squares solution:
d(k)
Input signal
x( k)
Adaptive
Filter, w(k)
y( k )
Desired signal
+
−
Output
signal
e( k)
Error signal
RLS Adaptive
Algorithm
y ( k ) = Filter { x ( k ), w ( k ) }
w k + 1 = w k + e ( k )f { d ( ( k ), x ( k ) ) }
For least squares adaptive signal processing the aim is to adapt the impulse response of the
FIR digital filter such that the input signal x ( k ) is filtered to produce y ( k ) which when
subtracted from desired signal d ( k ) , minimises the sum of the squared error signal e ( k )
over time from the start of the signal at 0 (zero) to the current time k.
Note: While the above figure is reminiscent of the Least Mean Squares (LMS) adaptive filter, the
distinction between the two approaches is quite important: LMS minimizes the mean of the square
of the output error, while RLS minimizes the actual sum of the squared output errors.
In order to minimize the error signal, e ( k ) , consider minimizing the total sum of squared errors for
all input signals up to and including time, k. The total squared error, v ( k ) , is:
k
v(k ) =
∑ [ e( s ) ]2
= e2(0 ) + e2( 1) + e2(2 ) + … + e2( k)
s=0
Using vector notation, the error signal can be expressed in a vector format and therefore:
(498)
339
e( 0)
d( 0)
y( 0)
e( 1)
d( 1)
y( 1)
e( 2)
d( 2) – y( 2)
=
= dk – yk
ek =
:
:
:
e( k – 1)
d( k – 1)
y(k – 1)
e( k)
d( k)
y(k)
(499)
Noting that the output of the N weight adaptive FIR digital filter is given by:
N–1
y(k) =
∑
w n x ( k – n ) = w T x k = x kT w
(500)
n=0
where,
w = [ w 0, w 1, w 2, …, w N – 1 ] and
(501)
x k = [ x ( k ), x ( k – 1 ), x ( k – 2 ), …, x ( k – N + 1 ) ]
(502)
then Eq. 499 can be rearranged to give:
ek =
e(0)
x 0T w
x 0T
e(1)
x 1T w
x 1T
e(2)
:
= dk –
x 2T w
:
x 2T
:
e( k – 1)
x kT – 1 w
x kT – 1
e(k)
x kT w
x kT
x(0)
0
0
x(1)
x(0)
0
x(1)
x(0)
= dk – x ( 2 )
:
:
:
x(k – 1) x(k – 2) x(k – 3)
x(k) x(k – 1) x(k – 2)
i.e.
= dk –
w
…
0
w0
…
0
w1
…
0
w2
…
:
:
… x(k – N )
… x ( k – N + 1 ) wN – 1
(503)
e k = dk – Xk w
where X k is a ( k + 1 ) × N data matrix made up from input signal samples. Note that the first N rows
of X k are sparse. Equation 498 can be rewritten such that:
DSPedia
340
v ( k ) = e kT e k = e k
2
2
= [ dk – Xk w ] T [ dk – Xk w ]
(504)
= d kT dk + w T X kT X k w – 2d kT X k w
where e k 2 is the 2-norm of the vector e k . From a first glance at the last line of Eq. 503 it may
seem that a viable solution is to set e k = 0 then simply solve the equation w = X k–1 dk . However
this is of course not possible in general as X k is not a square matrix and therefore not invertible.
In order to find a “good” solution such that the 2-norm of the error vector, e k , is minimized, note that
Eq. 504 is quadratic in the vector w , and the function v ( k ) is an up-facing hyperparaboloid when
plotted in N+1 dimensional space, and there exists exactly one minimum point at the bottom of the
hyperparaboloid where the gradient vector is zero, i.e.,
∂ v(k) =
0
∂w
(505)
∂ v ( k ) = 2X T
T
T
k X k w – 2X k d k = – 2X k [ d k – X k w ]
∂w
(506)
From Eq. 504
and therefore:
– 2X kT [ d k – X k w LS ] = 0
(507)
⇒ X kT X k w LS = X kT dk
and the least squares solution, denoted as w LS and based on data received up to and including
time, k, is given as:
w LS = [ X kT X k ] –1 X kT dk
(508)
Note that because [ X kT X k ] is a symmetric square matrix, then [ X kT X k ] – 1 is also a symmetric square
matrix. As with any linear algebraic manipulation a useful check is to confirm that the matrix
dimensions are compatible, thus ensuring that w LS is a N × 1 matrix:
N
w
w
w
w
N
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
k+1
x
x
x
x
x
x
x
x
-1
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
XT
X
x
x
x
x
x
x
x
x
x
x
x
x
k+1
N
w
k+1
x
x
x
x
-1
XT
x
x
x
x
x
x
x
x
N
d
d
d
d
d
d
k+1
d
Note that if in the special case where X k is a square non-singular matrix, then Eq. 508 simplifies to:
341
w LS = X k– 1 X k–T X kT d k = X k–1 d k
(509)
The computation to calculate Eq. 508 requires about O(N4) MACs (multiply/accumulates) and
O ( N ) divides for the matrix inversion, and O ( ( k + 1 ) × N 2 ) MACs for the matrix multiplications.
Clearly therefore, the more data that is available, then the more computation required.
At time iteration k+1, the weight vector to use in the adaptive FIR filter that minimizes the 2-norm of
the error vector, e k can be denoted as w k + 1 , and the open loop least squares adaptive filter
solution can be represented as the block diagram:
x( k )
w0
w1
wN-2
wN-1
y(k )
−
+
e(k )
w k + 1 = [ X kT X k ] – 1 X kT d k
d( k)
Note however that at time k + 1 when a new data sample arrives at both the input, x ( k + 1 ) , and
the desired input, d ( k + 1 ) then this new information should ideally be incorporated in the least
squares solution with a view to obtaining an improved solution. The new least squares filter weight
vector to use at time k + 2 (denoted as w k + 2 ) is clearly given by:
w k + 2 = [ X kT + 1 X k + 1 ] –1 X kT + 1 dk + 1
(510)
This equation requires that another full matrix inversion is performed, [ X kT + 1 X k + 1 ] –1 , followed by
the appropriate matrix multiplications. This very high level of computation for every new data
sample provides the motivation for deriving the recursive least squares (RLS) algorithm. RLS has
a much lower level of computation by calculating w k + 1 using the result of previous estimate w k to
reduce computation.
Consider the situation where we have calculated w k , from,
w k = [ X kT – 1 X k – 1 ] – 1 X kT – 1 dk – 1 = P k – 1 X kT – 1 d k – 1
(511)
P k – 1 = [ X kT – 1 X k – 1 ] –1
(512)
where
When the new data samples, x ( k ) and d ( k ) , arrive we have to calculate:
w k + 1 = [ X kT X k ] –1 X kT dk = P k X kT d k
(513)
However note that P k can be written in terms of the previous data matrix X k – 1 and the data vector
x k by partitioning the matrix X k :
DSPedia
342
P k = [ X kT X k ] –1 = X kT – 1 x k
= X kT – 1 X k – 1 + x k x kT
= P k– 1– 1 + x k x kT
Xk – 1
x kT
(514)
–1
–1
where, of course, x k = [ x ( k + 1 ), x ( k ), x ( k – 1 ), …, x ( k – N + 1 ) ] as before in Eq. 500. In order to
write Eq. 514 in a more “suitable form” we use the matrix inversion lemma (see Matrix PropertiesInversion Lemma) which states that:
[ A –1 + BCD ] –1 = A – AB [ C + DAB ] –1 DA
(515)
where A is a non-singular matrix and B, C and D are appropriately dimensioned matrices. Using
the matrix inversion lemma of Eq.514, where P k – 1 = A , x k = B , x kT = D and C is the 1 × 1
identity matrix. i.e. the scalar 1, then:
P k = P k – 1 – P k – 1 x k [ 1 + x kT P k – 1 x k ] –1 x kT P k – 1
(516)
This equation implies that if we know the matrix [ X kT – 1 X k – 1 ] –1 then the matrix [ X kT X k ] –1 can be
computed without explicitly performing a complete matrix inversion from first principles. This, of
course, saves in computation effort. Equations 513 and 516 are one form of the RLS algorithm. By
additional algebraic manipulation, the computation complexity of Eq. 516 can be simplified even
further.
By substituting Eq. 516 into Eq. 513, and partitioning the vector d k and simplifying gives:
w k + 1 = [ P k – 1 – P k – 1 x k [ 1 + x kT P k – 1 x k ] –1 x kT P k – 1 ]X kT d k
= [ P k – 1 – P k – 1 x k [ 1 + x kT P k – 1 x k ] –1 x kT P k – 1 ] X kT – 1 x k
dk – 1
d(k )
(517)
= [ P k – 1 – P k – 1 x k [ 1 + x kT P k – 1 x k ] –1 x kT P k – 1 ] X kT – 1 d k – 1 + x k d ( k )
Using the substitution that w k = P k – 1 X kT – 1 d k – 1 and also dropping the time subscripts for
notational convenience, i.e. P = P k – 1 , x = x k , d = d k – 1 , and d = d ( k ) ) , further simplification
can be performed:
343
w k + 1 = [ P – Px [ 1 + x T Px ] –1 x T P ] X T d + xd
= PX T d + Pxd – Px [ 1 + x T Px ] –1 x T PX T d – Px [ 1 + x T Px ] –1 x T Pxd
= w k – Px [ 1 + x T Px ] –1 x T w k + Pxd – Px [ 1 + x T Px ] –1 x T Pxd
(518)
= w k – Px [ 1 + x T Px ] –1 x T w k + Pxd [ 1 – [ 1 + x T Px ] –1 x T Px ]
= w k – Px [ 1 + x T Px ] –1 x T w k + Px [ 1 + x T Px ] – 1 [ [ 1 + x T Px ] – x T Px ]d
= w k – Px [ 1 + x T Px ] –1 x T w k + Px [ 1 + x T Px ] – 1 d
= w k + Px [ 1 + x T Px ] – 1 ( d – x T w k )
and reintroducing the subscripts, and noting that y ( k ) = x kT w k :
w k + 1 = w k + P k – 1 x k [ 1 + x kT P k – 1 x k ] – 1 ( d ( k ) – y ( k ) )
= wk + mk ( d ( k ) – y ( k ) )
(519)
= wk + mk e ( k )
where m k = P k – 1 x k [ 1 + x kT P k – 1 x k ] – 1 and is called the gain vector.
The RLS adaptive filtering algorithm therefore requires that at each time step, the vector m k and
the matrix P k are computed. The filter weights are then updated using the error output, e ( k ) .
Therefore the block diagram for the closed loop RLS adaptive FIR filter is: :
d(k)
x(k )
w0
w1
wN-2
wN-1
−
y(k )
+
e(k )
wk + 1 = w k + m k e ( k )
Pk – 1 x k
m k = ----------------------------------------[ 1 + x kT P k – 1 x k ]
P k = P k – 1 – m k x kT P k – 1
The above form of the RLS requires O ( N 2 ) MACs and one divide on each iteration. See also
Adaptive Filtering, Least Mean Squares Algorithm, Least Squares, Noise Cancellation, Recursive
Least Squares-Exponentially Weighted.
Recursive Least Squares (RLS) - Exponentially Weighted: One problem with least squares
and recursive least squares (RLS) algorithm derived in entry Recursive Least Squares, is that the
minimization of the 2-norm of the error vector e k calculates the least squares vector at time k based
on all previous data, i.e. data from long ago is given as much relevance as recently received data.
Therefore if at some time in the past a block of “bad” data was received or the input signal statistics
changed then the RLS algorithm will calculate the current least squares solution giving as much
DSPedia
344
relevance to the old (and probably irrelevant) data as it does to very recent inputs. Therefore the
RLS algorithm has infinite memory.
In order to overcome the infinite memory problem, the exponentially weighted least squares, and
exponentially weighted recursive least squares (EW-RLS) algorithms can be derived. Consider
again Eq. 498 where this time each error sample is weighted using a forgetting factor constant λ
which just less than 1:
k
v(k) =
∑ λk – s [ e( s) ]2
= λ k e 2 ( 0 ) + λ k – 1 e 2 ( 1 ) + λ 2k – 2 e 2 ( 2 ) + … + e 2 ( k )
(520)
s=0
For example if a forgetting factor of 0.9 was chosen then data which is 100 time iterations old is premultiplied by 0.9 100 = 2.6561 × 10 –5 and thus considerably de-emphasized compared to the
current data. Therefore in dB terms, data that is more 100 time iterations old is attenuated by
10 log ( 0.00026561 ) = – 46 dB . Data that is more than 200 time iterations old is therefore
attenuated by around 92 dB, and if the input data were 16 bit fixed point corresponding to a dynamic
range of 96dB, then the old data is on the verge of being completely forgotten about. The forgetting
factor is typically a value of between 0.9 and 0.9999.
Noting the form of Eq. 504 we can rewrite Eq. 520 as:
v ( k ) = e kT Λ k e k
(521)
where Λ k is a ( k + 1 ) × ( k + 1 ) diagonal matrix Λ k = diag [ λ k, λ k – 1, λ k – 2, …, λ, 1 ]
Therefore:
v ( k ) = [ dk – Xk w ]T Λk [ dk – Xk w ]
= dkT Λ k d k + w T X kT Λ k X k w – 2d kT Λ k X k w
(522)
Following the same procedure as for Eqs. 505 to 508 the exponentially weight least squares
solution is easily found to be:
w LS = [ X kT Λ k X k ] –1 X kT Λ k dk
(523)
In the same way as the RLS algorithm was realised, we can follow the same approach as Eqs. 511
to 519 and realise the exponentially weighted RLS algorithm:
wk + 1 = wk + mk e ( k )
Pk – 1 xk
m k = ---------------------------------------[ λ + x kT P k – 1 x k ]
P k – 1 – m k x kT P k – 1
P k = ----------------------------------------------λ
(524)
345
Therefore the block diagram for the exponentially weighted RLS algorithm is:
d(k)
x(k )
w0
w1
wN-2
wN-1
−
y(k )
+
e(k )
wk + 1 = w k + m k e ( k )
Pk – 1 x k
m k = ----------------------------------------[ λ + x kT P k – 1 x k ]
P k – 1 – m k x kT P k – 1
P k = -----------------------------------------------λ
Compared to the Least Mean Squares (LMS) algorithm, the RLS can provide much faster
convergence and a smaller error, however the computation required is a factor of N more than for
the LMS, where N is the adaptive filter length. The RLS is less numerically robust than the LMS.
For more detailed information refer to [77]. See also Adaptive Filtering, Least Mean Squares
Algorithm, Least Squares, Noise Cancellation, Recursive Least Squares.
Reflection: Sound can be reflected when a sound wave reaches a propagation medium boundary,
e.g. from air to brick (wall). Some of the sound may be reflected and the rest will either be absorbed
(converted to heat or transmitted through the medium). See also Absorption.
Register: A memory location inside a DSP processor, used for temporary storage of data. Access
to the data in a register is very fast as no off-chip memory movements are required.
Relative Error: The ratio of the absolute error (difference between true value and estimated value)
to the true value of a particular quantity is called the relative error. For example consider two real
numbers x and y, that will be represented to only one decimal place of precision:
x = 1.345 and
(525)
y = 1000.345
(526)
The rounded values, denoted as x’ and y’ will be given by
x′ = 1.3 and
(527)
y′ = 1000.3
(528)
The absolute errors, ∆x and ∆y , caused by the rounding are the same for both quantities, and
given by:
∆x = x – x′ = 1.345 – 1.3 = 0.045
(529)
∆y = y – y′ = 1000.345 – 1000.3 = 0.045
(530)
DSPedia
346
The relative error, however, is defined as the ratio of the absolute error to the correct value.
Therefore the relative error of x’ and y’ can be calculated as:
∆x
------- = 0.045
--------------- = 0.0334
x
1.345
(531)
∆y
0.045 - = 4.5 × 10 –5
------- = -----------------------y
1000.345
(532)
Relative error is often denoted as a percentage error. Therefore in the above example x’
represents a 3.34% error, whereas y’ is only a 0.0045% error. Relative errors are widely used in
error analysis calculations where the results of computations on estimated, rounded or truncated
quantities can be predicted by manipulating only the relative errors. See also Absolute Error, Error
Analysis.
Relative Pitch: The ability to specify the names of musical notes on the Western music scale if the
name of one of the notes is first given is known as relative pitch. Relative pitch skills are relatively
common among singers and musicians. The ability to identify any musical note with no clues is
known as perfect or absolute pitch and is less common. See also Music, Perfect Pitch, Pitch,
Western Music Scale.
Resistor-Capacitor Circuit: See RC Circuit.
Resolution: The accuracy to which a particular quantity has been converted. If the resolution of a
particular A/D converter is 10mVolts then this means that every analog quantity is resolved to within
10mVolts of its true value after conversion.
Resonance: When an object is vibrating at its resonant frequency it is said to be in resonance. See
Resonant Frequency.
Resonant Frequency: All mechanical objects have a resonant or natural frequency at which they
will vibrate if excited by an impulse. For example, striking a bell, or other metal object will, cause a
ringing sound (derived from the vibrations) at the bell’s resonant or natural frequency. If a
component is excited by vibrations at its resonant frequency then it will start to vibrate in synchrony
and lead to vibrations of a very large magnitude. This is referred to as sympathetic vibration. For
example, if a tone at the same frequency as a bell’s resonant frequency is played nearby, the bell
will start to ring in unison at the same frequency. Music is derived from instruments’ vibrating strings
and membranes, and columns of air at resonant frequency.
Resource Interchange File Format (RIFF): RIFF is a proprietary format developed by IBM and
Microsoft. RIFF essentially defines a set of file formats which are suitable for multimedia file
handling (i.e. audio, video, and graphics):
• Playing back multimedia data;
• Recording multimedia data;
• Exchanging multimedia data between applications and across platforms.
A RIFF file is composed of a descriptive header identifying the type of data, the size of the data,
and the actual data. Currently well known forms of RIFF file are:
• WAVE:
Waveform Audio Format (.WAV files)
347
• PAL:
Palette File Format (.PAL files)
• RDIB:
RIFF Device Independent Bitmap Format (.DIB files)
• RMID:
RIFF MIDI Format (.MID files)
• RMMP: RIFF Multimedia Movie File Format
RIFF files are supported by Microsoft Windows on the PC. (Note that there is also a counterpart to
RIFF called RIFX that uses the Motorola integer byte ordering format rather than the Intel format.)
See also Standards.
Return to Zero: See Non-Return to Zero.
Reverberation: The multitude of a particular sound’s waves that add to the direct path sound wave
but slightly later in time due to the longer distance (reflected) transmission paths. Virtually all rooms
have some level of reverberation (compare a carpeted office to an indoor swimming pool to contrast
rooms with short reverberation time to those with long reverberation times.) More formally the
reverberation time in a room is defined as the time it takes a sound to fall to one millionth (reduce
by 60dB) of its initial sound intensity.
Ringing Tone: Tones at 440 Hz and 480 Hz make up the ringing tone for telephone systems. See
also DialTone, Dual Tone Multifrequency.
Ripple Adder: See Parallel Adder.
RISC: RISC (Reduced instruction set computer) refers to a microprocessor that has implemented
a smaller core of instructions than a Complex Instruction Set Computer (CISC) in order that the
silicon area can be filled with more application appropriate facilities. Some designers refer to DSP
processors are RISC, whereas others note that RISCs are subtly different and lack features such
as internal DMA, multiple interrupt pins, single cycle MACs, wide accumulators and so on. RISCs
are designed to perform a wide range of general purpose instructions unlike DSPs, which are
optimized for MACs. Texas Instruments describe their TMS320C31 DSP chip as a hybrid DSP, with
features of both RISC and CISC. Best not to worry!
RS232: A simple serial communications protocol. A few DSP boards use RS232 lines to
communicate with the host computer. The ITU (formerly CCITT) adopted a related version of the
RS232 cable which is specified in recommendation V24.
Robinson-Dadson Curves: Robinson and Dadson’s 1956 paper [126] studied the definition of
sound intensity, the subjective loudness of human hearing, and associated audiometric
measurements. They repeated elements of earlier work by Fletcher and Munson in 1933 [73] and
produced a set of equal loudness contours which showed the variation in sound pressure level
(SPL) of tones at different frequencies that are perceived as having the same loudness. See also
Equal Loudness Contours, Frequency Range of Hearing, Loudness Recruitment, Sound Pressure
Level, Threshold of Hearing.
Roll-off: Common filter types such as low pass, band pass, or high pass filters have distinct
regions: the passband, transition band(s) and stopband(s). The increasing attenuation above the
3dB point from the passband to the stopband is referred to as the transition band. The rate at which
the filter response decreases from passband to stop band is called the roll-off of the filter. The
higher the roll-off, then the closer the filter is to the ideal filter which would have an infinite roll-off
from passband to stopband.
DSPedia
348
The roll-off a simple analog (single pole) RC circuit is 6dB/octave at frequencies above the cut-off
frequency, f3dB, (or 3dB point). If two RC circuits are cascaded together to realise a second order
(two pole) filter then the roll-off at frequencies above the cut-off frequency will be 12dB/octave or
40dB/decade (To attain better roll-off is it unlikely that passive RC circuits would be cascaded
together, and it is more likely that a higher order active filter would be used). In general for an N-th
order/pole cascaded RC filter (and which will have at least N capacitors), the roll-off rate at
frequencies high above f3dB the roll-off will be:
Roll-off


1
= 20 log 10  -----------------------------------------------------------------------------------------
2N
 1 + a ( f ⁄ f )2 + … + ( f ⁄ f
o
3dB
3dB )
(533)
≈ – 20N log 10 ( f ⁄ f 3dB )
For applications such as analog anti-alias filters, Bessel, Butterworth or Chebychev filters with
sharp cut-off frequencies with a hard knee at f3dB are required and the roll-off rate should be at least
the same as the dynamic range of the digital wordlength. For example using an ADC with 16 bits
wordlength and dynamic range 20 log 2 16 = 96dB it would be advisable to use an anti-alias filter of
at least 96dB/octave such that any frequency components above fs are completed removed. Note
that even with this sharp cut-off some frequency components between fs/2 and fs will still alias down
to the baseband if f3dB is chosen to equal fs/2. If less selective filters are available, it is generally
necessary to set f3dB to less than fs/2 (or use oversampling techniques). See also Active Filter,
349
Decade, Decibels, Filter (Bessel, Butterworth, Chebychev), Knee (of a filter), Logarithmic
Frequency, Logarithmic Magnitude, Octave. .
20log10 Vout/Vin (dB)
Ideal filter, infinite roll-off
0
-5
-10
-15
-20
-25
-30
-35
-40
-45
-50
-55
-60
Log10 frequency (decade)
Roll-off of simple RC
circuit: 20dB/decade
0.1
0.5
1
5
10
50
100
500
1000
20log10 Vout/Vin (dB)
log10(f/f3dB)
0
-3
-6
-9
-12
-15
-18
-21
-24
-27
-30
-33
-36
0.125
Log2 frequency (octave)
Roll-off of 6dB/octave
using a simple RC circuit:
Roll-off of 12dB/octave using
a second order active filter.
0.25
0.5
1
2
4
8
16
32
64
log2(f/f3dB)
The magnitude transfer function of the simple RC circuit is given by:
V out
1
1
----------- = ------------------------------------- , where f 3dB = ---------------2πRC
V in
1 + ( f ⁄ f 3dB ) 2
R
Vin
C
V
Round-Off Error: When two N bit numbers are multiplied together, the result is a number with 2N
bits. If a fixed point DSP processor with N bits resolution is used, the 2N bit number cannot be
accommodated for future computations which can operate on only N bit operands. Therefore, if we
assume that the original N bit numbers were both constrained to be less than 1 in magnitude by
using a binary point, then the 2N bit result is also less that 1. Hence if we round the least significant
N bits up or down, then this is equivalent to losing precision. This loss of precision is referred to as
round-off error. Although the round-off error for a single computation is usually not significant, many
errors added together can be significant. Furthermore if the result of a computation yields the value
DSPedia
350
of 0 (zero) after rounding, and this result is to be used as a divisor, a divide by zero error will occur.
See also Truncation Error, Fractional Binary, Binary Point.
Binary
0.1011001 x 0.1010001 = 0.011100001010010
Decimal 0.6953125 x 0.53125
= 0.44000244140625
0.0111000
Rounding
0.4375
After multiplication of two 8 bits numbers, the 16 bit result is rounded to 8 bits introducing a binary
round-off error of 0.000000001010010 which in decimal is 0.00250244140625.
Round-Off Noise: When round-off errors are modelled as a source of additive noise in a system,
the effect is referred to as round-off noise. This noise is usually discussed in terms of its mean
power. See also Round-off Error.
Row Vector: See Vector.
Run Length Encoding (RLE): If a data sequence contains a consecutive sequence of the same
data word, then this is referred to as a “run”, and the number of data words is referred to as the
“length” of the run. Run length encoding is a technique that allows data sequences prone to
repetitive values to be efficiently encoded and therefore compressed. For example, if a 256 × 256
image is stored in a file sequentially by each row, then a run of identical pixel values in a row can
is encoded by two data words, one stating the repeated value, and one stating the length of the run.
Run length encoding is a lossless compression technique. See also Compression.
351
S
Sample and Hold (S/H): A analog circuit used at the input to A/D converters to maintain a
constant input voltage while the digital equivalent is calculated by the A/D converter. The output
waveform is an analog voltage, that is “steppy” in appearance, with the duration of the steps (the
hold time) being determined by the chosen sampling frequency fs. The sample and hold function is
also referred to as a zero order hold. See also First Order Hold, Analog to Digital Converter.
time
Sample
and Hold
Circuitry
Voltage
Voltage
fs
1--fs
time
Sampling: The process of converting an analog signal into discrete samples at regular intervals.
To correctly sample a signal the sampling rate or sampling frequency, fs, should be at least twice
the maximum frequency component of the signal (the Nyquist criteria). Sampling results in analog
samples of a signal. Quantization converts these analog samples to a discrete set of values. See
also Analog to Digital Converter.
Sampling Rate: The number of samples per second from a particular analog signal, usually
expressed in Hz (Hertz).
Saturation Arithmetic: When the magnitude of the result of a computation will overflow the result
is limited by the DSP processor to the maximum positive or negative number (otherwise the number
would be too large for the processor wordlength). For a fixed point 16 bit DSP processor therefore,
the maximum value generated by any computation will be 32767, and the minimum value will be 32768.
DSPedia
352
Sawtooth Waveform: A sawtooth waveform is a periodic signal made up from individual ramp
waveforms. See also Ramp Waveform.
s(t)
0
2τ
τ
Continuous time sawtooth waveform with period, τ.
3τ
t
s( k)
0
κ
2κ
3κ
k
Discrete time sawtooth waveform with period, κ.
SAXPY: This term is used in vector algebra to indicate the calculation:
x = αx + y
(534)
SAXPY is a mnemonic for scalar alpha x plus y, and has its origins as part of the Linpack
software.[15]
Schur-Cohn Test: Given a z-domain polynomial of order N, the Schur-Cohn test can be used to
establish if the roots of the polynomial are within the unit circle [77]. The Schur-Cohn test can
therefore be used on IIR filters to check stability (i.e. all poles within the unit circle), or to test if a
filter is minimum phase (all zeroes and poles within the unit circle).
Schur Form: See Matrix Decompositions - Schur Form.
Scrambler/Descrambler: A scrambler is either an analog or digital device used to implement
secure communication channels by modifying a data stream or analog signal to appear random. A
descrambler reverses the effect of the scrambler to recover the original signal. Many different
techniques exist for scrambling signals and are of two main forms: frequency domain techniques,
and time domain techniques.
Second Order: Usually meaning two of a particular device cascaded together. Used in a nonconsistent way. Second order is often used to refer to a segment of a linear system that can be
represented by a system polynomial of order 2.
Semitone: In music theory each adjacent note in the chromatic scale differs by one semitone,
which corresponds to multiplying the lower frequency by the twelfth root of 2, i.e.
2 1 / 12 = 1.0594631… . A difference of two semitones is a tone. See also Western Music Scale.
Semi-vowels: One of the elementary sounds of speech, namely plosives, fricatives, sibilant
fricative, semi-vowels, and nasals. Semi-vowels are relatively open sounds and formed via
constrictions made by the lips or tongue. See also Fricatives, Nasals, Plosives, and Sibilant
Fricatives.
353
Sensation Level (SL): A person’s sensation level for particular sound stimulus is calculated as a
power ratio relative to their own minimum detectable level of that specific sound:
Sound Intensity
Sensation Level = 10 log  ------------------------------------------------------------------------------------------- = dB (SL)
Minimum Detectable Sound Level
(535)
Therefore if a sound is 40dB (SL) then it is 40dB above that person’s minimum detectable level of
the sound. Clearly the physical intensity of a sensation level will differ from person to person [30].
See also Audiology, Hearing Level, Sound Level Units, Sound Pressure Level, Threshold of
Hearing.
Sensorineural Hearing Loss: If the cochlea, auditory nerve or other elements of the inner ear are
not functioning correctly then the associated hearing loss is often known as sensorineural [30].
Typically the audiogram will reveal that the sensorineural hearing loss increases with increased
frequency. Although a frequency selective linear amplification hearing aid will assist in some cases
to reduce the impairment, in general the wearer will still have difficulty in perceiving speech signals
in noisy environments. Such is the complex nature of this form of hearing loss. See also Audiology,
Audiometry, Conductive Hearing Loss, Ear, Hearing Aids, Hearing Impairment, Loudness
Recruitment, Threshold of Hearing.
Sequential Linear Feedback Register: See Pseudo Random Binary Sequence.
Serial Copy Management System (SCMS): The Serial Copy Management System provides
protection from unauthorised digital copying of copyrighted material. The SCMS protocol ensures
that only one digital copy is possible from a protected recording [128], [158].
Shading Weights: Coefficients used to weight the contributions of different sensors in a
beamforming array (or the coefficients in an FIR filter). Shading weights control the characteristics
of the sidelobes and mainlobe for a beamformer (or, analogously, an FIR filter). The use and design
of shading weights is very similar to that for Data Windows and FIR filters. See also Beamforming,
Windows, FIR Filters.
Shannon, Claude Elwood: Claude Elwood Shannon can be justly described as the father of the
digital information age by virtue of his mathematical genius in defining the important principles of
what we now call information theory. Claude Shannon was born in Michigan on April 30th 1916. He
first attended University of Michigan in 1932 and graduated with a Bachelor of Science degree in
Electrical Engineering, and also in Mathematics. In 1936 he joined MIT as a research assistant, and
in 1938 published his first paper “A Symbolic Analysis of Relay and Switching Circuits”. In 1948 he
produced the celebrated paper “A Mathematical Theory of Communication” in the Bell System
Technical Journal [129]. It is widely accepted that Claude Shannon profoundly altered virtually all
aspects of communication theory and real world practice. Claude Shannon’s other interests have
included “beat the dealer” gambling machines, mirrored rooms, robot bicycle riders, and a long time
interest in the practical and mathematical aspects of juggling. Readers are referred to Shannon’s
biography and collected papers [41] for more insights on this most interesting individual.
Sherman-Morrison-Woodbury Formula: See Matrix Properties - Inversion Lemma.
Shielded Pair: Two insulated wires in a cable wrapped with metallic braid or foil to prevent
interference and provide reduced transmission noise.
DSPedia
354
Sibilant Fricatives: One of the elementary sounds of speech, namely plosives, fricatives, sibilant
fricative, semi-vowels, and nasals. Sibilant fricatives are the hissing sounds formed when air is
forced over the cutting edges of the front teeth with the lips slightly parted. See also Fricatives,
Nasals, Plosives, and Semi-vowels.
Sidelobes: In an antenna or sensor array processing system, sidelobes refer to the secondary
lobes of sensitivity in the beampattern. For a filter or a data window, sidelobes refer to the stopband
lobes of sensitivity. The lower the sidelobe level, the more selective or sensitive a given system is
said to be. The level of the first sidelobe (relative to the main lobe peak) is often an important
parameter for a data window, a digital filter, or an antenna system. Sidelobes are best illustrated by
an example.
mainlobe
array gain
as a function
of angle
sidelobes
-15
-10
-5
0 dB contour
Typical Beampattern
See also Main lobe, Beamformer, Beampattern, Windows.
Sigma Delta (Σ−∆): Σ−∆ converters use noise shaping techniques whereby the baseband
quantization noise from oversampling can be high pass filtered, and the oversampling factor
required to increase signal resolution can be reduced from the 4x’s per single bit normally required
when oversampling (see Oversampling). A simple first order Σ−∆ ADC converter only requires the
analog components of an integrator, a summer, a 1 bit quantiser (or a 1 bit ADC) and single bit DAC
in the feedback loop. A first order Σ−∆ DAC requires only the analog components of a 1 bit DAC:
+
x(t)
Analog
Input
Σ
1-bit ADC
∫
1 bit
y(k)
fovs
1-bit
DAC
ADC
Analog signal
Digital signal (single bit)
+
x(k)
Digital
N-bit input
integrator
Σ
+
+
Σ
Quantiser
z–1
fovs
1-bit
DAC
y(t)
Analog
Output
z–1
DAC
First order single bit Σ−∆ converter ADC and DAC. The 1 bit ADC intercepts the y-axis at
the input maximum and minimum, and the quantiser (in the DAC) intercepts at ± 2 N – 1 .
For the Σ−∆ ADC the integrator can be produced using a capacitive component, the summer using
a simple summation amplifier, and the quantiser using a comparator.
Unlike conventional data converters the non linear element (the quantiser) is within a feedback loop
in a mixed analogue/digital system and as a result Σ∆ devices are difficult to analyze. However as
a first step to understanding the principle of operation of a Σ∆ device consider the following
355
representation of the ADC which is similar to the one above but the integrator has now been moved
in front of the adder.
x(t)
+
∫
-
1-bit ADC
Σ
1 bit
1
-1
Analogue
Input
y(k)
fovs
1-bit
DAC
∫
Modified first order sigma delta ADC.
Clearly the Σ∆ modulator tries to keep the mean value of the 1-bit high frequency signal equal to
the mean of the input signal. Thus for a frequency input of 0 Hz, the mean output is not affected by
quantisation noise. This simple result can be extended and for inputs of “very low frequency” with
respect to the sampling frequency, f ovs , and we conclude that the output will be a “good”
representation of the input.
Because of the non-linearities present, the simple first order Σ−∆ “loop” is actually very difficult to
analyze. Therefore the above linearized digital model which represents a “reasonably”
mathematically tractable model is used [8]. The analog integrator is modelled with a digital
integrator and the quantizer is modelled as an additive white noise source. The ADC is therefore
linearised and replaced by a signal independent white noise source, n ( k ) , of variance (power)
q 2 ⁄ 12 (where q is the step size of the single bit quantiser) and the analog integrator approximated
by a digital integrator such that:
k
y(k) = x(k) + y(k – 1) =
t
∑ x ( n ) ≈ ∫ x ( τ ) dτ
n=0
(536)
0
where t = kT and T is the sampling period. The following analysis models are therefore realised:
+
x(k) -
Input
Σ
+
integrator
Σ
+
z–1
z–1
n(k)
+
Σ
+
1 bit
y(k)
+
x(k) Digital
N-bit input
Σ
+
integrator
Σ
+
n(k)
+
Σ
+
1 bit
z –1
z –1
ADC
DAC
The (identical) linearised digital models for a Σ∆ ADC and a Σ∆ DAC. The linearised model
allows for a more simple analysis of the behaviour of the circuits. Note that z – 1 represents
a sample delay element of period t ovs = 1 ⁄ f ovs .
y(k)
DSPedia
356
This z-domain model can further be simplified to:
N(z)
X(z)
+
-
Σ
1
------------------1 – z –1
+
+
Σ
1 bit
Y(z)
z –1
Linearised digital models for a Σ∆ ADCs and a Σ∆ DACs. Compared to the previous figure,
the integrator can be represented as a simple pole.
The output of the above Σ∆ first order model is simply given by:
1 + N(z )
Y ( z ) = [ X ( z ) – z – 1 Y ( z ) ] ---------------1 – z –1
⇒ Y ( z ) – z – 1 Y ( z ) = [ X ( z ) – z – 1 Y ( z ) ] + N ( z ) ( 1 – z –1 )
(537)
⇒ Y ( z ) = X ( z ) + N ( z ) – z –1 N ( z )
Written in the time domain the output is therefore:
y(k) = x(k) + n(k) – n(k – 1)
(538)
From Eq. 538 we can note that the input signal passes unaltered through the modulator, whereas
the added noise is high pass filtered (for low frequency values of n ( k ) , then n ( k ) – n ( k – 1 ) ≈ 0 ).
The total quantisation noise power of the 1 bit quantiser is therefore increased by using the Σ∆ loop
(actually doubled or increased by 3dB), but the low frequency quantisation noise power (i.e. at the
baseband) is reduced if the sampling frequency is high enough. Compared to the 1 extra bit of
resolution obtained for every increase in sampling frequency by 4 for an oversampling ADC (see
Quantization Noise-Reduction by Oversampling), the first order Σ∆ loop brings the advantage of
approximately 1.5 bits of extra resolution (in the baseband) for each doubling of the sampling
frequency [8].
To illustrate the operation of a first order Σ∆ converter a linear chirp signal with frequency increasing
from 100 to 4800 Hz over a 0.1 second interval was input to the above sigma delta loop sampling
357
at 64 x’s the Nyquist rate, i.e. f ovs = 64f n = 640000 Hz . A 0.45ms (292 samples) of the sigma
delta output and chirp input input signal is shown below: .
SD single bit output
Input Signal
1
Amplitude
y( k)
-1
64.1
64.2
64.3
64.4
64.5
time/ms
Output of a first order sigma delta loop for a 0.45ms segement of the input chirp signal (when
the signal frequency was around 3000 Hz) sampled at 640000 Hz. 292 single bit samples are
shown.
The power spectrum obtained from an FFT on about a 0.1s segment of chirp signal i.e 65536
samples (zero padded from 64000) is:
Magnitude (dB)
0
Baseband (5000 Hz)
-20
0
-40
-60
-80
0
40
80
120
160
200
240
280
320
frequency/Hz x 103
Frequency domain output of a first order, R = 64 x’s oversampled sigma delta converter.
The Nyquist rate was f s = 10000 Hz . The input signal was a linear chirp from 100 Hz to
4800Hz over a 0.1 second interval (64000 samples) and 65536 points ( ≈ 0.1 seconds)
were used in the (zero padded) FFT. The dotted line shows the first order noise shaping
characteristic predicted by Eq. 538. By digitally low pass filtering this single bit signal,
around 9-10 bits of resolution are achievable in the baseband of 0 to 5000 Hz.
Clearly the quantisation noise has been high pass filtered out of the baseband, thus giving
additional resolution. The dotted line in the above figure shows the quantisation noise shaping
spectrum predicted by Eq. 538. For this oversampling rate of R = 64 the signal to quantisation
noise ratio in the baseband is about 55dB giving between 9 and 10 bits of signal resolution (cf.
20log29 dB). If only an oversampling single bit converter was used (i.e. no Σ∆ loop), 64 x’s
oversampling would only allow about 3-4 bits of resolution. To extract the higher resolution
baseband signal a low pass filter is required to extract only the baseband signal.
DSPedia
358
To obtain more than 9-10 bits resolution without further increasing the sampling frequency, a higher
order sigma delta converter can be used. The circuit for a simplified second order sigma delta loop
can be represented as the z-domain model:
+
x(t)
Analog
Σ
-
+
integrator
Σ
+
+
Σ
-
z –1
+
integrator
Σ
+
Quantiser
z –1
y(k)
Single bit
Output
fovs
1-bit
DAC
Second order sigma delta modulator. The baseband noise is much lower than that of the
first order sigma delta loop due to the more effective high pass quantisation noise filtering.
Analytical and experimental studies of this system are considerably more complex than that
of the first order loop.
For each doubling of the sampling frequency the second order loop gives around an extra 2.5 bits
resolution. The z-domain output of the above converter is:
Y ( z ) = X ( z ) + ( 1 – z –1 ) 2 N ( z )
(539)
and it can be seen that this extra baseband resolution is a result of the second order high pass
filtering of the quantisation noise compared to the first order loop.
The result of inputting the same signal as previously, a linear chirp signal with frequency increasing
from 100 to 4800 Hz over a 0.1 second interval at 64 x’s the Nyquist rate, i.e.
f ovs = 64f n = 640000 Hz into a second order sigma delta modulator is:
Magnitude (dB)
0
Baseband (5000 Hz)
-20
-40
-60
-80
0
40
80
120
160
200
240
280
320
frequency/Hz x103
Frequency domain output of second order R = 64 x’s oversampled sigma delta converter.
The input signal was a linear chirp signal from 100 Hz to 4800Hz over a 0.1 second interval.
65536 data points were used in the FFT. The dotted line shows the second order noise
shaping characteristic predicted by Eq. 539. By digitally low pass filtering this single bit
signal, around 13-14 bits of resolution are achievable in the baseband of 0 to 5000 Hz.
The signal to quantisation noise in the baseband is now even higher, almost of the order of 80dB
and therefore allowing between 13 and 14 bits of signal resolution to be obtained (cf. 20log213 dB).
Note that the design of higher than second order Σ−∆ loops must be done very “carefully” in order
to ensure stability and a straightforward cascading is to produced higher order loops is ill advised
[8].
359
=
1 --------------------second
64 × 10 3
time
Magnitude
Baseband signal
f ovs
Quantisation noise
5
freq/kHz
t
ovs
time
d
fn
f ovs
5
freq/kHz
320
t
t
n =
1 --------------10000
second
time
d
Aliased spectra
fn
5
freq/kHz
6.4 MHz
fovs
Σ∆
ADC
Analog input
from antialias filter
320
1 ibit
Attenuation
Magnitude
-1
t
Amplitude
1
ovs
Magnitude
t
Amplitude
Amplitude
At the output of a Σ−∆ ADC, the single bit oversampled signal is decimated, i.e. digitally low pass
filtered to half of the Nyquist frequency, and then downsampled:
Downsampler
Digital Low Pass Filter
freq/kHz
5
320
64
Multibit oversampled
PCM signal
Multibit Nyquist rate
(10kHz) PCM signal
Decimation of a 64 x’s oversampled sigma delta signal at f ovs = 64 × f n = 64 kHz , to the
Nyquist rate of f n = 10 kHz by low pass digital filtering then down-sampling by 64. Note
that the interpolated signal will be delayed by the group delay, t d of the digital low pass filter
(which should be linear phase in the baseband). Note that in practice the low pass filtering
and downsampling is done in stages, see Sigma Delta-Decimation. The number of bits of
signal resolution in the final output stage is a function of the order of the Σ∆ converter, and
the filtering properties of the low pass filter.
0
time
fn
5 10
Aliased spectra
freq/kHz
t
ovs
td
time
freq/kHz
Multibit Nyquist
rate PCM signal
sampled at 10kHz
64
Attenuation
6.4 MHz
Upsampler
t
320
ovs
1
-1
f ovs
5
Amplitude
1 --------------------second
64 × 10 3
Magnitude
ovs =
Amplitude
t
Magnitude
Magnitude
Amplitude
In order to produce a suitably noise shaped single bit data stream for input to a Σ∆ DAC the reverse
of the above process is performed:
t
time
d
fn
Quantisation noise
5
freq/kHz
320
6.4 MHz
Digital Low Pass Filter
Σ∆
DAC
freq/kHz
5
fovs
Baseband signal
320
Multibit oversampled
PCM signal
To analog
reconstruction filter
Interpolation of a Nyquist rate signal sampled at f n = 10 kHz , to a sampling rate of
64 × f n = 64 kHz by upsamping and low pass digital filtering. Note that the interpolated
Nyquist rate or baseband signal will be delayed by the group delay, t d of the digital low
pass filter (which should be linear phase in the baseband). Note that in practice the low
pass filtering and upsampling is done in stages, see Sigma Delta-Interpolation. The number
of bits of signal resolution in the final output stage is a function of the order of the Σ∆
converter, and the properties of the low pass filter.
DSPedia
360
To use sigma delta converters in a DSP system computing at the Nyquist rate, the following
components are required:
Analogue
AntiAlias
Filter
Analogue
Input
Digital
Σ−∆
ADC
1
Decimation:
Low pass
filter and
Downsample
16
16 bit
PCM
DSP
fovs = R fn
fn
Analogue
16
Interpolation
Upsample
and Low
Pass Filter
Σ−∆
DAC
1
Reconstrcution
filter
fovs = R fn
Analogue
Output
Using sigma delta converters as part of an DSP system. The analogue anti-alias and
reconstruction filters are simple low order filters which match the order of the Σ∆ codec. The
DSP processor is running at the Nyquist rate, f n and interpolation and decimation stages
are used to convert the oversampled 1 bit digital signal to a multibit Nyquist rate digital
signal.
See also Decimation, Differentiator, Integrator, Oversampling, Interpolation , Quantisation Noise Reduction by Oversampling, Sigma Delta - Anti-Alias Filter, Sigma Delta - Decimation Filters,
Sigma Delta - Reconstruction Filter.
+
x(k) -
Input
Σ
+
integrator
Σ
+
z –1
z –1
n(k)
+
Σ
1 bit
y(k)
Magnitude Y(f) (dB)
Sigma Delta, Anti-Alias Filter: One of the advantages of using sigma delta converters is the
analogue anti-alias and reconstruction filters are very simple and therefore low cost. Consider a first
order order sigma delta loop oversampling at 64 x’s the Nyquist rate, with the quantiser modelled
as white noise source, n ( k ) (see Sigma Delta), and the input signal of full scale deflection
(represented as 0dB) and occupying the entire Nyquist bandwidth:
0
Baseband, fs/128
First order
RC circuit
-20
-40
Shaped Quantisation Noise
-60
-80
frequency
fovs/2
fovs/128
Using the simple first order sigma delta model (left hand side), the frequency spectra shows
that the quantisation noise is low in the region of the baseband, and the multibit signal
representation can be extracted from the 1 bit signal by digital low pass filtering and
downsampling by 64. To ensure aliasing does not occur, an analog anti-alias filter (first
order RC circuit) removing frequency components above fovs/2 is required.
In order that aliasing does not occur, the analog anti-alias filter must cut-off all frequencies above
fovs/2. Noting that the digital low pass decimation filter (see Sigma Delta) will filter all frequencies
between fovs/2 and fovs/128, then the analog anti-alias only requires to cut off above fovs/2. The antialias filter should be cutting off by at least the baseband resolution of the converter. Therefore
noting that the power roll off of an RC circuit is 6dB/octave, then if the 3dB frequency is placed at
fovs/128, at 64 times this frequency (6 octaves) 36dB of attenuation is produced at fovs/2. Noting that
the quantisation noise power is already about 20dB below the 0dB level at fovs/2, then a total of
56dB of attentuation is produced.
For a second order sigma delta converter via a similar argument as above, a second order anti-alias
filter is required, (noting that the quantisation noise at fovs/2 is now increased due to enhanced noise
361
shaping). In general for an n-th order sigma delta converter an n-th order anti-alias filter should be
used. The same is true for the reconstruction filter used with a sigma delta DAC. See also
Oversampling, Sigma Delta.
Sigma Delta Converter: See Sigma Delta.
Sigma Delta, Decimation Filters: Decimation for a sigma delta converter requires that a low
pass filter with a cut off frequency of 1/R-th of the oversampling frequency is implemented, where
R is the oversampling ratio. This filter should also have linear phase in the passband. To implement
a low pass FIR filter with 90dB stopband rejection and a passband of, for example, 1/64 of the
sampling rate ( R = 64 ) would require thousands of filter weights. Clearly this is impractical to
implement. Therefore the low pass filtering and downsampling is often done in stages, using initial
stages of simple comb type filters where all filter coefficients are of value 1 leading to a simple FIR
that requires only additions and no multiplications. After this initial coarse filtering, a sharp cut-off
FIR filter (still of a hundred or more weights) can be used at the final stage:
f ovs = 64f n
Analog input
from antialias filter
Σ∆
ADC
1 bit
64f n
Downsampler
Downsampler
Digital Low
Pass Comb
Filter
12 ibits
16
4f n
Sharp Cut
Off FIR Low
Pass Filter
4
16 ibits
fn
Decimation of the output of a 3rd order sigma delta converter using a low pass comb filter
followed by a sharper cut off low pass FIR filter running at only 4 x’s the Nyquist rate, f n .
Interpolation for a Σ∆ ADC is the effective reverse of the above process.
See also Comb Filter, Decimation, Sigma Delta, Sigma Delta - Anti-Alias Filter.
Sigma Delta (Σ−∆) Loop: A term sometimes used to indicate a first order sigma delta converter.
The “loop” refers to the feedback from converter output to an input summation state. See Sigma
Delta.
Sigma Delta, Reconstruction Filter: The order of the reconstruction filter for a sigma delta DAC
should match that of sigma delta order. For details see Sigma Delta - Anti Alias Filter.
Sign Data/Regressor LMS: See Least Mean Squares Algorithm Variants.
Sign Error LMS: See Least Mean Squares Algorithm Variants.
Sign-Sign LMS: See Least Mean Squares Algorithm Variants.
Signal Conditioning: The stage where a signal from a sensor is amplified (or attenuated) and
anti-alias filtered in order that its peak to peak voltage, V Pk to Pk , swing matches the voltage swing
of the A/D converter and so that the signal components are not aliased upon sampling and
conversion. Signals are also conditioned going the opposite way from D/A converter to signal
conditioning amplifier, to actuator.
Signal Flow Graph (SFG): A simple line diagram used to illustrate the operation of an algorithm;
particularly the flow of data. Signal flow graphs consist of annotated directed lines and splitting and
summing nodes. It is very often easier to represent an algorithm in signal flow graph form than it is
DSPedia
362
to represent it algebraically. See, for example, the Fast Fourier Transform signal flow graph. Below
a z-domain signal flow graph is illustrated for a 4 tap FIR filter.
z-1
z-1
z-1
x(n)
w0
w1
w2
- Summing Node
w3
y(n)
Signal Flow Graph for a 4 tap FIR filter.
Signal Primitives: See Elementary Signals.
Signal Space: Signal space is a convenient tool for representing signals (or symbols) used for
encoding information to be sent over a channel. The signal space approach to digital
communication systems exploits the fact that a finite number of signals can be represented as
points (or vectors) in a finite dimensional vector space. This vector space representation allows
convenient matrix-vector notation (linear algebra) to be used in the design and analysis of these
systems. See also Vector Space, Matrix.
Signal to Interference plus Noise Ratio (SINR): The ratio of the signal power to the interference
power plus the noise power. Used especially in systems that experience significant interference
components (e.g., intentional jamming) in addition to additive noise.
Signal to Noise Ratio (SNR, S/N): The ratio of the power of a signal to the power of
contaminating (and unwanted) noise. Clearly a very high SNR is desirable in most systems. SNR
ratios are usually given in dB’s and calculated from:
Signal Power
SNR = 10 log 10  -----------------------------------
Noise Power
(540)
Simplex: Pertaining to the ability to send data in one direction only. See also Full Duplex, Half
Duplex.
Similarity Transform: See Matrix Decompositions - Similarity Transform.
Simultaneous Masking: See Spectral Masking.
Sinc Function: The sinc function is widely used in signal processing and is usually denoted as:
sin xsinc ( x ) = ---------x
(541)
363
The sinc function can be plotted as:
sin x---------x
5π
4π
3π
2π
1.0
0
π
π
2π
3π
4π
5π
x
The logarithmic magnitude sinc function (which is symmetric about the y-axis) has the form:
0
sin x
20 log  -----------
x
-10
-20
-30
-40
-50
-60
π
2π
3π
4π
5π
6π
7π
8π
x
Note that the first sidelobe peak occurs at approximately -26 dB (and at -13 dB if the
function 10 log sin x ⁄ x is plotted).
Singular Value: See Matrix Decompositions - Singular Value.
Sine Wave: A sine wave (occurring with respect to time) can be written as:
x ( t ) = A sin ( 2πft + φ )
(542)
DSPedia
364
Voltage
where A is the signal amplitude; f is the frequency in Hertz; φ is the phase and t is time.
A sin(φ)
period = 1/f
A
time t
Sine Wave Generation: See Dual Tone Multifrequency - Tone Generation.
Single Cycle Execution: Many DSP processors can perform a full precision multiplication (e.g.,
16 bit integer, 32 bit floating point - 24 bit mantissa, 8 bit exponent) and accumulate (MAC) (a.b +c)
in a single cycle of the clock used to control the DSP processor. See DSP Processor, Parallel
Multiplier.
Single Pole: If the input-output transfer function of a circuit has only one pole (in the s-domain),
then it is often referred to a single pole circuit. The magnitude frequency plot of a single pole circuit
will roll-off at 20dB/decade (6dB/octave). An RC circuit is a simple single pole circuit. See also
Active Filter, RC Circuit.
Singular Matrix: See Matrix Properties - Singular.
Slope Overload: If the step size is too small when delta modulating a digital signal, then slope
overload will occur resulting in a large error between the coded signal and the original signal. Slope
overload can be corrected by increasing the sampling frequency, or increasing the delta (∆) step
size, although the latter may lead to granularity effects. See also Delta Modulation, Granularity
Effects .
x(n)
Slope overload error
time
Snap-In Digital Filter: The name used to mean a digital filter that can easily be introduced
between the analog front end (A/Ds) and the user interface (the PC screen). A term introduced by
Hyperception Inc.
Solenoid: A device that converts electro-magnetic energy into physical displacement.
Sones: A sone is a subjective measure of loudness which relates the logarithmic response of the
human ear to SPL. One sone is the level of loudness experienced by listening to a sound of 40
phon. A measure of 2 sones will be twice as loud, and 0.5 sones will be half as loud and so on. See
also Phons, Sensation Level, Sound Pressure Level.
365
Sound: Sound is derived from vibrations which cause the propagating medium’s particles (usually
air) to alternately rarify and compress. For DSP purposes sound can be sensed by a microphone
and the electrical output sent to an analog to digital converter (ADC) for input to a DSP processor.
Sound can be reproduced in a DSP system using a loudspeaker.
When the loudspeaker below produces a tone the compression and rarefaction of air particles
occurs in all directions of sound propagation. For illustrative purposes only the compression and
rarefaction in one direction is shown:
Compression
Air “particles”
Sound Pressure Level
LoudSpeaker
Rarefication
time
Direction of sound propagation
Sound waves are longitudinal -- meaning that the wave fluctuations occur in the direction of
propagation of the wave. As a point of comparison, electromagnetic waves are transversal -meaning the variation occurs perpendicular to the direction of propagation. Hence subtle
differences exist between modelling acoustic wave propagation and electromagnetic wave
propagation. For example, there is no polarization phenomena for acoustic waves. See also Audio,
Microphone, Loudspeaker, Sound Pressure Level, Speed of Sound.
Sound Exposure Meters: For persons subjected to noise at the workplace, a sound exposure
meter can be worn which will average the “total” sound they are exposed to in a day, and the
measurement can then be compared with national safety standards [46].
Sound Intensity: Sound intensity is a measure of the power of a sound over a given area. The ear
of a healthy young person can hear sounds between frequencies around 1000 - 3000Hz at
intensities as low 10-12 W/m2 (the threshold of hearing) and as high as 1W/m2 (just below the
threshold of pain). Because of the human ear linear dynamic range of almost 1,000,000,000,000,
absolute sound intensity is rarely quoted. Instead a logarithmic measure called sound pressure
level (SPL) is calculated by measuring the sound intensity relative to a reference intensity of 10-12
W/m2:
I
SPL = 10 log  ------- dB
 I ref
(543)
See also Audiology, Equal Loudness Contours, Infrasound, Sound Pressure Level, Sound
Pressure Level Weighting Curves, Threshold of Hearing, Ultrasound.
DSPedia
366
Sound Intensity Meter: A sound intensity meter will use two or more identical microphones in
order that simple beamforming techniques can be performed in an attempt to resolve the direction
(as well as magnitude) of a noise. This can be important in noisy environments where there are
several noise sources close together rather than a single noise source. Typically a sound intensity
meter will consist of two precision microphones with very similar performance which are mounted
a fixed distance apart. The sound intensity meter measures both the amplitude and relative phase
and then calculates the noise amplitude and direction of arrival. By dividing the frequency analysis
into bands, multiple sources at different frequencies and from different directions can be identified.
Sound intensity meters usually measure noise over one third octave frequency bands. Sound
intensity meters correspond to standard IEC 1043:1993. See also Sound Intensity, Sound Pressure
Level, Sound Pressure Level Weighting Curves [46].
Sound Level Units: There are a number of different units by which sound level can be expressed.
The human ear can hear sounds at pressures as low as 10-5 N/m2 (approximately the threshold of
hearing for a 1000Hz tone). Sound level can also be measured as sound intensities which specify
dissipated power over area, rather than as a pressure; 2 × 10 –5 N/m 2 is equivalent to 10-12W/m2.
Because of the very large dynamic range of the human ear, most sound level units and related
measurements are given on a logarithmic dB scale. See also Audiometry, Equivalent Sound
Continuous Level, Hearing Level, Phons, Sones, Sound, Sound Exposure Meters, Sound Intensity,
Sound Intensity Meter, Sensation Level, Sound Pressure Level, Sound Pressure Level Weighting
Curves, Threshold of Hearing.
Sound Pressure Level (SPL): Sound Pressure Level (SPL) is specified in decibels (dB) and is
calculated as the logarithm of a ratio:
I
SPL = 10 log  ------- dB
I ref
(544)
where I is the sound intensity measured in Watts per square meter (W/m2) and Iref is the reference
intensity of 10-12W/m2 which is the approximate lower threshold of hearing for a tone at 1000Hz.
Alternatively (and more intuitively given the name sound “pressure” level) SPL can be expressed
as a ratio of a measured sound pressure relative to a reference pressure, P ref , of 2 × 10 – 5 N/m 2 =
20 µ Pa:
I
P 2-
P
= 20 log --------- dB
SPL = 10 log  ------- = 10 log  ------- I ref
P2 
P ref
(545)
ref
Intensity is proportional to the squared pressure. i.e.
I ∝ P2
(546)
A logarithmic measure is used for sound because of the very large dynamic range of the human
has a linear scale of intensity of more than 10 12 and because of the logarithmic nature of hearing.
Due to the nature of hearing, a 6dB increase in sound pressure level is not necessarily perceived
as twice as loud. (See entry for Sones.)
Some approximate example SPLs are:
367
SPL (dB)
Intensity ratio
I / Iref
Pressure ratio
P / Pref
Example Sound
120
1012
106
Gun-fire (Pain threshold)
100
1010
105
The Rolling Stones
80
108
104
Noisy lecture theatre
60
106
103
Normal conversation
40
104
102
Low murmur in the countryside
20
102
101
Quiet recording studio
0
1
1
Threshold of human hearing
-10
10-1 = 0.1
10-1/2 = 0.316
The noise of a nearby spider walking.
Table 2:
It is worth noting that standard atmospheric pressure is around 101300 N/m2 and the pressure
exerted by a very small insect’s legs is around 10 N/m2. Therefore the ear and other sound
measuring devices are measuring extremely small variations on pressure. See also Audiology,
Audiometry, Equivalent Sound Continuous Level, Hearing Level, Sones, Sound Intensity, Sound
Pressure Level, Sound Pressure Level Weighting Curves, Threshold of Hearing.
Sound Pressure Level (SPL) Weighting Curves: Because the human ear does not perceive all
frequencies of the same SPL with the same loudness, a number of SPL weighting scales were
introduced. The most common is the A weighting curve (based on the average threshold of hearing)
which attempts to measure acoustics signals in the same way that the ear perceives it. Sound
pressure level measurements made using the A-weighting curve are indicated as dB(A) or dBA,
although the use of this weighting is so widespread in SPL meters measuring environmental noise,
that the A is often omitted. Sounds above 0dB(A) over the frequency range 20-16000Hz are “likely”
to be perceptible by humans with unimpaired hearing. As an example of using the weighting curve,
a 100Hz tone with SPL of 100dB(SPL) will register about 78dB(A) on the A-weighting scale and can
be “loosely” interpreted as being 88dB above the threshold of hearing at 100Hz from the figure
below.
Other less commonly used weighting curves are denoted as B, C and D. Standard weighting curves
can be found in IEC 651: 1979, BS 5969: 1981, and ANSI S1.4-1983.
DSPedia
368
See also Audiogram, Audiology, Hearing Level, Permanent Threshold Shift, Psychoacoustics,
Sound Pressure Level, Spectral Masking, Temporal Masking, Threshold of Hearing.
Approximate Sound Pressure Level Weighting Curves
Sound Pressure Level Weighting (dB)
20
10
0
C
-10
-20
-30
-40
D
B
-50
-60
A
-70
-80
20
50
100
200
500
1000 2000
5000 10000 20000
frequency (Hz)
Source Coding: This refers to the coding of data bits to reduce the bit rate required to represent
an information source (i.e., a bit stream). While channel coding introduces structured redundancy
to allow correction and detection of channel induced errors, source coding attempts to reduce the
natural redundancy present in any information source. The lower limit for source coding (without
loss of information) is set by the entropy of the source. See also Channel Coding, Huffman Coding,
Entropy, Entropy Coding.
Source Localization: See Localization.
Space: See Vector Properties - Space.
Space, Vector: See Vector Properties and Definitions - Space.
Span of Vectors: See Vector Properties and Definitions - Span.
Sparse Matrix: See Matrix Structured - Sparse.
Spatial Filtering: Digital filters can be used to separate signals with non-overlapping spectra in the
frequency domain. A DSP system can also be set up to separate signals arriving from different
369
spatial locations (or directions) with an array of sensors. This process is referred to as spatial
filtering. See Beamforming, Beampattern.
Microphone array
DSP
System
Competing speakers cancelled
Speaker of interest in
listener’s look direction
The DSP system identifies the broadside (head-on) waveform and attempts to null out the
interfering signal from the oblique angles to produce a spatially filtered signal which is sent to
an amplifier and a small loudspeaker in the listener’s ear.
Spectral Analysis: Methods for finding the frequency content of signals, usually using the FFT
and variants.
Spectral Decomposition: See Matrix Decompositions - Spectral Decomposition
Spectral Leakage: When a segment of data is transformed into the frequency domain using the
FFT (or DFT), there will be discontinuities at the start and end of the data window unless the data
window is an integral number of periods of the waveform (this is rarely the case). The discontinuities
will manifest themselves in the frequency domain as sidelobes around the main peaks. Spectral
leakage can be reduced (at the expense of wider peaks) by smoothing windows such as the
Hanning, Hamming, Blackman-harris, harris, Von Hann and so on. See also Discrete Fourier
Transform - Spectral Leakage, Windows, Sidelobes.
Spectral Masking: Spectral masking refers to the situation where a very loud audio signal in a
certain frequency band drowns out a quieter signal of similar frequencies. A very stark example of
spectral masking is where a conversation is rendered inaudible if standing next to a revving jet
engine! Spectral masking is almost often referred to as simply masking.
Spectral masking also has more subtle and quantifiable effects whereby the presence of a signal
causes the threshold of hearing of signals with a similar frequency to increase [30], [52]. For
example if a narrowband of noise of approximately 100Hz bandwidth and centered at 500Hz is
DSPedia
370
played to a listener at various different sound pressure levels, the threshold of hearing around
500Hz is raised:
40dB 450-550 Hz Noise
Approximate threshold of hearing
Raised threshold
50
100 200
SPL (dB)
SPL (dB)
20dB 450-550 Hz Noise
80
70
60
50
40
30
20
10
0
-10
20
500 1000 2000 5000 10000
frequency (Hz)
80
70
60
50
40
30
20
10
0
-10
20
80
70
60
50
40
30
20
10
0
-10
20
Raised threshold
50
100 200
500 1000 2000 5000 10000
frequency (Hz)
50
100 200
500 1000 2000 5000 10000
frequency (Hz)
80dB 450-550 Hz Noise
SPL (dB)
SPL (dB)
60dB 450-550 Hz Noise
Raised threshold
80
70
60
50
40
30
20
10
0
-10
20
Raised threshold
50
100 200
500 1000 2000 5000 10000
frequency (Hz)
The louder the level of the narrowband noise, the more pronounced is the
masking effect on nearby frequencies.
The higher the SPL, the more the threshold of hearing of nearby frequencies will be raised, i.e. the
more pronounced the masking effect is. In the above example when the 500Hz narrowband noise
is at a level of 80dB then the 1000Hz tone at 20dB is inaudible to the human ear. In general the
effect of masking is more pronounced for frequencies above the masking level. For the above
example of narrowband noise, at 80dB SPL the masking effect at frequencies above 500Hz almost
stretches a full octave falling off at around 60dB/octave, whereas for frequencies below 500Hz the
masking effect falls off at around 120dB/octave.
371
The bandwidth of the masking level is higher for high frequencies. For example below 500Hz the
masking level bandwidth is less that 100Hz, whereas for 10-15kHz, the bandwidth of the masking
level is around 4kHz:
80
70
60
50
40
30
20
10
0
-10
20
Raised threshold
50
100 200
500 1000 2000 5000 10000
frequency (Hz)
SPL (dB)
SPL (dB)
4000Hz Masking Level Bandwidth
80
70
60
50
40
30
20
10
0
-10
20
Raised threshold
50
100 200
500 1000 2000 5000 10000
frequency (Hz)
The masking bandwidth is larger for higher frequencies. For the narrowband
100Hz noise the masking bandwidth if less than 100Hz, whereas for the
narrowband noise at 5000Hz the masking bandwidth is around 4000Hz
The auditory effects of spectral masking are the basis for signal compression techniques such as
precision adaptive subband coding (PASC). See also Auditory Filters, Equal Loudness Contours,
Psychoacoustic subband coding (PASC), Temporal Masking, Threshold of Hearing.
Spectrogram: A 2-D plot with time on the x-axis, and frequency on the y-axis. The magnitude at
a particular frequency and a particular time on the spectrogram is indicated by a color (or grey
scale) contour map. Widely used in speech processing.
Speech Compression: Using DSP algorithms and techniques to reduce the bit rate of speech for
transmission or storage. Algorithms in wide use for communications related applications (usually
speech sampled at 8kHz and 8 bit samples) that have been standardized include, LPC10, CELP,
MRELP, CVSD, VSELP and so on.
Speech Immunity: Dual tone multifrequency receivers must be able to discriminate between tone
pairs, and speech or other stray signals that may be present on the telephone line. The capacity of
a circuit to discriminate between DTMF and other signals is often referred to at the speech
immunity. See also Dual Tone Multifrequency.
Speech Processing: The use of DSP for speech coding, synthesis, or speech recognition.
Speech synthesis research is more advanced, whereas speech recognition and natural language
understanding continue to be a very large area of research.
Speech Recognition: Using DSP to actually interpret human speech and convert into text or
trigger particular control functions (e.g. open, close and so on).
Speech Shaped Noise: If a random noise signal has similar spectral characteristics to a speech
signal this may be referred to as speech shaped noise. Speech noise is unlikely to be intelligible
and would be mainly used for DSP system testing and benchmarking. Speech shaped noise is also
used in audiometry.
Speech Synthesis: The process of using DSP for synthesizing human speech. A simple method
is to digitally record a dictionary of a few thousand commonly used words and cascade them
372
DSPedia
together to form a desired sentence. This rudimentary form of synthesis will have no intonation and
be rather difficult to listen to and understand for long messages. It will also require a large amount
of memory. True speech synthesizers can be set up with a set of formant filters, fricative formant
and nasal unit and associated control algorithms (for context analysis etc.).
Speed of Sound: The speed of sound in air is nominally taken as being 330m/s. In actual fact,
depending on the actual air pressure and temperature this speed will vary up and down. More
generally the speed of sound will depend on the solid, liquid or gas in which it is travelling. Some
typical values for the speed of sound are:
373
Substance
Approximate Speed of
Sound (m/s)
Air at -10oC
325
Air at 0oC
330
Air at 10oC
337
Air at 20oC
343
Water
1500
Steel
5000-7000
Wood
3000-4000
Table 3:
See also Absorption, Sound, Sound Pressure Level.
SPOX: A signal processing operating system and the associated library of functions.
Spread Spectrum: Spread spectrum is a communication technique whereby bandwidth of the the
modulated signal to be transmitted is increased, and thereafter decreased in bandwidth at the
receiver [9], [16].
Square Matrix: See Matrix Structured - Square.
Square Root: The square root is a rare operation in real time DSP as most compression, digital
filtering, and frequency transformation type algorithms require only multiply-accumulates with the
occasional divide. Square roots are, however, found in some image processing routines (rotation
etc) and in DSP algorithms such as QR decomposition. General purpose DSP processors do not
perform square roots in a single cycle, as they do for multiplication, and successive approximation
techniques are usually used. Consider the following iterative technique to calculate a :
1
a-
x n + 1 = ---  x n + ---
2
x n
(547)
DSPedia
374
Using an initial guess, x0, as a/2 the algorithm converges asymptotically. The algorithm is often said
to have converged when a specified error quantity is less than a particular value.
Finding the square root of a = 15, using the iterative update:
1
a
x n + 1 = ---  x n + -----
2
xn
(548)
After only 6 iterations the algorithm has converged to within 0.03 of the correct solution.
Variable, xn
16
14
12
10
8
6
4
2
0
1
2
3
4
5
6
Iteration, n
Square Root Decomposition: See Matrix Decompositions - Cholesky.
Square Root Free Given’s Rotations: See Matrix Decompositions - Square Root Free Given’s
Rotations.
Square Root Matrix: See Matrix Properties - Square Root Matrix.
Square System of Equations: See Matrix Properties - Square System of Equations.
Square Wave: A sequence of periodic rectangular pulses. See Rectangular Pulse.
Stability: If an algorithm in a DSP processor is stable then it is producing bounded and perhaps
useful output results from the applied inputs. If an algorithm or system is not stable then it is
exhibiting instability and outputs are likely to be oscillating. See Instability.
Stand-Alone DSP: Most DSP application programs are developed on DSP boards hosted by IBM
PCs. After development of, for example, a DSP music effects box, the system will be stand-alone
as it is no longer hosted by a PC.
Standards: Technology standards are agreed definitions, usually at the international level which
allow the compatibility, reliable operation and interoperability of systems. With relevance to DSP
there are various standards on telecommunications, radiocommunications, and information
technology, most notable from the ISO, ITU and ETSI.
See also Bell 103/113, Bell 202, Bell 212, Bento, Blue Book, Comité Européen de Normalisation
Electrotechnique, Digital Video Interactive, European Broadcast Union, European
Telecommunications
Standards
Institute,
F-Series
Recommendations,
G-Series
Recommendations, Global Information Infrastructure, Graphic Interchange Format, H-Series
Recommendations, HyTime, I-Series Recommendations, IEEE Standard 754, Image Interchange
Facility, Integrated Digital Services Network, International Electrotechnical Commission,
International Mobile (Maritime) Satellite Organization, International Organisation for Standards,
International Telecommunication Union, ITU-R Recommendations, ITU-T Recommendations, JSeries Recommendations, Joint Binary Image Group, Joint Photographic Experts Group, Moving
375
Picture Experts Group, Multimedia and Hypermedia Information Coding Experts Group,
Multipurpose Internet Mail Extensions, Multimedia Standards, Red Book, Resource Interchange
File
Format,
T-Series
Recommendations,
V-Series
Recommendations,
X-Series
Recommendations.
Static Random Access Memory (SRAM): Digital memory which can be read from or written to.
SRAM does not need to be refreshed as does DRAM. See also Dynamic RAM.
Statistical Averages: See Expected Value.
Stationarity: See Strict Sense Stationary, Wide Sense Stationarity.
Status Register (SR): See Condition Code Register.
Step Reconstruction: See Zero Order Hold.
Step Size Parameter: Most adaptive algorithms require small steps while changing filter weights,
parameters or signals being estimated. The size of this step is often a parameter of the algorithm
called the step size (or the adaptive step size). As an example, the step size in the LMS (Least Mean
Squares) algorithm is almost always denoted by µ. The larger µ, the larger the adaptive increments
taken by the processor with each update. Haykin 1991, suggests a normalized LMS step size
parameter, α, that is equal to µ normalized by the power of the input signal. This allows appropriate
comparison of adaptive LMS processors operating with different input signals. The step size
parameter can also vary with time -- this “variable step size” often allows adaptive algorithms to
achieve faster convergence times and lower overall misadjustment simultaneously. See also
Adaptive Signal Processing, Least Mean Squares Algorithm, Least Mean Squares Algorithm
Variants - Variable Step Size LMS.
Stereo: Within DSP systems stereo has come to mean a system with two input channels and/or
two output channels. See also Dual, Stereophonic.
Stereophonic: This refers to a system that has two independent audio channels. See also
Monaural, Monophonic, Binaural.
Stochastic Conversion: If an ADC with only single bit resolution producing two levels of -1 and
+1 is used, then this is often referred to as stochastic conversion. See also Analog to Digital
Conversion, Dithering.
Stochastic Process: A stochastic process is a random process. Random signals are good
examples of stochastic processes. A number of measurements are associated with stochastic
signals, such as mean, variance, autocorrelation and so on. Signals such as short speech
segments can be described as stochastic.
Stopband: The range of frequencies that are heavily attenuated by a filter. See also Passband.
Strict Sense Stationary: A random process is strict sense stationary if it has a time invariant
mean, variance, 3rd order moment and so on. For most stochastic signals, strict stationarity is
unlikely (or difficult to show) and not (usually) a necessary criteria for analysis, modelling, etc.
Usually wide sense stationarity will suffice. When texts or papers refer to a stationary process they
almost always are referring to stationary in the wide sense unless explicitly stating otherwise. For
DSP, particularly least mean squares type algorithms, the looser criterion of wide sense stationarity
is referred to. Strict sense stationarity implies wide sense stationarity, but the reverse is not
DSPedia
376
necessarily true. A wide sense stationary Gaussian process, however, is also strict sense
stationary. See also Wide Sense Stationarity.
Subband Filtering: A technique where a signal is split ino subbands and DSP algorithms are
applied (usually independently) to each subband [49]. When a signal is split into subbands the
sampling rate can be reduced, and very often the PCM resolution can be reduced. See also
Precision Adaptive Subband Coding.
Subband Coding: A technique whereby a signal is filtered into frequency bands which are then
coded using fewer bits than for the original wideband signal. Good sub-band coding schemes exist
for signal compression that exploit psychoacoustic perception. See also Precision Adaptive
Subband Coding.
Sub-Harmonic: For a given fundamental frequency produced by, for example, a vibrating string,
the frequency of the harmonics are integer multiples of the fundamental frequency, and the
frequency of the subharmonics are integer dividends of the fundamental frequency. See also
Fundamental Frequency, Harmonic. Music.
Magnitude
fundamental frequency
Sub-harmonic
f0/2
f0
2f0
3f0
Harmonics
4f0
frequency (Hz)
The frequency domain representation of a fundamental frequency signal with harmonics
and sub-harmonics associated harmonics.
Subspace: See Vector Properties and Definitions - Subspace.
Subspace, Vector: See Vector Properties and Definitions - Subspace.
Subtractive Synthesis: Traditional analogue technique of synthesizing music starting with a
signal that contains all possible harmonics of a fundamental. Thereafter harmonic elements can be
filtered out (i.e. subtracted) in order to produce the desired sound [32]. See also Music, Western
Music Scale.
Successive Approximation: A type of A/D converter which converts from analog voltage to digital
values using an approximation technique based on a D/A converter.
Super Bit Mapping (SBM): SBM (a trademark of Sony) is noise shaping FIR filter algorithm
developed by Sony for mastering of compact disks from 20 bit master sources. It is essentially a
noise shaping FIR filter of order 12 which produces a high pass noise shaping curve.
Surround Sound: A number of systems have been developed to create the impression that sound
is spread over a wide area with the listener standing in the centre. DSP techniques are widely used
to create artificial echo and reverberation to simulate the acoustics of stadiums and theatres. Dolby
Surround Sound is widely used on the soundtracks of many major film releases. To be truly effective
the sound should be coming from 360o with loudspeakers placed at the front and back of the
listener.
377
Sustain: See Attack-Decay-Sustain-Release.
Switch: A device with (typically) two states, e.g. off and on; high or low etc. Also a means of
connecting/disconnecting two systems.
Symbol: In a digital communications system the transmission and reception of information occurs
in discrete chunks. The symbol is the signal (one from a finite set) transmitted over the channel
during the symbol period. The receiver detects which of the finite set of symbols was sent during
each symbol period. The message is recovered by the decoding of the received symbol stream.
The packaging of the message into discrete symbols sent over regular intervals forms the
fundamental basis of any digital communication system. See also Digital Communications,
Message, Symbol Period.
Symbol Period: In a digital communication system, the symbol period defines the regular time
interval over which symbols are transmitted. During a symbol period exactly one of a finite number
of signals are transmitted over the communications channel. Accurate knowledge of when this
period begins and ends (synchronization) is required at the receiver in a communications system.
See also Symbol, Digital Communications.
Symmetric Matrix: See Matrix Structured - Symmetric.
Synchronous: Meaning a system in which all transitions are regulated by a synchronizing clock.
System Identification: Using adaptive filtering techniques, an unknown filter or plant can be
identified. In an adaptive system identification architecture, when the error, ε(k) has adapted to a
minimum value (ideally zero) then, in some sense, y ( k ) ≈ d ( k ) , and therefore the transfer function
of the adaptive filter is now similar to, or the same as, the unknown filter or system. An example
application of system identification would be to identify the transfer function of the acoustics of a
room. See also Adaptive Filtering, Inverse System Identification, LMS algorithm, Active Noise
Cancellation .
Unknown
System
d(k)
x(k)
Adaptive
Filter
y(k)
−
Σ
+
ε(k)
Adaptive Algorithm
Generic Adaptive Signal Processing System Identification Architecture
Systolic arrays: A generic name for a DSP system that consists of a large number of very simple
processors interconnected to solve larger problems [25].
378
DSPedia
379
T
T-Series Recommendations: The T-series telecommunication recommendations from the
International Telecommunication (ITU), advisory committee on telecommunications (denoted ITUT and formerly known as CCITT) provide standards for terminal characteristics protocols for
telematic services and document transmission architecture. Some of the current recommendations
(http://www.itu.ch) include:
T.0
T.1
T.2
T.3
T.4
T.6
T.10
T.10 bis
T.11
T.12
T.15
T.22
T.23
T.30
T.35
T.42
T.50
T.51
T.53
T.60
T.62bis
T.64
T.65
T.70
T.71
T.80
T.81
T.82
T.83
T.90
T.100
T.102
T.103
T.104
T.105
T.106
T.122
T.123
T.125
T.351
Classification of facsimile apparatus for document transmission over the public networks.
Standardization of phototelegraph apparatus.
Standardization of Group 1 facsimile apparatus for document transmission.
Standardization of Group 2 facsimile apparatus for document transmission.
Standardization of Group 3 facsimile apparatus for document transmission (+ amendment).
Facsimile coding schemes and coding control functions for Group 4 facsimile apparatus
Document facsimile transmissions over leased telephone-type circuits.
Document facsimile transmissions in the general switched telephone network.
Phototelegraph transmissions on telephone-type circuit.
Range of phototelegraph transmissions on a telephone-type circuit.
Phototelegraph transmission over combined radio and metallic circuits.
Standardized test charts for document facsimile transmissions.
Standardized colour test chart for document facsimile transmissions.
Procedures for document facsimile transmission in the general switched telephone network
(+amendment).
Procedure for the allocation of CCITT defined codes for non-standard facilities.
Continuous colour representation method for facsimile.
Information technology - 7-bit coded character set for information interchange.
Latin based coded character sets for telematic services.
Character coded control functions for telematic services.
Terminal equipment for use in the teletext service.
Control procedures for teletext and G4 facsimile services based on X.215 and X.225.
Conformance testing procedures for the teletext.
Applicability of telematic protocols and terminal characteristics to computerized communication
terminals (CCTs).
Network-independent basic transport service for the telematic services.
Link Access Protocol Balanced (LAPB) extended for half-duplex physical level facility.
Common components for image compression and communication - Basic principles.
Information technology; digital compression and coding of continuous-tone still images;
requirements and guidelines.
Information technology - Coded representation of picture and audio information; progressive bilevel image compression (+T82 Correction 1).
Information technology - digital compression and coding of continuous-tone still images:
compliance testing.
Characteristics and protocols for terminals for telematic services in ISDN (+ amendment).
International information exchange for interactive Videotex.
Syntax-based videotex end-to-end protocols for the circuit mode ISDN.
Syntax-based videotex end-to-end protocols for the packet mode ISDN.
Packet mode access for syntax-based videotex via PSTN.
Syntax-based videotex application layer protocol.
Framework of videotex terminal protocols.
Multipoint communication service for audiographics and audiovisual conferencing service
definition.
Protocol stacks for audiographic and audiovisual teleconference applications.
Multipoint communication service protocol specification.
Imaging process of character information on facsimile apparatus.
DSPedia
380
T.390
T.400
T.41X/
T.42X
T.431
T.432
T.433
T.434
T.441
T.50X
T.510
T.521
T.522
T.523
T.541
T.561
T.562
T.563
T.564
T.571
T.611
Teletext requirements for interworking with the telex service.
Introduction to document architecture, transfer and manipulation.
Information technology - Open document architecture (ODA) and interchange format.
Document transfer and manipulation (DTAM) - Services and protocols - Introduction and general
principles.
Document transfer and manipulation (DTAM) services and protocols - Service definition.
Document Transfer, Access and Manipulation (DTAM) - Services and protocols - Protocol
specification.
Binary file transfer format for the telematic services.
Document transfer and manipulation (DTAM) - Operational structure.
Document application profile for the interchange of various documents.
General overview of the T.510-series.
Communication application profile BT0 for document bulk transfer based on the session service.
Communication application profile BT1 for document bulk transfer.
Communication application profile DM-1 for videotex interworking.
Operational application profile for videotex interworking.
Terminal characteristics for mixed mode (MM) of operation.
Terminal characteristics for teletext processable mode(PM.1).
Terminal characteristics for Group 4 facsimile apparatus.
Gateway characteristics for videotex interworking.
Terminal characters for the telematic file transfer within teletext service.
Programming communication interface (PCI) APPLI/COM for facsimile Group 3, facsimile Group
4, teletext, telex, e-mail and file transfer services.
For additional detail consult the appropriate standard document or contact the ITU. See also ITUT Recommendations, International Telecommunication Union, Standards.
Tactile Perception: Sounds below 20Hz (infrasonic or infrasound) cannot be heard by most
humans, however this low frequency infrasound can be felt tactilely. Some pipe organs can play
notes lower than 20Hz which can enhance the overall appreciation of the rest of the music in the
audible range.
Tap: The name given to a data line corresponding to a delayed version of the input signal. A tapped
delay line has several points (i.e., taps) where delayed input samples are multiplied by the individual
weights of a digital filter. The number of taps in a digital filter is equal to the number of weights or
coefficients. For example, a particular FIR may be described as having 32 taps or 32 coefficients.
The terms taps and weights (or coefficients) are used interchangeably -- this usage is imprecise,
but we usually “know what is meant.” See also FIR filter, IIR filter, Adaptive Filter.
Tape Speed: See Cassette Tape.
Tempco: See Temperature coefficient.
Temperature Coefficient: The temperature coefficient gives a measure of the voltage (or current)
drift of a component with respect to temperature change. For example if a particular 20 bit ADC
(range of had a temperature coefficient of 1ppm/oC, then this means that for a change in
20
temperature of 1oC, the output of the ADC would drift by less than 1 bit ( 2 = 1, 048, 576 ).
Temporal Masking: The human ear may not perceive quiet sounds which occur a short time
before or after a louder sound. This masking effect is called temporal masking. When the quiet
sound occurs just after the louder sound (forward temporal masking) it may be interpreted that the
ear has not “recovered” from the louder sound. If the quiet sound comes just before the louder
sound then backward temporal masking may occur; a simple interpretation of this effect is less
obvious. The effects of temporal masking are still a topic of debate and research [30].
381
Masking Effect (dB)
For forward temporal masking, the closer together the loud and quiet sound, then the more of a
masking effect that is likely to be present. The amount of masking is influenced by the frequency
and sound pressure levels of the two sounds, and masking effects may occur for up to 200ms.
Temporal masking can be useful for perceptual coding of audio whereby the first few milliseconds
of sounds (such as after loud drumbeats) are not fully coded.
Duration of “loud” sound
Backward Masking
Spectral Masking.
Forward Masking
time
Sounds occurring just after the loud sound may in fact be (forward) masked (i.e. rendered
perceptually inaudible) to the listener. A less pronounced backward masking effect also
occurs.
See also Audiology, Audiometer, Binaural Unmasking, Moving Picture Experts Group - Audio,
Psychoacoustics, Psychoacoustic Subband Coding (PASC), Sound Pressure Level, Spectral
Masking, Temporary Threshold Shift, Threshold of Hearing
Temporary Threshold Shift (TTS): When the threshold of hearing is raised temporarily (i.e., the
threshold eventually returns to normal) due to exposure to excessive noise a temporary threshold
shift is said to have occurred. Recovery can be within a few minutes or take several hours. Many
people have experienced this effect by attending a loud concert or shooting a gun. See also
Audiology, Audiometry, Threshold of Hearing, Permanent Threshold Shift.
Terrestrial Broadcast: TV and radio signals are sent to consumers in one of three ways:
terrestrial, satellite, or cable. Terrestrial broadcasts transmit electromagnetic waves modulated with
the radio or TV signal from earth based transmitters, and are received by earth based aerials or
antennas.
Third Octave Band: A typical bandwidth measure used when making measurements of sound
intensity over a few octaves of frequency. The third of an octave is usually one third of the particular
octave. For example, choosing octaves frequencies at 125, 250, 500Hz and so on, the bandwidths
of the third of an octave bands are approximately 42Hz, 86Hz, and 166Hz. To compute a third
octave frequency band around frequency f0, note that from 21/6f0 down to 2-1/6f0, the ratio of the
high and low frequencies is 21/3, or one-third of an octave (a doubling). The third octave bandwidth
is computed as (21/6- 2-1/6)f0. Three consecutive third octaves make an octave.
Third Order: Usually meaning three of a particular device cascaded together. Used in a nonconsistent way. See also Second Order.
Threshold Detection: One of the most rudimentary forms of signal analysis, where a particular
signal is monitored to find at what points it has a magnitude larger than some predefined threshold.
DSPedia
382
For example an ECG signal may be monitored using threshold detection in order to calculate the
heart rate (the inverse of the R to R time).
Amplitude
Detect all occurrences of signal
above threshold level
4
2
0
Thresholding an ECG waveform to determine the heart rate.
time (secs)
Threshold of Audibility: The level of a tone that is just audible defines the threshold of audibility
for that frequency. For a more general sound, the threshold of audibility is the level at which it
becomes just audible. See also Audiogram.
Threshold of Hearing: The threshold of hearing or minimum audible field (MAF) is a curve of the
minimum detectable sound pressure level (SPL) of pure frequency tones plotted against frequency.
There are a number of different methods for obtaining the lower threshold of hearing depending on
the actual point on/in the ear where SPL is measured, whether headphones or loudspeakers were
used, and of course the cross section of population over which the averaged curve is obtained, i.e.
different age groups, including/excluding hearing impaired persons and so on. (Note that although
SPL was originally defined as a sound pressure level relative to the minimum detectable 1000Hz
tone, established at 10-12W/m2, the average threshold of hearing at 1000Hz is actually around
5dB.)
Threshold of Hearing
80
70
60
SPL (dB)
50
Approximate Threshold of Hearing
40
30
Audible
Region
20
10
0
Inaudible
Region
-10
20
50
100
200
500
1000 2000
5000 10000 20000
frequency (Hz)
The curve shown above is based on the Fletcher-Munson [73] and Robinson-Dadson [126] curves
and is now a well established shape showing clearly that the ear is most sensitive to the range
1000-5000Hz where speech is found. At very low and very high frequencies the minimum
thresholds increase rapidly. It is worthwhile noting that the threshold of pain is around 120dB, and
prolonged exposure to such high intensities will damage the ear. The upper frequency limit of
383
hearing can be as high as 20kHz for very young children, but in adults is about 12-15kHz. The lower
limit of hearing is often quoted as 20Hz as further reduction on frequency is not perceived as a
further reduction in pitch. Also at these frequencies high SPL sounds can be “felt” as well as heard
[30]. Many animals have hearing ranges well above 20kHz, the most noted example being dogs
who respond to the sound made from dog whistles which humans cannot hear.
Given that the bandwidth of hi-fidelity digital audio systems is up to 22.05kHz for CD and 24kHz for
DAT it would appear that the full range of hearing is more than covered. However this is one of the
key issues of the CD-analogue records debate. The argument of some analog purists is that
although humans cannot perceive individual tones above 20kHz, when listening to musical
instruments which produce harmonic frequencies above the human range of hearing these high
frequencies are perceived in some “collective” fashion. This adds to the perception of live music;
the debate will doubtless continue into the next century.
See also Audiogram, Audiometry, Auditory Filters, Binaural Unmasking, Ear, Equal Loudness
Contours, Equivalent Sound Continuous Level, Frequency Range of Hearing, Habituation, Hearing
Aids, Hearing Impairment, Hearing Level, Infrasound, Permanent Threshold Shift,
Psychoacoustics, Sensation Level, Sound Pressure Level (SPL), Spectral Masking, Temporal
Masking, Temporary Threshold Shift (TTS), Ultrasound.
Timbre: (Pronounced tam-ber). The characteristic sound that distinguishes one musical
instrument from another. Key components of timbre are the signal amplitude envelope and the
harmonic content of the signal [14]. See also Attack-Decay-Sustain-Release, Music, Western Music
Scale.
Time Invariant: A quantity that is constant over time. For example if the mean of a stochastic
signal is described as being time invariant, then this means that the measured value of the mean
will be the same if measured today, and then tomorrow.
TMS320: The part number prefix for Texas Instruments series of DSP processors. One of the early
members of the family was the TMS320C10 in 1984.
Toeplitz Matrix: See Matrix Structured - Toeplitz.
Tonal Distortion: If an analogue signal with periodic or quasi-periodic components is converted
to a digital signal and the output contains harmonics of the periodic signal that were not present in
the original, then this is referred to as tonal or harmonic distortion. For example, the following digital
DSPedia
384
Amplitude, y(n)
Magnitude, |Y(f)| (dB)
signal is a 200Hz sine wave sampled at 48000Hz with an amplitude of 250. A 16384 point FFT
confirms that signal there is no tonal distortion present.
time(ms)
frequency (kHz)
The time and frequency representations of a 200Hz sine wave of amplitude 100,
sampled at 48000 Hz, i.e. y ( k ) = 100 sin ( ( ( 2π200 )k ) ⁄ 48000 ) The 16384 point
FFT shows that there is no tonal distortion. Note that on the frequency graph an
amplitude of 100 corresponds to about -50 dB ( = 20 log ( 50 ⁄ 32767 ) ) where the full
scale amplitude of 32767 ( = 2 15 – 1 ) is 0dB
Amplitude, d(n)
Magnitude, |D(f)| (dB)
However when the signal is clipped at an amplitude of 80, then this non-linear operation causes
tonal distortion as can be seen in the frequency domain representation:
time(ms)
frequency (kHz)
The time and frequency representations of a 200Hz sine wave of amplitude 100,
sampled at 48000 Hz, i.e. d ( k ) = 100 sin ( ( ( 2π200 )k ) ⁄ 48000 ) which has
been clipped at ±80. The 16384 point FFT shows that there is clearly tonal
distortion at integer multiples of the signal frequency.
385
Amplitude, v(n)
Magnitude, |V(f)| (dB)
Also when a very low level periodic signal is converted from an analog to a digital representation,
the quantisation error will be correlated with the signal which will manifest itself as tonal distortion:
time(ms)
frequency (kHz)
The time and frequency representations of a 100Hz sine wave of amplitude 5,
sampled at 48000 Hz, i.e. v ( k ) = 100 sin ( ( 2π200 )k ⁄ 48000 ) . The 16384 point
FFT shows that there is clearly tonal distortion present.
When a speech or music signal is converted from analog to digital then the quasi-periodic nature of
the signals may result in tonal distortion components. This tonal distortion may be due either to nonlinearities in the system or analog-to-digital conversion of very low level signals, See also Dithering,
Total Harmonic Distortion.
Tone (1): A pure sine wave (existing for all time, t ).
Tone (2): In music theory each adjacent note in the chromatic scale differs by one semitone, which
corresponds to multiplying the lower frequency by the twelfth root of 2, i.e. 2 1 / 12 = 1.0594631… .
A difference of two semitones is a tone. Coincidentally (or perhaps by design!) “tone” is an anagram
of “note”, as in musical note. See also Western Music Scale.
Tone Generation: See Dual Tone Multifrequency - Tone Generation.
Total Error Budget: Virtually every component in a standard input/output DSP system will
contribute some error, or noise to a signal passing through. If a designer knows the tolerable error
in the final system output, then from this total error budget, tolerances and allowable errors can be
assigned to components. In a DSP system the designer will need to consider both analog and digital
components in the total error budget.
Total Harmonic Distortion (THD): If a pure tone signal of M Hz is played into a system and the
output is found to contain not only the original signal, but also small components at harmonic
frequencies of 2M, 3M, and so on then distortion has occurred. The THD is calculated as the
percentage of total energy contained in the harmonics to the energy of the signal itself. THD is
usually expressed in dB. See also Total Harmonic Distortion plus Noise.
Total Harmonic Distortion plus Noise (THD+N): A measure often associated with ADCs and
DACs defining the ratio of all spectral components over the specified bandwidth, excluding the input
signal, to the rms value of the signal. See also Total Harmonic Distortion.
TP Algorithm: The Turning Point algorithm was a technique to reduce the sampling frequency of
an ECG signal from 200 to 100 samples/sec. The algorithm developed from the observation that
except for the QRS portion of the ECG with large amplitudes and slopes, a sampling rate of 100
DSPedia
386
samples was more than adequate. The algorithm processes three points at once in order to identify
where a significant turning point occurs.
Trace of a Matrix: See Matrix Properties - Trace.
Transceiver: A data communications device that can both transmit and receive data.
Transcoding: Converting from one form of coded information to another. For example converting
from MPEG1 compressed video to H.261 compressed video can be termed as transcoding.
Transducer: A device for converting one form of energy into another, e.g. a microphone converts
sound energy into electrical energy.
Transform Coding: For some signals, mathematical transformation of the data into another
domain may yield a data set that is more amenable to compression techniques that the original
signal. The transform is usually applied to small blocks of data which are compared with a standard
set of blocks to produce a correlation function for each. The signal is decompressed by applying the
correlation functions as a weighting to each standard block. It is possible to combine transform
coding and predictive coding to yield powerful compression algorithms. The disadvantage is that
the algorithms are computation intensive. See also JPEG, MPEG, DCT.
Transfer Function: A description (usually in the mathematical Z-domain) of the function a
particular linear system will perform on signals. For example, the transfer function of a very simple
low pass filter, y ( n ) = x ( n ) + x ( n – 1 ) , could be given as the transfer function H(z):
Y ( z ) = + z –1
H ( z ) = -----------1
X( Z )
(549)
See also Impulse Response.
Transients: When an impulse is applied to a system, the resulting signal is often referred to as a
transient. For example when a piano key is struck, the piano wire creates a transient as it continues
to vibrate long after the key was struck.
Sometimes, unexplained small currents and voltages within a system are described (and perhaps
dismissed) as transients.
Transpose Matrix: See Matrix Operations - Transpose.
Transpose Vector: See Vector Properties and Definitions - Transpose.
Transputer: A microprocessor designed by INMOS Ltd. The first and original parallel processing
chips (T212, T414, and T800) had four serial links to allow intercommunication with other
Transputers. Since its launch in 1984 the Transputer, despite its catchy name, failed to set the
computing world on fire. Although the Transputer was used for many DSP applications, its slow
arithmetic restricted its use and it never became a general purpose DSP.
Trellis Coded Modulation (TCM): TCM is a digital modulation technique that combines
convolutional coding and decoding techniques (including the Viterbi algorithm) with signal design
to reduce transmission errors in a digital communication system while retaining the same average
symbol energy and system bandwidth. TCM increases the number of signals in a signal set by some
factor of two without increasing the signal space dimension (i.e., the system bandwidth). The coder
387
and decoder exploit the increase in the number signals by separating signals both by Euclidean
distance in signal space as well as free distance in the convolutional code trellis. The Viterbi
algorithm is used with a Euclidean distance rather than a Hamming distance as the appropriate
metric to minimize probability of error (for the additive white gaussian noise channel). Trellis Codes
are often referred to as Ungerboeck Codes, after G. Ungerboeck who is credited with their
development. See also Viterbi Algorithm, Euclidean Distance, Hamming Distance.
Tremolo: Tremolo is the effect where a low frequency amplitude modulation is applied to the
musical output of an instrument. Tremolo can be performed digitally using simple multiplicative DSP
techniques [32]:
Tremolo Signal = cos ( 2π ( f t ⁄ f s )k ) s ( k )
(550)
where, f s is the sampling frequency, f t is tremolo frequency of modulation and s ( k ) is the original
digital music signal. In practice however the tremolo effect may require more subtle forms of
modulation to produce an aesthetic sound. See also Music, Vibrato.
Triangular Pulse (Continuous and Discrete Time): The continuous time triangular pulse can
be defined as:

– t0
 1 – t----------
τ
tri ( ( t – t 0 ) ⁄ τ ) = 

 0 --
if t – t 0 ≤ τ
continuous time
otherwise
g(t)
1
0
t0 - τ
t0
t0 +τ
t
The continuous triangular pulse g ( t ) = tri ( ( t – t 0 ) ⁄ τ )
The discrete time triangular pulse can be defined as:
(551)
DSPedia
388

– k0
 1 – k------------
κ
tri ( ( k – k 0 ) ⁄ κ ) = 
 -- 0

if k – k 0 ≤ κ
discrete time
(552)
otherwise
g( k )
1
0
k0 −κ
k0
k0 +κ
k
g ( k ) = tri ( ( k – k 0 ) ⁄ κ )
See also Elementary Signals, Rectangular Pulse, Square Wave, Unit Impulse Function, Unit Step
Function.
Triangularization: See Matrix Decompositions - Cholesky/LU/QR.
Tridiagonal Matrix: See Matrix Structured - Tridiagonal.
Truncation Error: When two N bit numbers are multiplied together, the result is a number with 2N
bits. If a fixed point DSP processor with N bits resolution is used, the 2N bit number cannot be
accommodated for future computations which can operate on only N bit operands. Therefore, if we
assume that the original N bit numbers were both constrained to be less than 1 in magnitude by
using a binary point, then the 2N bit result is also less that 1. Hence if we throw away the last N bits,
then this is equivalent to losing precision. This loss of precision is referred to as truncation error.
Although the truncation error for a single computation is usually not significant, many errors added
together can be significant. Furthermore if the result of a computation yields the value of 0 (zero)
after truncation, and this result is to be used as a divisor, a divide by zero error will occur. See also
Round-Off Error, Fractional Binary.
Binary
0.1101011 x 0.1000100 = 0.011100011011000
Decimal 0.8359375 x 0.53125
= 0.444091796875
0.0111000
Truncation
0.4375
After multiplication of two 8 bit numbers the 16 bit result is truncated to 8 bits introducing a binary
round off error of 0.000000011011000 which in decimal is 0.006591796875. If rounding had been
used, then the result would have been 0.0111001, which is an error of 0.000000000101000, and
in decimal an error of 0.001220703125.
Truncation Noise: When truncation errors are considered in terms of their mean power, this
results in a measure of the truncation noise. See also Truncation Error.
Tweeter: The section of a loudspeaker that reproduces high frequencies is often called the
tweeter. The name is derived from the high pitched tweet of a bird. See also Woofer.
389
Twisted Pair: The name given to a pair of twisted copper wires used for telephony. The gauge
(and, consequently, the frequency response) of this type of transmission line will depend on the
precise purpose and location. The “twist” is to improve common mode noise rejection.
Two’s Complement: The type of arithmetic used by most DSP processors which allows a very
convenient way of representing negative numbers, and imposes no overhead on arithmetic
operations. In two’s complement the most significant bit is given a negative weighting, e.g.
1001 0000 0000 0001 2 = -2 15 + 2 12 + 2 1
= -32768 + 4096 + 1 = -28671
(553)
See also Sign bit.
Two-wire Circuit: A circuit formed of two conductors insulated from each other, providing a send
and return path. Signals may pass in one or both directions although not at the same time. See also
Four Wire Circuit, Half Duplex, Full Duplex.
390
DSPedia
391
U
Ungerboeck Codes: See Trellis Coded Modulation.
Ultrasonic: Acoustics signals (speed in air, 330 ms–1 ) having frequencies above 20kHz, the
upper level of human hearing. The ultrasonic spectrum extends up to MHz frequencies.
Underdetermined System: See Matrix Properties - Underdetermined System of Equations.
Unit Impulse Function (Continuous Time and Discrete Time): The mathematical definition of
the continuous time unit impulse function is a signal with an infinite magnitude, but with an
infinitesimal duration and that has a unit area. The continuous time unit impulse function is often
referred to as the Dirac impulse (or Dirac delta function) and is not physically realisable. The
mathematical representation for the continuous time unit impulse function occurring at time t 0 , is
usually denoted by the Greek letter δ (delta) in the form:
0
if t ≠ t 0

δ ( t – t0 ) = 
 undefined if t = t 0
(554)
Graphically the Dirac impulse, δ ( t – t 0 ) , can be represented as the following rectangular or
triangular models where ε → 0 :
ε
1/ε
2ε
Rectangular Model
0
ε
1/ε
t0
t
0
Triangular Model
t0
t
Rectangular and triangular models of the continuous time unit impulse function. As ε → 0 both
models become infinitely tall and infinitesimally thin, but continue to maintain a unit area.
Although the Dirac impulse does not exist in the real physical world, it does have significant
importance in the mathematical analysis of signals and systems. A useful mathematical definition
of the continuous time unit impulse function is:
du ( t )
δ ( t ) = ------------dt
(555)
where u(t) is the unit step function. (To be mathematically correct the impulse function is actually a
distribution rather than a function of time. The distinction is that a function must be single valued
and for any time, t, the function has one and only one value.)
DSPedia
392
The discrete time unit impulse function has a magnitude of 1 at a specific (discrete) time. The
unit impulse response is bounded for all time and is therefore physically realizable. The discrete
time unit impulse function is often referred to as the Kronecker impulse or (Kronecker delta
function). The mathematical representation for the discrete time unit impulse function occurring at
(discrete) time k 0 , is usually denoted by the Greek letter δ (delta):

 0 if k ≠ k 0
δ ( k – k0 ) = 
 1 if k = k0

(556)
discrete time
Graphically the discrete time unit impulse function, δ ( k – k 0 ) , can be represented as:
δ ( k – k0 )
1
0
1 2
k0
k
The discrete time unit impulse function δ ( k – k 0 )
Both the discrete time and continuous time unit impulse functions exhibit a sampling property when
an analog signal is multiplied by a unit impulse response and integrated over time. Hence they are
extremely useful mathematical tools for the analysis and definition of DSP sampled data systems.
See also Elementary Signals, Fourier Transform Properties, Impulse, Rectangular Pulse, Sampling
Property, Unit Step Function.
Unit Step Function (Continuous Time and Discrete Time): The mathematical representation
for the continuous time unit step function occurring at time t 0 , is usually denoted by the letter u,
and defined by:

 0 if t < t 0
u ( t – t0 ) = 
 1 if t ≥ t 0

continuous time
(557)
Graphically the continuous time unit step function, u ( t – t 0 ) , can be represented as:
u ( t – t0 )
1
0
t0
t
The continuous time unit step function u ( t – t 0 )
The unit step function can be mathematically derived from the unit impulse function, δ(t), as:
393
t
u(t) =
∫ δ ( τ ) dτ
(558)
∞
The discrete time unit step function is denoted by:

u ( k – k 0 ) =  0 if k < 0
 1 if k ≥ 0
(559)
discrete time
Graphically the discrete time unit step function, u ( k – k 0 ) , can be represented as:
u ( k – k0 )
1
0
1 2
k0
k
The discrete time unit step function u ( k – k 0 )
Rectangular, or pulse functions can be generated by the addition of unit step functions:
x( k)
1
0
1 2 3 4 5 6 7 8 9 10 11 12
k
x ( k ) = u ( k – 4 ) – u ( k – 10 )
See also Elementary Signals, Fourier Transform Properties, Impulse, Rectangular Pulse, Sampling
Property, Step Response, Unit Impulse Function.
Unit Step Response: See Step Response.
Unit Pulse Function: See Rectangular Pulse, Unit Step Pulse.
Unit Vector: See Vector Properties and Definitions - Unit Vector.
Unitary Matrix: See Matrix Properties - Unitary.
Unstable: See Instability.
Upper Triangular Matrix: See Matrix Structured - Upper Triangular.
Upsampling: Increasing the sampling rate of a digital signal by inserting zeroes between adjacent
samples. To upsample a digital signal, xk, sampled at fs Hz to Mfs Hz would require that M-1 zeroes
are inserted between adjacent samples in the original signal. Upsampling in combination with a low
pass filter to remove the aliased portions of the frequency spectra gives interpolation. Up-sampling
has no effect on the shape of the frequency spectrum of the signal. (If up sampling was performed
DSPedia
394
using a digital zero order hold, i.e. the value of xk is inserted instead of zeroes, then the frequency
spectrum of the output signal is modulated by a sinc function.) See also Downsampling,
Decimation, Fractional Sampling Rate Converter, Interpolation, Sigma Delta Converter, Zero Order
Hold.
x(k)
y(k)
1f s = --ts
ts
tu
time
time
Output
Input
4
Upsampler
|Y(f)|
|X(f)|
0
fs/2
fs
3fs/2
2fs
5fs/2
3fs
7fs/2
4fs
frequency
0
fu /2
fu
frequency
395
V
V-Series Recommendations: The V-series recommendations from the International
Telecommunication (ITU), advisory committee on telecommunications (denoted ITU-T, and
formerly known as CCITT) propose a number of standards for telecommunication based data
transmission. Among the more well known of these standards from a DSP perspective are V22bis
(2400 bits/s modem), V32bis (14400 bits/sec modem), V34 (14400 bit/s modem), and V42bis
(higher than 28800 bits/s modem featuring data compression) which all feature advanced adaptive
signal processing techniques for echo control and data equalisation. Some of the current ITU-T Vseries recommendations (http://www.itu.ch) can be summarised as:
V.1
Equivalence between binary notation symbols and the significant conditions of a two-condition
code.
V.2
Power levels for data transmission over telephone lines.
V.4
General structure of signals of International Alphabet No. 5 code for character oriented data
transmission over public telephone networks.
V.7
Definitions of terms concerning data communication over the telephone network.
V.8
Procedures for starting sessions of data transmission over the general switched telephone
network.
V.10
Electrical characteristics for unbalanced double-current interchange circuits operating at data
signalling rates nominally up to 100 kbit/s.
V.11
Electrical characteristics for balanced double-current interchange circuits operating at data
signalling rates up to 10 Mbit/s.
V.13
Simulated carrier control.
V.14
Transmission of start-stop characters over synchronous bearer channels.
V.15
Use of acoustic coupling for data transmission.
V.16
Medical analogue data transmission modems.
V.17
A 2-wire modem for facsimile applications with rates up to 14 400 bit/s.
V.17
A 2-wire modem for facsimile applications with rates up to 14 400 bit/s.
V.18
Operational and interworking requirements for modems operating in the text telephone mode.
V.19
Modems for parallel data transmission using telephone signalling frequencies.
V.21
300 bits per second duplex modem standardized for use in the general switched telephone
network.
V.22
1200 bits per second duplex modem standardized for use in the general switched telephone
network and on point-to-point 2-wire leased telephone-type circuits.
V.22bis 2400 bits per second duplex modem using the frequency division technique standardized for
use on the general switched telephone network and on point-to-point 2-wire leased telephonetype circuits.
V.23
600/1200-baud modem standardized for use in the general switched telephone network.
V.24
List of definitions for interchange circuits between terminal equipment (DTE) and data circuitterminating equipment (DCE). The V24 standard is very similar to the RS232 standard.
V.25
Automatic answering equipment and/or parallel automatic calling equipment on the general
switched telephone network including procedures for disabling of echo control devices for both
manual and automatic operation.
V.25bis Automatic calling and/or answering equipment on the general switched telephone network
(GSTN) using the 100-series interchange circuits.
V.26
2400 bits per second modem standardized for use on 4-wire leased telephone-type circuits.
V.26bis 2400/1200 bits per second modem standardized for use in the general switched telephone
network.
V.26ter 2400 bits per second duplex modem using the echo cancellation technique standardized for use
on the general switched telephone network and on point-to-point 2-wire leased telephone-type
circuits.
V.27
4800 bits per second modem with manual equalizer standardized for use on leased telephonetype circuits.
DSPedia
396
V.27bis 4800/2400 bits per second modem with automatic equalizer standardized for use on leased
telephone-type circuits.
V.27ter 4800/2400 bits per second modem standardized for use in the general switched telephone
network.
V.28
Electrical characteristics for unbalanced doubled-current interchange circuits.
V.29
9600 bits per second modem standardized for use on point-to-point 4-wire leased telephonetype circuits.
V.31
Electrical characteristics for single-current interchange circuits controlled by contact closure.
V.31bis Electrical characteristics for single-current interchange circuits using optocouplers.
V.32
A family of 2-wire, duplex modems operating at data signalling rates of up to 9600 bit/s for use
on the general switched telephone network and on leased telephone-type circuits.
V.32bis A duplex modem operating at data signalling rates of up to 14400 bit/s for use on the general
switched telephone network and on leased point-to-point 2-wire telephone-type circuit.
V.33
14400 bits per second modem standardized for use on point-to-point 4-wire leased telephonetype circuits.
V.34
A modem operating at data signalling rates of up to 28800 bit/s for use on the general switched
telephone network and on leased point-to-point 2-wire telephone-type circuits.
V.36
Modems for synchronous data transmission using 60-108 kHz group band circuits.
V.37
Synchronous data transmission at a data signalling rate higher than 72 kbit/s using 60-108 kHz
group band circuits.
V.38
A 48/56/64 kbit/s data circuit terminating equipment standardized for use on digital point-to-point
leased circuits.
V.41
Code-independent error-control system.
V.42
Error-correcting procedures for DCEs using asynchronous-to-synchronous conversion.
V.42bis Data compression procedures for data circuit terminating equipment (DCE) using error
correction procedures.
V.50
Standard limits for transmission quality of data transmission.
V.51
Organization of the maintenance of international telephone-type circuits used for data
transmission.
V.52
Characteristics of distortion and error-rate measuring apparatus for data transmission.
V.53
Limits for the maintenance of telephone-type circuits used for data transmission.
V.54
Loop test devices for modems.
V.55
Specification for an impulsive noise measuring instrument for telephone-type circuits.
V.56
Comparative tests of modems for use over telephone-type circuits.
V.57
Comprehensive data test set for high data signalling rates.
V.58
Management information model for V-series DCE's.
V.100
Interconnection between public data networks (PDNs) and the public switched telephone
networks (PSTN).
V.110
Support of data terminal equipments with V-Series type interfaces by an integrated services
digital network.
V.120
Support by an ISDN of data terminal equipment with V-series type interfaces with provision for
statistical multiplexing.
V.230
General data communications interface layer 1 specification.
For additional detail consult the appropriate standard document or contact the ITU. See also Bell
103/113., Bell 202, Bell 212, International Telecommunication Union, ITU-T Modem,
Recommendations, Standards.
Variable Step Size LMS: See Least Mean Squares Algorithm Variants.
Variance: The variance of a signal is the mean of the square of the signal about the mean value.
If the signal is ergodic the statistical averages will equal the time averages and then:
397
1
m = Mean of x ( k ) = E { x ( k ) } = ∑ x ( k )p { x ( k ) } ≅ ---N
k
N–1
∑ x( k )
(560)
k=0
and
Variance of x ( k ) = E { ( x ( k ) – m ) 2 } =
∑ ( x( k ) – m)2p{ x(k )}
k
1
≅ ---N
N–1
(561)
∑ [x(k) – m]2
k
0
for large N. In a practical DSP situation where real signals are being used, the variance is often
calculated using time averages. Variance gives a measure of the AC power in a signal. See also
Ergodic, Expected Value, Mean Value, Mean Squared Value, Wide Sense Stationarity.
Vector: A vector is a set of ordered information. A vector is usually denoted in texts using boldface
lower case letters, v (cf. matrices, denoted ;kby upper case boldface) or with an underscore, v . A
column vector has n rows and one column i.e. n × 1 dimension, and a row vector has one row and
n columns, i.e. 1 × n dimension.
In DSP a vector is usually a set of ordered elements conveying information or data. For example
the last N samples of a signal, gk may be stored in a continuous array of memory and referred to
and operated on as a (data) vector:
gk
gk – 1
gk =
gk – 2
:
(562)
gk – N + 2
gk – N + 1
Vectors can be added subtracted, multiplied, scaled and transposed. See also Data Vector, Matrix,
Vector Operations, Vector Properties, Weight Vector.
Vector Addition: See Vector Operations - Addition.
Vector-Matrix Multiplication: See Vector Operations - Matrix-Vector Multiplication.
Vector Multiplication: See Vector Operations - Multiplication.
Vector Operations: Vectors of the appropriate dimension can be added, subtracted, multiplied,
scaled, and transposed.
• Addition (Subtraction): If two vectors are to be added (or subtracted) then they must be of exactly the
same dimension. Each element in one vector is added (subtracted) to the analogous element in the other
vector. For example:
DSPedia
398
1
3
+
6
0 =
2
3
4
= 6
5
(1 + 3)
(6 + 0)
(2 + 3)
(563)
Vector addition is commutative, i.e. a + b = b + a .
• Dot Product: See Vector Operations - Inner Product.
• Inner Product: When a row vector is multiplied by a column vector of the same dimension, the result is a
scalar called the inner product. For example an FIR filter forms an inner product by multiplying the weight
vector by the data vector. The inner product is sometimes referred to as the dot product. See also Outer
Product.
xk
w T x = w0 w 1 w2 w3
xk – 1
(564)
xk – 2
xk – 3
• Multiplication: Two vectors, w and v, can be multiplied either to form the inner product, w T v or the outer
product, wv T .
The inner product (also known as the dot product) of an 1 × n and an n × 1 vector is a scalar. For
example:
x0
(565)
w0 w 1 w2 x 1 = w0 x 0 + w 1 x1 + w2 x 2
x2
The outer product of an n × 1 and a 1 × n vector is a square n × n matrix. For example
w0
v0 w 0 v1 w 0
v2 w0
w1 v 0 v1 v 2 = v0 w 1
v1 w1
v2 w1
w2
v1 w2
v2 w2
v0 w 2
(566)
The inner product (also known as the dot product) is widely used for digital filter presentation, and the
output product is found in a number of linear algebraic derived DSP algorithms such as Recursive Least
Squares.
Matrix-Vector Multiplication: A n × 1 vector can be premultiplied by an m × n matrix to give an m × 1 .
vector. For example:
a 11 a 12
a 21 a 22
a 31 a 32
b1
b2
a 11 b 1 + a 2 b 2
(567)
= a 21 b 1 + a 22 b 2
a 31 b 1 + a 32 b 2
• A 1 × n vector can be postmultiplied by a n × m matrix to give a 1 × m vector.
b 1 b2
a 11 a 21 a 31
a 12 a 22 a 32
= a 11 b 1 + a 2 b 2
a 21 b 1 + a 22 b 2
a 31 b 1 + a 32 b 2
(568)
399
Note that if Ab = c then b T A T = c T . See also Matrix Operations.
• Scaling: A vector, a, is scaled by multiplying each element by a scale factor, s.
a1
sa 1
sa = s a 2 = sa 2
a3
(569)
sa 3
• Transpose: The transpose of a row vector is obtained by writing the top to bottom elements as the left to
right elements of a column vector, and vice-versa for the transpose of a column vector. The transpose of
a vector a is denoted as aT. For example, if:
b1
b =
b2
⇒
b3
b T = b1 b 2 b 3 b 4
(570)
b4
Note that ( b T ) T = b .
• Subtraction: Vector Operations - Addition.
• Vector-Matrix Multiplication: See Vector Operations - Matrix-Vector Multiplication.
See also Matrix, Vector Properties and Definitions.
Vector Properties and Definitions: A number of vector properties can be defined:
• Basis: A basis is a minimal set of linearly independent vectors which spans a particular subspace.
Representations of any vector in that subspace spanned by the basis vectors can be achieved by a unique
linear combination of the basis vectors.
• Cauchy-Schwartz Inequality: The Cauchy Schwartz inequality as applies to the 2-norm of two vectors
is given by:
wTx ≤ w
2
x
2
(571)
A useful interpretation of this inequality is that the output of an FIR digital filter will have a magnitude less
than or equal to the multiplication of the 2-norm of the weight vector and data vector; this information can
be useful in deciding the wordlength required be a DSP processor. See also Vector Properties - Norm,
FIR Filter.
• ∞ -norm: See Matrix Properties - Norm.
• Linearly Dependent: See Linearly Independent Entry.
• Linearly Independent: A set of vectors, { x 1, x 2, …, x N } , is linearly independent if:
N
∑ αj xj
= 0
j=1
implies that α i = 0 , for i = 1 to N If this condition is not true, then the vector set { x 1, x 2, …, x N } is said
to be linearly dependent.
As an example consider the vector set:
(572)
DSPedia
400
1 0 0
{ x 1, x 2, x 3 } = 0 , 1 , 0
0 0 1
(573)
There is clearly no linear combination of { x 1, x 2, x 3 } such that
3
∑ αj xj = α1 x1 + α2 x2 + α3 x3 ≠ 0
(574)
j=1
other than the trivial solution of α 1 = α 2 = α 3 = 0 . The set of vectors { x 1, x 2, x 3 } are therefore
linearly independent. However the set of vectors:
1 0 1
{ w 1, w 2, w 3 } = 0 , 1 , 2
0 0 0
(575)
are not linearly independent (and therefore linear dependent) as:
3
∑ αjwj
(576)
= α 1 w 1 + α2 w 2 + α 3 w 3 = 0
j=1
if
α 1 = 1, α 2 = 2, α 3 = – 1 . See also Vector Properties - Basis, Subspace, Rank.
• Minimum Norm: A system of linear equations can be defined as:
(577)
Ax = b
where A is a known m × n matrix and has rank ( A ) = min ( m, n ) , x is an unknown n element vector, and
b is a known m element vector. If multiple solutions exist that give the same error between Ax and b, then
the solution with the minimum 2-norm is typically desirable. This solution is referred to as the minimum
norm solution and is given by:
(578)
x LS = A + b
where A + is the pseudoinverse. See also Matrix Properties - Underdetermined/Overdetermined, PseudoInverse, Vector Properties - Norm.
• Norm: The vector norm provides a measure of the magnitude or distance spanned by an n element vector
in n-dimensional space. The most useful class of norms are the p-norms defined by:
1
--p
n
v
p
= ( v 1 + v 2 + … + vn ) 1 / p =
∑
vi
(579)
i=1
The most often used of these norms is the 2-norm, also referred to as the magnitude of the vector v:
v
2
= ( v 12 + v 22 + … + v n2 ) 1 / 2 =
The square of the 2-norm is denoted as v
For example, the 2-norm of a vector, x:
2
2
.
v 12 + v 22 + … + v n2
(580)
401
x =
3
4
–7
x
2
=
( 9 + 16 + 49 ) =
74 = 8.602
(581)
Other norms occasionally used are the 1-norm, which is the sum of the magnitude of all of the elements,
and the and the ∞-norm, which returns the magnitude of the element with the largest absolute value:
v
v
A p-norm unit vector is one that
x
x
= v 1 + v2 + … + v n
= max x i
∞
For the above 3 element vector x,
1
1
p
for i = 1 to n""
= 14 and,
x
∞
(582)
(583)
= 7 .
= 1 .. See also Matrix Properties - Invariant Norm.
• One Norm: See Vector Properties - Norm.
• Orthogonal: A set of vectors ( v 1, v 2, v 3, …, v n ) is said to be orthogonal if:
v iT v j = 0 for all i ≠ j
(584)
• Orthonormal: A set of vectors ( v 1, v 2, v 3, …, v n ) is said to be orthonormal if:
v iT v j = δ ij for all i, j
(585)
where δ ij is the Kronecker delta (i.e., δ ij =1 if i=j and δ ij =0 otherwise). Orthogonal and orthonormal sets
of vectors seem closely related and they are. The important distinction between an orthogonal set of
vectors and an orthonormal set of vectors is that the vectors from the orthonormal set all have a norm of
one. This is not necessarily the case for the set of orthogonal vectors.
• Outer Product: When a column vector ( n × 1 ) is post-multiplied by a row vector ( 1 × n ) the result is a
matrix ( n × n elements). For example for n = 3:
( x 12 ) ( x 1 x 2 ) ( x 1 x 3 )
x1
xx T = x 2 x 1 x 2 x 3 = ( x 2 x 1 ) ( x 22 ) ( x 2 x 3 )
x3
(586)
( x 3 x 1 ) ( x 3 x 2 ) ( x 32 )
The outer product is used to realise estimates of the covariance matrix and/or correlation matrix and is
widely used in adaptive digital signal processing formulations. See also Vector Properties - Inner product.
• Subspace: Given an m-dimensional space, ℜ m , and a set of m-dimensioned vectors ( v 1 v 2 v 3 … v n ) ,
the set of all possible linear combinations of these vectors forms a subspace of. ℜ m . The form of the
linearly combination is given by:
n
∑ αi vi
where, α i ∈ ℜ
i=1
The subspace defined by the linear combination of the vectors is said to be the span of
( v 1 v 2 v 3 … v n ) . For example consider the space ℜ 3 . The set of vectors:
(587)
DSPedia
402
1
v1 = 0
0
and
0
v2 = 0
2
(588)
can only specify points on the x-z plane within the three dimensional [x, y, z] space. Hence v 1, v 2 specify
a subspace.
y
1 0
Subspace spanned vectors: 0 , 0
0 2
z
0
x
There are effectively an infinite number of (plane) subspaces of ℜ 3 . Note that a subspace of ℜ 3 could
also be a straight line in three dimensional space if, for example, only v 1 = [ 1, 0, 0 ] T is used to define
the subspace. Since the form of the linear combination in Eq. [587] allows the scalars to be any value
(including all zeros), it is clear that the origin has to be a point in any valid subspace.
• Space: Given a vector, v = [ v 1, v 2, …, v m ] T , of dimension or length m, then it can be said that for
v i ∈ ℜ, for i = 0, 1, 2, …m and where ℜ is the set of real numbers, then v is contained in the space (or
m-dimensional space) denoted as ℜ m .
As examples, the space ℜ 2 can be visualised as the space consisting of all points on a two dimensional
plane, and the space ℜ 3 , considered as all possible points in three dimensional space. For spaces ℜ 4
and above it is impossible to visualise there physical existence, however their mathematical existence is
assured to the reader! See also Vector Properties - Subspace, Matrix Properties - Range.
y
y
z
0
0
x
x
Space ℜ 2 consists of all points on the x-y
plane.
Space ℜ 3 consists of all points in the x-y-z
three dimensional space.
For the vector v = [ x i y i ] T ,
For the vector w = [ x i y i z i ] T ,
if x i, y i ∈ ℜ
if x i, y i, z i ∈ ℜ
then v ∈ ℜ 2
then w ∈ ℜ 3
or v spans the space ℜ 2
or w spans the space ℜ 3
• Span: Given a linearly independent set of m-dimensional vectors { x 1, x 2, …x n } , the set of all linear
combinations of these vectors is referred to as the span of { x 1, x 2, …x n } , i.e.
n
span { x 1, x 2, …x n } =
∑ αixi
i=1
where, α i ∈ ℜ
(589)
403
Note that the span will define a subspace of ℜ m , where m > n . Note that if m = n then the vectors span
the entire space ℜ m . See also Vector Properties - Space/Subspace.
• Transpose Vector: The transpose of a vector is formed by interchanging the rows and columns and is
denoted by the superscript T. For example for a vector, x:
a
x = b
c
then
(590)
xT = a b c
• 2-norm: See Vector Properties - Norm.
• Unit Vector: A unit vector with respect to the p-norm is one that
- Norm.
x
p
= 1 . See also Vector Properties
• Weight Vector: The name given to the vector formed by the weights of an FIR filter.
See also Matrix, Vector Operations.
Vector Scaling: See Vector Operations - Scaling.
Vector Sum Excited Linear Prediction (VSELP): Similar to CELP vocoders except that VSELP
uses more than one codebook. VSELP also has the additional advantage that it can be run on fixed
point DSP processors, unlike CELP which requires floating point computation.
Vector Transpose: See Vector Operations - Transpose.
Vibration: A continuous to and fro motion, or reciprocating motion. Vibrations at audible
frequencies give rise to sound.
Vibrato: This is a simple frequency modulating effect applied to the output of a musical instrument.
For example a mechanical arm on a guitar can be used to frequency modulate the output to produce
a warbling effect. Vibrato can also be performed digitally by simple frequency modulation of a
signal. See also Music, Tremolo.
Virtual Instrument: The terminology used by some companies for a measuring instrument that is
implemented on a PC but is presented in a form that resembles the well know analog version of the
instrument. For example a virtual oscilloscope forms all of the normal controls as buttons and dials
actually drawn on the screen in order that the instrument can immediately be used by an engineer
whether they are familiar with DSP or not.
Virtual Reality: A virtual instrument (substitute) for living. Ultimately, this application of DSP image
and audio may prove to be very addictive.
Visually Evoked Potential: See Evoked Potentials.
Viterbi Algorithm: This algorithm is a means of solving an optimization problem (that can be
framed on a trellis -- or structured set of pathways) by calculating the cost (or metric) for each
possible path and selecting the path with the minimum metric [103]. The algorithm has proven
extremely useful for decoding convolutional codes and trellis coded modulation. For these
applications, the paths are defined on a trellis and the metrics are Hamming distance for
convolutional codes and Euclidean distance for trellis coded modulation. These metrics result in the
smallest possible probability of error when signals are transmitted over an additive white Gaussian
noise channel (this is a common modelling assumption in communications). See also Additive
DSPedia
404
White Gaussian Noise (AWGN), Channel Coding, Trellis Coded Modulation, Euclidean Distance,
Hamming Distance.
Viterbi Decoder: A technique for decoding convolutionally encoded data streams that uses the
Viterbi algorithm (with a Hamming distance metric) to minimize the probability of data errors in a
digital receiver. See Viterbi Algorithm. See also Channel Coding.
VLSI: Very Large Scale Integration. The name given to the process of integrating millions of
transistors on a single silicon chip to realize various digital devices (logic gates, flip-flops) which in
turn are used to make system level components such as microprocessors, all on a single chip.
VME Bus: A bus found in SUN workstations, VAXs and others. Many DSP board manufacturers
make boards for VME bus, although they are usually a little more expensive than for the PC-Bus.
Vocoders: A vocoder analyzes the spectral components of speech to try to identify the parameters
of the speech waveform that are perceived by the human ear. These parameters are then extracted,
transmitted and used at the receiver to synthesize (approximately) the original speech pattern. The
resulting waveform may differ considerably from the original, although it will sound like the original
speech signal. Vocoders have become popular at very low bit rates (2.4kbits/sec).
Volatile: Semiconductor Memory that loses its contents when the power is removed is volatile.
See also Non-Volatile, Dynamic RAM, Static RAM.
Volterra Filter: A filter based on the non linear Volterra series, and used in DSP to model certain
types of non-linearity. The second order Volterra filter includes second order terms such that the
output of the filter is given by:
N–1
y(k) =
N – 1N – 1
∑ wn ( k )x ( k – n ) + ∑ ∑ wij x ( k – i ) x ( k – j )
n=0
(591)
i=0j=0
where w n are the linear weights and w ij are the quadratic weights. Adaptive LMS based Volterra
filters are also widely investigated and a good tutorial article can be found in [109].
Voice Grade Channel: A communications channel suitable for transmission of speech, analog
data, or facsimile, generally over a frequency band from 300Hz to 3400Hz.
Volume Unit (VU): VU meters have been used in recording for many years and give a measure of
the relative loudness of a sound [14], [46]. In general a sound of long duration is actually perceived
by the human ear as louder than a short duration burst of the same sound. VU meters have rather
a “sluggish” mechanical response, and therefore have an in built capability to model the human ear
temporal loudness response. An ANSI standard exists for the design of VU meters. See also Sound
Pressure Level.
Von Hann Window: See Windows.
VXI Bus: A high performance bus used with instruments that can fit on a single PCB card. This
standard is a capable of transmitting data at up to 10Mbytes/sec.
405
W
Waterfall Plot: A graphical 3-D plot that shows frequency plotted on the X-axis, signal power on
the Y-axis, and time elapsing on the Z-axis (into the computer screen). As time elapses and
segments of data are transformed by the FFT, the screen can appear like a waterfall the 2-D spectra
pass along the Z-axis.
Warble Tone: If an audible pure tone is frequency modulated (FM) by a smaller pure tone (typically
a few Hz) the perceived signal is often referred to as a warble tone, i.e. the signal is perceived to
be varying between two frequencies around the carrier tone frequency. Warble tones are often used
in audiometric testing where stimuli signals are played to a subject through a loudspeaker in a
testing room. If pure tones were used there is a possibility that a zone of acoustic destructive
interference would occur at or near the patient’s head thus making the test erroneous. The use of
warble tones greatly reduces this possibility as the zones of destructive interference will not be
static.
To produce a warble tone, consider a carrier tone at frequency f c , frequency modulated by another
tone at frequency f m :
w ( t ) = sin ( 2πf c t + β sin 2πf m t ) = sin θ ( t ) i.e. θ ( t ) = 2πf c t + β sin 2πf m t
(592)
where β is the modulation index which controls the maximum frequency deviation from the carrier
frequency. For example if a carrier tone f c = 1000 Hz is to be modulated by a tone f m = 5 Hz
such that the warble tone signal frequency varies between 900Hz and 1000Hz at a rate 5 times per
second, then noting that the instantaneous frequency of an FM tone, f , is given by:
1 dθ ( t )
f = ------ ------------- = f c + βf m cos 2πf m t
2π dt
(593)
Amplitude
the modulation index required is β = 20 to give the required frequency swing. See also
Audiometer, Audiometry, Binaural Beats, Constructive Interference, Destructive Interference.
time (secs)
A warble tone where an audible frequency tone carrier is modulated by
a lower frequency modulating tone usually of a few Hz.
Watt: The surname of the Scottish engineer James Watt who gave his name to the unit of power.
In an electrical system power is calculated from:
2
-----P = V ⋅ I = I2 = V
R
(594)
Waveform: The representation of a signal plotted (usually) as voltage against time, where the
voltage will represent some analog time varying quantity (e.g. audio, speech and so on).
DSPedia
406
Waveform Averaging: (Ensemble Averaging) The process of taking a number of measurements
of a periodic signal, summing the respective elements in each record and dividing by the number
of measurements. Waveform Averaging is often used to reduce the noise when the noise and
periodic signal are uncorrelated. As an example, averaging is widely used in ECG signal analysis
where the process retains correlated frequencies in the periodic signal and the removes the
uncorrelated one to reveal the distinctive ECG complex.
Wavelet Transform: The wavelet transform is an operation that transforms a signal integrated
with specific functions, often known as the kernel functions. This kernel functions may be referred
to as the mother wavelet and the associated scaling function. Using the scaling function and mother
wavelet, multi-scale translations and compressions of these functions can be produced. The
wavelet transform actually generalizes the time frequency representation of the short time Fourier
Transform (STFT). Compared to the STFT the wavelet transform allows non-uniform bandwidths or
frequency bins and allows resolution to be different at different frequencies. Over the last few years
DSP has seen considerable interest and application of the wavelet transform, and the interested
reader is referred to [49].
Web: See World Wide Web.
Weight Vector: Weighted Moving Average (WMA): See Finite Impulse Response (FIR) filter.
See also Moving Average.
Weight Vector: The weights of an FIR digital filter can be expressed in vector notation such that
the output of a digital filter can be conveniently expressed as a row-column vector product (or inner
product).
xk-1
xk-2
xk-3
xk
w0
w1
w2
w3
yk
3
yk =
∑
wn x k – n = w 0 xk + w 1 xk – 1 + w2 x k – 2 + w3 x k – 3
n=0
xk
⇒ yk = wT x = w0 w 1 w2 w3
xk – 1
xk – 2
xk – 3
If the digital filter is IIR, then two weight vectors can be defined: one for the feedforward weights
and one for the feedback weights. For further notational brevity the two weight vectors and two data
407
vectors can be respectively combined into a single weight vector, and a data vector consisting of
past input data and past output samples:.
xk-1
xk-2
xk-3
xk
a0
a1
a2
yk
yk-3
yk-2
b3
2
yk =
∑
n=0
b2
yk-1
b1
3
a n xk – n +
∑
b n y k – n = a0 x k + a 1 xk – 1 + a2 x k – 2 + b1 y k – 1 + b2 y k – 2 + b3 y k – 3
n=1
xk
yk – 1
⇒ yk = a T x k + b T y k – 1 = a0 a1 a2 x k – 1 + b1 b2 b 3 yk – 2
xk – 2
⇒ yk = a T bT
xk
yk – 1
yk – 3
= wT uk
See also Vector Properties and Definitions - Weight Vector.
Weighting Curves: See Sound Pressure Level Weighting Curves.
Weights: The name given to the multipliers of a digital filter. For example, a particular FIR may be
described as having 32 weights. The terms weights and coefficients are used interchangeably. See
also FIR filter, IIR filter, Adaptive Filter.
Well-Conditioned Matrix: See Matrix Properties - Well Conditioned.
Western Music Scale: The Western music scale is based around musical notes separated by
octaves [14]. If a note, X, is an octave higher than another note, Y, then the fundamental frequency
of X is twice that of Y. From one octave frequency to the next in the Western music scale, there are
twelve equitempered frequencies which are spaced one semitone apart, where a semitone is a
logarithmic increase in frequency (If the two octave frequencies are counted then there are thirteen
DSPedia
408
notes). The Western music scale can be best illustrated on the well known piano keyboard which
comprises a full chromatic scale:
F#3 A3b B3b
C4# E4b
F#4 A4b B4b
C#4 E4b
F#5
F3 G3 A3 B3 C4 D4 E4 F4 G4 A4 B4 C5 D5 E5 F5 G5
One octave
increasing fundamental frequency
A section of the familiar piano keyboard with the names of the notes marked. One octave is
twelve equitempered notes (sometimes called the chromatic scale), or eight notes of a major
scale. The black keys represent various sharps (#) and flats (b). The piano keyboard extends
in both directions repeating the same twelve note scale. Neighboring keys (black or white) are
defined as being a semitone apart. If one note separates two keys, then they are a tone apart.
The letters A to G are the names given to the notes.
The International Pitch Standard defines the fundamental frequency of the note A4 as being 440
Hz. The note A4 is the first A above middle C (C4) which is located near the middle of a piano
keyboard. Each note on the piano keyboard is characterised by its fundamental frequency, f 0 ,
which is usually the loudest component caused by the fundamental mode of vibration of the piano
string being played. The “richness” of the sound of a single note is caused by the existence of other
modes of vibration which occur at harmonics (or integer multiples) of the fundamental, i.e. 2f 0, 3f 0
and so on. The characteristic sound of a musical instrument is produced by the particular harmonics
that make up each note.
On the equitempered Western music scale the logarithmic difference between the fundamental
frequencies of all notes is equal. Therefore noting that in one octave the frequency of the thirteenth
note in sequence is double that of the first note, then if the notes are equitempered the ratio of the
fundamental frequencies of adjacent notes must be 2 1 / 12 = 1.0594631… . As defined the ratio
between the first and thirteenth note is then of course ( 2 1 / 12 ) 12 = 2 , or an octave. The actual
logarithmic difference in frequency between two adjacent notes on the keyboard is:
log 2 1 / 12 = 0.025085…
(595)
Two adjacent notes in the Western music scale are defined as being one semitone apart, and two
notes separated by two semitones are a tone apart. For example, musical notes B and C are a
semitone apart, whereas G and A are a tone apart as they are separated by Ab.
409
Therefore the fundamental frequencies of 3 octaves of the Western music scale can be
summarised in the following table, where the fundamental frequency of the next semitone is
calculated by multiplying the current note fundamental frequency by 1.0594631...:
Note
Fundamental
frequency (Hz)
Note
Fundamental
frequency (Hz)
Note
Fundamental
frequency (Hz)
C3
130.812
C4
261.624
C5
523.248
C3#
138.591
C4#
277.200
C5#
554.400
D3
146.832
D4
293.656
D5
587.312
Eb3
155.563
Eb4
311.124
Eb5
622.248
E3
164.814
E4
329.648
E5
659.296
F3
174.614
F4
349.228
F5
698.456
F#3
184.997
F#4
370.040
F#5
740.080
G3
195.998
G4
392.040
G5
784.080
Ab3
207.652
Ab4
415.316
Ab5
830.632
A3
220
A4
440
A5
880
Bb3
233.068
Bb4
466.136
Bb5
932.327
B3
246.928
B4
493.856
B5
987.767
A correctly tuned musical instrument will therefore produce notes with the frequencies as stated
above. However it is the existence of subtle fundamental frequency harmonics that gives every
instrument its unique sound qualities. It is also worth noting that certain instruments may have some
or all notes tuned “sharp” or “flat” to create a desired effect. Also noting that pitch perception and
frequency is not a linear relationship the high frequencies of certain instruments may be tuned
slightly “sharp”.
Music is rarely represented in terms of its fundamental frequencies and instead music staffs are
used to represent the various notes that make up a particular composition. A piece of music is
usually played in a particular musical key which is a subset of eight notes of an octave and where
those eight notes have aesthetically pleasing perceptible qualities. The major key scales are
DSPedia
410
realised by starting at a root note and selecting the other notes of the key in intervals of 1, 1, 1/2, 1,
1, 1, 1/2 tones (where 1/2 tone is a semitone). For example the C-major and G-major scales are:
G
One tone
A
1
1
B C
1/2
D
E
1
1
F# G
1
1/2
G-major Scale
1
One Semitone
G Ab A
Bb B C
1/2
C-major Scale
C
C# D
Eb E
1
1
D
F
1/2
E F
F# G Ab A
1
Bb B
1
G
1
A
C
1/2
B C
Starting at the any root note, X, of the chromatic scale, the, X-major scale can be
produced by selecting notes in steps of 1,1,1/2,1,1,1,1/2 tones. The above shows
example of the C- and G-major scales. There are a total of 12 major scales possible.
There are many other forms of musical keys, such as the natural minors which are formed by the
root note and then choosing in steps of 1, 1/2, 1, 1, 1/2, 1, 1. For more information on the rather
elegant and simple mathematics of musical keys, refer to a text on music theory.
C-major Scale
Treble
Staff
C4 D4 E4 F4
G4 A4
B4 C5
D5
E5 F5
G5
D3 E 3
F3 G 3
A3
B 3 C 4 D4
Bass
Staff
G2 A2
B2
C2
Music notation for the C major scale which has no sharps or flats (i.e., only the white notes of
the piano keyboard). Different notes are represented by different lines and spaces on the staff
(the five parallel lines). The treble clef (the “g” like letter marking the G-line on the top left hand
side of the staff) usually defines the melody of a tune, whereas the bass clef (the “f” like letter
marking the F-line on the bottom left hand side of the staff) defines the bass line. Note that
middle C (C4) is represented on a “ledger” line existing between the treble and bass staffs. On
a piano the treble is played with the right hand, and the bass with the left hand. For other scales
(major or minor), the required sharps and flats are shown next to the bass and treble clefs.
Many musical instruments only have the capability of playing either the treble or bass, e.g. the
flute can only play the treble clef, or the double bass can only play the bass clef.
411
G-major Scale
Treble
Staff
Bass
Staff
#
C4 D4 E4 F4#
G4 A 4
B 4 C5
D5
E5 F5#
G2 A2
D3 E3
F3# G3
A3
B3
G5
#
B2
C3
C4 D4
Music notation for the G major scale which has one sharp (sharps and flats are the black notes
of the piano keyboard). Therefore whenever an F note is indicated by the music, then an F#
should be played in order to ensure that the G-major scale is used.
So what are the qualities of the Western music scale that make it pleasurable to listen to? The first
reason is familiarity. We are exposed to music from a very early age and most people can recognise
and recall a simple major scale or a tune composed of notes from a major scale. The other reasons
are that the ratios of the frequencies of certain notes when played together are “almost” low integer
ratios and these chords of more than one note take on a very “full” sound.
For example the C-major chord is composed of the 1st, 3rd and 5th notes of the C-major scale, i.e.
C,E,G. If we consider the ratios of the fundamental frequencies of these notes:
5
C
---- = 2 4 / 12 = 1.2599… ≈ --4
E
3
C- = 7 / 12 =
--1.4983… ≈ --2
2
G
(596)
6
E- = 3 / 12 =
--1.189… ≈ --2
5
G
they can be approximated by “almost” integer ratios of the fundamental frequencies. (Note that on
the very old scales -- the Just scale and the Pythagorean scale -- these ratios were exact). When
these three notes are played together the frequency differences actually reinforce the fundamental
which produces a rich strong sound. This can be seen by considering the simple trigonometric
identities:
1–5⁄4
1+5⁄4
C + E = cos C 0 + cos ( 2 1 / 3 C 0 ) ≈ 2 cos  -------------------- C 0 cos  -------------------- C 0
 2 

2 
1
9
= 2 cos --- C o cos --- C 0
8
8
and
(597)
DSPedia
412
1–3⁄2
1+3⁄2
C + G = cos C 0 + cos ( 2 7 / 12 C 0 ) ≈ 2 cos  -------------------- C 0 cos  -------------------- C 0
 2 

2 
5
1
= 2 cos --- C 0 cos --- C 0
4
4
(598)
where C 0 = 2πf C t and f C is the fundamental frequency of the C note. Adding together the C and
E results in a sound that may be interpreted as a C three octaves below C0 modulating a D. Similarly
the addition of the C and G results in sound that may be interpreted as a C two octaves below C0
that is modulated an E. The existence of these various modulating subharmonics leads to the “full”
and aesthetically pleasing sound of the chord. In addition to major chords, there are many others
such as the minor, the seventh and so on. All of the chords have there own distinctive sound to
which we have become accustomed and associated certain styles of music.
Prior to the existence of the equitempered scale there were other scales which used perfect integer
ratios between notes ratios. Also around the world there are still many other music scales to be
found, particularly in Asia. See also Digital Audio, Just Music Scale, Music, Music Synthesis,
Pythagorean Scale.
White Noise: A signal that (in theory) contains all frequencies and is (for most purposes)
completely unpredictable. Most white noise is defined as being Gaussian, which means that it has
definable properties of mean (average value) and variance (a measure of its power). White noise
has a constant power per unit bandwidth, and is labelled white because of the analogy with white
light (containing all visible light frequencies with nearly equal power). In a digital system, a white
noise sequence has a flat spectrum from 0Hz to half the sampling frequency.
Wide Sense Stationarity: If a discrete time signal, x ( k ) , has a time invariant mean:
E{x(k)} =
∑ x ( k )p { x ( k ) }
(599)
k
and a time invariant autocorrelation function:
r( n) =
∑ x ( k )x ( k – n )p { x ( k ) }
(600)
k
that is a function only of the time separation, n – k , but not of time, k, is said to be wide sense
stationary. Therefore if the signal, x ( k ) , is also ergodic, then:
1
E { x ( k ) } ≅ --------------------M2 – M1
M2 – 1
∑
x ( k ),
for any M 1 and M 2 where M 2 » M 1
(601)
n = M1
and
1
E { x 2 ( k ) } ≅ --------------------M2 – M1
M2 – 1
∑
n = M1
[ x( k ) ]2 ,
for any M 1 and M 2 where M 2 » M 1
(602)
413
For derivation and subsequent implementation of least means squares DSP algorithms using
stochastic signals, assuming wide sense stationarity is usually satisfactory. See Autocorrelation,
Expected Value, Least Mean Squares, Mean Value, Mean Squared Value, Strict Sense Stationary,
Variance, Wiener-Hopf Equations.
Wideband: A signal that uses a large portion of a particular frequency band may be described as
wideband. The classification into wideband and narrowband depends on the particular application
being described. For example, the noise from a reciprocating (piston) engine may be described as
narrowband as it consists of a one main frequency (the drone of the engine) plus a some frequency
components around this frequency, whereas the noise from a jet engine could be described as
wideband as it covers a much larger frequency band and is more white (random) in its make-up.
Sound Pressure (dB)
Sound Pressure (dB)
In telecommunications wideband or broadband may describe a circuit that provides more
bandwidth than a voice grade telephone line (300-3000Hz) i.e. a circuit or channel that allows
frequencies of upto 20kHz to pass. These type of telecommunication broadband channels are used
for voice, high speed data communications, radio, TV and local area data networks.
Narrowband Engine Noise
0.1
0.4
1.6
6.4
25.6
Frequency (kHz)
Wideband Engine Noise
0.1
0.4
1.6
6.4
25.6
Frequency (kHz)
Widrow: Professor Bernard Widrow of Stanford University, USA, generally credited with
developing the LMS algorithm for adaptive digital signal processing systems. The LMS algorithm is
occasionally referred to as Widrow’s algorithm.
Wiener-Hopf Equations: Consider the following architecture based on a FIR filter and a
subtraction element:
d( k )
x( k)
w
0
w
1
w
2
wN
–2
wN
–1
+
y(k ) -
e( k)
The output of an FIR filter, y ( k ) is subtracted from a desired signal, d ( k ) to produce
an error signal, e ( k ) . If there is some correlation between the input signal, x ( k ) and
the desired signal, d ( k ) then values can be calculated for the filter weights,
w ( 0 ) to w ( N – 1 ) in order to minimize the mean squared error, E { e 2 ( k ) } .
If the signal x ( k ) and d ( k ) are in some way correlated, then certain applications and systems may
require that the digital filter weights, w ( 0 ) to w ( N – 1 ) are set to values such that the power of the
error signal, e ( k ) is minimised. If weights are found that minimize the error power in the mean
squared sense, then this is often referred to as the Wiener-Hopf solution.
DSPedia
414
To derive the Wiener Hopf solution it is useful to use a vector notation for the input vector and the
weight vector. The output of the filter, y(k), is the convolution of the weight vector and the input
vector:
N–1
y(k) =
∑ wn x ( k – n )
= wTx(k )
(603)
n=0
where,
w = [ w 0 w 1 w 2 … w N – 2 wN – 1 ] T
(604)
x ( k ) = [ x( k ) x( k – 1 ) x( k – 2 ) … x( k – N + 2 ) x( k – N + 1 ) ]T
(605)
and,
Assuming that x ( k ) and d ( k ) are wide sense stationary processes and are correlated in some
sense, then the error, e ( k ) = d ( k ) – y ( k ) can be minimised in the mean squared sense.
To derive the Wiener-Hopf equations consider first the squared error:
e2( k ) = [ d(k ) – y(k )]2
= d 2 ( k ) – [ w T x ( k ) ] 2 – 2d ( k )w T x ( k )
(606)
= d 2 ( k ) – w T x ( k )x T ( k )w – 2w T d ( k )x ( k )
Taking expected (or mean) values we can write the mean squared error (MSE), E { e 2 ( k ) } as:
E { e 2 ( k ) } = E { d 2 ( k ) } – w T E { x ( k )x T ( k ) }w – 2w T E { d ( k )x ( k ) }
(607)
Writing in terms of the N × N correlation matrix,
r0
r1
r2
… rN – 1
r1
r0
r1
… rN – 2
r1
r0
… rN – 3
R = E { x ( k )x T ( k ) } = r
2
:
:
:
… :
rN – 1 rN – 2 rN – 3 … ro
and the N × 1 cross correlation vector,
(608)
415
p0
p1
p = E { d ( k )x ( k ) } = p
2
(609)
:
pN – 1
gives,
ζ = E { e 2 ( k ) } = E { d 2 ( k ) } + w T Rw – 2w T p
(610)
where ζ is used for notational convenience to denote the MSE performance surface. Given that this
equation is quadratic in w then there is only one minimum value. The minimum mean squared error
(MMSE) solution, w opt , can be found by setting the (partial derivative) gradient vector, ∇ , to zero:
∇ = ∂ζ = 2Rw – 2p = 0
∂w
⇒ w opt =
desired
signal
input
signal
x(k)
(611)
R –1 p
FIR Digital Filter,
y(k) =wTx(k)
y(k)
Output −
signal
d(k)
+
e(k)
error
signal
Calculate
w = R-1p
A simple block diagram for the Wiener-Hopf calculation. Note that there is no feedback
and therefore, assuming R is non-singular, the algorithm is unconditionally stable.
(612)
To appreciate the quadratic and single minimum nature of the error performance surface consider
the trivial case of a one weight filter:
ζ = E { d 2 ( k ) } + rw 2 – 2wp
(613)
DSPedia
416
MSE, ζ
where E [ d 2 ( k ) ] , r, and p are all constant scalars. Plotting mean squared error (MSE), ζ , against
the weight vector, w, produces a parabola (upfacing):
Point of zero
gradient
∇ =
dζ
= 2rw – 2p = 0
dw
w opt = r –1 p
MMSE
w
wopt
The mean square error (MSE) performance surface, ζ, of for a single weight filter.
The MMSE solution occurs when the surface has gradient, ∇ = 0 .
If the filter has two weights the performance surface is a paraboloid which can be drawn in 3
dimensions:
MSE, ζ
Point of zero
gradient
MMSE
∇ =
dζ
= 2Rw – 2p = 0
dw
w1
w1(opt)
w opt =
w0(opt)
w0
w1
=
opt
r0 r1
r1 r0
–1
p0
p1
w0
The mean square error (MSE) performance surface, ζ, of for a two weight filter.
If the filter has more than three weights then we cannot draw the performance surface in three
dimensions, however, mathematically there is only one minimum point which occurs when the
gradient vector is zero. A performance surface with more than three dimensions is often called a
hyperparaboloid.
To actually calculated the Wiener-Hopf solution, w opt = R – 1 p requires that the R matrix and p
vector are realised from the data x ( k ) and d ( k ) , and the R matrix is then inverted prior to
premultiplying vector p. Given that we assumed that x ( k ) and d ( k ) are stationary and ergodic, then
we can estimate all elements of R and p from:
1
r n = ----M
M–1
∑
i=0
xi xi + n
and
1
p n = ----M
M–1
∑
di xi + n
(614)
i=0
Calculation of R and p requires approximately 2MN multiply and accumulate (MAC) operations
where M is the number of samples in a “suitably” representative data sequence, and N is the
417
adaptive filter length. The inversion of R requires around N3 MACs, and the matrix-vector
multiplication, N2 MACs. Therefore the total number of computations in performing this one step
algorithm is 2MN + N3 + N2 MACs. The computation load is therefore very high and real time
operation is computationally expensive. More importantly, if the statistics of signals x ( k ) or d ( k )
change, then the filter weights will need to be recalculated, i.e. the algorithm has no tracking
capabilities. Hence direct implementation of the Wiener-Hopf solution is not practical for real time
DSP implementation because of the high computational load, and the need to recalculate when the
signal statistics change. For this reason real time systems which need to minimize an error signal
power use gradient descent based adaptive filters such as the least mean squares (LMS) or
recursive least squares (RLS) type algorithms. See also Adaptive Filter, Correlation Matrix,
Correlation Vector, Least Mean Squares Algorithm, Least Squares.
Whitening Filter: A filter that takes a stochastic signal and produces a white noise output [77]. If
the input stochastic signal is an autoregressive process, the whitening filters are all-zero FIR filters.
See also Autoregressive Model.
Window: A window is a set of numbers that multiply a set of N adjacent data samples. If the data
was sampled at frequency f s , then the window weights N ⁄ f s second of data. There a number of
semi-standardized data weighting windows used to pre-weight data prior to frequency domain
calculations (FFT/DFT). The most common are the Bartlett, Von Hann, Blackman, Blackmannharris, Hamming, and Hanning:
• Bartlett Window: A data weighting window used prior to frequency transformation (FFT) to reduce
spectral leakage. Compared to the uniform window (no weighting) the Bartlett window doubles the width
of the main lobe, while attenuating the main sidelobe by 26dB, compared to the 13dB of the uniform
window. For N data samples, the Barlett window is defined by:
n
h ( n ) = 1.0 – ----------N⁄2
N
N
for n = – ---- ..... –2 , –1 , 0, 1, 2, .... ---2
2
(615)
• Blackmann Window: A data weighting window used prior to frequency transformation (FFT) providing
improvements over the Bartlett and Von Hann windows by increasing spectral leakage rejection. For N
data samples, the Blackmann window is defined by:
2
h (n ) =
∑
k=0
N
2knπ
N
a ( k ) cos  -------------- for n = ---- .....-2,-1,0,1,2 ..... –---N
2
2
(616)
with coefficients:
a(0) = 0.42659701, a(1) = 0.49659062, a(2) = 0.07684867
• Blackmann-harris Window: A type of data window often used in the calculation of FFTs/DFTs for
reducing spectral leakage. Similar to the Blackman window, but with four cosine terms:
3
h (n ) =
∑
k=0
N
2knπ
N
a ( k ) cos  -------------- for n = ---- .....-2,-1,0,1,2 ..... –---N
2
2
with coefficients:
a(0) = 0.3635819, a(1) = 0.4891775, a(2) = 0.1365995, a(3) = 0.0106411
• Hamming Window: A data weighting window used prior to frequency transformation (FFT) to reduce
spectral leakage. Compared to the uniform window (no weighting) the Bartlett window doubles the width
(617)
DSPedia
418
of the main lobe, while attentuating the main sidelobe by 46dB, compared to the 13dB of the uniform
window. Compared to the similar Von Hann window, the Hamming window sidelobes do not decay as
rapidly. For N data samples, the Barlett window is defined by:
N
2nπ
N
h ( n ) = 0.54 + 0.46 cos  ---------- for n = ---- .....-2,-1,0,1,2 ..... –---N
2
2
(618)
• harris Window: A data weighting window used prior to frequency transformation (FFT) to reduce spectral
leakage (similar to the Bartlett and Von Hann windows). For N data samples, the harris window is defined
by:
h( n) =
3
∑
k =0
N
2knπ
N
a ( k ) cos  -------------- for n = ---- .....-2,-1,0,1,2 ..... –---N
2
2
(619)
with coefficients:
a(0) = 0.3066923, a(1) = 0.4748398, a(2) = 0.1924696, a(3) = 0.0259983
• Vonn Hann Window: A data weighting window used prior to frequency transformation (FFT). Compared
to the uniform window (no weighting) the Von Hann doubles the width of the main lobe, while attentuating
the main sidelobe by 32dB, compared to the 13dB of the uniform window. For N data samples, the Von
Hann window is defined by:
2nπ
h ( n ) = 0.5 + 0.5 cos  ----------
N
N
N
for n = – ---- ..... –2 , –1 , 0, 1, 2, .... ---2
2
(620)
Wold Decomposition: H. Wold showed that any stationary stochastic discrete time process,
x ( n ) , can be decomposed into two components: (1) a general linear regression of white noise; and
(2) a predictable process. The general linear regression of white noise is given by:
∞
u( k) = 1 +
∑ bnv ( k – n )
n=1
∞
with
∑
bn < ∞
(621)
n=1
and the predictable process, s ( n ) , can be entirely predicted from its own past samples. s ( n ) and
v ( n ) are uncorrelated, i.e. E { v ( n )s ( k ) } = 0 for all n, k [77]. See also Autoregressive Modelling,
Yule Walker Equations.
Woodbury’s Identity: See Matrix Properties - Inversion Lemma.
Wordlength: The size of the basic unit of arithmetic computation inside a DSP processor. For a
fixed point DSP processor the wordlength is at least 16 bits, and in the case of the DSP56000, it is
24 bits. Floating point DSP processors usually use 32 bit wordlengths. See also DSP Processor,
Parallel Multiplier.
World Wide Web (WWW): The World Wide Web (or the web) has become the de facto standard
on the internet for storing, finding and transferring open information; hypertext (with text, graphics
and audio) is used to access information. Most universities and companies involved in DSP now
have web servers with home pages where the information available on a particular machine is
summarised. There are also likely to be hypertext links available for cross referencing to additional
information. The best way to understand the existence and usefulness of the World Wide Web is to
use it with tools such as Mosaic or Netscape. Speak to your system manager or call up your phone
company or internet service provider for more information.
419
Woofer: The section of a loudspeaker that reproduces low frequencies is often called the woofer.
The name is derived from the low pitched woof of a dog. The antithesis to the woofer is the tweeter.
See also Tweeter.
420
DSPedia
421
X
X-Series Recommendations: The X-series telecommunication recommendations from the
International Telecommunication Union (ITU), advisory committee on telecommunications
(denoted ITU-T and formerly known as CCITT) provide standards for data networks and open
system communication. For details on this series of recommendations consult the appropriate
standard document or contact the ITU.
The well known X.400 standards are defined for the exchange of multimedia messages by storeand-forward transfer. The X.400 standards therefore provide an international service for the
movement of electronic messages without restriction on the types of encoded information
conveyed. The ITU formed a collaborative partnership with the International Organization for
Standards for the development and continued definition of X.400 in 1988 (See ISO 10021 (Parts 17).) A joint technical committee was also formed by the ISO and the International Electrotechnical
Commission (IEC). See also International Electrotechnical Commission, International Organization
for Standards, International Telecommunication Union, ITU-T Recommendations, Standards.
xk: x k or x(k) is often the name assigned to the input signal of a DSP system.
x(k)
DSP System
y(k)
422
DSPedia
423
Y
yk: y k or y(k) is usually the name assigned to the output signal of a DSP system.
x(k)
y(k)
DSP System
Yule Walker Equations: Consider a stochastic signal, u ( k ) produced by inputting white noise,
v ( k ) to an all-pole filter:
Modelled Signal, or
Autoregressive Process
Autoregressive
Model
{b1, b2,..., bM
White Noise
v(k)
u(k)
The output signal u ( k ) is referred to as an autoregressive process, and was generated by
a white noise input at v ( k ) .
If the inverse problem is posed such that you are given the autoregressive signal u ( k ) and the order
of the process (say M), then the autoregressive filter weights {b1, b2, ... bM} that produced the given
process from a white noise signal, v ( n ) can be found by solving the Yule Walker equations:
⇒ b AR = R –1 r
where the vector
(622)
b = [ b 1 … b M – 1 b M ] T , R is the M × M correlation matrix:
R = E { u ( k – 1 )u T ( k – 1 ) } =
r0
… rM – 2 rM–1
:
…
…
:
r0
:
r1
rM – 1 …
r1
r0
rM – 2
(623)
and r the M × 1 correlation vector,
r1
r = E { u ( k )u ( k – 1 ) } =
r2
:
rM
where r n = E { u ( k )u ( k – n ) } = E { u ( k – n )u ( k ) } , where E { . } is the expectation operator.
See also Autoregressive Modelling.
(624)
424
DSPedia
425
Z
Z-1: Derived from the z-transform of signal, z – 1 is taken to mean a delay of one sample period.
Sometimes denoted simply as ∆ .
Zeroes: A sampled impulse response (e.g. of a digital filter) can be transferred into the Z-domain,
and the zeroes of the function can be found by factorizing the polynomial to find the roots:
H ( z ) = 1 – 3z –1 + 2z – 1 = ( 1 – z –1 ) ( 1 – 2z –2 )
(625)
i.e. the zeros are z = 1 and z = 2.
Zero Order Hold: If a signal is upsampled or reconstructed by holding the same value until the
next sample value, then this is a zero order hold. Also called step reconstruction. See First Order
Hold, Reconstruction Filter.
Zero-Padding: See Fast Fourier Transform - Zero Padding.
Zoran: A manufacturer and designer of special purpose DSP devices.
Z-transform: A mathematical transformation used for theoretical analysis of discrete systems.
Transforming a signal or a system into the z-domain can greatly facilitate the understanding of a
particular system [10].
426
DSPedia
427
Common Numbers Associated with DSP
In this section numerical values which are in some way associated with DSP and its applications
are listed. The entries are given in an alphabetical type order, where 0 is before 1, 1 is before 2 and
so on, with no regard to the actual magnitude of the number. Decimal points are ignored.
0 dB: If a system attenuates a signal by 0 dB then the signal output power is the same as the signal
input power, i.e.
P out
- = 10 log 1 = 0 dB
10 log --------P in
(626)
0x: Used as a prefix by Texas Instruments processors to indicate hexadecimal numbers.
0.0250858... : The base 10 logarithm of the ratio of the fundamental frequency of any two
neighboring notes (one semi-tone apart) on a musical instrument tuned to the Western music scale.
See also Western Music Scale.
0.6366197: An approximation of 2 ⁄ π . See also 3.92dB.
1 bit A/D: An alternative name for a Sigma-Delta ( Σ-∆ ) A/D.
1 bit D/A: An alternative name for a Sigma-Delta ( Σ-∆ ) D/A.
1 bit idea: An alternative name for a really stupid concept.
10-12 W/m2: See entry for 2 x10-5 N/m2.
1004Hz: When measuring the bandwidth of a telephone line, the 0dB point is taken at 1004 Hz.
10149: The ISO/IEC standard number compact disc read only system description. Sometimes
refered to as the Yellow Book. See also Red Book.
10198: The ISO/IEC standard number for JPEG compression.
1024: 210. The number of elements in 1k, when refering to memory sizes, i.e. 1 kbyte = 1024 bytes.
1.024 Mbits/sec: The bit rate of a digital audio system sampling at f s = 32000 Hz with 2 (stereo)
channels and 16 bits per sample.
1070 Hz: One of the FSK (frequency shift keying) carrier frequencies for the Bell 103, 300 bits/sec
modem. Other frequencies are 1270 Hz, 2025 Hz and 2225 Hz.
103: The Bell 103 was a popular 300 bits/sec modem standard.
1.05946...: The twelfth root of 2, i.e 2 1 / 12 . This number is the basis of the modern western music
scale whereby the ratio of the fundamental frequencies of any two adjacent notes on the scale is
1.05946... See also Music, Western Music Scale.
10.8dB: Used in relation to quantisation noise power calculations; 10 log 1 ⁄ 12 = 10.8 dB .
11.2896 MHz: 2 × 5.6448 MHz and used as a clock for oversampling sigma delta ADCs and DACs.
5.6448 MHz sampling frequency can be decimated by a factor of 128 to 44.1kHz ,a standard
hifidelity audio sampling frequency for CD players.
428
DSPedia
115200 bits/sec: The 111520 bits/sec modem is an eight times speed version of the very popular
14400 modem and became available in the mid 1990s. This modem uses echo cancellation, data
equalisation, and data compression technique to achieve this data rate. See also 300, 2400, Vseries recommendations.
11544: The ISO/IEC standard number for JBIG compression.
11172: The ISO/IEC standard number for MPEG-1 video compression.
120 dB SPL: The nominal threshold of pain from a sound expressed as a sound pressure level.
1200 Hz: The carrier frequency of the originating end of the ITU V22 modem standard. The
answering end uses a carrier frequency of 2400Hz. Also one of the carrier frequencies for the FSK
operation of the Bell 202 and 212 standards, the other one being 2400Hz.
1209 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency.
12.288 MHz: 2 × 6.144 MHz and used as a clock for oversampling sigma delta ADCs and DACs.
6.144 MHz sampling frequency can be decimated by a factor of 128 to 48kHz, a standard hifidelity
audio sampling frequency for DAT.
128: 27
12.8 MHz: 2 × 6.4 MHz and used as a clock for oversampling sigma delta ADCs and DACs. 6.4
MHz sampling frequency can be decimated by a factor of 64 to a sampling frequency of 100kHz.
13 dB: The attentuation of the first sidelobe of the function 10 log sin x ⁄ x is approximately 13 dB.
See also Sine Function.
1336 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency.
13522: The ISO/IEC standard number for MHEG multimedia coding.
13818: The ISO/IEC standard number for MPEG-2 video compression.
-13 dB: The ISO/IEC standard number for MPEG-2 video compression.
1.4112 Mbits/sec: The bit rate of a CD player sampling at fs = 44100Hz, with 2 (stereo) channels
and 16 bits per sample.
14400 bits/sec: The 14400 bits/sec modems was six times speed version of the very popular 2400
modem and became available in the early 1990s, with the cost falling dramatically in a few years.
See also 300, 2400, V-series recommendations.
1.452 - 1.492 GHz: The 40 MHz radio frequency band allocated for satellite DAB (digital audio
broadcasting) at the 1992 World Administrative Radio Conference in Spain. Due to other plans for
this bandwidth, a number of countries selected other bandwidths such as 2.3 GHz in the USA, and
2.5 GHz in fifteen other countries.
147: The number of the European digital audio broadcasting (DAB) project started in 1987, and
formally named Eureka 147. This system has been adopted by ETSI (the European
429
Telecommunication Standards Institute) for DAB and currently uses MPEG Audio Layer 2 for
compression.
147:160: The largest (integer) common denominator of the sampling rates of a CD player, and a
DAT player, i.e.
44100
---------------- : 48000
---------------- = 147 : 160
300
300
(627)
1477 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency.
1.536 Mbits/sec: The bit rate of a DAT player sampling at fs = 48000Hz, with 2 (stereo) channels
and 16 bits per sample.
160: See 147.
1633 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency.
16384: 214
1.76 dB: Used in relation to quantisation noise power calculations; 10 log 1.5 = 1.76 dB .
176.4kHz: The sample rate when 4 ×’s oversampling a CD signal where the sampling frequency
f s = 44.1kHz .
1800 Hz: The carrier frequency of the QAM (quadrature amplitude modelling) ITU V32 modem
standard.
2 bits: American slang for a quarter (dollar).
2-D FFT: The extension of the (1-D) FFT into two dimensions to allow Fourier transforms on
images.
2 × 10-5 N/m2: The reference intensity, sometimes denoted as Iref , for the measurement of sound
pressure levels (SPL). This intensity can also be expressed as 10-12 W/m2, or as 20 µ Pa
(micropascals). This intensity was chosen as it was close to the absolute level of a tone at 1000Hz
that can just be detected by the human ear; the average human threshold of hearing at 1000Hz is
about 6.5dB. The displacement of the eardrum at this sound power level is suggested to be 1/10th
the diameter of a hydrogen molecule!
20 dB/octave: Usually used to indicate how good a low pass filter attenuates at frequencies above
the 3dB point. 20dB per octave means that each time the frequency doubles then the attenuation
of the filter increases by a factor of 10, since 20dB = 20 log 2 ( 10 ) . 20dB/decade is the same rolloff as 6dB/decade. See also Decibels, Roll-off.
20 µ Pa (micropascals): See entry for 2 x10-5 N/m2.
205: The number of data points in used in Goertzel’s algorithm (a form of discrete Fourier transform
(DFT)) for tone detection.
430
DSPedia
2025 Hz: One of the FSK (frequency shift keying) carrier frequencies for the Bell 103, 300 bits/sec
modem. Other frequencies are 1070 Hz, 1270 Hz and 2225 Hz.
2048: 211
2100: The part number of most Analog Devices fixed point DSP processors.
21000: The part number of most Analog Devices floating point DSP processors.
2225 Hz: One of the FSK (frequency shift keying) carrier frequencies for the Bell 103, 300 bits/sec
modem. Other frequencies are 1070 Hz, 1270 Hz and 2025 Hz.
24 bits: The fixed point wordlength of some members of the Motorola DSP56000 family of DSP
processors.
2400 bits/sec: The 2400 bits/sec modems appeared in the early 1990s as low cost communication
devices for remote computer access and FAX transmission. The bit rate of 2400 was chosen as it
is a factor of 8 faster than the previous 300 bits/sec modem. Data rates of 2400 were achieved by
using echo cancellation and data equalisation techniques. The 2400 bits/sec modem dominated the
market until the cost of the 9600 modems started to fall in about 1992. To ensure a simple
backwards operation compatibility all modems are now produced in factors of 2400, i.e. 4800, 7200,
9600, 14400, 28800, 57600, 115200. See also V-series recommendations.
2400 Hz: The carrier frequency of the answering end of the ITU V22 modem standard. The
originating end uses a carrier frequency of 1200Hz. Also one of the carrier frequencies for the FSK
operation of the Bell 202 and 212 standards, the other one being 1200Hz.
256: 28
26 dB: The attentuation of the first sidelobe of the function 20 log sin x ⁄ x is approximately 26 dB.
See also Sine Function.
261.624 Hz: The fundamental frequency of middle C on a piano tuned to the Western music scale.
See also 440 Hz.
2.718281... : The (truncated) value of e, the natural logarithm.
28800 bits/sec: The 28800 bits/sec modem is an eight times speed version of the very popular
14400 modem and became available in the mid 1990s. This modem uses echo cancellation, data
equalisation, and data compression technique to achieve this data rate. See also 300, 2400, Vseries recommendations.
2.8224 MHz: An intermediate oversampling frequency used for sigma delta ADCs and DACs used
with CD audio systems. 2.8224 MHz can be decimated by a factor of 64 to 44.1 kHz, the standard
sampling frequency of CD players.
3 dB: See 3.01dB.
3.01 dB: The approximate value of 10 log 10 ( 0.5 ) = 3.0103 . If a signal is attenuated by 3dB then
its power is halved.
300: The largest (integer) common denominator of the sampling rates of a CD player, and a DAT
player, i.e.
431
44100
---------------- : 48000
---------------- = 147 : 160
300
300
(628)
300 bits/sec: The bit rate of the first commercial computer modems. Although 28800 bits/sec is
now easily achievable, 300 bits/sec modems probably outsell all other speeds of modems by virtue
of the fact that most credit card telephone verification systems can perform the verification task at
300 bits/sec in a few seconds. See also Bell 103, 2400, V-series recommendations.
3.072 MHz: An intermediate oversampling frequency used for sigma delta ADCs and DACs used
with DAT and other professional audio systems. 3.072 MHz can be decimated by a factor of 64 to
48kHz, the current standard professional hifidelity audio sampling frequency.
32 kHz: A standard hifidelity audio sampling rate. The sampling rate of NICAM for terrestrial
broadcasting of stereo audio for TV systems in the United Kingdom.
32 bits: The wordlength of most floating point DSP processors. 24 bits are used for the mantissa,
and 8 bits for the exponent.
3.2 MHz: An intermediate oversampling frequency for sigma delta ADCs and DACs that can be
decimated by a factor of 32 to 100 kHz.
320: The part number for most Texas Instruments DSP devices.
32768: 215
3.3 Volt Devices: DSP processor manufacturers are now releasing devices that will function with
3 volt power supplies, leading to a reduction of power consumption.
350 Hz: Tones at 350 Hz and 440 Hz make up the dialing tone for telephone systems.
35786 km: The height above the earth of a satellite geostationary orbit. This leads to between 240
and 270ms one way propagation delay for satellite enabled telephone calls. On a typical
international telephone connection the round-trip delay can be as much as 0.6 seconds making
voice conversation difficult. In the likely case of additional echoes voice conversation is almost
impossible without the use of echo cancellation strategies.
+++ 352.8 bits/sec: One quarter of the bit rate of hifidelity CD audio sampled at 44.1 kHz, with 16
bit samples and stereo channels ( 44100 × 16 × 2 = 1411200 bits/sec ). The data compression
scheme known as PASC (psychoacoustic subband coding) used on DCC (digital compact cassette)
compresses by a factor 4:1 and therefore has a data rate of 384 bits/sec when used on data
sampled at 44.1kHz.
& 352.8kHz: The sample rate when 8 ×’s oversampling a CD signal where the sampling frequency
is f s = 44.1kHz .
+++ 384 bits/sec: One quarter of the bit rate of hifidelity audio sampled at 48kHz, with 16 bit
samples and stereo channels ( 48000 × 16 × 2 = 1536000 bits/sec ). The data compression
scheme known as PASC (psychoacoustic subband coding) used on DCC (digital compact cassette)
compresses by a factor 4:1 and therefore has a data rate of 384 bits/sec when used on data
sampled at 44.1kHz.
& 3.92dB: The attenuation of the frequency response of a step reconstructed signal at f s ⁄ 2 . The
attenuation is the result of the zero order hold “step” reconstruction which is equivalent to
DSPedia
432
convolving the signal with a unit pulse of time duration t s = 1 ⁄ f s , or in the frequency domain,
multiplying by the sinc function, H ( f ) ::
sin πft
H ( f ) = -----------------sπft s
(629)
Therefore at f s ⁄ 2 , the droop in the output signal spectrum has a value of:
sin ( π ⁄ 2 )- = 2
--- = 0.63662
H ( f s ⁄ 2 ) = ----------------------π⁄2
π
(630)
which in dB’s can be expressed as:
20 log ( 2 ⁄ π ) = 3.922398
(631)
4 dB: Sometimes used as an approximation to 3.92dB. See also 3.92dB
4096: 212
4294967296: 232
440 Hz: The fundamental frequency of the first A note above middle C on a piano tuned to the
Western music scale. Definition of the frequency of this one note allows the fundamental tuning
frequency of all other notes to be defined.
Also the pair of tones at 440 Hz and 350 Hz make up the telephone dialing tone, and 440 Hz and
480 Hz make up the ringing tone for telephone systems.
44.1kHz: The sampling rate of Compact Disc (CD) players. This sampling frequency was originally
chosen to be compatible with U-matic video tape machines which had either a 25 or 30Hz frame
rate, i.e. 25 and 30 are both factors of 44100.
44.056kHz: The sampling rate of Compact Disc (CD) players. was originally chosen to be
compatible with U-matic video tape machines which had either a 25 or 30Hz frame rate, i.e. 25 and
30 are both factors of 44100. When master recording was done on a 29.97Hz frame rate video
machine, this required the sampling rate to be modified to a nearby number that was a factor of
29.97, i.e. 44.056kHz. This sampling rate is redundant now.
4.76cm/s: The tape speed of compact cassette players, and also of digital compact cassette
players (DCC).
4.77 dB: 10 log 3 ≈ 4.77dB , i.e. a signal that has its power amplfied by a factor of 3, has an
amplification of 4.77dB.
48kHz: The sampling rate of digital audio tape (DAT) recorders, and the sampling rate used by
most professional audio systems.
480 Hz: The tone pair 480 Hz and 620 Hz make up the busy signal on telephone systems.
433
4800 bits/sec: The 4800 bits/sec modems was a double speed version of the very popular 2400
modem. Data rates of 4800 were achieved using echo cancellation and data equalisation
techniques. See also 2400, V-series recommendations.
512: 29
56000: The part number for most Motorola fixed point DSP devices.
5.6448 MHz: An oversampling frequency for sigma delta ADCs and DACs used with CD players.
5.6448 MHz can be decimated by a factor of 128 to 44.1kHz the standard hifidelity audio sampling
frequency for CD players.
57200 bits/sec: The 57200 bits/sec data rate modem is an 4 times speed version of the very
popular 14400 modem and became available in the mid 1990s. This modem uses echo
cancellation, data equalisation, and data compression technique to achieve this data rate. See also
300, 2400, V-series recommendations.
6dB/octave: The “6” is an approximation for 20 log 10 2 = 6.0206 . Usually used to indicate how
good a low pass filter attenuates at frequencies above the 3dB point. 6dB per octave means that
each time the frequency doubles then the attenuation of the filter increases by a factor of 2, since.
6dB/octave is the same roll-off as 20dB/decade. See also Decibels, Roll-off.
6.144 MHz: An oversampling frequency for sigma delta ADCs and DACs used with DAT and other
professional audio systems. 6.144 MHz can be decimated by a factor of 128 to 48kHz to the current
standard professional hifidelity audio sampling frequency.
620 Hz: The tone pair 480 Hz and 620 Hz make up the busy signal on telephone systems.
6.4 MHz: An oversampling frequency for sigma delta ADCs and DACs that can be decimated by a
factor of 64 to 100 kHz.
64kBits/sec: A standard channel bandwidth for data communications. If a channel has a
bandwidth of approximately 4kHz, then the Nyquist sampling rate would be 8kHz, and data of 8 bit
wordlength is sufficient to allow good fidelity of speech to be transmitted. Note that 64000 bits/sec
= 8000Hz × 8 bits.
6.4 MHz: A common sampling rate for a 64 times oversampled sigma-delta ( Σ-∆ ) A/D, resulting
in up to 16 or more bits of resolution at 100kHz after decimation by 64.
65536: 216
697 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency.
& 705600 bits/sec: The bit rate of a single channel of a CD player, with 16 bit samples, and
sampling at f s = 44100kHz .
& 705.6 kHz: The sample rate when 16 ×’s oversampling a CD signal where the sampling
frequency f s = 44100kHz .
7200 bits/sec: The 7200 bits/sec modems was a three times speed version of the very popular
2400 modem and became available in the early 1990s, with the cost falling dramatically in a few
434
DSPedia
years. Data rates of 7200 were achieved using echo cancellation and data equalisation techniques.
See also 2400, V-series recommendations.
741 Op-Amp: The part number of a very popular operational amplifier chip widely used for signal
conditioning, amplification, and anti-alias, reconstruction filters.
768000 bits/sec: The bit rate of a single channel DAT player with 16 bits per sample, and sampling
at f s = 48000 Hz .
770 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency.
8 kHz: The sampling rate of most telephonic based speech communication.
8192: 213
852 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency.
941 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency.
9.54dB: 20 log 3 ≈ 9.54dB , i.e. a signal that has its voltage amplfied by a factor of 3, has an
amplification of 9.54 dB.
9600 bits/sec: The 9600 bits/sec modems was a four times speed version of the very popular
2400 modem and became available in the early 1990s, with the cost falling dramatically in a few
years. Data rates of 9600 were achieved by using echo cancellation and data equalisation
techniques. See also 2400, V-series recommendations.
96000: The part number for most Motorola 32 bit floating point devices.
435
Acronyms:
ADC - Analogue to Digital Converter.
ADSL - Advanced Digital Subscriber Line
ADSR - Attack-Decay-Sustain-Release.
AES/EBU - Audio Engineering Society/European Broadcast Union.
A/D - Analogue to Digital Converter.
ADPCM - Adaptive Differential Pulse Code Modulation.
ANC - Active noise cancellation.
ANSI - American National Standards Institute.
AIC - Analogue Interfacing Chip.
ARB - Arbitrary Waveform Generation.
ASCII - American Standard Code for Information Interchange.
ASIC - Application Specific Integrated Circuit.
ASK - Amplitude Shift Keying.
ASPEC - Adaptive Spectral Perceptual Entropy Coding .
ASSP - Acoustics, Speech and Signal Processing.
AVT - Active Vibration Control.
AWGN - Additive White Gausssian Noise.
BER - Bit Error Rate.
BISDN - Broadband Integrated Services Digital Network.
BPF - Band pass filter.
BPSK - Binary Phase Shift Keying.
CCR - Condition Code Register.
CCITT - Comité Consultatif International Télégraphique et Téléphonique. (International
Consultative Committee on Telegraphy and Telecommunication, now known as ITU-T.)
CCIR - Comité Consultatif International Radiocommunication. (International Consultative
Committee on Radiocommunication, now known as ITU-R.)
CD - Compact Disc
CD-DV: Compact Disc Digital Video.
436
DSPedia
CELP - Coded Excited Linear Prediction Vocoders.
CENELEC - Comité Européen de Normalisation Electrotechnique (European Committee for
Electrotechnical Standardization).
CIF - Common Intermediate Format.
CIRC - Cross Interleaved Reed Solomon code.
CISC - Complex Instruction Set Computer.
CPM - Continuous Phase Modulation.
CPU - Central Processing Unit.
CQFP - Ceramic Quad Flat Pack.
CRC - Cyclic Redundancy Check.
CVSD - Continuous variable slope delta modulator.
D/A - Digital to analogue converter.
DAB - Digital Audio Broadcasting.
DAC - Digital to analogue converter.
dB - decibels.
DECT - Digital European Cordless Telephone.
DL - Difference Limen.
& DARS - Digital Audio Radio Services.
DBS - Direct Broadcast Satellites.
DCC - Digital Compact Cassette.
DCT - Discrete Cosine Transform.
& DDS - Direct Digital Synthesis.
DECT - Digital European Cordless Telephone.
DFT - Discrete Fourier Transform.
DLL - Dynamic Link Library.
DMS - Direct Memory Access.
DPCM - Differential Pulse Code Modulation.
DPSK - Differential Phase Shift Keying.
DRAM - Dynamic Random Acces Memory.
437
DSL - Digital Subscriber Line
DSP - Digital Signal Processing.
DTMF - Dual tone Multifrequency.
DSfP - Digital Soundfield Processing.
ECG - Electrocardiograph.
EEG - Electroencephalograph.
EFM - Eight to Fourteen Modulation.
EMC - Electromagnetic compatibility.
EPROM - Electrically programmable read only memory.
EEPROM - Electrically Erasable Programmable Read Only Memory.
EQ - Equalization (usually in acoustic applications).
ETSI - European Telecommunications Standards Institute.
FIR - Finite Impulse Response.
FFT - Fast Fourier Transform.
FSK - Frequency Shift Keying.
G - prefix meaning 10 9 , as in GHz, thousands of millions of Hertz GII - Global Information
Infrastructure.
GIF - Graphic Interchange Format.
GSM - Global System For Mobile Communications (Group Speciale Mobile).
HDSL - High speed Digital Subscriber Line
hhtp - Hypertext Transfer Protocol.
IEEE - Institute of Electrical and Electronic Engineers (USA).
IEE - Institute of Electrical Engineers (UK).
IEC - International Electrotechnical Commission.
IIR - Infinite impulse response.
IIF - Image Interchange Facility.
INMARSAT - International Mobile Satellite Organization.
ISDN - Integrated Services Digital Network.
ISO - International Organisation for Standards.
DSPedia
438
ISO/IEC JTC - International Organization
Commission Joint Technical Committee.
for
Standards/
International
ITU - International Telecommunications Union.
ITU-R - International Telecommunications Union - Radiocommunication.
ITU-T - International Telecommunications Union - Telecommunication.
I/O - Input/Output.
JBIG - Joint Binary Image Group.
JND - Just Noticeable Difference.
JPEG - Joint Photographic Expert Group.
JTC - Joint Technical Committee.
k - prefix meaning 10 3 , as in kHz, thousands of Hertz.
LFSR - Linear Feedback Shift Register Coding.
LPC - Linear Predictive Coding.
LSB - Least Significant Bit.
M - prefix meaning 10 6 as in MHz, millions of Hertz.
MAC - Multiply Accumulate.
MFLOPS - Millions of Floating Point Operations per Second.
MIDI - Music...
MAF - Minimum Audible Field.
MAP - Minimum Audible Pressure.
MIPS - Millions of Instructions per second.
MLPC - Multipulse Linear Predictive Coding.
MA - Moving Average.
MD - Mini-Disc.
MMSE - Minimum Mean Squared Error.
MHEG - Multimedia and Hypermedia Experts Group.
MPEG - Moving Picture Experts Group.
MRELP - M..
ms - millisecond ( 10 –3 ).
Electrotechnical
439
MSB - Most Significant Bit.
MSE - Mean Squared Error.
MSK - Minimum Shift Keying.
MIX - Modular Interface eXtension.
MUSICAM - Masking pattern adapted Universal Subband Integrated Coding And Multiplexing.
NRZ - Non Return to Zero.
ns - nanosecond ( 10 –9 seconds).
OKPSK - Offset-Keyed Phase Shift Keying.
OKQAM - Offset-Keyed Quadrature Amplitude Modulation.
OOK - On Off Keying.
OPSK - Offset-Keyed Phase Shift Keying.
OQAM - Offset-Keyed Quadrature Amplitude Modulation.
PAM - Pulse Amplitude Modulation.
PASC - Precision Adaptive Subband Coding.
PCM - Pulse Code Modulation.
PCMCIA - Personal Computer Memory Card International Association.
PN - Pseudo-Noise.
ppm - Parts per million.
PPM - Pulse Position Modulation.
PRBS - Pseudo Random Binary Sequence.
PSK - Phase Shift Keying.
PSTN - Public Switched Telephone Network.
PTS - Permanent Threshold Shift.
PWM - Pulse Width Modulation.
PDA - Personal Digital Assistant.
PGA - Pin Grid Array.
PID - Proportional Integral Controller.
PQFP - Plastic Quad Flat Pack.
PRNS - Pseudo Random Noise Sequence.
440
QAM - Quadrature Amplitude Modulation.
QPSK - Quadrature Phase Shift Keying.
RAM - Random access memory.
RBDS - Radio Broadcasting.....?
RELP - Residual Excited Linear Prediction Vocoder.
RIFF - Resource Interchange File Format.
RISC - Reduced Instruction Set Computer.
RLC - Run Length Coding.
RLE - Run Length Encoding.
ROM - Read only memory.
RPE - Recursive Predictor Error or Regular Pulse Excitation
RZ - Return to Zero.
Rx - Receive.
SBM - Super Bit Mapping (A trademark of Sony).
SCMS - Serial Copy Management System.
SFG - Signal Flow Graph.
SGML - Standard Generalized Markup Language.
S/H - Sample and Hold.
SINR - Signal to Interference plus Noise Ratio.
SNR - Signal to Noise Ratio.
S/N - Signal to Noise ratio.
S/P-DIF - Sony/Philips Digital Interface Format.
SR - Status Register.
SPL - Sound Pressure Level.
SRAM - Static random access memory.
SRC - Sample Rate Converter.
TBDF - Triangular Probability Density Function.
TCM - Trellis Coded Modulation.
THD - Total Harmonic Distortion.
DSPedia
441
THD+N - Total Harmonic Distortion plus Noise.
TTS - Temporary Threshold Shift.
Tx - Transmit.
VSELP - Vector Sum Excited Linear Prediction.
VU - Volume Unit.
WMA - Weighted Moving Average.
WWW - World Wide Web.
µ sec - microsecond ( 10 –6 )
Standards Organisation
ANSI - American National Standards Institute.
BS - British Standard.
IEC - International Electrotechnical Committee.
IEEE - Institute of Electronic and Electrical Engineers.
ISO - International Organisation for Standards.
442
DSPedia
443
References and Further Reading
Goto Papers
Textbooks
[1]
S. Banks. Signal Processing, Image Processing and Pattern Recognition. Prentice Hall, Englewood Cliffs, NJ,
1990.
[2]
T.P. Barnwell III, K. Nayebi, C.H. Richardson. Speech Coding. A Computer Laboratory Textbook. John Wiley and
Sons, 1996.
[3]
A. Bateman and W. Yates. Digital Signal Processing Design. Pitman Publishing 1988.
[4]
E.H. Berger, W.D. Ward, J.C. Morrill, L.H. Royster. Noise and Hearing Conservation Manual, 4th Edition.
American Industrial Hygiene Association.
[5]
R.L. Brewster. ISDN Technology. Chapman & Hall, London, 1993.
[6]
R.G. Brown, P.Y.C. Hwang. Introduction to Random Signals and Applied Kalman Filtering, John Wiley and Sons,
1992.
[7]
C.S. Burrus, J.H. McLellan, A.V. Oppenheim, T.W. Parks, R.W. Schafer, H.W. Schuessler. Computer Based
Exercises for Signal Processing Using Matlab. Prentice Hall, 1994.
[8]
J.C. Candy, G.C. Temes. Oversampling Delta-Sigma Data Converters. Piscataway, NJ; IEEE Press, 1992.
[9]
L.W. Couch II. Modern Communication Systems: Principles and Applications. Prentice-Hall, Englewood Cliffs, NJ,
1995.
[10] D.J. DeFatta, J.G. Lucas, W.S. Hodgkiss. Digital Signal Processing: A System Design Approach. John Wiley, New
York, 1988.
[11] J.R. Deller, J.G. Proakis, J.H.K. Hansen. Discrete Time Processing of Speech Signals. MacMillan, New York,
1993.
[12] P.D. Denyer and D. Renshaw. VLSI Signal Processing - A Bit Serial Approach. Addison-Wesley, 1995.
[13] G. De Poli, A Piccialli, C. Roads. Representations of Musical Signals. The MIT Press, Boston, USA, 1991.
[14] J.M. Eargle. Music Sound and Technology. Van Nostrand Reinhold, 1990.
[15] G.H. Golub, C.F. Van Loan. Matrix Computations. John Hopkins University Press, 1989.
[16] J.G. Gibson, The Mobile Communications Handbook. CRC Press/IEEE Press, 1996.
[17] S. Haykin. Adaptive Filter Theory (2nd Edition). Prentice Hall, Englewood Cliffs, NJ, 1990.
[18] S. Haykin. Neural Networks: A Comprehensive Foundation. MacMillan College, 1994.
[19] D.R. Hush and B.G. Horne. Progress in supervisied neural networks. IEEE Signal Processing Magazine, Vol. 10,
No. 1, pp. 8-39, January 1993.
[20] K. Hwang, F. Briggs. Computer Architecture and Parallel Processing. McGraw-Hill, 1985.
[21] E.C. Ifeachor, B.W. Jervis. Digital Signal Processing: A Practical Approach. Addison-Wesley, 1993.
[22] N. Kalouptsidis,Theodoridis. Adaptive System Identification and Signal Processing Algorithms. Prentice Hall,
1993.
444
DSPedia
[23] A. Kamas, E.A. Lee. Digital Signal Processing Experiments. Prentice-Hall, Englewood Cliffs, NJ, 1989.
[24] S.Y. Kung. Digital Neurocomputing. Prentice-Hall, Englewood Cliffs, NJ, 1992.
[25] S.Y. Kung. VLSI Array Processors. Prentice-Hall, Englewood Cliffs, NJ, 1987.
[26] P.A. Lynn. An Introduction to the Analysis and Processing of Signals, 1982.
[27] J.D. Martin. Signals and Processes: A Foundation Course.
[28] C. Marven, G. Ewers. A Simple Approach to Digital Signal Processing. Texas Instruments Publication, 1993.
[29] R.M. Mersereau, M.J.T. Smith. Digital Filtering. John Wiley, New York, 1993.
[30] B.C.J Moore. An Introduction to the Psychology of Hearing.
[31] A.V. Oppenheim, R.W. Schafer. Discrete Time Signal Processing. Prentice Hall, Englewood Cliffs, NJ, 1989.
[32] R.A. Penfold. Synthesizers for Musicians. PC Publishing, London, 1989.
[33] K. Pohlmann. Advanced Digital Audio. Howards Sams, Indiana, 1991.
[34] K. Pohlmann. An Intoduction to Digital Audio, Howard Sams, Indiana, 1989.
[35] T.S. Rappaport. Wireless Communications. IEEE Press, New York, 1996.
[36] P. Regalia. Adaptive IIR Filtering. Marcel Dekker, 1995.
[37] F. Rumsey. Digital Audio. Butterworth-Heinemann, 1991
[38] E. Rogers and Y. Li. Parallel Processing in a Control Systems Environment. Prentice Hall, Englewood Cliffs, NJ,
1993.
[39] K. Sayood. Introduction to Data Compression. Morgan-Kaufman, 1995.
[40] M. Schwartz. Information, Transmission, and Modulation Noise. McGraw-Hill.
[41] N.J.A. Sloane, A.D. Wyner (Editors). Claude Elwood Shannon: Collected Papers. IEEE Press, 1993, Piscataway,
NJ. ISBN 0-7803-0434-9.
[42] M.J.T. Smith, R.M Mersereau. Introduction to Digital Signal Processing: A Computer Laboratory Textbook. John
Wiley and Sons, 1992.
[43] K. Steiglitz. A Digital Signal Processing Primer. Addison-Wesley, 1996.
[44] C.A. Stewart and R. Atkinson. Basic Analogue Computer Techniques. McGraw-Hill, London, 1967.
[45] N. Storey. Electronics: A Systems Approach. Addison-Welsey, 1992.
[46] M. Talbot-Smith (Editor). Audio Engineer’s Reference Book, Focal Press, ISBN 0 7506 0386 0, 1994.
[47] F.J. Taylor. Principles of Signals and Systems. New York; McGraw-Hill, 1994.
[48] W.J. Tompkins. Biomedical Digital Signal Processing. Prentice Hall, Englewood Cliffs, NJ, 1993.
[49] P.P. Vaidyanathan. Multirate Systems and Filter Banks. Prentice Hall, Englewood Cliffs, NJ,1993.
[50] S.V. Vaseghi. Advanced Signal Processing and Digital Noise Reduction. John Wiley/B.G. Tuebner, 1996.
[51] J. Watkinson. An Introduction to Digital Audio. Focal Press, ISBN 0 240 51378 9, 1994.
[52] J. Watkinson. Compression in Video and Audio. ISBN 0240513940, Focal Press, April 1994.
445
[53] B. Widrow and S. Stearns. Adaptive Signal Processing. Prentice Hall, 1985.
[54] J. Watkinson. The Art of Digital Audio, 2nd Edition. ISBN 0240 51320 7, 1993.
Technical Papers
[55] P.M. Aziz, H.V. Sorenson, J.V. Der Spiegel. An overview of sigma delta converters. IEEE Signal Processing
Magazine, Vol. 13, No. 1, pp. 61-84, January 1996.
[56] J.W. Arthur. Modern SAW-based pulse compression systems for radar application. Part 2: Practical systems. IEE
Electronics Communication Engineering Journal, Vol. 8, No. 2, pp. 57-78, April 1996.
[57] G.M. Blair. A Review of the Discrete Fourier Transform. Part 1: Manipulating the power of two. IEE Electronics and
Communication Engineering Journal, Vol. 7, No.4, pp. 169-176, August 1995.
[58] G.M. Blair. A review of the discrete Fourier Transform. Part 2: Non-radix algorithms, real transforms and noise.
IEE Electronics Communication Engineering Journal, Vol. 7, No. 5, pp. 187-194, October 1995.
[59] J.A. Cadzow. Blind deconvolution via cumulant extrema. IEEE Signal Processing Magazine, Vol. 13, No. 3, pp.
24-42, May 1996.
[60] J. Cadzow. Signal processing via least squares error modelling. IEEE ASSP Magazine, Vol. 7, No. 4, pp 12-31,
October 1990.
[61] G. Cain, A. Yardim, D. Morling. All-Thru DSP Provision, Essential for the modern EE. Proceedings of IEEE
International Conference on Acoutics, Speech and Signal Processing 93, pp. I-4 to I-9, Minneapolis, 1993.
[62] C. Cellier. Lossless audio data compression for real time applications. 95th AES Convention, New York, Preprint
3780, October 1993.
[63] S. Chand and S.L. Chiu (editors). Special Issue on Fuzzy Logic with Engineering Applications. Proceedings of the
IEEE, Vol. 83, No. 3, pp. 343-483, March 1995.
[64] R. Chellappa, C.L. Wilson and S. Sirohey. Human and machine recognition of faces. Proceedings of the IEEE,
Vol. 83, No. 5 pp. 705-740, May 1995.
[65] J. Crowcroft. The Internet: a tutorial. IEE Electronics Communication Engineering Journal, Vol. 8, No. 3, pp. 113122, June 1996.
[66] J.W. Cooley. How the FFT gained acceptance. IEEE Signal Processing Magazine, Vol. 9, No. 1, pp. 10-13,
January 1992.
[67] J.R. Deller, Jr. Tom, Dick and Mary discover the DFT. IEEE Signal Processing Magazine, Vol. 11, No. 2, pp. 3650, April 1994.
[68] S.J. Elliot and P.A. Nelson. Active Noise Control. IEEE Signal Processing Magazine, Vol. 10, No. 4, pp 12-35,
October 1993.
[69] L.J. Eriksson. Development of the filtered-U algorithm for active noise control. Journal of the Acoustical Society of
America, Vol. 89 (No.1), pp 27-265, 1991.
[70] H. Fan. A (new) Ohio yankee in King Gustav’s country. IEE Signal Processing Magazine, Vol.12, No. 2, pp. 3840, March 1995.
[71] P.L. Feintuch. An adaptive recursive LMS filter. Proceedings of the IEEE, Vol. 64, No. 11, pp. 1622-1624,
November 1976.
[72] D. Fisher. Coding MPEG1 image data on compact discs. Electronic Product Design (UK), Vol. 14, No. 11, pp. 2633. November 1993.
446
DSPedia
[73] H. Fletcher and W.A. Munson. Loudness, its definition, measurement and calculation. Journal of the Acoustical
Society of America, Vol. 70, pp. 1646-1654, 1933.
[74] M. Fontaine and D.G. Smith. Bandwidth allocation and connection admission control in ATM networks. IEE
Electronics Communication Engineering Journal, Vol. 8, No. 4, pp. 156-164, August 1996.
[75] W. Gardner. Exploitation of spectral redundancy in cyclostationary signals. IEEE Signal Processing Magazine,
Vol. 8, No. 2, pp 14-36, April 1991.
[76]H. Gish and M. Schmidt. Text independent speaker identification. Vol. 11, No. 4, pp 18- 32, October 1994.
[77] P.M. Grant. Signal processing hardware and software. IEEE Signal Processing Magazine, Vol. 13, No. 1, pp. 8688, January 1996.
[78] S. Harris. The effects of sampling clock jitter on Nyquist sampling analog to digital converters, and on oversampling
delta sigma ADCs. Journal of the Audio Engineering Society, July 1990.
[79] S. Heath. Multimedia standards and interoperability. Electronic Product Design (UK), Vol. 15, No. 9, pp. 33-37.
November 1993.
[80] F. Hlawatsch and G.F. Boudreaux-Bartels. Linear and quadratic time-frequency signal representations. IEEE
Signal Processing Magazine, Vol. 9, No. 2, pp. 21-67, April 1992.
[81]D.R. Hush and B.G. Horne. Progress in Supervised Neural Networks. IEEE Signal Processing Magazine, Vol. 10,
No. 1, pp. 8-39, January 1993.
[82] Special Issue on DSP Education, IEEE Signal Processing Magazine, Vol. 9, No.4, October 1992.
[83] Special Issue on Fuzzy Logic with Engineering Applications. Proceedings of the IEEE, Vol. 83, No. 3, March 1995.
[84] A. Hoogendoorn. Digital Compact Cassette. Proceedings of the IEEE, Vol. 82, No. 10, pp. 1479-1489, October
1994.
[85] B. Jabbari (editor). Special Issue on Wireless Networks for Mobile and Personal Communications. Vol. 82, No. 9,
September 1994.
[86] D.L. Jaggard (editor). Special Section on Fractals in Electrical Engineering. Proceedings of the IEEE, Vol. 81, No.
10, pp. 1423-1523, October 1993.
[87] N. Jayant, J. Johnston, R. Safranek. Signal compression based on models of human perception. Proceedings of
the IEEE, Vol. 81, No. 10, pp. 1385-1382, October 1993.
[88] C.R. Johnson. Yet still more on the interaction of adaptive filtering, identification, and control. IEE Signal
Processing Magazine, Vol.12, No. 2, pp. 22-37, March 1995.
[89] R.K. Jurgen. Broadcasting with Digital Audio. IEEE Spectrum, Vol. 33, No. 3, pp. 52-59. March 1996.
[90] S.M. Kay and S.L. Marple. Spectrum Analysis - A Modern Perspective. Proceedings of the IEEE, Vol. 69, No. 11,
pp 1380-1419, November 1981.
[91] K. Karnofsky. Speeding DSP algorithm design. IEEE Spectrum, Vol. 33, No. 7, pp. 79-82, July 1996.
[92] W. Klippel. Compensation for non-linear distortion of horn loudspeakers by digital signal processing. Journal of the
Audio Engineering Society, Vol. 44, No. 11, pp 964-972, Novemeber 1996.
[93] P. Kraniauskas. A plain man’s guide to the FFT. IEEE Signal Processing Magazine, Vol. 11, No. 2, pp. 24-36, April
1994.
[94] F. Kretz and F. Cola. Standardizing Hypermedia Information Objects. IEEE Communications Magazine, May
1992.
447
[95] M. Kunt (Editor). Special Issue on Digital Television, Part 1: Technologies. Proceedings of the IEEE, Vol. 83, No.
6, June 1995.
[96] M. Kunt (Editor). Special Issue on Digital Television, Part 2: Hardware and Applciations. Proceedings of the IEEE,
Vol. 83, No. 7, July 1995.
[97] T.I. Laakso, V. Valimaki, M. Karjalainen and U.K. Laine. Splitting the unit delay. IEEE Signal Processing Magazine,
Vol. 13, No. 1, pp. 30-60, January 1996.
[98] T.I. Laasko, V. Valimaki, M. Karjalainen, U.K. Laine. Splitting the Unit Delay. IEEE Signal Processing Magazine,
Vol. 13, No. 1, pp. 30-60, January 1996.
[99] P. Lapsley and G. Blalock. How to estimate DSP processor perfomance. IEEE Spectrum, Vol. 33, No. 7, pp. 7478, July 1996.
[100]V.O.K. Li and X. Qui. Personal communication systems. Proceedings of the IEEE, Vol. 83, No. 9, pp. 1210-1243,
September 1995.
[101]R.P. Lippmann. An introduction to computing with neural nets. IEEE ASSP Magazine, Vol. 4, No. 2, pp. 4-22, April
1987.
[102]G.C.P. Lokhoff. DCC: Digital Compact Cassette. IEEE Transactions on Consumer Electronics, Vol. 37, No. 3 pp
702-706, August 1991.
[103]H. Lou. Implementing the Viterbi Algorithm. IEEE Signal Processing Magazine. Vol 12, No. 5 pp. 42-52,
September 1995.
[104].J. Lipoff. Personal communications networks bridging the gap between cellular and cordless phones.
Proceedings of the IEEE, Vol. 82, No. 4, pp. 564-571, April 1994.
[105]M. Liou. Overview of the p*64 kbit/s Video Coding Standard. Communications of the ACM, April 1991. G.K. Ma
and F.J. Taylor. Multiplier policies for digital signal processing. IEEE ASSP Magazine, Vol. 7, No. 4, pp 6-20,
January 1990.
[106]G-K. Ma and F.J. Taylor. Multiplier policies for digital signal processing. IEEE ASSP Magazine, Vol. 7, No. 1,
January 1990.
[107]Y. Mahieux, G. Le Tourneur and A. Saliou. A microphone array for multimedia workstations. Journal of the Audio
Engineering Society, Vol. 44, No. 5, pp. 331-353, May 1996.
[108]D.T. Magill, F.D. Natali and G.P. Edwards. Spread-spectrum technology for commercial applications. Proceedings
of the IEEE, Vol. 82, No. 4, pp. 572-584, April 1994.
[109]V.J. Mathews. Adaptive Polynomial Filters. IEEE Signal Processing Magazine, Vol. 8, No. 3, pp. 10-26, July 1991.
[110]N. Morgan and H. Bourland. Neural networks for statistical recognition of continuous speech. Proceedings of the
IEEE, Vol. 83, No. 5, pp. 742-770, May 1995.
[111]N. Morgan and H. Bourland. Continuous speech recognition. IEEE Signal Processing Magazine, Vol. 12, No. 3,
pp. 24-42, May 1995.
[112]N. Morgan and H. Bourland. Neural networks for statistical recognition of speech. Proceedings of the IEEE, Vol
83, No. 5, pp 742-770, May 1995.
[113]A. Miller. From here to ATM. IEEE Spectrum, Vol. 31, No. 6, pp 20-24, June 1994.
[114]Y.K. Muthusamy, E. Barnard and R.A. Cole. IEEE Signal Processing Magazine, Vol. 11, No. 4, pp. 33-41, October
1994.
[115]R.N. Mutagi. Psuedo noise sequences for engineers. IEE Electronics Communication Engineering Journal, Vol.
DSPedia
448
8, No. 2, pp. 79-87, April 1996.
[116]R.N. Mutagi. Pseudo noise sequences for engineergs. IEE Electronics and Communication Engineering Journal,
Vol. 8. No. 2, pp. 79-87, April 1996.
[117]C.L. Nikias and J.M Mendel. Signal processing with higher order statistics. IEEE Signal Processing Magazine, Vol.
10, No. 3, p 10-37, July 1993.
[118]P.A. Nelson, F. Orduna-Bustamante, D. Engler, H. Hamada. Experiments on a system for the synthesis of virtual
acoustic sources. Journal of the Audio Engineering Society, Vol. 44, No. 11, pp 973-989, Novemeber 1996.
[119]P.A. Nelson, F. Orduna-Bustamante, H. Hamada. Multichannel signal processing techniques in the reproduction
of sound. Journal of the Audio Engineering Society, Vol. 44, No. 11, pp 973-989, Novemeber 1996.
[120]P. Noll. Digital audio coding for visual communications. Proceedings of the IEEE, Vol. 83, No. 6, pp. 9 925-943,
June 1995
[121]K.J. Olejniczak and G.T. Heydt. (editors). Special Section on the Hartley Transform. Proceedings of the IEEE, Vol.
82, No . 3, pp. 372-447, March 1994.
[122]J. Picone. Continuous speech recognition using hidden Markov models. IEEE ASSP Magazine, Vol. 7, No. 3, pp.
26-41, July 1990.
[123]M. Poletti. The design of encoding functions for stereophonic and polyphonic sound systems. Journal of the Audio
Engineering Society, Vol. 44, No. 11, pp 948-963, November 1996.
[124]P.A. Ramsdale. The development of personal communications. IEE Electronics Communication Engineering
Journal, Vol. 8, No. 3, pp. 143-151, June 1996.
[125]P. Regalia, S.K. Mitra, P.P. Vaidynathan. The digital all-pass filter: a versatile building block. Proceedings of the
IEEE, Vol. 76, No. 1, pp. 19-37, January 1988.
[126]D.W. Robinson and R.S. Dadson. A redetermination of the equal loudness relations for pure tones. British Journal
of Applied Physics, Vol. 7, pp. 166-181, 1956.
[127]R.W. Robinson. Tools for Embedded Digital Signal Processing. IEEE Spectrum, Vol. 29, No. 11, pp 81-84,
November 1992.
[128]C.W. Sanchez. An Understanding and Implementation of the SCMS Serial Copy Management System for Digital
Audio Transmission. 94th AES Convention, Preprint #3518, March 1993. R. Schafer and T. Sikora. Digital video
coding standards and their role in video communications. Proceedings of the IEEE, Vol. 83, No. 6, pp. 907-924,
June 1995.
[129]C.E. Shannon. A mathematical theory of Communication. The Bell System Technical Journal, Vol. 27, pp. 379423, July 1948. (Reprinted in Claude Elwood Shannon: Collected Papers [41].)
[130]C.E. Shannon. The Bandwagon (Editorial). Institute of Radio Engineers, Transations on Information Theory, Vol.
IT-2, p. 3 March 1956. (Reprinted in Claude Elwood Shannon: Collected Papers [41].)
[131]J.J. Shynk. Frequency domain and multirate adaptive filtering. IEEE Signal Processing Magazine, Vol. 9, No. 1,
pp. 10-37, January 1992.
[132]J.J. Shynk. Adaptive IIR filtering. IEEE ASSP Magazine, Vol. 6, No. 2, pp. 4-21, April 1989.
[133]H.F. Silverman and D.P. Morgan. The appliation of dynamic programming to converted speech recognition. IEEE
ASSP Magazine, Vol. 7, No. 3, pp. 6-25, July 1990.
[134]J.L. Smith. Data compression and perceived quality. IEEE Signal Processing Magazine, Vol. 12, No. 5, pp. 58-59,
September 1995.
449
[135]A.S. Spanias. Speech coding: a tutorial review. Proceedings of the IEEE, Vol. 82, No. 10, pp. 1541-1582, October
1994.
[136]A.O. Steinhardt. Householder transforms in signal processing. IEEE Signal Processing Magazine, Vol. 5, No. 3,
pp. 4-12, July 1988.
[137]R.W. Stewart. Practical DSP for Scientist. Proceedings of IEEE International Conference on Acoutics, Speech and
Signal Processing 93, pp. I-32 to I-35, Minneapolis, 1993.
[138]C.Stone. Infrasound. Audio Media, Issue 55, AM Publishing Ltd, London,June 1995.
[139]J.A. Storer. Special Section on Data Compression. Proceedings of the IEEE, Vol. 82, No. 6, pp. 856-955, June
1994.
[140]JP. Strobach. New forms of Levinson and Schur algorithms. IEEE Signal Processing Magazine, Vol. 8, No. 1, pp.
12-36, January 1991.
[141].R. Treicher, I. Fijalkow, and C.R. Johnson, Jr. Fractionally spaced equalizers. IEEE Signal Processing Magazine,
Vol. 13, No. 3, pp. 65-81, May 1996.
[142]B.D.Van Veen and K. Buckley. Beamforming: A Versatile Approach to spatial filtering. IEEE ASSP Magazine, Vol.
5, No.2, pp. 4-24, April 1988.
[143]V. Valimaki, J. Huopaniemi, M. Karjalainen and Z. Janosy. Physical modeling of plucked string instruments with
application to real time sound synthesis. Journal of the Audio Engineering Society, Vol. 44, No. 5, pp. 331-353,
May 1996.
[144]V.D. Vaughn and T.S. Wilkinson. System considerations for multispectral image compression designs. IEEE
Signal Processing Magazine, Vol. 12, No. 1, pp. 19-31, January 1995.
[145]S.A. White. Applications of distributed arithmetic to digital signal processing: a tutorial review.
[146]IEEE ASSP Magazine, Vol. 6, No. 3, pp. 4-19, July 1989.
[147]W.H.W. Tuttlebee. Cordless telephones and cellular radios: synergies of DECT and GSM. IEE Electronics
Communication Engineering Journal, Vol. 8, No. 5, pp. 213-223, October 1996.
[148]Working Group on Communication Aids for the Hearing Impaired. Speech perception aids for hearing impaired
people: current status and needed research. Journal of Acoustical Society of America, Vol. 90, No.2, 1991
[149]R.D. Wright. Signal processing hearing aids. Hearing Aid Audiology Group, Special Publication, British Society of
Audiology, London, 1992.
[150]F. Wylie. Digital audio data compression. IEE Electronics and Communication Engineering Journal, pp. 5-10,
February 1995.
[151]I. Wickelgren. The Strange Senses of Other Species. IEEE Spectrum, Vol. 33, No. 3, pp. 32-37. March 1996.
[152]B. Widrow et al. Adaptive Noise Cancellation: Principles and Applications. Proceedings of the IEEE, Vol. 63, pp.
1692-1716, 1975.
[153]B. Widrow et al. Stationary and Non-stationary learning characteristics of the LMS adaptive filter. Proc. IEEE, Vol
64, pp. 1151-1162, 1976.
[154]T. Yamamoto, K. Koguchi, M. Tsuchida. Proposal of a 96kHz sampling digital audio. 97th AES Convention,
October 1994, Audio Engineering Society preprint 3884 (F-5).
[155]T. Yoshida. The rewritable minidisc system. Proceedings of the IEEE, Vol. 82, No. 10, pp. 1492-1500 October
1994.
[156]Y.Q. Zhang, W. Li, M.L. Liou (Editors). Special Issue on Advances in Image and Video Compression. Proceedings
DSPedia
450
of the IEEE, Vol. 83, No. 2, February 1995.
[157]British Society of Audiology. Recommended procedures for pure tone audiometry. British Journal of Audiometry,
Vol. 15, pp213-216, 1981.
[158]IEC-958/ IEC-85, Digital Audio Interface / Amendment. International Electrotechnical Commission, 1990.
[159]DSP Education Session. Proceedings of IEEE International Conference on Acoutics, Speech and Signal
Processing 92, pp. 73-109, San Francisco, 1992.
[160]Special Section on the Hartley Transform (Edited by K.J. Olejniczak and G.T. Heydt). Proceedings of the IEEE,
Vol. 82, No. 3, March 1994.
[161]Special Issue on Advances in Image and Video Compression (Edited by Y.Q. Zhang, W. Li and M.L. Liou).
Proceedings of the IEEE, Vol. 83, No. 2, February 1995.
[162]Special Issue on Digital Television Part 2: Hardware and Applications (Editor M. Kunt). Proceedings of the IEEE,
Vol. 83, No. 7, July 1995.
[163]Special Issue on Electrical Therapy of Cardiac Arrhythmias (Edited by R.E. Ideker and R.C. Barr). Proceedings of
the IEEE, Vol. 84, No. 3, March 1996.
[164]Special Section on Data Compression (Editor J.A. Storer). Proceedings of the IEEE, Vol. 82, No. 6, June 1994.
[165]Special Section on Field Programmable Gate Arrays (Editor A. El Gamal). Proceedings of the IEEE, Vol. 81, No.
7, July 1993.
[166]Special Issue on Wireless Networks for Mobile and Personal Communications (Editor B. Jabbari). Proceedings of
the IEEE, Vol. 82, No. 9, September 1994.
[167]Special Issue on Digital Television, Part 1: Technologies (Editor M. Kunt). Proceedings of the IEEE, Vol. 83, No.
6, June 1995.
[168]Special Issue on Time-Frequency Analysis (Editor P.J. Loughlin). Proceedings of the IEEE, Vol. 84, No. 9,
September 1996.
[169]Technology 1995. IEEE Spectrum, Vol. 32, No.1, January 1995.
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement