1 A DSP A-Z http://www.unex.ucla.edu Digital Signal Processing An “A” to “Z” R.W. Stewart M.W. Hoffman Signal Processing Division Dept. of Electronic and Electrical Eng. University of Strathclyde Glasgow G1 1XW, UK Department of Electrical Eng. 209N Walter Scott Eng. Center PO Box 880511 Lincoln, NE 68588 0511 USA Tel: +44 (0) 141 548 2396 Fax: +44 (0) 141 552 2487 E-mail: [email protected] Tel: +1 402 472 1979 Fax: +1 402 472 4732 Email:[email protected] © BlueBox Multimedia, R.W. Stewart 1998 DSPedia 2 The DSPedia An A-Z of Digital Signal Processing This text aims to present relevant, accurate and readable definitions of common and not so common terms, algorithms, techniques and information related to DSP technology and applications. It is hoped that the information presented will complement the formal teachings of the many excellent DSP textbooks available and bridge the gaps that often exist between advanced DSP texts and introductory DSP. While some of the entries are particularly detailed, most often in cases where the concept, application or term is particularly important in DSP, you will find that other terms are short, and perhaps even dismissive when it is considered that the term is not directly relevant to DSP or would not benefit from an extensive description. There are 4 key sections to the text: • • • • DSP terms A-Z page 1 Common Numbers associated with DSP page 427 Acronyms page 435 References page 443 Any comment on this text is welcome, and [email protected], or [email protected] Bob Stewart, Mike Hoffman 1998 Published by BlueBox Multimedia. the authors can be emailed at A-series Recommendations: 1 A A-series Recommendations: Recommendations from the International Telecommunication Union (ITU) telecommunications committee (ITU-T) outlining the work of the committee. See also International Telecommunication Union, ITU-T Recommendations. A-law Compander: A defined standard nonlinear (logarithmic in fact) quantiser characteristic useful for certain signals. Non-linear quantisers are used in situations where a signal has a large dynamic range, but where signal amplitudes are more logarithmically distributed than they are linear. This is the case for normal speech. Speech signals have a very wide dynamic range: Harsh “oh” and “b” type sounds have a large amplitude, whereas softer sounds such as “sh” have small amplitudes. If a uniform quantization scheme were used then although the loud sounds would be represented adequately the quieter sounds may fall below the threshold of the LSB and therefore be quantized to zero and the information lost. Therefore non-linear quantizers are used such that the quantization level at low input levels is much smaller than for higher level signals. To some extent this also exploits the logarithmic nature of human hearing. Non-linear ADC Linear ADC Binary Output -2 Binary output 15 15 12 12 8 8 4 4 -1 1 -4 2 -2 Voltage Input -1 1 -4 -8 -8 -12 -12 -16 -16 2 Voltage Input A linear, and a non-linear (A-law in fact) input-output characteristic for two 4 bit ADCs. Note that the linear ADC has uniform quantisation, whereas the non-linear ADC has more resolution for low level signals by having a smaller step size for low level inputs. A-law quantizers are often implemented by using a nonlinear circuit followed by a uniform quantizer. Two schemes are widely in use, the µ -law in the USA: ln ( 1 + µ x ) y = ----------------------------ln ( 1 + µ ) (1) + ln A x-----------------------y = 1 1 + ln A (2) and the A-law in Europe and Japan: DSPedia 2 where “ln” is the natural logarithm (base e), and the input signal x is in the range 0 to 1. The ITU have defined standards (G.711) for these quantisers where µ = 255 and A = 87.56 . The input/ output characterisitcs of Eqs. 1 and 2 for these two values are virtually identical. Although a non-linear quantiser can be produced with analogue circuitry, it is more usual that a linear quantiser will be used, followed by a digital implementation of the compressor. For example, if a signal has been digitised by a 12 bit linear ADC, then digital µ -law compression can be performed to compress to 8 bits using a modified version of Eq. 2: ln ( 1 + µ x ⁄ 2 11 ) ln ( 1 + µ x ⁄ 2048 ) y = 2 7 ------------------------------------------ = 128 ----------------------------------------------ln ( 1 + µ ) ln ( 1 + µ ) (3) where y is rounded to the nearest integer. After a signal has been compressed and transmitted, at the receiver it can be expanded back to its linear form by using an expander with the inverse characteristic to the compressor. 127 Digital output µ = 255 96 64 32 A-Law Compression -2048 -1536 -1024 -512 0 -32 512 1024 1536 2047 Digital input 12 bits input Digital A-law compressor 8 bits output -64 -96 -128 The ITU µ -law characteristic for compression from 12 bits to 8 bits. Note that if a value of µ = 0 was used then the characteristic is linear, and for µ → ∞ the characteristic tends to a sigmoid/step function. Listening tests for µ -law encoded speech reveal that compressing a linear resolution 12 bit speech signal (sampled at 8 kHz) to 8 bits, and then expanding back to a linearly quantised 12 bit signal does not degrade the speech quality to any significant degree. This can be quantitatively shown by considering the actual quantisation noise signals for the compressed and uncompressed speech signals. In practice the use of DSP routines to perform Eq. 3 is not performed and a piecewise linear approximation (defined in G.711) to the µ - or A-law characteristic is used. See also Companders, Compression,G-series Recommendations, m-law. Absolute Error: Consider the following example, if an analogue voltage of exactly v = 6.285 volts is represented to only one decimal place by rounding then v′ = 6.3 , and the absolute error, ∆v , is defined as the difference between the true value and the estimated value. Therefore, v = v′ + ∆v (4) Absolute Pitch: 3 and ∆v = v – v′ (5) For this case ∆v = -0.015 volts. Notice that absolute error does not refer to a positive valued error, but only that no normalization of the error has occurred. See also Error Analysis, Quantization Error, Relative Error. Absolute Pitch: See entry for Perfect Pitch. Absolute Value: The absolute value of a quantity, x, is usually denoted as x . If x ≥ 0 , then x = x , and if x < 0 then x = – x . For example 12123 = 12123 , and – 234.5 = 234.5 . The absolute value function y = x is non-linear and is non-differentiable at x = 0 . y 5 4 3 y = x 2 1 -5 -4 -3 -2 -1 1 0 2 3 4 5 x Absorption Coefficient: When sound is absorbed by materials such as walls, foam etc., the amount of sound energy absorbed can be predicted by the material’s absorption coefficient at a particular frequency. The absorption coefficients for a few materials are shown below. A 1.0 indicates that all sound energy is absorbed, and a 0, that none is absorbed. Sound that is not absorbed is reflected. The amplitude of reflected sound waves is given by 1 – A times the amplitude of the impinging sound wave. Absorption Coefficient 1.0 Polyurethane Foam 0.8 Reflected Sound Glass-Wool 0.6 Thick Carpet Absorbed Sound 0.4 0.2 Incident Sound Brick 0 0.1 0.2 0.4 0.5 1 Frequency (kHz) 2 3 4 5 Wall Accelerometer: A sensor that measures acceleration, often used for vibration sensing and attitude control applications. Accumulator: Part of a DSP processor which can add two binary numbers together. The accumulator is part of the ALU (arithmetic logic unit). See also DSP Processor. Accuracy: The accuracy of DSP system refers to the error of a quantity compared to its true value. See also Absolute Error, Relative Error, Quantization Noise. DSPedia 4 Acoustic Echo Cancellation: For teleconferencing applications or hands free telephony, the loudspeaker and microphone set up in both locations causes a direct feedback path which can cause instability and therefore failure of the system. To compensate for this echo acoustic echo cancellers can be introduced: A + echoes of B’ + echoes of A’ ....etc. + A A’ − H1(f) “feedback” Adaptive Filter Adaptive Filter “feedback” H2(f) − B’ + Room 1 B Room 2 B + echoes of A’ + echoes of B’ ....etc. When speaker A in room 1 speaks into microphone 1, the speech will appear at loudspeaker 2 in room 2. However the speech from loudspeaker 2 will be picked up by microphone 2, and transmitted back into room 1 via loudspeaker 1, which in turn is picked up by loudspeaker 1, and so on. Hence unless the loudspeaker and microphones in each room are acoustically isolated (which would require headphones), there is a direct feedback path which may cause stability problems and hence failure of the full duplex speakerphone. Setting up an adaptive filter at each end will attempt to cancel the echo at each outgoing line. Amplifiers, ADCs, DACs, communication channels etc. have been omitted to allow the problem to be clearly defined. Teleconferencing is very dependent on adaptive signal processing strategies for acoustic echo control. Typically teleconferencing will sample at 8 or 16 kHz and the length of the adaptive filters could be thousands of weights (or coefficients), depending on the acoustic environments where they are being used. See also Adaptive Signal Processing, Echo Cancellation, Least Mean Squares Algorithm, Noise Cancellation, Recursive Least Squares. Acoustics: The science of sound. See also Absorption, Audio, Echo, Reverberation. Actuator: Devices which take electrical energy and convert it into some other form, e.g. loudspeakers, AC motors, Light emitting diodes (LEDs). Active Filter: An analog filter that includes amplification components such as op-amps is termed an active filter; a filter that only has resistive, capacitive and inductive elements is termed a passive filter. In DSP systems analog filters are widely used for anti-alias and reconstruction filters, where good roll-off characteristics above fs /2 are required. A simple RC circuit forms a first order (single pole) passive filter with roll of 20dB/decade (or 6dB/ocatve). By cascading RC circuits with an (active) buffer amplifier circuit, higher order filters (with more than one pole) can be easily designed. See also Anti-alias Filter, Filters (Butterworth, Chebyshev, Bessel etc.) , Knee, Reconstruction Filter , RC Circuit, Roll-off. Active Noise Control (ANC): 5 Active Noise Control (ANC): By introducing anti-phase acoustic waveforms, zones of quiet can be introduced at specified areas in space caused by the destructive interference of the offending noise and an artificially induced anti-phase noise: Anti-phase noise ANC Loudspeaker Quiet Zone: (destructive interference) Periodic noise NOISE The simple principle of active noise control. ANC works best for low frequencies up to around 600Hz. This can be intuitively argued by the fact that the wavelength of low frequencies is very long and it is easier to match peaks and troughs to create relatively large zones of quiet. Current applications for ANC can be found inside aircraft, in automobiles, in noisy industrial environments, in ventilation ducts, and in medical MRI equipment. Future applications include mobile telephones and maybe even noisy neighbors! The general active noise control problem is: NOISE T(f) Reference microphone n(t) Q(f) x(t) Adaptive Noise Controller y(t) Hr(f) ye(t) Desired zone of quiet d(t) He(f) Secondary Loudspeaker Error microphone e(t) = d(t) + ye(t) The general set up of an active noise controller as a feedback loop where the aim is to minimize the error signal power. DSPedia 6 To implement an ANC system in real time the filtered-X LMS or filtered-U LMS algorithms can be used [68], [69]: NOISE T(f) Reference microphone Q(f) d(k) Filter Zeroes a x(k) ˆ He ( z ) y(k) He(f) + Filter Poles b a ( k + 1 ) = a ( k ) + 2µe ( k )f ( k ) f(k) + Σ ˆ He ( z ) Loud speaker Error microphone b ( k + 1 ) = b ( k ) + 2µe ( k )g ( k ) g(k) The filtered-U LMS algorithm for active noise control. Note that if there are no poles, this architecture simplifies to the filtered-X LMS. The figure below shows the time and frequency domains for the ANC of an air conditioning duct. Note that the signals shown are represent the sound pressure level at the error microphone. In Active Vibration Control (AVT): 7 general the zone of quiet does not extend much greater than λ ⁄ 4 around the error microphone (where λ is the noise wavelength): TIme Analysis 2500 Before ANC Amplitude (units) 1500 500 0 -500 -1500 -2500/2500 After ANC 1500 500 0 -500 -1500 -2500 0 5 10 15 20 25 30 35 40 45 50 Time (ms) Power Spectra Analysis 0 Before ANC Magnitude (dB) -8 -16 -24 -32 -40/0 After ANC -8 -16 -24 -32 -40 0 100 200 300 400 500 600 700 800 900 1000 Frequency (Hz) ANC inside air conditioning duct. The sound pressure levels shown represent the noise at an error microphone before and after switching on the noise canceller. The noise canceller clearly reduces the low frequency (periodic) noise components. Sampling rates for ANC can be as low as 1kHz if the offending noise is very low in frequency (say 50-400Hz) but can be as high as 50 kHz for certain types of ANC headphones where very rapid adaption is required, even although the maximum frequency being cancelled is not more than a few kHz which would make the Nyquist rate considerably lower. See also Active Vibration Control, Adaptive Line Enhancer, Adaptive Signal Processing, Least Mean Squares Algorithm, Least Mean Squares Filtered-X Algorithm Convergence, Noise Cancellation. Active Vibration Control (AVT): DSP techniques for AVT are similar to active noise cancellation (ANC) algorithms and architectures. Actuators are employed to introduce anti-phase vibrations in an attempt to reduce the vibrations of a mechanical system. See also Active Noise Cancellation. DSPedia 8 AC-2: An Audio Compression algorithm developed by Dolby Labs and intended for applications such as high quality digital audio broadcasting. AC-2 claims compression ratios of 6:1 with sound quality almost indistinguishable from CD quality sound under almost all listening conditions. AC-2 is based on psychoacoustic modelling of human hearing. See also Compression, Precision Adaptive Subband Coding (PASC). Adaptation: Adaptation is the auditory effect whereby a constant and noisy signal is perceived to become less loud or noticeable after prolonged exposure. An example would be the adaptation to the engine noise in a (loud!) propeller aircraft. See also Audiology, Habituation, Psychoacoustics. Adaptive Differential Pulse Code Modulation (ADPCM): ADPCM is a family of speech compression and decompression algorithms which use adaptive quantizers and adaptive predictors to compress data (usually speech) for transmission. The CCITT standard of ADPCM allows an analog voice conversation sampled at 8kHz to be carried within a 32kbits/second digital channel . Three or four bits are used to describe each sample which represent the difference between two adjacent samples. See also Differential Pulse Code Modulation (ADPCM), Delta Modulation, Continuously Variable Slope Delta Modulation (CVSD), G.721. Adaptive Beamformer: A spatial filter (beamformer) that has time-varying, data dependent (i.e., adaptive) weights. See also Beamforming. Adaptive Equalisation: If the effects of a signal being passed through a particular system are to be “removed” then this is equalisation. See Equalisation. Adaptive Filter: The generic adaptive filter can be represented as: d( k) x( k ) Adaptive Filter, w(k) y(k) + − e( k) Adaptive Algorithm y ( k ) = Filter { x ( k ), w ( k ) } w ( k + 1 ) = w ( k ) + e ( k )f { d ( ( k ), x ( k ) ) } In the generic adaptive filter architecture the aim can intuitively be described as being to adapt the impulse response of the digital filter such that the input signal x ( k ) is filtered to produce y ( k ) which when subtracted from desired signal d ( k ) , will minimize the power of the error signal e ( k ) . The adaptive filter output y ( k ) is produced by the filter weight vector, w ( k ) , convolved (in the linear case) with x ( k ) . The adaptive filter weight vector is updated based on a function of the error signal e ( k ) at each time step k to produce a new weight vector, w ( k + 1 ) to be used at the next time step. This adaptive algorithm is used in order that the input signal of the filter, x ( k ) , is filtered to produce an output, y ( k ) , which is similar to the desired signal, d ( k ) , such that the power of the error signal, e ( k ) = d ( k ) – y ( k ) , is minimized. This minimization is essentially achieved by exploiting the correlation that should exist between d ( k ) and y ( k ) . Adaptive Filter: 9 The adaptive digital filter can be an FIR, IIR, Lattice or even a non-linear (Volterra) filter, depending on the application. The most common by far is the FIR. The adaptive algorithm can be based on gradient techniques such as the LMS, or on recursive least squares techniques such as the RLS. In general different algorithms have different attributes in terms of minimum error achievable, convergence time, and stability. There are at least four general architectures that can be set up for adaptive filters: (1) System identification; (2) Inverse system identification; (3) Noise cancellation; (4) Prediction. Note that all of these architectures have the same generic adaptive filter as shown below (the “Adaptive Algorithm” block explicitly drawn above has been left out for illustrative convenience and clarity): Unknown System x(k) x(k) Adaptive Filter Delay y(k) d(k) + e(k) - x(k) Unknown System s(k) System Identification Adaptive Filter y(k) d(k) + e(k) - Inverse System Identification s(k) + n(k) n’(k) x(k) Adaptive Filter y(k) - Noise Cancellation d(k) + e(k) s(k) Delay x(k) Adaptive Filter y(k) d(k) + e(k) - Prediction Four adaptive signal processing architectures Consider first the system identification; at an intuitive level, if the adaptive algorithm is indeed successful at minimizing the error to zero, then by simple inspection the transfer function of the “Unknown System” must be identical to the transfer function of the adaptive filter. Given that the error of the adaptive filter is now zero, then the adaptive filters weights are no longer updated and will remain in a steady state. As long as the unknown system does not change its characteristics we have now successfully identified (or modelled) the system. If the adaption was not perfect and the error is “very small” rather than zero (which is more likely in real applications) then it is fair to say the we have a good model rather than a perfect model. Similarly for the inverse system identification if the error adapts to zero over a period of time, then by observation the transfer function of the adaptive filter must be the exact inverse of the “Unknown System”. (Note that the “Delay” is necessary to ensure that the problem is causal and therefore solvable with real systems, i.e. given that the “Unknown System” may introduce a time delay in producing x ( k ) , then if the “Delay” was not present in the path to the desired signal the system would be required produced an anti-delay or look ahead in time - clearly this is impossible.) For the noise cancellation architecture, if the input signal is s ( k ) which is corrupted by additive noise, n ( k ) , then the aim is to use a correlated noise reference signal, n′ ( k ) as an input to the DSPedia 10 adaptive filter, such that when performing the adaption there is only information available to implicitly model the noise signal, n ( k ) and therefore when this filter adapts to a steady state we would expect that e ( k ) ≈ s ( k ) . Finally, for the prediction filter, if the error is set to be adapted to zero, then the adaptive filter must predict future elements of the input s ( k ) based only on past observations. This can be performed if the signal s ( k ) is periodic and the filter is long enough to “remember” past values. One application therefore of the prediction architecture could be to extract periodic signals from stochastic noise signals. The prediction filter can be extended to a “smoothing filter” if data are processed off-line -- this means that samples before and after the present sample are filtered to obtain an estimate of the present sample. Smoothing cannot be done in real-time, however there are important applications where real-time processing is not required (e.g., geophysical seismic signal processing). A particular application may have elements of more than one single architecture, for example in the following, if the adaptive filter is successful in modelling “Unknown System 1”, and inverse modelling “Unknown System 2”, then if s ( k ) is uncorrelated with r ( k ) then the error signal is likely to be e ( k ) ≈ s ( k ) : s(k) Unknown System 1 r(k) Unknown System 2 Delay x(k) Adaptive Filter + y(k) + d(k) + e(k) - An adaptive filtering architecture incorporating elements of system identification, inverse system identification and noise cancellation In the four general architectures shown above the unknown systems being investigated will normally be analog in nature, and therefore suitable ADCs and DACs would be used at the various Adaptive Infinite Impulse Response (IIR) Filters: 11 analog input and output points as appropriate. For example if an adaptive filter was being used to find a model of a small acoustic enclosure the overall hardware set up would be: DAC x(t) d(t) ADC d(k) Adaptive Filter x(k) + e(k) y(k) - Digital Signal Processor The analog-digital interfacing for a system identification, or modelling, of an acoustic transfer path using a loudspeaker and microphone. See also Adaptive Signal Processing, Acoustic Echo Cancellation, Active Noise Control, Adaptive Line Enhancer, Echo Cancellation, Least Mean Squares (LMS) Algorithm, Least Squares, Noise Cancellation, Recursive Least Squares, Wiener-Hopf Equations. Adaptive Infinite Impulse Response (IIR) Filters: See Least Mean Squares IIR Algorithms. Adaptive Line Enhancer (ALE): An adaptive signal processing structure that is designed to enhance or extract periodic (or predictable) components: d( k) ∆ x(k ) p(k ) + n( k) p(k – ∆) + n(k – ∆) Adaptive Filter − y(k) + e(k ) An adaptive line enhancer. The input signal consists of a periodic component, p ( k ) and a stochastic component, n ( k ) . The delay, ∆, is long enough such that the stochastic component at the input to the adaptive filter, n ( k – ∆ ) is decorrelated with the input n ( k ) . For periodic signal the delay does not decorrelate p ( k ) and p ( k – ∆ ) . When the adaptive filter adapts it will therefore only cancel the periodic signal. The delay, ∆, should be long enough to decorrelate the broadband “noise-like” signal, resulting in an adaptive filter which extracts the narrowband periodic signal at filter output y ( k ) (or removes the periodic noise from a wideband signal at e ( k ) ). An ALE exploits the knowledge that the signal of interest is periodic, whereas the additive noise is stochastic. If the decorrelation delay, ∆, is long enough then the stochastic noise presented to the d ( k ) input is uncorrelated with the noise presented to the x ( k ) input, however the periodic noise remains correlated: DSPedia 12 r(n) q( n) Lag, n Correlation r ( n ) = E { p ( k )p ( k + n ) } periodic (sine wave) signal of a -∆ ∆ Lag, n Correlation q ( n ) = E { n ( k )n ( k + n ) } of a stochastic signal Typically an ALE may be used in communication channels or in radar and sonar applications where a low level sinusoid is masked by white or colored noise. In a telecommunications system, an ALE could be used to extract periodic DTMF signals from very high levels of stochastic noise. Alternatively note that the ALE can be used to extract the periodic noise from the stochastic signal by observing the signal e ( k ) . See also Adaptive Signal Processing, Least Mean Squares Algorithm, Noise Cancellation. Adaptive Noise Cancellation: See Adaptive Signal Processing, Noise Cancellation. Adaptive Signal Processing: The discrete mathematics of adaptive filtering, originally based on the least squares minimization theory of the celebrated 19th Century German mathematician Gauss. Least squares is of course widely used in statistical analysis and virtually every branch of science and engineering. For many DSP applications, however, least squares minimization is applied to real time data and therefore presents the challenge of producing a real time implementation to operate on data arriving at high data rates (from 1kHz to 100kHz), and with loosely known statistics and properties. In addition, other cost functions besides least squares are also used. One of the first suggestions of adaptive DSP algorithms was in Widrow and Hoff’s classic paper on the adaptive switching circuits and the least mean squares (LMS) algorithm at the IRE WESCON Conference in 1960. This paper stimulated great interest by providing a practical and potentially real time solution for least squares implementation. Widrow followed up this work with two definitive and classic papers on adaptive signal processing in the 1970s [152], [153]. Adaptive signal processing has found many applications. A generic breakdown of these applications can be made into the following categories of signal processing problems: signal detection (is it there?), signal estimation (what is it?), parameter or state estimation, signal compression, signal synthesis, signal classification, etc. The common attributes of adaptive signal processing applications include time varying (adaptive) computations (processing) using sensed input values (signals).See also Acoustic Echo Cancellation, Active Noise Control, Adaptive Filter, Adaptive Line Enhancer, Echo Cancellation, Least Mean Squares (LMS) Algorithm, Least Squares, Noise Cancellation, Recursive Least Squares, Wiener-Hopf Equations. Adaptive Spectral Perceptual Entropy Coding (ASPEC): ASPEC is a means of providing psychoacoustic compression of hifidelity audio and was developed by AT&T Bell Labs, Thomson and the Fraunhofer society amongst others. In 1990 features of the ASPEC coding system were incorporated into the International Organization for Standards MPEG-1 standard ISO in combination with MUSICAM. See also Masking Pattern Adapted Universal Subband Integrated Adaptive Step Size: 13 Coding and Multiplexing (MUSICAM), Precision Adaptive Subband Coding (PASC), Spectral Masking, Psychoacoustics, Temporal Masking. Adaptive Step Size: See Step Size Parameter. Adaptive Transform Acoustic Coding (ATRAC): ATRAC coding is used for compression of hifidelity audio (usually starting with 16 bit data at 44.1kHz) to reduce storage requirement on recording mediums such as the mini-disc (MD) [155]. ATRAC achieves a compression ratio of almost 5:1 with very little perceived difference to uncompressed PCM quality. ATRAC exploits psychoacoustic (spectral) masking properties of the human ear and effectively compresses data by varying the bit resolution used to code different parts of the audio spectrum. More information on the mini-disc (and also ATRAC) can be found in [155]. ATRAC has three key coding stages. First is the subband filtering which splits the signal into three subbands, (low:0 - 5.5 kHz; mid:5.5 - 11kHz; high:11- 22kHz) using a two stage quadrature mirror filter (QMF) bank. The second stage them performs a modified discrete cosine transform (MDCT) to produce a frequency representation of the signal. The actual length (no. of samples) of the transform is controlled adaptively via an internal decision process and either uses time frame lengths of 11.6ms (when in long mode) for all frequency bands, and 1.45ms (when in short mode) for the high frequency band, and 2.9ms (also called short mode) for the low and mid frequency bands. The choice of mode is usually long, however if a signal has rapidly varying instantaneous power (when say a cymbal is struck) short mode may be required in the low and mid frequency bands to adequately code the rapid attack portion of the waveform. Finally the third stage is to consider the spectral characteristics of the three subbands and allocate bit resolution such that spectral components below the threshold of hearing, are not encoded, and that the spectral masking attributes of the signal spectrum are exploited such that the number of bits required to code certain frequency bands is greatly reduced. (See entry for Precision Adaptive Subband Coding (PASC) for a description of quantization noise masking.) ATRAC splits the frequencies from the MDCT into a total of 52 frequency bins which are of varying bandwidth based on the width of the critical bands in the human auditory mechanism. ATRAC then compands and requantizes using a block floating point representation. The wordlength is determined by the bit DSPedia 14 allocation process based on psychoacoustic models. Each input 11.6 ms time frame of 512 × 16 bit samples or 1024 bytes is compressed to 212 bytes (4.83:1 compression ratio). Delay QMF-1 Digital Audio input 44.1kHz, 16 bits; 1.4112 Mbits/s 11.025 - 22.05kHz MDCT High 5.5125 - 11.025kHz MDCT Mid Bit allocation and spectral quantization QMF-2 0 - 5.5125 kHz MDCT Low Compressed output 292 Imbeds/sec The three stages of adaptive transform acoustic coding (ATRAC): (1) Quadrature mirror filter (QMF) subband coding; (2) Modified Discrete Cosine Transform (MDCT); (3) Bit allocation and spectral masking/quantization decision. Data is input for coding in time frames of 512 samples (1024 bytes) and compressed into 212 bytes. ATRAC decoding from compressed format back to 44.1kHz PCM format is achieved by first performing an inverse MDCT on the three subbands (using long mode or short mode data lengths as specified in the coded data). The three time domain signals produced are then reconstructed back into a time domain signal using QMF synthesis filters for output to a DAC. See also Compact Disc, Data Compression, Frequency Range of Hearing, MiniDisc (MD), Psychoacoustics, Precision Adaptive Subband Coding (PASC), Spectral Masking, Subband Filtering, Temporal Masking, Threshold of Hearing. Additive White Gaussian Noise: The most commonly assumed noise channel in the analysis and design of communications systems. Why is this so? Well, for one, this assumption allows analysis of the resulting system to be tractable (i.e., we can do the analysis). In addition, this is a very good model of electronic circuit noise. In communication systems the modulated signal is often so weak that this circuit noise becomes a dominant effect. The model of a flat (i.e., white) spectra is good in electronic circuits up to about 1012Hz. See also White Noise. Address Bus: A collection of wires that are used for sending memory address information either inter-chip (between chips) or intra-chip (within a chip). Typically DSP address buses are 16 or 32 bits wide. See also DSP Processor. Address Registers: Memory locations inside a DSP processor that are used as temporary storage space for addresses of data stored somewhere in memory. The address register width is always greater than or equal to (normally the same) the width of the DSP processor address bus. Most DSP processors have a number of address registers. See also DSP Processor. AES/EBU: See Audio Engineering Society, European Broadcast Union. Aliasing: An irrecoverable effect of sampling a signal too slowly. High frequency components of a signal (over one-half the sampling frequency) cannot be accurately reconstructed in a digital system. Intuitively, the problem of sampling too slowly (aliasing) can be understood by considering that rapidly varying signal fluctuations that take place in between samples cannot be represented at the output. The distortion created by sampling these high frequency signals too slowly is not Algorithm: 15 reversible and can only be avoided by proper aliasing protection as provided by an anti-alias filter or a an oversampled Analog to Digital converter. Voltage period = 1/f 0.01 0.02 time 0.03 Sampling a 100 Hz sine wave at only 80 Hz causes aliasing, and the output signal is interpreted as a 20 Hz sine wave, i.e. See also Anti-alias Filter, Oversampling. Algorithm: A mathematical based computational method which forms a set of well defined rules or equations for performing a particular task. For example, the FFT algorithm can be coded into a DSP processor assembly language and then used to calculate FFTs from stored (or real-time) digital data. All-pass Filter: An all-pass filter passes all input frequencies with the same gain, although the phase of the signal will be modified. (A true all-pass filter has a gain of one.) All-pass filters are used for applications such as group delay equalisation, notch filtering design, Hilbert transform implementation, musical instruments synthesis [43] . The simplest all pass filter is a simple delay! This “filter” passes all frequencies with the same gain, has linear phase response and introduces a group delay of one sample at all frequencies: time domain x(k) z-domain y(k) y(k ) = x(k – 1) Y ( z ) = z–1 H ( z ) Y(z) H ( z ) = ------------ = z – 1 X(z) A simple all pass filter. All frequencies are passed with the same gain. A more general representation of some types of all pass filters can be represented by the general z-domain transfer function for an infinite impulse response (IIR) N pole, N zero filter: –1 * –N + a * z – N + 1 + … + a * * – N A * ( z –1 ) ( z )- = a N – 1 z + aN 0z 1 ------------------------------------------------------------------------------------------------------------- = z---------------------------H( z) = Y X(z) A( z ) a 0 + a 1 z –1 + … + a N – 1 z – N + 1 + a N z –N (6) where a * is the complex conjugate of a . Usually the filter weights are real, therefore a = a * , and we set a 0 = 1 : z – N + a 1 z –N + 1 + … + a N – 1 z –1 + a N z –N A ( z –1 )( z -) = ---------------------------------------------------------------------------------------------------------- = ------------------------H(z) = Y X(z) A( z ) 1 + a 1 z – 1 + … + a N – 1 z – N + 1 + a N z –N (7) DSPedia 16 We can easily show that H ( z ) = a N (see below) for all frequencies. Note that the numerator polynomial z –N A ( z ) is simply the ordered reversed z-polynomial of the denominator A ( z ) . For an input signal x ( k ) the discrete time output of an all-pass filter is: y ( k ) = aN x ( k ) + aN – 1 x ( k – 1 ) + … + a1 x ( k – N + 1 ) + x ( k – N ) + + a1 y ( k – 1 ) + … + aN – 1 y ( k + N – 1 ) + aN x ( k – N ) (8) In order to be stable, the poles of the all-pass filter must lie within the unit circle. Therefore for the denominator polynomial, if the N roots of the polynomial A ( z ) are: A ( z ) = ( 1 – p 1 z –1 ) ( 1 – p 2 z –1 )… ( 1 – p N z –1 ) (9) then p n < 1 for n = 1 to N in order to ensure all poles are within the unit circle. The poles and zeroes of the all pass filter are therefore: a N ( 1 – p 1– 1 z –1 ) ( 1 – p 2–1 z – 1 )… ( 1 – p N–1 z –1 ) H ( z ) = -----------------------------------------------------------------------------------------------------------( 1 – p 1 z –1 ) ( 1 – p 2 z –1 )… ( 1 – p N z –1 ) (10) where the roots of the zeroes polynomial A ( z –1 ) are easily calculated to be the inverse of the poles (see following example). To illustrate the relationship between roots of z-domain polynomial and of its order reversed polynomial, consider a polynomial of order 3 with roots at z = p 1 and z = p 2 : 1 + a 1 z – 1 + a 2 z – 2 + a 3 z –3 = ( 1 – p 1 z – 1 ) ( 1 – p 2 z – 1 ) ( 1 – p 3 z – 1 ) = 1 – ( p 1 + p 2 + p 3 )z – 1 + ( p 1 p 2 + p 2 p 3 + p 1 p 3 )z – 2 + p 1 p 2 p 3 z – 3 Then replacing z with z – 1 gives: 1 + a 1 z1 + a 2 z2 + a3 z 3 = ( 1 – p1 z ) ( 1 – p2 z ) ( 1 – p3 z ) and therefore multiplying both sides by z – 3 gives: z–3 ( 1 + a1 z 1 + a2 z 2 + a3 z 3 ) = z –3 ( 1 – p1 z ) ( 1 – p2 z ) ( 1 – p3 z ) z – 3 + a 1 z – 2 + a 2 z –1 + a 3 = ( z – 1 – p 1 ) ( z – 1 – p 2 ) ( z – 1 – p 3 ) = – p 1 p 2 p 3 ( 1 – p 1– 1 z – 1 ) ( 1 – p 2– 1 z – 1 ) ( 1 – p 3– 1 z – 1 ) = – a 3 ( 1 – p 1– 1 z –1 ) ( 1 – p 2– 1 z –1 ) ( 1 – p 3– 1 z –1 ) hence revealing the roots of the order reversed polynomial to be at z = 1 ⁄ p 1 , z = 1 ⁄ p 2 and z = 1 ⁄ p 3 . Of course, if all of the poles of Eq. 10 lie within the z-domain unit circle then all of the zeroes of the denominator of Eq. 10 will necessarily lie outside of the unit circle of the z-domain, i.e. when p n < 1 for n = 1 to N then p n– 1 > 1 for n = 1 to N . Therefore an all pass filter is maximum phase. The magnitude frequency response of the pole at z = p i and the zero at z = p i–1 is: 1 – p i–1 z –1 H i ( e jω ) = ---------------------------1 – p i z –1 z = e jω 1= -----pi (11) All-pass Filter: 17 If we let p i = x + jy then the frequency response is found by evaluating the transfer function at z = e jω : –j ω 1 – p i– 1 e – j ω 1 p i – e - 1 - = ---- -----------------------H i ( e jω ) = --------------------------- = ---- G ( e jω ) jω – jω – p p 1 – pi e i 1 – pi e i where G ( e jω ) = 1 . This can be shown by first considering that: ( x – cos ω ) + j ( y – sin ω ) x + jy – ( cos ω – j sin ω ) G ( e jω ) = = -------------------------------------------------------------------- = -----------------------------------------------------------------------------------------------------1 – ( x + jy ) ( cos ω – j sin ω ) 1 – x cos ω – y sin ω + j ( x sin ω – y cos ω ) and therefore the (squared) magnitude frequency response of G ( e jω ) is: G ( e jω ) 2 ( x – cos ω ) 2 + ( y – sin ω ) 2 = ------------------------------------------------------------------------------------------------------------------( 1 – ( x cos ω + y sin ω ) ) 2 + ( x sin ω – y cos ω ) 2 ( x 2 – 2x cos ω + cos2 ω ) + ( y 2 – 2y sin ω + sin2 ω ) = -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------1 – 2x cos ω – 2y sin ω + ( x cos ω + y sin ω ) 2 + x 2 sin2 ω + y 2 cos2 ω – 2xy sin ω cos ω ( sin2 ω + cos2 ω ) + x 2 + y 2 – 2x cos ω – 2b sin ω = ------------------------------------------------------------------------------------------------------------------------------------------------------------------------1 + x 2 ( sin2 ω + cos2 ω ) + y 2 ( sin2 ω + cos2 ω ) – 2x cos ω + 2y sin ω 1 + x 2 + y 2 – 2x cos ω – 2y sin ω= -------------------------------------------------------------------------------=1 1 + x 2 + y 2 – 2x cos ω + 2y sin ω 1 1 Hence: H i ( e jω ) = ------- = ---------------------pi x2 + y2 Therefore the magnitude frequency response of the all pass filter in Eq. 10 is indeed “flat” and given by: aN = 1 H ( e jω ) = a N H1 ( e jω ) H 2 ( e jω ) … H N ( e jω ) = ---------------------------------p1 p2 … pN (12) From Eq. 7 and 10 it is easy to show that a N = p 1 p 2 … p N . Consider the poles and zeroes of a simple 2nd order all-pass filter transfer function (found by simply using the quadratic formula): Imag 1 + 2z – 1 + 3z – 2 H ( z ) = ---------------------------------------3 + 2z – 1 + z – 2 z-domain 1 ( 1 – ( 1 + j 2 )z – 1 ) ( 1 – ( 1 – j 2 )z – 1 ) = -----------------------------------------------------------------------------------------------------------------------------3 ( 1 – ( 1 ⁄ 3 + j 2 ⁄ 3 )z – 1 ) ( 1 – ( 1 ⁄ 3 – j 2 ⁄ 3 )z – 1 ) -1 0 -1 1 Real ( 1 – p 1– 1 z – 1 ) ( 1 – p 2– 1 z – 1 ) 1 = ------------------ ⋅ --------------------------------------------------------------p1 p2 ( 1 – p 1 z – 1 ) ( 1 – p 2 z –1 ) and obviously p 1 = 1 ⁄ 3 – j 2 ⁄ 3 and p 2 = 1 ⁄ 3 + j 2 ⁄ 3 and p 1– 1 = 1 – j 2 and p 2– 1 = 1 + j 2 . This example demonsrates that given that the poles must be inside the unit circle for a stable filter, the zeroes will always be outside of the unit circle, i.e. maximum phase. Any non-minimum phase system (i.e. zeroes outside the unit circle) can always be described as a cascade of a minimum phase filter and a maximum phase all-pass filter. Consider the non-minimum phase filter: DSPedia 18 ( 1 – α 1 z –1 ) ( 1 – α 2 z –1 ) ( 1 – α 3 z –1 ) ( 1 – α 4 z –1 ) H ( z ) = --------------------------------------------------------------------------------------------------------------------( 1 – β 1 z – 1 ) ( 1 – β2 z –1 ) ( 1 – β 3 z –1 ) (13) where the poles, β 1, β 2, and β 3 are inside the unit circle (to ensure a stable filter) and the zeroes α1 and α 2 are inside the unit circle, but the zeroes α 3 and α4 are outside of the unit circle. This filter can be written in the form of a minimum phase system cascaded with an all-pass filter by rewriting as: ( 1 – α 1 z – 1 ) ( 1 – α 2 z – 1 ) ( 1 – α 3 z – 1 ) ( 1 – α 4 z – 1 ) ( 1 – α 3– 1 z – 1 ) ( 1 – α 4– 1 z – 1 ) - ----------------------------------------------------------------- H ( z ) = ----------------------------------------------------------------------------------------------------------------------( 1 – β1 z–1 ) ( 1 – β2 z –1 ) ( 1 – β3 z –1 ) ( 1 – α 3– 1 z – 1 ) ( 1 – α 4– 1 z – 1 ) ( 1 – α 1 z – 1 ) ( 1 – α 2 z – 1 ) ( 1 – α 3–1 z – 1 ) ( 1 – α 4– 1 z – 1 ) ( ( 1 – α 3 z – 1 ) ( 1 – α 4 z – 1 ) ) - ----------------------------------------------------------------- = ----------------------------------------------------------------------------------------------------------------------------( 1 – β1 z–1 ) ( 1 – β2 z –1 ) ( 1 – β3 z –1 ) ( 1 – α 3– 1 z – 1 ) ( 1 – α 4– 1 z – 1 ) Minimum phase filter (14) All-pass maximum phase filter Therefore the minimum phase filter has zeroes inside the unit circle at z = α 3–1 , z = α 4– 1 and has exactly the same magnitude frequency response as the original filter and the gain of the all-pass filter being 1. See also All-pass Filter-Phase Compensation, Digital Filter, Infinite Impulse Response Filter, Notch Filter. G ( e jω ) 0 -10 Magnitude and phase response of G ( z ) -2π -4π -20 0 G ( e jω ) 0 frequency (Hz) frequency (Hz) All-pass filter Input Gain (dB) G(z) 0 HA(z) G ( e jω )H A ( e jω ) -10 G ( e jω )H A ( e jω ) -2π -20 0 0 Phase 0 Phase Gain (dB) All-pass Filter, Phase Compensation: All pass filters are often used for phase compensation or group delay equalisation where the aim is to cascade an all-pass filter with a particular filter in order to achieve a linear phase response in the passband and leave the magnitude frequency response unchanged. (Given that signal information in the stopband is unwanted then there is usually no need to phase compensate there!). Therefore if a particular filter has a non-linear phase response and therefore non-constant group delay, then it may be possible to design a phase compensating all-pass filter: -4π frequency (Hz) 0 Output Magnitude and phase response of G ( z )H A ( z ) frequency (Hz) Cascading an all pass filter H A ( z ) with a non-linear phase filter G ( z ) in order to linearise the phase response and therefore produce a constant group delay. The magnitude frequency response of the cascaded system is the same as the original system. See also Digital Filter, Infinite Impulse Response Filter, Notch Filter. All-pass Filter, Fractional Sample Delay Implementation: 19 All-pass Filter, Fractional Sample Delay Implementation: If it is required to delay a digital signal by a number of discrete sample delays this is easily accomplished using delay elements: y(k) = x(k – 3) x(k) x(k) y(k) 1 T = ---- secs fs 0 0 k time (secs/fs) k time (secs/fs) Delaying a signal by 3 samples, using simple delay elements. Using DSP techniques to delay a signal by a time that is an integer number of sample delays t s = 1 ⁄ f s is therefore relatively straightforward. However delaying by a time that is not an integer number of sampling delays (i.e a fractional delay) is less straightforward. Another method uses a simple first order all pass filter, to “approximately” implement a fractional sampling delay. Consider the all-pass filter: z – 1 + aH ( z ) = -------------------1 + az –1 (15) To find the phase response, we first calculate: e –j ω + a cos ω – j sin ω + a = --------------------------------------------------( a + cos ω ) – j sin ω H ( e jω ) = ----------------------- = --------------------------------------------------– ω j 1 + a cos ω – ja sin ω 1 + a cos ω – ja sin ω 1 + ae (16) and therefore: – sin ω a sin ω ∠H ( e jω ) = tan–1 ------------------------ + tan–1 --------------------------- + + a cos ω 1 a cos ω (17) For small values of x the approximation tan– 1 x ≈ x , cos x ≈ 1 and sin x ≈ x hold. Therefore in Eq. 17, for small values of ω we get: 1 – a– ω - + -----------aω - = – -----------∠H ( e jω ) ≈ -----------ω = δω +a 1 a+1 1+a (18) where δ = ( 1 – a ) ⁄ ( 1 + a ) . Therefore at “small” frequencies the phase response is linear, thus giving a constant group delay of δ . Hence if a signal with a low frequency value f i , where: 2πf i ---------- << 1 fs is required to be delayed by δ of a sample period ( t s = 1 ⁄ f s ), then: (19) DSPedia 20 1 – a⇒ δ = -----------1+a 1 – δ⇒ a = ----------1+δ (20) x ( k ) = sin ( 2πf i k ⁄ f s ) Therefore for the sine wave input signal of approximately y ( k ) ≈ sin ( ( 2πf i ( k – δ ) ) ⁄ f s ) . the output signal is Parameters associated with creating delays of 0.1, 0.4, and 0.9 are shown below : Input z –1 + a --------------------1 + az – 1 1–δ a = -----------1+δ 1.0 dH ( e jω ) ⁄ dω Group Delay Phase (radians) Delay (samples) All-Pass Filter 1.2 δ = 0.9 0.8 H ( e jω ) Phase Response 0 δ = 0.1 -π/2 0.6 δ = 0.4 0.4 0.2 0 Note that for: δ = 0.1, a = 0.9 ⁄ 1.1 ; δ = 0.4, a = 0.6 ⁄ 1.4 ; δ = 0.9, a = 0.1 ⁄ 1.9 Output 0.1 0.2 0.3 δ = 0.1 0.4 0.5 frequency (Hz) δ = 0.9 -π 0 0.1 0.2 δ = 0.4 0.3 0.4 0.5 frequency (Hz) Phase response and group delay for a first order all pass filter implementing a fractional delay at low frequencies. For frequencies below 0.1f s the phase response is “almost” linear, and therefore the group delay is effectively a constant. Note of course that for a stable filter, a < 1 . The gain at all frequencies is 1 (a feature of all pass filters of course). One area where fractional delays are useful is in musical instrument synthesis where accurate control of the feedback loop delay is desirable to allow accurate generation of musical notes with rich harmonics using “simple” filters [43]. If a digital audio system is sampling at f s = 48000 Hz then for frequencies up to around 4000 Hz very accurate control is available over the loop delay thus allowing accurate generation of musical note frequencies. More detail on fractional delay method and applications can be found in [97]. See All-pass Filter-Phase Compensation, Equalisation, Finite Impulse Reponse Filter - Linear Phase. . All-Pole Filter: An all-pole filter is another name for a digital infinite impulse response (IIR) filter which features only a recursive (feedback) section, i.e. it has no feedforward (non-recursive) finite All-Pole Filter: 21 impulse response (FIR) section. The signal flow graph and discrete time equations for an all-pole filter are:. x(k) y(k-M+1) y(k-M) bM bM-1 y(k-2) b2 y(k-1) y(k) b1 M y(k ) = ∑ bn y( k – n) n=1 = b1 y ( k – 1 ) + b 2 y ( k – 2 ) + … + b M – 1 ( 2 ) y ( k – M + 1 ) + b M y ( k – M ) An all pole filter has a feedback (recursive) section but no feedforward (non-recursive) section. As for all IIR filters care must be taken to ensure that the filter is stable and all poles are within the unit circle of the z-domain. (In our example we have used b’s to specify the recursive weights, and (where appropriate) a’s to specify the non-recursive weights. Some others use precisely the reverse notation!) An M th order all-pole filter has M weights (b1 to bM). and the z-domain transfer function can be represented by an M th order z-polynomial: ( z )- = -------------------------------------------------------------------------------------------------1 ----------B( z) = Y – 1 X(z) 1 + b1 z + … + bM – 1 z – M + 1 + bM z –M 1 = ---------------------------------M 1+ ∑ (21) bn z –n n=1 The all-pole filter weights are also referred to as the autoregressive parameters if the all-pole filter is used to generate an AR process. See also All-Zero Filter, Autoregressive Model, AutoregressiveMoving Average Filter, Digital Filter, Finite Impulse Response Filter, Infinite Impulse Response Filter. DSPedia 22 All-Zero Filter: An all zero filter is another name for a finite impulse response (FIR) digital filter: x(k-1) x(k-N+2) x(k-N+1) x(k) w0 w1 wN-2 wN-1 y(k) y ( k ) = w 0 x ( k ) + w 1 x ( k – 1 ) + w 2 x ( k – 2 ) + w 3 x ( k – 3 ) + ..... + w N – 1 x ( k – N + 1 ) N–1 = x(k) Tx = = – ( ) w x k n w w0 w1 w 2 x ( k – 1 ) ∑ n k x(k – 2) n=0 The signal flow graph and the discrete time output equation for an all zero digital filter. An all zero filter is non-recursive and therefore contains no feedback components. An (N-1)-th order all-zero filter has N weights (w0 to wN-1) and can be represented as an (N-1)-th order polynomial in the z-domain: ( z )- = w + w z –1 + + w –N+2 + w –N+1 = ----------W(z) = Y … N – 2z N – 1z 0 1 X( z) N–1 ∑ n=0 w n z –n (22) = X ( z )z – N + 1 [ w 0 z N – 1 + w1 z N – 2 + …w N – 1 ] An all-zero filter is often also referred to as a moving average filter, although the name “moving average filter” is (usually) more specifically used to mean an all-zero filter where all of the filter weights are 1/N (or 1). See also All-Pole Filter, Comb Filter, Digital Filter, Finite Impulse Response Filter, Infinite Impulse Response Filter , Moving Average Filter. Ambience Processing: The addition of echoes or reverberation to warm a particular sound or mimic the effect of a certain type of hall, or other acoustic environment. Another more popular term used by Hifi companies is Digital Soundfield Processing (DSfP). Amplifier: A device used to amplify, or linearly increase, the value of an analog voltage signal. Amplifiers are usually denoted by a triangle symbol. The amplification factor is stated as a ratio V out ⁄ V in , or in dBs as 20 log 10 ( V out ⁄ V in ) . For any real time input/output DSP system some form of amplifier interface is required at the input and the output. A good amplifier should have a very high input impedance, and a very low output impedance. Some systems require an amplification Amplitude: 23 factor of 1 to protect or isolate a source; this type of amplifier is often called a buffer. See also Operational Amplifier, Digital Amplifier, Buffer Amplifier, Pre-amplifier, and Attenuation. Vin Voltage Voltage Vout time time Amplifier Amplitude: The value size (or magnitude) of a signal at a specific time. Prior to analog to digital conversion (ADC) the instantaneous amplitude will be given as a voltage value, and after the ADC, the amplitude of a particular sample will be given as a binary number. Note that a few authors use amplitude as the plus/minus magnitude of a signal. Volts Digital Value 4 3 2 1 1 2 3 4 t2 t1 time 32000 24000 16000 8000 0 8000 16000 24000 32000 n2 n1 time Signal amplitude at: After A/D conversion: t1: V = 3.7 volts t2: V = -3.1 volts n1: Value = 30976 n2: Value = -20567 Amplitude Modulation: One of the three ways of modulating a sine wave signal to carry information. The sine wave or carrier has its amplitude changed in accordance with the information signal to be transmitted. See also Frequency Modulation, Phase Modulation. Amplitude Response: See Fourer Series - Amplitude/Phase Representation, Fourier Series Complex Exponential Representation. Amplitude Shift Keying (ASK): A digital modulation technique in which the information bits are encoded in the amplitude of a symbol. On-Off Keying (OOK) is a special case of ASK in which the two possible symbols are zero (Off) and V volts (On). See also Frequency Shift Keying, Phase Shift Keying, Pulse Amplitude Modulation, Quadrature Amplitude Modulation. Analog: An analog means the “same as”. Therefore, as an example, an analog voltage for a sound signal means that the voltage has the same characteristics of amplitude and phase variation as the sound. Using the appropriate sensor, analog voltages can be created for light intensity (a photovoltaic cell), vibrations (accelerometer), sound (microphone), fluid level (potentiometer and floating ball) and so on. Analog Computer: Before the availability of low cost, high performance DSP processors, analog computers were used for analysis of signals and systems. The basic linear elements for analog computers were the summing amplifier, the integrator, and the differentiator [44]. By the judicious use of resistor and capacitor values, and the input of appropriate signals, analog computers could DSPedia 24 be used for solving differential equations, exponential and sine wave generation and the development of control system transfer functions. C R + Vin –1 t V out = --------- ∫ V in dt RC 0 Integrator R C + Vin dV in V out = – R C -----------dt Differentiator V1 V2 V3 R1 R2 R3 Rf + Rf Rf Rf V out = ------- V 1 + ------- V 1 + ------- V 1 R1 R2 R3 Summer Analog Differentiator: See Analog Computer. Analog Integrator: See Analog Computer. Analog to Digital Converter (A/D or ADC): A analog to digital converter takes an analog input voltage (a real number) and converts it (or “quantizes” it) to a binary number (i.e., to one of a finite set of values). The number of conversions per second is governed by the sampling rate. The input to an ADC is usually from a sample and hold circuit which holds an input voltage constant for one sampling period while the ADC performs the actual analog to digital conversion. Most ADCs used in DSP use 2’s complement arithmetic. For audio applications 16 bit ADCs are used, whereas for telecommunications and speech coding, 8 bit ADCs are usually used. Modern ADCs can achieve Anechoic: 25 almost 20 bits of accuracy at sampling rates of up to 100kHz. See also Anti-alias Filter, Digital to Analog Converter, Quantizer, Sample and Hold, Sigma Delta . fs Voltage 2 1 ADC 0 time -1 -2 Binary Value 15 12 8 4 0 -4 -8 -12 -16 time Binary Output -2 15 01111 12 01100 8 01000 4 00100 Example of a 5 bit ADC converting the output from a sample and hold circuit to binary values -1 1 11001 -4 11000 -8 10100 -12 10000 -16 2 Voltage Input Anechoic: An acoustic condition in which (virtually) no reflected echoes exist. This would occur if two people were having a conversation suspended very high in the air. In practice anechoic chambers can be built where the walls are made of specially constructed cones which do not reflect any sound, but absorb it all. Having a conversation in an anechoic chamber can be awkward as the human brain is expecting some echo to occur. ANSI: American National Standards Institute. A group affiliated with the International Standards Organization (ISO) that prepares and establishes standards for a wide variety of science and engineering applications including transmission codes such as ASCII and companding standards like µ-law, among other things. See also Standards. ANSI/IEEE Standard 754: See IEEE Standard 754. Anti-alias Filter: A filter used at the input to an A/D converter to block any frequencies above f s ⁄ 2 , where f s is the sampling frequency of the A/D (analog to digital) converter. The anti-alias filter is analog and usually composed of resistive and capacitive components to provide good attenuation above f s ⁄ 2 . With the introduction of general oversampling techniques and more specifically sigmadelta techniques, the specification for analog anti-alias filters is traded off against using DSPedia 26 oversampling and digital low pass filters. See also Aliasing, Analog to Digital Converter, Oversampling, Sampling, Sample and Hold. Magnitude Frequency domain representation of antialias filter fs/2 frequency AntiAlias Filter To DSP Processor ADC Magnitude Magnitude Analog input voltage fs/2 fs/2 frequency frequency Frequency spectra of an analog signal before and after being filtered by an anti-alias filter. Aperture: The physical distance spanned by an array of sensors or an antenna dish. Aperture is a fundamental quantity in DSP applications ranging from RADAR processing to SONAR array processing to geophysical remote sensing. sensors array aperture See also Beamforming, Shading Weights. Aperture Taper: See Shading Weights. Application Specific Integrated Circuit (ASIC): A custom designed integrated circuit targeted at a specific application. For example, an ASIC could be designed that implements a 32 tap digital filter with weights set up to provide high pass filtering for a digital audio system. Architecture: The hardware set up of a particular DSP system. For example a system which uses four DSP processors, may be referred to as a parallel processing DSP architecture. At the chip level, inside most DSP processors a control bus, address bus and data bus are used that is often referred to generically as the Harvard architecture. See also DSP Board, DSP Processor. Arpanet: The name for a US Defense Department’s Advanced Research Projects Agency network (circa 1969) which was the first distributed communications network and has now “probably” evolved into the Internet. Array (1): The name given to a set of quantities stored in a tabular or list type form. For example a 3 × 5 matrix could be stored as a 3 × 5 array in memory. Array Multiplier: 27 Array (2): The general name given to a group of sensors/receivers (antennas, microphones, or hyrophones for example) arranged in a specific pattern in order to improve the reception of a signal impinging on the array sensors. The simplest form of array is the linear, or 1-D (one dimensional) array which consists of a set of (often equally spaced) sensors. This array can be used to discriminate angles of arrival in any plane containing the array, but is limited because of a cone of confusion. This cone is the cone of angles of arrival that all give rise to identical time differences at the array. cone of confusion linear equi-spaced array The 2-D array has a set of elements distributed in a plane and can be used to discriminate signals in two dimensions of arrival angle. A similar, but less severe confusion results since signals from opposite sides of the plane containing the array (top-bottom) give rise to the same time delays at each of the elements. This may or may not be a problem depending on the geometry of the array and the particular application of the array. 3-D arrays can also be used to eliminate this ambiguity. See also Beamforming. Array Multiplier: See Parallel Multiplier. ASCII: American Standard Code for Information Interchange. A 7 bit binary code that defines 128 standard characters for use in computers and data transmission. See also EBCDIC. Assembler: A program which takes mnemonic codes for operations on a DSP chip, and assembles them into machine code which can actually be run on the processor. See also Cross Compiler, Machine Code. Assembly Language: This is a mnemonic code used to program a DSP processor at a relatively low level. The Assembly language is then assembled into actual machine code (1’s and 0’s) that can be downloaded to the DSP system for execution. The assembly language for DSP processors from the various DSP chip manufacturers is different. See also Cross Compiler, Machine Code.. movep clr rep mac macr movep y:input, x:(r0) a #19 x0, y0, a x0, x0, a a, y:output ; input sample x:(r0)+,x0 y:(r4)+, y0 x:(r0)+,x0 y:(r4)+, y0 r0); output filtered sample A segment of Motorola DSP56000 assembly language to realize a 20 tap FIR filter Asymptotic: When a variable, x, converges to a solution m, with the error e = x – m reducing with increasing time, but never (in theory) reaching exactly m, then the convergence is asymptotic. For example the function: xn = 2–n (23) DSPedia 28 will asymptotically approach zero as n increases, but will never reach exactly zero. (Of course, if finite precision arithmetic is used then the quantization error may allow this particular result to converge exactly.)The function xn can be plotted as: en 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 Iteration, n See also Adaptive Signal Processing, Convergence, Critically Damped, Overdamped, Underdamped. Asynchronous: Meaning not synchronized. An asynchronous system does not work to the regular beat of a clock, and is likely to use handshaking techniques to communicate with other systems. See also Handshaking. DSP System 1 RTS CTS Tx DSP System 2 A simple protocol for handshaking. DSP system 1 send an RTS signal (request to send data) to DSP System 2, which replies with a CTS signal (clear to send data) if it is ready to receive data. After the handshake using RTS and CTS, the data can be transmitted on the Tx line. Asynchronous Transfer Mode (ATM): A protocol for digital data transmission (e.g., voice or video data) that breaks data from higher levels in a network into 53 byte cells comprising a 5 byte header and 48 data bytes. The protocol allows for virtual circuit connections (i.e., like a telephone circuit) and can be used to support a datagram network (i.e., like some electronic mail systems). In spite of the word Asynchronous, ATM can be used over the ubiquitous synchronous optical network (SONET). Attack-Decay-Sustain-Release (ADSR): In general the four phases of the sound pressure level envelope of a musical note comprise: (1) the attack, when the note is played; (2) the decay when the note starts to reduce in volume from its peak; (3) the sustain where the note holds its volume and the decay is slow and; (4) the release after the note is released and the volume rapidly decays Attenuation: 29 Musical Note Volume away. The ADSR profile of most musical instruments is different and varies widely for different classes of instrument such as woodwind, brass, and strings. Attack Decay Sustain Release time The amplitude envelope of a musical instrument can usually be characterized by four different phases. The relative duration of each phase depends of course on the instrument being played. Specification of the ADSR values is a key element for synthesizing of musical instruments. See also Music, Music Synthesis. Attenuation: A signal is attenuated when its magnitude is reduced. Attenuation is often measured as a (modulus) ratio ( V out ⁄ V in ) , or in dBs as 20 log 10 V out ⁄ V in . Note that an attenuation of 10 is equivalent to a gain of 10, expressed in dB, an attenuation of 20dB is equivalent to a gain of 20dB, i.e., 1 - or Attenuation (dB) = −Gain (dB) Attenuation Factor = -----------------------------Gain Factor (24) Therefore an attenuation factor of 0.1, is actually a gain factor of 10! The simplest form of attenuator for analog circuits is a resistor bridge. Of course, to avoid loading the source it is more advisable to use an op-amp based attenuator.) See also Amplifier. Vin Attenuator time Voltage Voltage Vout time Audio: Audio is the Latin word for “I hear” and usually used in the context of electronic systems and devices that produce and affect what we hear. Audio Evoked Potential: See Evoked Potentials. Audio Engineering Society/ European Broadcast Union (AES/EBU): The AES/EBU is the acronym used to describe a popular digital audio standard for bit serial communications protocol for transmitting two channels of digital audio data on a single transmission line. The standard requires the use of 32kHz, 44.1kHz or 48kHz sample rates. See also Standards. Audio Engineering Society (AES): The Audio Engineering Society is a professional organization whose area of interest is all aspects of audio. The international headquarters are at 60 East 42nd DSPedia 30 Street, Room 2520, New York, NY 10165-2520, USA. The British is at AES British Section, Audio Engineering Society Ltd, PO Box 645, Slough SL1 8BJ, UK. Audiogram: An audiogram is a graph showing the deviation of a person’s hearing from the defined “average threshold of hearing” or “Hearing Level”. The audiogram plots hearing level, dB (HL), against logarithmic frequency for both ears. dB (HL) are used in preference to dB (SPL) - sound pressure level - in order to allow a person’s hearing profile to be compared with a straight line average unimpaired hearing threshold. Threshold of hearing 0dB (HL) line Audiogram -10 0 Hearing Level, dB (HL) 10 A “reasonably” healthy ear. 20 30 40 An impaired ear with high frequency hearing loss 50 60 70 80 o - Right ear x - Left ear 90 100 110 120 130 140 125 250 500 1000 2000 4000 8000 frequency (Hz) An audiogram is produced by an audiologist using a calibrated audiometer to find the lowest level of aural stimuli just detectable by a patient’s left and right ear respectively. See also Audiometry, Auditory Filters, Ear, Equal Loudness Contours, Frequency Range of Hearing, Hearing Impairment, Hearing Level, Permanent Threshold Shift, Sound Pressure Level, Temporary Threshold Shift, Threshold of Hearing. Audiology: The scientific study of hearing. See also Audiometry, Auditory Filters, Beat Frequencies, Binaural Beats, Binaural Unmasking, Dichotic, Diotic, Ear, Equal Loudness Contours, Equivalent Sound Continuous Level, Frequency Range of Hearing, Habituation, Hearing Aids, Hearing Impairment, Hearing Level, Loudness Recruitment, Psychoacoustics, Sensation Level, Sound Pressure Level, Spectral Masking, Temporal Masking, Temporary Threshold Shift, Threshold of Hearing. Audiometer: An instrument used to measure the sensitivity of human hearing using various forms of aural stimuli at calibrated sound pressure levels (SPL). An audiometer is usually a desktop instrument with a selection of potentiometric sliders, dials and switch controls to specify the frequency range, signal characteristics and intensity of various aural stimuli. Audiometers connect to calibrated headphones (for air conduction tests) or a bone-phone (to stimulate the mastoid bone behind the ear with vibrations if tests are being done to detect the presence of nerve deafness). Occasionally free-field loudspeaker tests may be done using narrowband frequency modulated tones or warble tones. (If pure tones were used nodes and anti-nodes would be set up in the test room at various points). Audiometry: 31 The most basic form of audiometer is likely to only produce pure tones over a frequency range of 125Hz, 250Hz, 500Hz, 1000Hz, 2000Hz, 4000Hz, and 8000Hz. More complex audiometers will be able to produce intermediate frequencies and also frequency modulated (FM) or warble tones, bandlimited noise, and spectral masking noise. Because of the dynamic range of human hearing and the severity of some impairments, an audiometer may require to be able to generate tones over a 130dB (SPL) range. Computer based, DSP audiometers are likely to completely displace the traditional analogue electronic instruments over the next few years. DSP audiometers may even be integrated into PC notebook style “DSP Audiometric Workstations”, capable of all forms of audiometric testing, hearing aid testing, and programming of the impending future generation of DSP hearing aids. See also Audiogram, Audiometry, Auditory Filters, Frequency Range of Hearing, Hearing Impairment, Hearing Level, Sound Pressure Level, Spectral Masking, Threshold of Hearing. Audiometry: Audiometry is the measurement of the sensitivity of the human ear [30], [157]. For audiometric testing, audiologists use electronic instruments called audiometers to generate various forms of aural stimuli. A first test of any patient’s hearing is usually done with pure tone audiometry, using tones with less than 0.05% total harmonic distortion (THD) at test frequencies of 125Hz, 250Hz, 500Hz, 1000Hz, 2000Hz, 4000Hz and 8000Hz and dynamic ranges of almost 130dB (SPL) for the most sensitive human hearing frequencies between 2-4kHz. Each ear is presented with a tone lasting (randomly) between 1 and 3 seconds; the randomness avoids giving rhythmic clues to the patient. The loudness of the tones are varied in steps of 5 and 10dB until a threshold can be determined. The patient indicates whether a tone was heard by clicking a switch. As an example of a test procedure, the British Society of Audiology Test B [157] determines the threshold at a particular frequency as follows: 1. Reduce the tone level in 10dB steps until the patient no longer responds; 2. Three further tones are presented at this level. If none or only one of these is heard, that level is taken as unheard; 3. If all tones in stage 2 were heard, the level is reduced by 5dB until the level is unheard, by repeating stage 2 procedure; 4. If stage 2 was not heard the level is raised by 5dB and as many tones are presented as are necessary to deduce whether at least 2 out of 4 presentations were heard. If this level is heard it is taken as the threshold for that frequency; 5. If stage 4 was not heard the level is raised by 5dB and stage 4 is repeated until a threshold is found; The results of an audiometric test are usually plotted as an audiogram, a graph of dB Hearing Level (HL) versus logarithmic frequency. A audiometric procedure using (spectral) masking is particularly important where one ear is suspected to be much more sensitive than the other. Most audiometers will provide a facility to produce spectral masking noise. Masking noise is generally white and is played into the ear that is not being tested in cases where the tone presented to the test ear is very loud. If masking was not used the conduction of the tone through the skull is heard by the other ear giving a false impression about the sensitivity of the ear under test. More complex audiometers provide a wider range of frequencies, and also facilities for producing narrowband frequency modulated tones, narrowband noise, white noise, and speech noise, thus providing for a more comprehensive facility for investigation of hearing loss. Audiometry is specified DSPedia 32 in IEC 645, ISO 6189: 1983, ISO 8253: 1989. See also Audiogram, Audiology, Ear, Frequency Range of Hearing, Hearing Impairment, Sensation Level, Sound Pressure Level, Spectral Masking, Temporal Masking, Threshold of Hearing. Auditory Filters: It is conjectured that a suitable model of the front end of the auditory system is composed of a series of overlapping bandpass filters [30]. When trying to detect a signal of interest in broadband background noise the listener is thought to make use of a filter with a centre frequency close to that of the signal of interest. The perception to the listener is that the background noise is somewhat filtered out and only the components within the background noise that lie in the auditory filter passband remain. The threshold of hearing of the signal of interest is thus determined by the amount of noise passing through the filter. This auditory filter can be demonstrated by presenting a tone in the presence of noise centered around the tone and gradually increasing the noise bandwidth while maintaining a constant noise power spectral density. The threshold of the tone increases at first, however starts to flatten off as the noise increases out with the bandwidth of the auditory filter. The bandwidth at which the tone threshold stopped increasing is known as the critical bandwidth (CB) or equivalent rectangular bandwidth (ERB). These filters are often assumed to have constant percent critical bandwidths (i.e., constant fractional bandwidths). For normal hearing individuals this bandwidth may be about 18 percent -- so an auditory filter centered at 1000 Hz would have a critical bandwidth of about 180 Hz. The entire hearing range can be covered by about 24 (non-overlapping) critical bandwidths. See also Audiology, Audiometry, Ear, Fractional Bandwidth, Frequency Range of Hearing, Psychoacoustics, Spectral Masking, Temporal Masking, Threshold of Hearing. Aural: Relating to the process of hearing. The terms monaural and binaural are related to hearing with one and two ears respectively. See also Audiology, Binaural, Ear, Monaural, Threshold of Hearing. Auralization: The acoustic simulation of virtual spaces. For example simulating the sound of a stadium (an open sound with large echo and long reverberation times) in a small room using DSP. Autocorrelation: When dealing with stochastic (random) signals, autocorrelation, r ( n ) , provides a measure of the randomness of a signal, x ( k ) and is calculated as: r ( n ) = E { x ( k )x ( k + n ) } = ∑ x ( k )x ( k + n )p { x ( k ), x ( k + n ) } (25) k where p { x ( k ), x ( k + n ) } is the joint probability density function of the signal or random process, x ( k ) at times k and k+n. For ergodic signals using 2M available samples the autocorrelation can be estimated as a time average: 1 r ( k ) = ----M M–1 ∑ x ( n )x ( n + k ) for large M (26) k=0 If the mean and autocorrelation of a signal are constant then the signal is said to be wide sense stationary. In many least mean squares DSP algorithms the assumption of wide sense stationarity is necessary for algorithm derivations and proofs of convergence. Autoregressive (AR) Model: 33 If a signal is highly correlated from sample to sample, then for a particular sample at time, i, the next sample at time i+1 will have a value that can be predicted with a small amount of error. If a signal has almost no sample to sample correlation (almost white noise) then the sample value at time i+1 cannot be reliably predicted from values of the sequence occurring at or before time i. Calculating the autocorrelation function, r ( n ) , therefore gives a measure of how well correlated (“or similar”) a signal is with itself by comparing the difference between samples at time lags of n = 0,1,2,... and so on. Taking the discrete Fourier transform of the autocorrelation function yields the Power Spectral Density (PSD) function which gives a measure of the frequency content of a stochastic signal. See also Ergodic, Power Spectral Density. Signal A Magnitude Magnitude Signal B time, k time, k r(n) r(n) 1 1 Autocorrelation n Magnitude Power Spectral Density frequency Magnitude n frequency Signal A is more highly correlated than Signal B, and therefore from sample to sample, Signal A varies less than Signal B. The autocorrelation function of Signal A is wider than for Signal B because as n increases, samples are correlated with previous values and the signal does not change its magnitude by a large amount. Signal B makes larger and less predictable changes and as the lag value n increases the correlation between the i-th sample, and the (i+n)-th sample reduces rapidly. By inspection Signal B has the wider frequency content, which is confirmed on calculation of the Power Spectral Density function. Autoregressive (AR) Model: An autoregressive model is a means of generating an autoregressive stochastic process. Autoregressive refers to the fact that the signal is the output of a all-pole infinite impulse response (IIR) filter that has been driven by white noise input [17], [90]. DSPedia 34 An autoregressive process can be generated by the signal flow graph and discrete time equations below: v(k) u(k-M+1) u(k-M) bM bM-1 u(k-2) b2 u(k-1) u(k) b1 M u(k ) = ∑ bn v ( k – n ) n=1 = b1 v ( k – 1 ) + b2 v ( k – 2 ) + … + bM – 1 v ( k – M + 1 ) + b M v ( k – M ) An autoregressive model has a feedback (recursive) section but no feedforward (nonrecursive) section. The input signal, v(k), is assumed to be white Gaussian noise. |The output signal, u(k), is referred to as an autoregressive process. When setting the filter weights values, { b n } care must be taken to ensure that the filter is stable and all filter poles are within the unit circle of the z-domain. In addition, since the autoregressive model is generated with a feedback system, it is necessary to let the AR system reach steady state before using the output samples. An M th order autoregressive model is generated from an all-pole digital filter that has M weights (b1 to bM). These weights are also referred to as the autoregressive parameters. The z-domain transfer function can be represented by an M th order z-polynomial: ( z -) = -------------------------------------------------------------------------------------------------1 ----------H( z) = U – 1 V(z) 1 + b 1 z + … + b M – 1 z – M + 1 + b M z –M 1 = ---------------------------------M 1+ (27) ∑ b n z –n n=1 If a stochastic signal is produced by using white noise as an input to an all-pole filter, then this is referred to as autoregressive modelling. The name “autoregressive” comes from the Greek prefix “auto-” meaning, self or one’s own, and “regression” meaning previous or past, hence the combined meaning of a process whose output is generated from its own past outputs. Autoregressive models are sometimes loosely referred to as all-pole models. In addition, sometimes the input to the all-pole model is something other than white noise. For example, in modelling voiced speech a pulse train with the desired pitch period drives the all-pole model. Autoregressive models are widely used in speech processing and other DSP applications whereby a stochastic signal is to be modelled by taking the output of an all-pole filter driven by a stochastic signal. See also All-Zero Filter, Autoregressive Modelling, Autoregressive-Moving Average Filter, Digital Filter, Infinite Impulse Response Filter. Autoregressive Modelling (inverse): 35 Autoregressive Modelling (inverse): Given an M-th order autoregressive process the inverse problem is to generate the AR model parameters which can be used to produce this process from a white noise input: Modelled Signal, or Autoregressive Process Autoregressive Model {b1, b2,..., bM} White Noise v(k) u(k) The output signal u ( k ) is referred to as an autoregressive process, and was generated by a white noise input at v ( k ) . The autoregressive coefficients can be found using statistical signal processing least squares techniques such as Yule-Walker or the LMS algorithm. To do this, one common approach uses the AR process as the input to an M-th order (or greater) all-zero filter with weights {1, b1, b2, ... bM}. If the M adjustable weights are selected to minimize the output power, the output will be white noise process. In addition, the feed-forward coefficients from the all-zero model will correspond the parameters of the autoregressive input process. This use of an adaptive FIR predictor is referred to as autoregressive modelling [6], [10], [17]: Modelled Signal, or Autoregressive Process White Noise All-zero Filter {1,b1, b2,..., bM-1} u(k) v(k) The white noise signal v ( k ) can be reproduced by using the modelled stochastic signal as an input to an all zero (FIR) filter with M weights, the first weight being 1. u(k) u(k-1) u(k-M+1) u(k-M) Modelled Signal b1 bM-1 bM v(k) White Noise Generation of white noise from an autoregressive process using an all-zero filter. To see that the AR parameters are recovered we can rewrite Eq. 27 (see Autoregressive Model) as: ( z )- = 1 + b z –1 + … + b – M + 1 + b z –M ----------V(z) = U M – 1z M 1 H( z) (28) If a given stochastic signal, u ( k ) was in fact generated by an autoregressive process then we can use mean square minimization techniques to find the autoregressive parameters (i.e., the all-pole filter weights) that would produce that signal from a white noise input. First note that the output of the all zero filter is given by: DSPedia 36 M v(k) = u( k) + ∑ bmu ( k – m ) = u( k ) + bT u ( k – 1 ) (29) m=1 where the vector b = [ b1 … bM – 1 bM ] T and the vector u( k) = [ u( k – 1 ) … u(k – M + 1 ) u(k – M )]T If we attempt to minimize the signal v ( k ) at the output of the filter, then this is implicitly done by generating the predictable components present in the stationary stochastic signal u ( k ) (assuming the filter is of sufficient order) which means that the output v ( k ) will consist of the completely unpredictable part of the signal which is, in fact, white noise (See Wold Decomposition and [17]). To use MMSE techniques, first note that the squared output signal is: v2( k ) = [ u( k ) + bT u( k – 1 ) ]2 = u 2 ( k ) + [ b T u ( k – 1 ) ] 2 + 2u ( k )b T u ( k – 1 ) (30) = u 2 ( k ) + b T u ( k – 1 )u T ( k – 1 )b + 2b T [ u ( k )u ( k – 1 ) ] Taking expected (or mean) values using the expectation operator E { . } we can write the mean squared value, E { v 2 ( k ) } as: E { v 2 ( k ) } = E { u 2 ( k ) } + b T E { u ( k – 1 )u T ( k – 1 ) }b + 2b T E { u ( k )u ( k – 1 ) } (31) Writing in terms of the M × M correlation matrix, R = E { u ( k – 1 )u T ( k – 1 ) } = r0 … rM – 2 rM – 1 : … … : r0 : r1 rM – 1 … r1 r0 rM – 2 (32) and the M × 1 correlation vector, r1 r = E { u ( k )u ( k – 1 ) } = where r n = E { u ( k )u ( k – n ) } = E { u ( k – n )u ( k ) } r2 : rM (33) . gives, E { v 2 ( k ) } = E { u 2 ( k ) } + b T Rb + 2b T r (34) Autoregressive Modelling (inverse): 37 Given that this equation is quadratic in b then there is only one minimum value (See entry for Wiener-Hopf Equations for more details on quadratic surfaces). The minimum mean squared error (MMSE) solution occurs when the predictable component in the signal u ( k ) is completely predicted, leaving only the unpredictable white noise as the output. This yields the autoregressive components, b AR , can be found by setting the (partial derivative) gradient vector, ∇ , to zero: ∇ = ∂ E { x 2 ( k ) } = 2Rb AR + 2r = 0 ∂b (35) ⇒ b AR = – R –1 r (36) Therefore, given a signal that was generated by an autoregressive process, Eq. 36 (known as the Yule Walker equations) can be used to find the parameters of the autoregressive process, that would generate the signal u ( k ) given a white noise input signal, v ( k ) . To practically calculate Yule Walker equations requires that the R matrix and r vector are realized from the stochastic signal u ( k ) , and the R matrix is then inverted prior to premultiplying vector r. Assuming that the signal u ( k ) is ergodic, then in the real world we can calculate elements of R and r from: 1 r n ≅ ---N N–1 ∑ u ( k )u ( k – n ) (37) n=0 where N is a large number of samples that adequately represent the signal. Clearly, solving the Yule-Walker equations requires a very large number of computations, and is usually not done directly in real time systems (See entry Wiener-Hopf for more details). Instead the Levinson-Durbin algorithm is used which is an efficient technique for solving equations of the form of Eq. 36. In many systems the LMS (least mean squares) algorithm [53] is used in a predictor architecture: Autoregressive Process u( k) Adaptive Filter, w + + v( k) LMS Algorithm y ( k ) = Filter { u ( k ), w ( k ) } w ( k + 1 ) = w ( k ) + 2µv ( k )x ( k – 1 ) The signal that was generated by an autoregressive process is input to the delay and thereafter adaptive filter. The adaptive filter attempts to minimize the signal v ( k ) and will therefore set the coefficients to values such that the periodic component of the signal is predicted by the autoregressive filter weights. Autoregressive modelling is widely used in speech processing and whereby speech is assumed to be generated by an autoregressive process and by extracting the autoregressive filter weights DSPedia 38 (parameters) these can be used for later generation of unvoiced speech components (speech synthesis) or for speech vocoding [11]. For model based speech coding the linear prediction problem of Eq. 36 is solved using the Levinson-Durbin algorithm. For speech coding techniques based on waveform coding, the predictor is more likely to the of the simple LMS form. Other stochastic linear filter models include the moving average (MA) model and the autoregressive moving average (ARMA) models. However the autoregressive filter is by far the most popular for modelling for the main reasons that to find weights requires the solution of a set of linear equations and that it is a generally good model for many applications. The MA or ARMA models, on the other hand, require the solution of a (more difficult to solve) set of non-linear equations. See also Adaptive Filtering, Autoregressive Model, Autoregressive Moving Average Filter, Autoregressive Parametric Spectrum Estimation, Least Mean Squares Algorithm, Moving Average Model. Autoregressive Moving Average (ARMA) Model: An autoregressive moving average model uses a combination of an autoregressive model and moving average model. If white noise is input to an ARMA model, the output is the desired process signal u ( k ) . Unfortunately solving the equations for an ARMA model requires the solution of a set of non-linear equations. See also Autoregressive Model, Moving Average FIR Filter. Autoregressive Parametric Spectral Analysis: Using an autoregressive model we can perform parametric power spectral analysis. From the coefficients of the all-pole filter, we can generate the power spectrum of the autoregressive process output, u ( k ) (see above figure in Autoregressive Model) by exploiting the fact that the white noise input has a flat spectrum and a total power of σ 2 [17], [90].Noting that the filter frequency response is: 1 H ( f ) = -----------------------------------------------------------------------------------------------------------------1 + b 1 e –jω + … + b M – 1 e –j ( M – 1 )ω + b M e –jMω 1 = --------------------------------------M 1+ (38) ∑ bn e –jωn n=1 then the power spectrum of the autoregressive filter output is: Y ( f ) 2 = σ2 H( f ) 2 (39) (assuming frequency is normalized so fs=1). See also Autoregressive Model, Autoregressive Modelling. Autoregressive (AR) Power Spectrum: See Autoregressive Model. Autoregressive (AR) Process: See Autoregressive Model. Averaging: See Waveform Averaging, Exponential Averaging, Moving Average, Weighted Moving Average. AZTEC Algorithm: Amplitude Zone Time Epoch Coding (AZTEC) is an algorithm used for data compression of ECGs. The algorithm very simply decomposes a signal into plateaus and slopes AZTEC Algorithm: 39 which are then coded in an a data array. Compression ratios of a factor of 10 can be achieved, however the algorithm can cause PRD (Percent Root-mean-square Difference) error levels of almost 30% [48]. 40 DSPedia 41 B Back Substitution: See Matrix Algorithms - Back Substitution. Band Matrix: See Matrix Structured - Band. Bandpass Filter: A filter (analog or digital) that preserves portions of an input signal between two frequencies. See also Bandstop Filter, Digital Filter, Low Pass Filter, High Pass Filter. Bandwidth Input Output G(f) Magnitude |G(f) | Bandpass Filter Lower cut-off frequency Upper cut-off frequency frequency Bandstop filter: A filter (analog or digital) that removes portions of an input signal between two frequencies. See also Bandpass Filter, Low Pass Filter, High Pass Filter. Stopband Input G(f) Output Magnitude |G(f) | Bandstop Filter Lower cut-off frequency Upper cut-off frequency frequency Bandwagon: The general English definition is a party, cause or group that people may jump on, or become involved with when it looks likely to succeed. The term was used by the famous information theorist Claude Shannon in 1956 [130] to describe the explosion of interest in his then recently published (1948) information theory paper. In referring to that particular bandwagon Shannon commented that: “Research rather than exposition is the keynote, and our critical thresholds should be raised. Authors should submit only their best efforts, and these only after careful criticism by themselves and their colleagues. A few first rate papers are preferable to a large number that are poorly conceived or half finished. The latter are no credit to their writers and a waste of time to their readers.” Bartlett Window: See Window. Baseband: Typically, a signal prior to any form of digital or analog modulation. A baseband signal extends from 0Hz contiguously over an increasing frequency range. For example if a radio station produces a baseband audio signal (typically music, 0 - 12kHz) in either a digital or analog form, the baseband signal is then modulated onto a carrier (such as 102.5MHz for an FM radio station) for transmission and subsequent reception by radio receivers. At the radio receiver the signal will be DSPedia 42 demodulated back to its original frequency band. Baseband can also refer to a naturally bandpass signal that has been mixed down to DC. Basis: See Vector Properties and Definitions - Basis. Basis Function: A periodic signal, x ( t ) ,with period T can be expressed as a series of periodic basis functions, { φ k } such that: ∞ ∑ x(t) = c n φn ( t ) (40) n = –∞ A basis is said to be orthogonal if: 〈 φ i ( t ), φ i ( t )〉 = b ∫a φ i ( τ )φj* ( τ )dτ for i ≠ j (41) where “*” denotes complex conjugate. It is useful to find an orthogonal basis, as if other functions are to be used to approximate a given signal, then it is useful to have as little similarity as possible between the various functions to avoid providing redundant information. The complex exponential used in the Fourier series are an orthogonal set of functions and if φ k ( t ) = e jkω 0 t where ω 0 = 2π ⁄ T then this is the complex or exponential Fourier series See also Fourier Transform, Matrix Operations. Baud: A measure of data transmission rate, mean symbols per second. Baud is often mis-used to mean bits per second. A baud is actually equal to the number of discrete events or transitions per second. There is potential confusion over the proper use of the word baud since at high data transmission speeds where data compression techniques are used (V42bis) the number of character bits per second transmitted does not necessarily equal the transmitted data rate in symbols per second. Baugh-Wooley Multiplier: A type of parallel multiplier which operates on 2’s complement data and is widely used in DSP [106]. See also Parallel Multiplier. Bayes Theorem: See Probability. Beamforming: A technique to enhance the sensitivity of a device towards a given direction (the look direction) by exploiting the spatial separation of an array of sensors (microphones or hydrophones for example). The array could be a linear 1-D array, 2-D array or even 3-D. The primary motivation behind beamforming is often a desire to copy a signal of interest while suppressing spatially disparate interfering signals. Delay-and-sum beamformers simply combine the outputs of a number of sensors (after signals are delayed to allow constructive interference in the look direction). More advanced adaptive beamforming techniques go further by attempting to null out any signals arriving from at the array that are not in the desired look direction. The key mechanisms responsible for the spatial sensitivity of a beamformer are constructive and destructive interference. Bearing estimation is related to beamforming, but not necessarily the same. A bearing estimator enhances Direction of Arrival (DOA) information for signals of interest, while a beamformer produces an enhanced copy of a signal of interest. See also Adaptive Beamformer, Bearing Estimation, 43 Broadside, Endfire, Constructive Interference, Interference, Localization, Spatial Filtering. Delay-and-Sum Sidelobe OUTPUT Beamformer, Destructive Interfering Signal in Null region Desired Signal impinging on mainlobe DSP Beamforming Implementation BEAMPATTERN Beamformer shown with resultant beampattern (polar plot of spatial sensitivity). Beampattern: A plot of spatial sensitivity of a beamformer (or antenna) as a function of direction. The main lobe and sidelobes are often easily distinguished. Any nulls (direction with virtually no sensitivity) are also clearly distinguished. Beampatterns can be plotted for a single frequency (useful for a narrowband application) or as a broadband measure where the sensitivity in each direction is integrated over the frequency span of interest. Broadband patterns seldom contain the deep nulls that are present in narrowband patterns. See also Beamformer, Localization. mainlobe array gain as a function of angle sidelobe -15 -10 -5 0 dB contour Typical Beampattern Bearing Estimation: A classic signal processing problem where it is required to find the angular direction of a number of incoming source signals. In bearing estimation, source signal copy is not a concern. See also Beamforming, Localization. Beat Frequencies: When two audible tones of similar frequencies are played together they will effectively go in and out of phase with each other and alternately constructively and destructively interfere. Depending on the frequencies and the magnitude of the difference between the tones they may be aurally perceived as beat frequencies rather than two distinct tones. If the frequency difference is no greater than about 10Hz then the ear will follow the amplitude fluctuations and therefore perceive a low beat frequency. Beat frequencies are heard most clearly for tones between around 300Hz and 600Hz. As the frequency of the tones increases above 1000-1500Hz the tones will be heard distinctly rather than as beats. This phenomenon is consistent with the fact the neural firings of the auditory system lose synchrony with the incoming sine wave at these frequencies. Simple trigonometry shows that: DSPedia 44 (A – B) (A + B) cos A + cos B = 2 cos ------------------ cos ------------------2 2 (42) Therefore if a 100Hz tone and a 110Hz tone are played simultaneously the composite tone can be written as: ( 2π10t ) ( 2π210t ) cos ( 2π100t ) + cos ( 2π110t ) = 2 cos -------------------- cos -----------------------2 2 (43) = 2 cos ( 2π5t ) cos ( 2π105t ) Amplitude which can be represented as: 0 -1 Amplitude 100Hz tone 1 0.05 0.1 0.15 0.2 time (secs) 110Hz tone 1 0 -1 time (secs) 0.05 0.1 0.15 0.2 100Hz + 110Hz tone Amplitude 2 1 0 time (secs) -1 -2 0.05 0.1 0.15 0.2 The composite tone clearly shows the amplitude fluctuation at 10 times per second caused by the 5Hz modulation effect. A phenomenon called binaural beats (as distinct from the above description of monaural beats) occurs when a tone of one frequency is presented to one ear, and a slightly different tone frequency is presented to the other ear [30]. The sound will appear to fluctuate at a rate corresponding to the difference between the frequencies. See also Audiology, Binaural Beats, Binaural Unmasking, Psychoacoustics. 45 Bell 103/ 113: The Bell 103/113 is a modem standard for communication at 300 bits/sec. The Bell 103/113 is a full duplex modem using FSK (frequency shift keying) modulation. The frequencies used are: Transmit: Receive: Originate End (Hz) Answer End (Hz) Space 1070 2025 Mark 1270 2225 Space 2025 1070 Mark 2225 1270 The transmit level is 0 to -12 dBm and the receive level is 0 to -50 dBm. Although in the mid 1990s modem speeds of 14400 bits/sec are standard and (compressed) bit data rates of 115200 bits/sec are achievable for remote computer communication, the 300 baud modem is still one of the top selling modems! This is due to low rate modems being used for short time connection applications where only a few bytes of data are exchanged, such as telephone credit card verification, traffic light control, remote metering and security systems. See also Bell 202, Standards, V-Series Recommendations. Bell 202: The Bell 202 is a modem standard for communication at 1200 bits/sec. The Bell 202 is a half duplex modem using FSK (frequency shift keying) modulation. The frequencies used are: Transmit (Hz) Space 2200 Mark 1200 See also Bell 103/113, Bell 212, Standards, V-Series Recommendations. Bell 212: The Bell 212 is a modem standard for communication at 1200 bits/sec. The Bell 202 is a full duplex modem using QPSK (quadrature phase shift keying) modulation. The carrier frequencies used are: Originate End (Hz) Answer End (Hz) Transmit: 1200 2400 Receive: 2400 1200 DSPedia 46 Each keying carries two bits: Message (2 bits) Phase Angle 00 90o 01 0o 10 180o 11 270o See also Bell 103/113, Bell 202, Standards, V-Series Recommendations. Bento: Bento is a multimedia data storage and interchange format the development of which was sponsored primarily by Apple Inc probably with the intention that it would become a de facto standard. The standard is available from ftp://ftp.apple.com/apple/standards/. See also Standards. BER vs. S/N Test: (Bit Error Rate vs. Signal to Noise Ratio). A test used to measure the ability of a modem (or a digital communication system) to operate over noise lines with a minimum of data transfer errors. Since even on the best of telephone lines there is always some level of noise, the modem should work with the lowest S/N ratio possible. Bit Error Rate 10-2 10-3 10-4 10-5 10-6 4 6 8 10 12 14 16 Signal to Noise (dB) Plot of BER vs. S/N for a typical modem operating at 1200 bits/second Other modem performance characteristics include BER vs. Phase Jitter which demonstrates the tolerance to phase jitter; BER vs. Receive Level which measures the sensitivity to the received signal dynamic range (typically 36dB is the minimum desirable); BER vs. Carrier Offset which indicates how the modem performance is affected by the shifts in the carrier frequency encountered in normal public telephone networks (ITU-T specifications allow up to as a 7Hz offset). Bessel Filter: See Filters. Bidiagonal Matrix: See Matrix Structured - Bidiagonal. 47 Binary: Base 2, where only the digits 0 and 1 are used to represent numbers, e.g. LSB MSB 27 26 25 24 23 22 21 20 128 64 32 16 8 4 2 1 0 1 0 1 1 0 1 0 = 90 1 0 0 0 0 0 0 0 = 128 1 1 0 0 0 0 0 0 = 192 The decimal equivalents of the unsigned 8 bit numbers 01011010, 10000000, and 11000000. See also Binary Point, Two’s Complement. Binary Phase Shift Keying (BPSK): A special case of PSK in which two signals with differing phase exist in the signal set. See also Phase Shift Keying. Binary Point: The binary point is the base 2 equivalent of the decimal point. Bits after the binary point have a fractional value. See also Fractional Binary, Integer Arithmetic, Two’s Complement. . LSB MSB −20 2-1 2-2 2-3 2-4 2-5 2-6 2-7 −1 0.5 0.25 0.125 0.0625 0.03125 0 1 0 1 1 0 1 0 = 0.84375 1 0 0 0 0 0 0 0 = −1 1 1 0 0 0 0 0 0 = −0.5 0.015625 0.0078125 The decimal equivalents of 0.1011010, 1.0000000, and 1.1000000. Note that the 2’s complement notation can still be used, with the most significant bit having a weighting of −1. Binaural: Binaural processing refers to an audio system that processes signals for presentation to two ears. See also Monaural, Monophonic, Stereophonic. Binaural Beats: A phenomenon called binaural beats occurs when a tone of one frequency is presented to one ear, and a slightly different tone frequency is presented to the other ear using headphones. The sound will appear to fluctuate at a rate corresponding to the difference between the frequencies. Binaural beats are a result of the interaction of the nervous system of the output of the ear to the brain. Binaural beats would appear to indicate that the auditory nerve preserves DSPedia 48 phase information about the acoustic stimulus [30]. See also Audiology, Beat Frequencies, Binaural Unmasking, Psychoacoustics. 300Hz 310Hz Listener will experience 10 binaural beats per second. Binaural Unmasking: If a tone masked by white noise is played into one ear or both ears (diotic stimulus) then the auditory mechanism will not perceive the tone without either increasing the tone sound pressure level (SPL) or decreasing the white noise SPL. However if the tone + white noise is played into one ear, and the white noise only into the other ear (dichotic stimulus) then the auditory effect of binaural unmasking will actually make the tone more readily detectable. Binaural unmasking will also occur when noise + tone is input to both ears, but the phase of one the tones is shifted by 180o relative to the other one.) Noise +Tone Noise + Tone Tone NOT perceived The tone in the both ears is completely masked by the white noise and therefore not perceived. Noise only Noise + Tone Tone perceived If noise only is played into the right ear the tone becomes readily detectable. Hence the auditory mechanism is providing a form of noise cancellation. As a crude DSP analogy, compare this effect to the adaptive noise canceller whereby if a (correlated) noise reference is available the noise in a speech + noise signal can be attenuated, thus providing the improved SNR at the canceller output. See also Adaptive Noise Cancellation, Audiometry, Dichotic, Diotic. Biomedical Signals: Over the last few years biomedical signals such as ECGs, EEGs, Evoked Potentials, EMGs have been recorded using DSP acquisition hardware, sampling at a few hundred Hertz. There is now considerable work to develop DSP algorithms for analysis and classification, and compression of sampled biomedical signals [48]. IEEE Transactions on Biomedical Engineering is a good source for further information. See also ECG, EMG, Evoked Potentials. Bipolar (1): A type of integrated circuit that uses NPN or PNP bipolar transistors in its construction [45]. 49 Bipolar (2): Bipolar refers to the type of signalling method used for digital data transmission, in which either the marks or the spaces are indicated by successively alternating positive and negative polarities. See also Non-return to Zero, Polar. Bit: A single binary digit; a 0 (a space) or 1 (a mark). Bit Error Rate (BER): The fraction of bits in error occurring in a received bit stream. BER is calculated as the average number of bits in error, divided by the total number of bits in a given binary digit data stream. See also BER vs. S/N Test. Bit Reverse Addressing: Due to the nature of the FFT algorithm it is often required to access data from memory in a non-arithmetic sequence (i.e. not 0,1,2, etc.) but in a sequence which is generated by reversing the address bits. As this type of addressing is very common to a DSP processor computing FFTs, this special addressing mode is available in some DSP processors to make programming easier, and algorithm execution faster. See also Decimation-in-Time, Decimation-in-Frequency, FFT. Bit Serial Multiplier: See Parallel Multiplier. Bitstream: Bitstream (Philips technology) DACs use sigma-delta technology to produce low cost and precise digital to analog converters. See Sigma Delta. Blackmann Window: See Window. Blackmann-harris Window: See Window. Blue Book: Shorthand name for the ITU-T regulations published in 1988 in 20 volumes and 61 Fascicles with a blue cover! (The ITU were known as the CCITT in 1988.) See also International Telecommunication Union, Red Book, Standards. Board: See DSP Board. Bounded: When the upper and lower values of specific parameters of a signal (or function) are known, or can be calculated or inferred from prior knowledge, then that parameter is said to be bounded. Boxcar Filter: See Moving Average. Brick Wall Filter: This is a filter having a frequency response that falls off to zero with infinite slope at some specified frequency. Although such filters are desirable in various DSP applications a true brick wall filter does not exist, and approximations with tolerable errors must be made. Magnitude In the ideal brick wall filter, all frequencies below f0 are passed by the filter, and all frequencies above f0 are completely removed. f0 Broadband: See Wideband. frequency DSPedia 50 Broadband Hiss: If a speech or music signal has a relatively low level of superimposed white noise then this is referred to as broadband hiss. The term hiss is onomatopoeic -- the prolonged sound of the “ss’s” gives a good simulation of the phenomenon. See also Dithering, White Noise. Broadband Integrated Digital Services Network (BISDN): Generally, BISDN refers to the information infrastructure provided by communications companies and institutions. The term BISDN evolved from the Integrated Services Digital Network (ISDN) to be a superset of the hardware and protocols provided by a previously adequate network infrastructure. Broadside: A beamformer configuration in which the desired signal is located at right angles to the line or plane containing an array of sensors. See also Beamforming, Endfire. Σ 90o BROADSIDE Broadside Direction indicated for a linear array of sensors. Buffer: Usually an area of memory used to store data temporarily. For example a large stream of sampled data is buffered in memory as 1000 sample chunks prior to digital signal processing. Buffers are also used in data communications to compensate for changes in the rate of data flow (e.g., rate fluctuations due to data compression algorithms). Voltage Voltage Follower Amplifier time Very low power voltage + - Voltage Buffer Amplifier: An amplifier with a high input impedance and low output impedance that has a voltage gain of one. If, for example, a sensor outputs an analog voltage that is of the appropriate magnitude to input to an ADC, but it cannot deliver or sink enough current, then a buffer amplifier can be used prior to the ADC converter. The simplest form of buffer amplifier to build is a voltage follower with gain 1, implemented using an op-amp. time High power voltage Burst Errors: When a large number of bits are incorrect in a relatively short segment of data bits then a burst error has occurred. In burst errors the average bit error rate is greatly exceeded by multiple bit errors. When the number of bits in error is very high then non-interleaved error correction schemes are unlikely to be successful and retransmission of the data may be required. See also Channel Coding, Interleaving, Cross-Interleaved Reed-Solomon Coding. 51 Bus: The generic name given to a set of wires used to transmit digital information from one point to another. A bus can be on-chip or off-chip. See also DSP Processor. Busy Tone: Tones at 480 Hz and 620 Hz make up the busy tone for telephone systems. Butterfly: The name given to the signal flow graph (SFG) element which can be used as a basic computational element to construct an N point fast Fourier transform (FFT) computation. See also FFT. k WN -1 Butterworth Filter: See Filters. Byte: 8 bits. 2 nibbles. 52 DSPedia Cable (1): 53 C Cable (1): One or more conductors (such as copper wire) or other transmission media (such as optical fiber) within a protective sheath (usually plastic) to allow the efficient propagation of signals. Cable (2): A generic name for cable TV systems using coaxial cable and/or optical fibers to transmit signals. Cable was first introduced into areas of the USA where geographical features prevented normal terrestrial TV reception. Within a few years of its introduction it proved so popular, flexible and reliable that cable became widely available all over the USA. Currently cable companies are involved in developing digital broadcast systems, and interactive TV viewing features. Cache: A useful means of keeping often used data or information handy, a cache is simply a buffer of memory whose contents are updated according to an algorithm that is designed to minimize the number of data accesses that require looking beyond the cache memory. Both hardware and software implementations of the cache algorithms are common in DSP systems. Call Progress Detection (CPD): A technique for monitoring the connection status during initiation of a telephone call by detecting the presence of call progress signalling tones such as the dialing tone, or the engaged (busy) signals as commonly found in the telephone network. Carrier Board: A printed circuit board that can host a number of daughter modules providing facilities such as a DSP processor, memory, and I/O channels. A carrier board without daughter modules has no real functionality. See also DSP Board, DSP Processor. Carry Look-Ahead Adder: See entry for Parallel Adder. Cassette Tape: See Compact Cassette Tape. Cauchy-Schwartz Inequality: See Vector Properties - Cauchy-Schwartz. Causal: A signal produced by a real device or system is said to be causal. If a signal generating device is turned on at time, t = t 0 , then the resultant signal produced exists only after time, t = t 0 : x ( t ) if t ≥ t 0 y(t) = 0 if t < t 0 (44) SIgnals that are not causal are said to be non-causal. Although in the real world all signals are necessarily causal, from a mathematical viewpoint non-causal signals can be useful for the analysis of signals and systems. Central Processing Unit (CPU): The part of the processor that performs that actual processing operations of addition, multiplications, comparison etc. The size of the arithmetic in the CPU usually defines the processor wordlength. For example the DSP56002 has a 24 bit CPU, meaning that it is a 24 bit processor. Usually the CPU wordlength matches the data bus width. If a DSP processor is floating point, then the CPU will also be capable of floating point arithmetic. See also DSP Processor. DSPedia 54 Channel: The generic name given to the transmission path of any signal, which usually changes the signal characteristics, e.g. a telephone channel. Also used to mean the input or output port of a DSP system. For example a DSP board with two ADCs and one DACs would be described as a twin channel input, single channel output system. Channel Coding: This refers to the coding of information data that introduces structured redundancy so that inevitable errors introduced by transmitting symbols over noisy channels will be correctable (or at least detectable) at the receiver. The simplest channel codes are single bit parity checks (a simple block code). Other, more involved block codes and convolutional codes exist. In block coding a block of k data bits are encoded into n code bits to yield a rate k/n code. Block codes tend to have large k and large n. In convolutional coding the coder maintains a memory of previous data bits and outputs n code bits for each k input bits (using not only the input data bits but also those data bits stored in the coder memory) to yield a rate k/n code. Convolutional codes tend to have small values of k and n with coding strength determined by the amount of memory in the coder. Block and/or convolutional coding techniques can be combined to produce very strong (often cross-interleaved) codes. See also Source Coding, Interleaving, Cross-Interleaved Reed-Solomon Code. Characteristic Polynomial: In order to conveniently specify the code used for cyclic redundancy coding (CRC) or a pseudo random binary sequence, a characteristic polynomial is often referred to. For example, the divisor using in ITU-T V.41 error control is 10001000000100001 is easier to represent as: X 16 + X 12 + X 5 + 1 (45) The index of each term in this polynomial indicates a 1 in the divisor (i.e. the divisor has 1’s at positions 0, 1, 5, 12 and 16). See also Pseudo-Random Binary Sequence. Chebyshev Filter: See Filters. Character: Letter, number, punctuation or other symbol. Characters are the basic unit of textual information. In DSP enabled data communication most characters are represented by ASCII codes. See also ASCII, EBCDIC. Chip: Integrated Circuit. Chip Interval: The clocking period of a pseudo random binary sequence generator. See also Pseudo Random Binary Sequence Generator. Cholesky Decomposition: See Matrix Decompositions - Cholesky. Chorus: A music effect where a delayed, and perhaps low pass filtered version of a signal is added to the original signal to create a chorus or echoic sound. See also Music, Music Synthesis. Chromatic Scale: The complete set of 12 notes in one octave of the Western music scale is often referred to as the chromatic scale. Each adjacent note in the chromatic scale differs by one semitone, which corresponds to multiplying the lower frequency by the twelfth root of 2, i.e. 2 1 / 12 = 1.0594631… . The chromatic scale is also known as the equitempered scale. See also Western Music Scale. Circulant Matrix: See Matrix Structured - Circulant. Circular Buffers: 55 Circular Buffers: This is a effectively a programming concept that allows fast and efficient implementation of shift registers in memory to allow convolutions, FIR filters, and correlations to be executed with a minimum of data movement as each new data sample arrives. Modulo registers, and indirect pointers facilitate circular buffers. Circular Reasoning: See Reasoning, Circular. CISC: Complex Instruction Set Computer (see RISC definition) Clipping: The nonlinear process whereby the value of an input voltage is limited to some maximum and minimum value. An analog signal with a magnitude larger than the upper and lower bounds ± V max of an ADC chip, will be clipped. Any voltage above V max will be clipped and the information lost. Clipping effects frequently occur in amplifiers when the amplification of the input signal results in a value greater than the power rail voltages. Vmax time -Vmax Clipping Circuit time Vout -Vmax Vmax Vin Vout = Vin, for Vin < Vmax Vout = Vmax, for Vin > Vmax Clock: A device which produces a periodic square wave that can be used to synchronize a DSP system. Current technology can produce extremely accurate clocks into the MHz range of frequencies. Clock Jitter: If the clock edges of a clock vary in time about their nominal position in a stochastic manner, then this is clock jitter. In ADCs and DACs clock jitter will manifest itself as a raising of the noise floor [78]. See also Quantization Noise. CMOS (Complimentary Metal Oxide Silicon): The (power efficient) integration technology used to fabricate most DSP processors. Cochlea: The mechanics of the cochlea convert the vibrations from the bones of the middle ear (i.e., the ossicles, often called the hammer, anvil and stirrup) into excitation of the acoustic nerve endings. This excitation is perceived as sound by the brain. See also Ear. Codebook Coding: A technique for data compression based on signal prediction. The compressed estimate is derived by finding the model that most closely matches the signal based on previous signals. Only the error between the selected model and the actual signal needs to be transmitted. For many types of signal this provides excellent data compression since, provided the codebook is sufficiently large, errors will be small. See also Compression. DSPedia 56 Codec: A COder and DECoder. Often used to describe a matched pair of A/D and D/A converters on a single CODEC chip usually with logarithmic quantizers (A-law for Europe and µ -law for the USA.) Coded Excited Linear Prediction Vocoders (CELP): The CELP vocoder is a speech encoding scheme that can offer good quality speech as relatively low bit rates (4.8kbits/sec) [133]. The drawback is that this vocoder scheme has a very high computational requirement. CELP is essentially a vector quantization scheme using a codebook at both analyzer and synthesizer. Using CELP a 200Mbyte hard disk drive could store close to 100 hours of digitized speech. See also Compression. Coherent: Refers to a detection or demodulation technique that exploits and requires knowledge of the phase of the carrier signal. Incoherent or Noncoherent refers to techniques that ignore or do not require this phase information. Color Subsampling: A technique widely used in video compression algorithms such as MPEG1. Color subsampling exploits the fact that the eye is less sensitive to the color (or chrominance) part of an image compared to the luminance part. Since the eye is not as sensitive to changes in color in a small neighborhood of a given pixel, this information is subsampled by a factor of two in each dimension. This subsampling results in one-fourth of the number of chrominance pixels (for each of the two chrominance fields) as are used for the luminance field (or brightness). See also Moving Picture Experts Group. Column Vector: See Vector. Comb Filter: A comb digital filter is so called because the magnitude frequency response is periodic and resembles that of a comb. (It is worth noting that the term “comb filter” is not always used consistently in the DSP community.) Comb filters are very simple to implement either as an FIR filter type structure where all weights are either 1, or 0, or as single pole IIR filters. Consider a simple FIR comb filter: x(k-N) x(k) N-delay elements “+” or “-” y ( k ) = x ( k )±x ( k – N ) The simple comb filter can be viewed as an FIR filter where the first and last filter weights are 1, and all other weights are zero. The comb filter can be implemented with only a shift register, and an adder; multipliers are not required. If the two samples are added then the comb filter has a linear gain factor of 2 (i.e 6 dB) at 0 Hz (DC) thus in some sense giving a low pass characteristics at low frequencies. And if they are subtracted the filter has a gain of 0 giving in some sense a band stop filter characteristic at low frequencies. The transfer function for the FIR comb filters can be found as: Y ( z ) = X ( z ) ± z –N X ( z ) = ( 1 ± z –N )X ( z ) Y ( z )- = ⇒ H ( z ) = ----------( 1 ± z –N ) X(z) (46) Comb Filter: 57 The zeroes of the comb filter, are the N roots of the z-domain polynomial 1 ± z – N : Therefore for the case where the samples are subtracted: 1 – z –N = 0 ⇒ zn = N 1 where n = 0…N – 1 ⇒ zn = N e j2πn noting e j2πn = 1 ⇒ zn = (47) j2πn----------e N And for the case where the samples are added: 1 + z –N = 0 ⇒ zn = ⇒ zn = N e N – 1 where n = 0…N – 1 1 j2π n + --- 2 ⇒ zn = noting e 1 j2π n + --- 2 1 j2π n + --- 2 --------------------------N e = –1 (48) DSPedia 58 As an example, consider a comb filter H ( z ) = 1 + z –8 and a sampling rate of f s = 10000 Hz . The impulse response, h ( n ) , frequency response, H ( f ) , and zeroes of the filter can be illustrated as: Imag 1 h(n) 1 Impulse Response 0.5 -1 0 z-domain -0.5 0 0.5 1 Real -0.5 1 2 3 4 5 6 7 8 time, n Log Magnitude Freq. Response H(f) 10 0 -20 1 0.5 -30 -40 Linear Magnitude Freq. Response 2 1.5 -10 Gain 20 log H ( f ) (dB) -1 0 1000 2000 3000 4000 5000 frequency, (Hz) 0 1000 2000 3000 4000 5000 frequency, (Hz) The impulse response, z-domain plot of the zeroes, and magnitude frequency response of the comb filter, H ( z ) = 1 + z – 8 . Note that the comb filter is like a set of frequency selective bandpass filters, with the first half-band filter having a low pass characteristic. The number of bands from 0 Hz to fs/2 is N/2. The zeroes are spaced equally around the unit circle and symmetrically about the x-axis with no zero at z = 1 . (There is a zero at z = – 1 if N is odd.) Comb Filter: 59 For the comb filter H ( z ) = 1 – z – 8 and a sampling rate of f s = 10000 Hz . The impulse response, h ( n ) , frequency response, H ( f ) , and zeroes of the filter are: Imag h(n) 1 Impulse Response 1 z-domain 0.5 0 1 2 3 4 5 6 7 -1 8 time, n -0.5 0 0.5 1 Real -0.5 -1 Log Magnitude Freq. Response H(f) 10 0 Gain 20 log H ( f ) (dB) -1 -10 -20 1.5 1 0.5 -30 -40 Linear Magnitude Freq. Response 2 0 1000 2000 3000 4000 5000 frequency, (Hz) 0 1000 2000 3000 4000 5000 frequency, (Hz) The impulse response, z-domain plot of the zeroes, and magnitude frequency response of the comb filter, H ( z ) = 1 – z – 8 . The zeroes are spaced equally around the unit circle and symmetrically about the x-axis. There is a zero at z = 1 .There is not a zero a z = – 1 if N is odd. FIR comb filters have linear phase and are unconditionally stable (as are all FIR filters). For more information on unconditional stability and linear phase see entry for Finite Impulse response Filters. Another type of comb filter magnitude frequency response can be produced from a single pole IIR filter: x(k) “+” or “-” b y ( k ) = x ( k ) ±y ( k – N ) N-delay elements y(k-N) A single pole IIR comb filter. The closer the weight value b is to 1, then the sharper the teeth of the comb filter in the frequency domain (see below). b is of course less than 1, or instability results. This type of comb filter is often used in music synthesis and for soundfield processing [43]. Unlike the FIR comb filter note that this comb filter does require at least one multiplication operation. Consider the difference equation of the above single pole IIR comb filter: DSPedia 60 y ( k ) = x ( k )±b y ( k – N ) Y ( z )- = --------------------1 ⇒ G ( z ) = ----------X( z) 1 ± bz – N (49) For a sampling rate of f s = 10000 Hz , N = 8 and b = 0.6 the impulse response g ( n ) , the frequency response, G ( f ) , and poles of the filter are: Imag z-domain -0.5 0 0.5 1 20 15 20 log G ( f ) 0.5 -1 Log Magnitude Freq. Response (dB) 1 Real -0.5 10 5 0 -5 -10 -1 1 G ( z ) = -------------------------1 – 0.6z – 8 0 1000 2000 3000 4000 5000 frequency, (Hz) The z-domain plot of the filter poles and magnitude frequency response of one pole comb filter. The poles are inside the unit circle and lie on a circle of radius 0.6 1 / 8 = 0.938…. As the feedback weight value, b, is decreased (closer to 0), then the poles move away from the unit circle towards the origin, and the peaks of the magnitude frequency response become less sharp and provide less gain. Increasing the feedback weight, b , to be very close to 1, the “teeth” of the filter become sharper and the gain increases: 1 z-domain 0.5 -1 -0.5 0 -0.5 0.5 1 Real 20 log G ( f ) (dB) Imag Log Magnitude Freq. Response 20 15 10 5 0 -5 -10 -1 0 1000 2000 3000 1 G ( z ) = ------------------------1 – 0.9z – 8 4000 5000 frequency, (Hz) The z-domain plot of the filter poles and magnitude frequency response of a one pole comb filter. The poles are just inside the unit circle and lie on a circle of radius 0.9 1 / 8 = 0.987…. Of course if b is increased such that b ≥ 1 then the filter is unstable. The IIR comb filter is mainly used in computer music [43] for simulation of musical instruments and in soundfield processing [33] to simulate reverberation. Finally it is worth noting again that the term “comb filter” is used by some to refer to the single pole IIR comb filter described above, and the term “inverse comb filter” to the FIR comb filter both Comité Consultatif International Télégraphique et Téléphonique 61 described above. Other authors refer to both as comb filters. The uniting feature however of all comb filters is the periodic (comb like) magnitude frequency response. See also Digital Filter, Finite Impulse Response Filter, Finite Impulse Response FiIter-Linear Phase, Infinite Impulse Response Filter, Moving Average Filter. . English Comité Consultatif International Télégraphique et Téléphonique (CCITT): The translation of this French name is the International Consultative Committee on Telegraphy and Telecommunication and is now known as the ITU-T committee. The ITU-T (formerly CCITT) is an advisory committee to the International Telecommunications Union (ITU) whose recommendations covering telephony and telegraphy have international influence among telecommunications engineers and manufacturers. See also International Telecommunication Union, ITU-T. Comité Consultatif International Radiocommunication (CCIR): The English translation of this French names is the International Consultative Committee on Radiocommunication and is now known as the ITU-R committee. The ITU-R (formerly CCIR) is an advisory committee to the International Telecommunications Union (ITU) whose recommendations covering radiocommunications have international influence among radio engineers and manufacturers. See also International Telecommunication Union, ITU-R. is the Comité Européen de Normalisation Electrotechnique (CENELEC): CENELEC European Committee for Electrotechnical Standardization. They provide European standards over a wide range of electrotechnology. CENELEC has drawn up an agreement with European Telecommunications Standards Institute (ETSI) to study telecommunications, information technology and broadcasting. See also European Telecommunications Standards Institute, International Telecommunication Union, International Organisation for Standards, Standards. Common Intermediate Format (CIF): The CIF image format has 288 lines by 360 pixels/line of luminance information and 144 x 180 of chrominance information and is used in the |TU-T H261 digital video recommendation. A reduced version of CIF called quarter CIF (QCIF) is also defined in H261. The choice between CIF and QCIF depends on channel bandwidth and desired quality. See also H-series Recommendations, International Telecommunication Union, Quarter Common Intermediate Format. Compact Cassette Tape: Compact cassette tapes were first introduced in the 1960s for convenient home recording and audio replay. By the end of the 1970s compact cassette was one of the key formats for the reproduction of music. Currently available compact cassettes afford a “good” response of about 65dB dynamic range from 100Hz to 12000Hz or better. Compact cassette outlived vinyl records, and is still a very popular format for music particularly in automobile audio systems. In the early 1990s DCC (Digital Compact Cassette) was introduced which had backwards compatibility with compact cassette. See also Digital Compact Cassette. Compact Disc (CD): The digital audio system that stores two channels (stereo) of 16-bit music sampled at 44.1kHz. Current CDs allow almost 70 minutes of music to be stored on one disc (without compression). This is equivalent to a total of 2 × 44100 × 70 × 60 × 16 = 5927040000 bits of information. (50) CDs use cross-interleaved Reed-Solomon coding for error protection. See also Digital Audio Tape (DAT), Red Book, Cross-Interleaved Reed-Solomon Coding. DSPedia 62 Compact Disc-Analogue Records Debate: Given that the bandwidth of hi-fidelity digital audio systems is up to 22.05kHz for compact disc (CD) and 24kHz for DAT it would appear that the full range of hearing is more than covered. However this is one of the key issues of the CD-analogue records debate. The argument of some analog purists is that although humans cannot perceive individual tones above 20kHz, when listening to musical instruments which produce harmonic frequencies above the human range of hearing these high frequencies are perceived in some “collective” fashion. This adds to the perception of live as opposed to recorded music; the debate will probably continue into the next century. See also Compact Disc, Frequency Range of Hearing, Threshold of Hearing. Compact Disc ROM (CD-ROM): As well as music, CDs can be used to store general purpose computer data, or even video. Thus the disk acts like a Read Only Memory (ROM). Companders: Compressor and expander (compander) systems are used to improve the SNR of channels. Such systems initially attenuate high level signal components and amplify low level signals (compression). When the signal is received the lower level signals appear at the receiving end at a level above the channel noise, and when expansion (the inverse of the compression function) is applied an improved signal to noise ratio is maintained. In addition, the original signal is preserved by the inverse relationship between the compression and expansion functions. In the absence of quantization, companders provide two inverse 1-1 mappings that allow the original signal to be recovered exactly. Quantization introduces an irreversible distortion, of course, that does not allow exact recovery of the original signal. See also A-law and µ -law. Comparator: A device which compares two inputs, and gives an output indicating which input was the largest. Complex Base: In everyday life base 10 (decimal) is used for numerical manipulation, and inside computers base 2 (binary) is used. When complex numbers are manipulated inside a DSP processor, the real parts and complex parts are treated separately. Therefore to perform a complex multiplication of: ( a + jb ) ( c + jd ) = ( ac – bd ) + j ( ad + bc ) (51) where 16 bit numbers are used to represent a, b, c, and d will require four separate real number multiplications and two additions. Therefore an interesting alternative (although not used in an practice to the authors’ knowledge) is to use the complex base ( 1 + j ) , where only the digits 0, 1, and j are used. Setting up a table of the powers of this base gives: (1+j)4 (1+j)3 (1+j)2 (1+j)1 (1+j)0 -4 -2+2j 2j 1+j 1 0 0 1 1 0 1 + 3j 0 0 0 j 0 -1 - j 0 j 1 1 1 j 1 0 1 1 0 -3 + 3j Complex Decimal Numbers in the complex base ( 1 + j ) can then be arithmetically manipulated (addition, subtraction, multiplication) although this is not as straightforward as for binary! Complex Conjugate: 63 Complex Conjugate: A complex number is conjugated by negating the complex part of the number. The complex conjugate is often denoted by a "*". For example, if a = 5 + 7j , then a∗ = 5 – 7j . (A complex number and its conjugate are often called a conjugate pair.) Note that the product of aa* is always a real number: aa∗ = ( 5 + 7j ) ( 5 – 7j ) = 25 + 35j – 35j + 49 = 25 + 49 = 74 (52) and can clearly be calculated by summing the squares of the real and complex parts. (Taking the square root of the product aa∗ is often referred to as the magnitude of a complex number.) The conjugate of a complex number expressed as complex exponential is obtained by negating the exponential power: ( e jω )∗ = e – jω (53) e jω = cos ω + j sin ω , and (54) e –jω = cos ( – ω ) + j sin ( – ω ) = cos ω – j sin ω (55) This can be easily seen by noting that: given that cosine is an even function, and sine is an odd function. Therefore: e jω e – jω = e 0 = cos2 ω + sin2 ω (56) A simple rule for taking a complex conjugate is: “replace any j by -j “. See also Complex Numbers. Complex Conjugate Reciprocal: The complex conjugate reciprocal of a complex number is found by taking the reciprocal of the complex conjugate of the number. For example, if z = a + bj , then the complex conjugate reciprocal is: a + bj1- = ------------1 - = --------------------a – bj a2 + b2 z∗ (57) See also Complex Numbers, Pole-Zero Flipping. Complex Exponential Functions: An exponent of a complex number times t, the time variable, provides a fundamental and ubiquitous signal type for linear systems analysis: the damped exponential. These signals describe many electrical and mechanical systems encountered in everyday life, like the suspension system for an automobile. See also Damped Exponential. Complex LMS: See LMS algorithm. Complex Numbers: A complex number contains both a real part and a complex part. The complex part is multiplied by the imaginary number j, where j is the square root of -1. (In other branches of applied mathematics i is usually used to represent the imaginary number, however in electrical engineering j is used because the letter i is used to denote electrical current.) For the complex number: a + jb (58) DSPedia 64 a is the real part, where a ∈ ℜ (ℜ is the set of real numbers) and jb is the imaginary part, where b ∈ ℜ . Complex arithmetic can be performed and the result expressed as a real part and imaginary part. For addition: ( a + jb ) + ( c + jd ) = ( a + c ) + j ( b + d ) (59) ( a + jb ) ( c + jd ) = ( ac – bd ) + j ( ad + bc ) (60) and for multiplication: Complex number notation is used to simplify Fourier analysis by allowing the expression of complex sinusoids using the complex exponential e jω = cos ω + j sin ω . Also in DSP complex numbers represent a convenient way of representing a two dimensional space, for example in an adaptive beamformer (two dimensional space), or an adaptive decision feedback analyser where the inphase component is a real number, and the quadrature phase component is a complex number. See also Complex Conjugate, Complex Sinusoid. Complex Plane: The complex plane allows the representation of complex numbers by plotting the real part of a complex number on the x-axis, and the imaginary part of the number on the y-axis. Imaginary, ℑ 2 + 3j 4 3 3 2 1 -4 -3 -2 -1 -1 0 1 2 3 4 Real, ℜ -2 -3.51- 3.49j -3 -4 If a complex number is written as a complex exponential, then the complex plane plot can be interpreted as a phasor diagram, such that for the complex number a + jb : a + jb = Me jθ , (61) where M = a2 + b2 b θ = tan– 1 --- a . (62) Conjugate Reciprocal: 65 If θ is a time dependent function such that θ = ωt , then the phasor will rotate in a counter-clockwise direction with angular frequency of ω radians per second (or ω ⁄ ( 2π ) rotations per second, i.e., cycles per second or Hertz). See also z-plane, Complex Exponential. Imaginary, ℑ b ω M θ a Real, ℜ Conjugate Reciprocal: See Complex Conjugate Reciprocal. Complex Roots: When the roots of a polynomials are calculated, if there is no real solution, then roots are said to be complex. As an example consider the following quadratic polynomial: y = x2 + x + 1 (63) The roots of this polynomial are when y = 0 . Geometrically this is where the are where the graph of y cuts the x-axis. However plotting this polynomial it is clear that the graph does not cut the x-axis: y 7 6 5 4 3 2 1 -4 -3 -2 -1 0 1 2 3 4 x In this case the roots of the polynomial are not real. Using the quadratic formula we can calculated the roots as: – 1 ± 1 2 – 4x = ------------------------------2 1 ± 3j= –---------------2 (64) 3 1 --- + ------- j x + --- – ------3- j x2 + x + 1 = x + 1 2 2 2 2 (65) and therefore: DSPedia 66 This example indicates the fundamental utility of complex number systems. Note that the coefficients of the polynomial are real numbers. It is obvious from the plot of the polynomial that no real solution to y(x) = 0 exists. However, the solution does exist if we choose x from the larger set of complex numbers. In applications involving linear systems, these complex solutions provide a tremendous amount of information about the nature of the problem. Thus real world phenomena can be understood and predicted simply and accurately in a way not possible without the intuition provided by complex mathematics. See also Poles, Zeroes. Complex Sinusoid: See Damped Exponential. Compression: Over the last few years compression has emerged as one of the largest areas of real time DSP application for digital audio and video. The simple motivation is that the bandwidth required to transmit digital audio and video signals is considerably higher than the analogue transmission of the baseband analogue signal, and also that storage requirements for digital audio and video are very high. Therefore data rates are reduced by essentially reducing the data required to transmit of store a signal, while attempting to maintain the signal quality. For example, the data rate of a stereo CD sampling at 44.1kHz, using 16 bit samples on stereo channels is: Data Rate = 44100 × 16 × 2 = 1411200 bits/sec (66) The often quoted CD transmission bandwidth (assuming binary signalling) is 1.5MHz. Compare this bandwidth with the equivalent analog bandwidth of around 30kHz for two 15kHz analog audio channels. The storage requirements for 60 minutes of music in CD format are: CD Storage Requirement = 44100 × 2 × 2 × 60 × 60 = 635 Mbytes/60 minutes (67) In general therefore CD quality PCM audio is difficult to transmit, and storage requirements are very high. As discussed above, if the sampling rate is reduced or the data wordlength reduced, then of course the data rate will be reduced, however the audio quality will also be affected. Therefore there is a requirement for audio compression algorithms which will reduced the quantity of data, but will not reduce the perceived quality of the audio. For telecommunications where speech is coded at 8kHz using, for example, 8 bit words the data rate is 64000 bits per second. The typical bandwidth of a telephone line is around 4000Hz, and therefore powerful compression algorithms are clearly necessary. Similarly teleconferencing systems require to compress speech coded at the higher rate of 16 kHz, and a video signal. Ideally no information will be lost by a compression algorithm (i.e. lossless). However, the compression achievable with lossless techniques is typically quite limited. Therefore most audio compression techniques are lossy such that the aim of compression algorithm is to reduce the components of the signal that do not matter such as periods of silence, or sounds that will not be heard due to the psychoacoustic behaviour of the ear whereby loud sounds mask quieter ones. For hi-fidelity audio the psychoacoustic or perceptual coding technique is now widely used to compress by factors between 2:1 and almost 12:1. Two recent music formats, the mini-disk and DCC (digital compact cassette) both use perceptual coding techniques and produce compress of 5:1 and 4:1 with virtually no (perceptual) degradation in the quality of the music. Digital audio Condition Code Register (CCR): 67 compression will continue to be a particularly large area of research and development over the next few years. Applications that will be enabled by real time DSP compression techniques include: Telecommunications: Using toll-quality telephone lines to transmit compressed data and speech; Digital Audio Broadcasting (DAB): DAB data rates must be as low as possible to minimise the required bandwidth; Teleconferencing/Video-phones: Teleconferencing or videophones via telephone circuits and cellular telephone networks; Local Video: Using image/video compression schemes medium quality video broadcast for organisations such as the police, hospitals etc are feasible over telephones, ISDN lines, or AM radio channels; Audio Storage: If a signal is compressed by a factor of M, then the amount of data that can be stored on a particular medium increases by a factor of M. The table below summarises a few of the well known audio compression techniques for both hifidelity audio and telecommunications. Currently there exist many different “standard” compression algorithms, and different algorithms have different performance attributes, some remaining proprietary to certain companies. Algorithm Compressio n Ratio Bit/rate, kbits/sec Audio Bandwidth (Hz) Example Application PASC 4:1 384 20kHz DCC Dolby AC-2 6:1 256 20kHz Cinema Sound MUSICAM 4:1 to 12:1 192 to 256 20kHz Professional Audio NICAM 2:1 676 16kHz Stereo TV audio ATRAC 5:1 307 20kHz Mini-disc ADPCM (G721) 8:5 to 4:1 16, 24, 32, 40 4kHz Telecommunications IS-54 VSELP 8:1 8 4kHz Telecommunications LD-CELP (G728) 4:1 8 4kHz Telecommunications Video compression schemes are also widely researched, developed and implemented. The best known schemes are Moving Picture Experts Group (MPEG) which is in fact both audio and video, and the ITU H-Series Recommendations (H261 etc). The Joint Photographic Experts Group (JPEG) standards and Joint Bi-level Image Group (JBIG) consider the compression of still images. See also Adaptive Differential Pulse Code Modulation, Adaptive Transform Acoustic Coding (ATRAC), Entropy Coding, Huffman Coding, Arithmetic Coding, Differential Pulse Code Modulation, Digital Compact Cassette, G-Series Recommendations, H-Series Recommendations, Joint Photographic Experts Group, MiniDisc, Moving Picture Experts Group, TransformCoding, Precision Adaptive Subband Coding, Run Length Encoding. Condition Code Register (CCR): The register inside a DSP processor which contains information on the result of the last instruction executed by the processor. Typically bits (or flags) in the CCR will indicate if the previous instruction had a zero result, positive result, overflow DSPedia 68 occurred, the carry bit value. The CCR bits are then used to make conditional decisions (branching). The CCR is sometimes called the Status Register (SR). See also DSP Processor. Condition Number: See Matrix Properties - Condition Number. Conditioning: See Signal Conditioning. Conductive Hearing Loss: If there is a defect in the middle ear this can often reduce the transmission of sound to the inner ear [30]. A simple conductive hearing loss can be caused by as simple a problem as excessive wax in the ear. The audiogram of a person with a conductive hearing loss will often indicate that the hearing loss is relatively uniform over the hearing frequency range. In general a conductive hearing loss can be alleviated with an amplifying hearing aid. See also Audiology, Audiometry, Ear, Hearing Aids, Hearing Impairment, Loudness Recruitment, Sensorineural Hearing Loss, Threshold of Hearing. Conjugate: See Complex Conjugate. Conjugate Pair: See Complex Conjugate. Conjugate Transpose: See Matrix Properties - Hermitian Transpose Constructive Interference: The addition of two waveforms with nearly identical phase. Constructive interference is exploited to produce resonance in physical and electrical systems. Constructive interference is also responsible for energy peaks in diffraction patterns. See also Destructive Interference, Beamforming, Diffraction. Boundary Incident Waves Wave Peaks Wave Valleys Wave Peak Constructive Interference Wave Valley Constructive Interference Reflected Waves Destructive Interference, i.e., Cancellation Continuous Phase Modulation (CPM): A type of modulation in which abrupt phase changes are avoided to reduce the bandwidth of the modulated signal. CPM requires increased decoder complexity. See also Minimum Shift Keying, Viterbi Algorithm. Continuous Variable Slope Delta Modulator (CVSD): A speech compression technique that was used before ADPCM became popular and standardized by the ITU [133]. Although CVSD Control Bus: 69 generally produces lower quality speech it is less sensitive to transmission errors than ADPCM. See also Compression, Delta Modulation. Control Bus: A collection of wires on a DSP processor used to transmit control information on chip and off chip. An example of control information is stating whether memory is to be read from, or written to. This would be indicated by the single R ⁄ W line. See also DSP Processor. Convergence: Algorithms such as adaptive algorithms, are attempting to find a particular solution to a problem by converging or iterating to the correct solution. Convergence implies that the correct solution is found by continuously reducing the error between the current iterated value and the true solution. When the error is zero (or, more practically, relatively small), the algorithm is said to have converged. For example consider an algorithm which will update the value of a variable xn to converge to the square root of a number, a. The iterative update is given by: 1 a- x n + 1 = --- x n + ---2 x (68) n where the initial guess, x0, is a/2. The error of e n = x n – a will reduce at each iteration, and converge to zero. Because most algorithms converge asymptotically, convergence is often stated to have occurred when a specified error quantity is less than a particular value. Finding the square root of a = 15, using an iterative algorithm to converge to the solution of a = 5.477 . Note that after only 6 iterations the algorithm has converged to within 0.03 of the correct answer Variable, xn Error, en 16 14 12 10 8 6 4 2 0 1 2 3 4 5 6 Iteration, n 1 2 3 4 5 6 Iteration, n 10 8 6 4 2 0 Another example is a system identification application using an adaptive LMS FIR filter to model an unknown system. Convergence is said to have occurred when the mean squared error between the output of the actual system and the modelled one (given the same input) is less than a certain value determined by the application designer. Algorithms that do not converge and perhaps diverge, are usually labelled as unstable. See also Adaptive Signal Processing, Iterative Techniques. Convolution: When a signal is input to a particular linear system the impulse response of the system is convolved with the input signal to yield the output signal. For example, when a sampled speech signal is operated on by a digital low pass filter, then the output is formed from the convolution of the input signal and the impulse response of the low pass filter: DSPedia 70 y(n ) = h(n ) ⊗ x(n ) = ∑ h ( k )x ( n – k ) (69) k x(n) h(n) For n < 0 both the signal x(n) and the filter h(n) are zero. n n For n < 0 the convolution output is 0. The summation occurs over the summation variable, k. n = -1 h(k) x(-1-k) k n=0 h(k) y(n) Σ x(0-k) k n=1 y(n) n=1 k n y(n) n=2 n=2 Σ h(k) x(2-k) n k y(n) n=7 h(k) n Σ h(k) x(1-k) n=0 n=7 Σ x(7-k) k n Cooley-Tukey: J.W. Cooley and J.W. Tukey published a noteworthy paper in 1965 highlighting that the discrete Fourier transform (DFT) could be computed in fewer computations by using the fast Fourier transform (FFT) [66]. Reference to the Cooley-Tukey algorithm usually means the FFT. See also Fast Fourier Transform, Discrete Fourier Transform. Co-processor: Inside a PC, a processor that is additional to the general purpose processor (such as the Intel 80486) is described as a co-processor and will usually only perform demanding CORDIC: 71 computational tasks. For multi-media applications, DSP processors inside the PC to facilitate speech processing, video and communications are co-processors. CORDIC: An arithmetic technique that can be used to calculate sin, cos, tan and trigonometrical values using only shift and adds of binary operands [25]. Core: All DSP applications require very fast MAC operations to be performed, however the algorithms to be implemented, and the necessary peripherals to input data, memory requirements, timers and on-chip CODEC requirements are all slightly different. Therefore companies like Motorola are releasing DSP chips which have a common core but have on-chip special purpose modules and interfaces. For example Motorola’s DSP56156 has a 5616 core but with other modules, such as on-chip CODEC and PLL to tailor the chip for telecommunications applications. See also DSP Processor. Correlation: If two signals are correlated then this means that they are in some sense similar. Depending on how similar they are, signals may be described as being weakly correlated or strongly correlated. If two signals, x(k) and y(k), are ergodic then the correlation function, rxy(n) can be estimated as: 1 r̂ xy ( n ) = -----------------2M + 1 M ∑ x ( k )y ( n + k ) for largeM (70) k = –M Taking the discrete Fourier transform (DFT) of the autocorrelation function gives the cross spectral density. See also Autocorrelation. Correlation Matrix: Assuming that a signal x ( k ) is a wide sense stationary ergodic processes, a 3 × 3 correlation matrix can be formed by taking the expectation, E { . } , of the elements of the matrix formed by multiplying the signal vector, x ( k ) = [ x ( k ) x ( k – 1 ) x ( k – 2 ) ] by its transpose to produce the correlation matrix: x(k) T = = k E E k [ ( ) ] ( )x R x x(k – 1) x(k) x(k – 1) x(k – 2) x(k – 2) x2( k ) = E x ( k )x ( k – 1 ) x ( k )x ( k – 1 ) x2( k – 1) x ( k )x ( k – 2 ) x ( k – 2 )x ( k – 1 ) x ( k )x ( k – 2 ) x ( k – 1 )x ( k – 2 ) (71) x2( k – 2 ) r0 r1 r2 = r1 r0 r1 r2 r1 r0 where r n = E [ x ( k )x ( k – n ) ] . The correlation matrix, R is Toeplitz symmetric and for a more general N point data vector the matrix will be N x N in dimension: DSPedia 72 r0 r1 r2 … rN – 1 r1 r0 r1 … rN – 2 R = r 2 r1 r0 … rN – 3 (72) : : : …: rN – 1 rN – 2 rN – 3 … ro The Toeplitz structure (i.e., constant diagonal entries) results from the fact that the diagonal entries all correspond to the same time lag estimate of the correlation, that is, n + k – n = n is constant. To calculate r n statistical averages should be used, or if the signal is ergodic then time averages can be used. See also Adaptive Signal Processing, Cross Correlation Vector, Ergodic, Expected Value, Matrix, Matrix Structured - Toeplitz, Wide Sense Stationarity, Wiener-Hopf Equations. Correlation Vector: See Cross Correlation Vector. CORTES Algorithm: Coordinate Reduction Time Encoding Scheme (CORTES) is an algorithm for the data compression of ECG signals. CORTES is based on the ATZEC and TP algorithms, using the AZTEC to discard clinically insignificant data in the isoelectric region, and applying the TP algorithm to clinically significant high frequency regions of the ECG data [48]. See also AZTEC, Electrocardiogram, TP. Critical Bands: It is conjectured that a suitable model of the human auditory system is composed of a series of (constant fractional bandwidth) bandpass filters [30] which comprise critical bands. When trying to detect a signal of interest in broadband background noise the listener is thought to make use of a bandpass filter with a centre frequency close to that of the signal of interest. The perception to the listener is that the background noise is somewhat filtered out and only the components within the background noise that lie in the critical band remain. The threshold of hearing of the signal of interest is thus determined by the amount of noise passing through the filter. See also Auditory Filters, Audiology, Audiometry, Fractional Bandwidth, Threshold of Hearing. Critical Distance: In a reverberant environment, the critical distance is defined as the separation between source and receiver that results in the acoustic energy of the reflected waveforms being equal to the acoustic energy in the direct path. A single number is often used to classify a given environment, although the specific acoustics of a given room may produce different critical distances for alternate source (or receiver) positions. Roughly, the critical distance characterizes how much reverberation exists in a given room. See also Reverberation. Cross Compiler: This is a piece of software which allows a user to program in a high level language (such as ‘C’) and generate cross compiled code for the target DSP’s assembly language. This code can in turn be assembled and the actual machine code program downloaded to the DSP processor. Although cross-compilers can make program writing much easier, they do not always produce efficient code (i.e. using minimal instructions) and hence it is often necessary to write in assembly language (or hand code) either the entire program or critical sections of the program (via in-line assembly commands in the higher level language program). Motorola produce a C cross compiler for the DSP56000 series, and Texas Instruments produce one for the TMS320 series of DSP processors. Cross Correlation Vector: A 3 element cross correlation vector, p, for a signal d ( k ) and a signal x ( k ) can be calculated from: Cross Interleaved Reed Solomon Coding (CIRC): p0 d ( k )x ( k ) p = E { d ( k )x ( k ) } = E d ( k )x ( k – 1 ) = p 1 d ( k )x ( k – 2 ) p2 73 (73) Hence for an N element vector: p0 p = p1 : pN – 1 (74) where p n = E { d ( k )x ( k – n ) } , and E { . } is the expected value function. To calculate p n statistical averages should be used, or if the signal is ergodic then time averages can be used. See also Adaptive Signal Processing, Correlation Matrix, Ergodic, Matrix, Expected Value, Wide Sense Stationarity, Wiener-Hopf Equations. Cross Interleaved Reed Solomon Coding (CIRC): CIRC is an error correcting scheme which was adopted for use in compact discs (CD) systems [33]. CIRC is an interleaved combination of block (Reed-Solomon) and convolutional error correcting schemes. It is used to correct both burst errors and random bit errors. On a CD player errors can be caused by manufacturing defects, dust, scratches and so on. CIRC coding can be decoded to correct several thousand consecutive bit errors. It is safe to say that without the signal processing that goes into CD error correction and error concealment, the compact discs we see today would be substantially more expensive to produce and, subsequently, the CD players would not be nearly the ubiquitous appliance we see today. See also Compact Disc. Cross-Talk: The interference of one channel upon another causing the signal from one channel to be detectable (usually at a reduced level) on another channel. Cut-off Frequency: The cut-off frequency of a filter is the point at which the attenuation of the filter drops by 3dB. Although the term cut-off conjures up the image of a sharp attenuation, 3dB is equivalent to 20log10 2 , i.e. the filtered signal output has half of the power of the input signal, 10log10 2 . For example the cut-off frequency of a low pass filter, is the frequency at which the filter DSPedia 74 attenuation drops by 3dB when plotted on a log magnitude scale, and reduces by 2 on a linear scale. A bandpass filter will have two cut-off frequencies. See also Attenuation, Decibels Bandwidth 0 -5 -10 -15 -20 Gain Factor Gain, dB Bandwidth Cut-off frequency frequency 1 0.75 0.5 0.25 0 Cut-off frequency frequency The cut-off frequency, or 3dB point of a filter. The left hand side illustrates the cut-off followed by the slow roll-off characteristic. The right hand side shows the same filter plotted as attenuation factor (linear scale, not decibel) against frequency. The cut off occurs when the attenuation is at 1 ⁄ [ 2 ] Cyberspace: The name given to the virtual dimension that the world wide network (internet) of connected computers gives rise to in the minds of people who spend a large amount of time “there”. Without the DSP modems there would be no cyberspace! See also Internet. Cyclic Redundancy Check (CRC): A cyclic redundancy check can be performed on digital data transmission systems whereby it is required at the receiver end to check the integrity of the data transmitted. This is most often used as an error detection scheme -- detected errors require retransmission. If both ends know the algebraic method of encoding the original data the raw data can be CRC coded at the transmission end, and then at the received end the cyclic (i.e., efficient) redundancy can be checked. This redundancy check highlights the fact that bit transmission errors have occurred. CRC techniques can be easily implemented using shift registers [40]. See also Characteristic Polynomial, V-series Recommendations. Cyclostationary: If the autocorrelation function (or second order statistics) of a signal fluctuates periodically with time, then this signal is cyclostationary. See [75] for a tutorial article. 75 D Damped Sinusoid: A common solution to linear system problems takes the form e ( a + jb )t at jbt = e e at = e [ cos ( bt ) + j sin ( bt ) ] . (75) where the complex exponent gives rise to two separate components, an exponential decay term, at e and a sinusoidal variation term [ cos ( bt ) + j sin ( bt ) ] . Common examples of systems that give rise to damped sinusoidal solutions are the suspension system in an automobile or the voltage in a passive electrical circuit that has energy storage elements (capacitors and inductors). Because many physical phenomena can be accurately described by coupled differential equations (for which damped sinusoids are common solutions), real world experiences of damped sinusoids are quite common. Data Acquisition: The general name given to the reading of data using an analog-to-digital converter (ADC) and storing the sampled data on some form of computer memory (e.g., a hard disk drive). Data Bus: The data bus is a collection of wires on a DSP processor that is used to transmit actual data values between chips, or within the chip itself. See also DSP Processor. Data Compression: See Compression. Data Registers: Memory locations inside a DSP processor that can be used for temporary storage of data. The data registers are at least as long as the wordlength of the processor. Most DSP processors have a number of data registers. See also DSP Processor. Data Vector: The most recent N data values of a particular signal, x(k), can be conveniently represented as a vector, xk , where k denotes the most recent element in the vector. For example, if N = 5: xk 40 20 0 1 2 3 4 5 6 7 8 9 10 11 12 -20 13 time, k -40 xk x7 xk – 1 x6 If x k = x k – 1 then, for example x 7 = x 5 xk – 3 x4 xk – 4 x3 – 23 – 20 = –9 11 29 More generally any type of data stored or manipulated as a vector can reasonable be referred to as a data vector. See also Vector, Vector Properties, Weight Vector. Data Windowing: See Window. DSPedia 76 Daughter Module: Most DSP boards are designed to be hosted by an IBM PC. To provide input/ output facilities or additional DSP processors some DSP boards (then called motherboards) have spaces for optional daughter modules to be inserted. Decade: An decade refers the interval between two frequencies where one frequency is ten times other. Therefore as an example from 10Hz to 100Hz is a decade, and from 100Hz to 1000Hz is a decade and so on. See also Logarithmic Frequency, Octave, Roll-off. Decibels (dB): The logarithmic unit of decibels is used to quantify power of any signal relative to a reference signal. A power signal dB measure is calculated as 10log10(P1/P0). In DSP since input signals are voltage, and Power = (Voltage)2 divided by Resistance we conventionally convert a voltage signal into its logarithmic value by calculating 20log10(V1/V0). Decibels are widely used to represent the attenuation or amplification of signals: P V Attentuation = 10 log ------ = 20 log ------ P 0 V 0 (76) where P o is the reference power, and V 0 is the reference voltage. dB’s are used because they often provide a more convenient measure for working with signals (e.g., plotting power spectra) than do linear measures. Often the symbol dB is followed by a letter that indicates how the decibels were computed. For example, dBm indicates a power measurement relative to a milliwatt, whereas dBW indicates power relative to a watt. In acoustics applications, dB can be measured relative to various perceptually relevant scales, such as A-weighting. In this case, noise levels are reported as dB(A) to indicate the relative weighting (A) selected for the measurement. See Sound Pressure Level Weighting Curves, Decibels SPL. Decibels (dB) SPL: The decibel is universally used to measure acoustic power and sound pressure levels (SPL). The decibel rating for a particular sound is calculated relative to a reference power W o : W1 10 log -------- W0 (77) dB SPL is sound pressure measured relative to 20 µ-Pascals ( 2 × 10 – 5 Newtons/m2). Acoustic power is proportional to pressure squared, so pressure based dB are computed via 20log10 pressure ratios. Intensity (or power) based dB computations use 10log10 intensity ratios. The sound level 0dB SPL is a low sound level that was selected to be around the absolute threshold of average human hearing for a pure 1000Hz sinusoid [30]. Normal speech has an SPL value of about 70dB SPL. The acoustic energy 200 feet from a jet aircraft at take-off about 125dB SPL, this is above the threshold of feeling (meaning you can feel the noise as well as hear it). See also Sound Pressure Level. Decibels (dB) HL (3): Hearing Level (HL). See Hearing Level, Audiogram. Decimation: Decimation is the process of reducing the sampling rate of a signal that has been oversampled. When a signal is bandlimited to a bandwidth that is a factor of 0.5 or less than half of the sampling frequency ( f s ⁄ 2 ) then the sampling rate can be reduced without loss of information. Oversampling simply means that a signal has been sampled at a rate higher than dictated by the 77 Nyquist criteria. In DSP systems oversampling is usually done at integral multiples of the Nyquist rate, f n , and usually by a power of two factor such as 4 x’s, 8 x’s or 64 x’s. For a discrete signal oversampled by a factor R, then the sampling frequency, f s , is: f s ≡ f ovs = Rf n (78) time 0 Magnitude t t ovs = n ⁄ 4 Baseband signal freq fovs/2 fovs Magnitude 0 4 f = 4f = ----n ovs tn freq fn/2 fn 2fn 4fn d freq fn/2 fn freq fn/2 fn fovs/2 Oversampling ADC Attenuation Analog Input Attenuation fovs Analog anti-alias filter freq 2fn 4fn 2fn Downsampler Digital Low Pass Filter fn/2 fn 1 f n = ----t n time t Magnitude time tn Amplitude Amplitude Amplitude For an R x’s oversampled signal the only portion of interest is the baseband signal extending from 0 to f n ⁄ 2 Hz. Therefore decimation is required. The oversampled signal is first digitally low pass filtered to f n ⁄ 2 using a digital filter with a sharp cut-off. The resulting signal is therefore now bandlimited to f n ⁄ 2 and can be downsampled by retaining only every R-th sample. Decimation for a system oversampling by a factor of R = 4 can be illustrated as: 4fn 4 To DSP Processor Decimation of a 4 x’s oversampled signal, f ovs = 4f n by low pass digital filtering then downsampling by 4, which retains every 4th sample. The decimation process is essentially a technique whereby anti-alias filtering is being done partly in the analog domain and partly in the digital domain. Note that the decimated Nyquist rate or baseband signal will be delayed by the group delay, t d of the digital low pass filter (which we assume to be linear phase). For the oversampling example above where R = 4 , any frequencies that exist between f n ⁄ 2 Hz and f ovs ⁄ 2 = 4f n after the analog anti-alias filter can be removed with a digital low pass filter prior to downsampling by a factor of 4. Hence the complexity of the analogue low pass anti-alias filter has been reduced by effectively adding a digital low pass stage of anti-alias filtering. So why not just oversample, but not decimate? To illustrate the requirement for decimation where possible, linear digital FIR filtering using an oversampled signal will require RN filter weights (corresponding to T secs ) whereas the number of weights in the equivalent function Nyquist rate filter will only be N (also corresponding to T secs ) Hence the oversampled DSP processing would require to perform R2Nfn multiply/adds per second, compared to the Nyquist rate DSP processing which requires Nfn multiply/adds per second, a factor of R 2 more. This is clearly not very desirable and a considerable disadvantage of an oversampled system compared to a Nyquist rate system. Therefore this is why an oversampled signal is usually decimated to the Nyquist rate, first by digital low pass filtering, then by downsampling (retaining only every R-th sample). DSPedia 78 The word decimation originally comes from a procedure within the Roman armies, where for acts of cowardice the legionaires were lined up, and every 10th man was executed. Hence the prefix “dec” meaning ten. See also Anti-alias Filter, Downsampling, Oversampling, Upsampling, Interpolation, Sigma Delta. Decimation-in-Frequency (DIF): The DFT can be reformulated to give the FFT either as a DIT or a DIF algorithm. Since the input data and output data values of the FFT appear in bit-reversed order, decimation-in-frequency computation of the FFT provides the output frequency samples in bit-reversed order. See also Bit Reverse Addressing, Discrete Fourier Transform, Fast Fourier Transform, Cooley-Tukey. Decimation-in-Time (DIT): The DFT can be reformulated to give the FFT either as a DIF or a DIT algorithm. Since the input data and output data values of the FFT appear in bit-reversed order, decimation-in-time computation of the FFT provides the output frequency samples in proper order when the input time samples are arranged in bit-reversed order. See also Bit Reverse Addressing, Discrete Fourier Transform, Fast Fourier Transform, Cooley-Tukey. Delay and Sum Beamformer: A relatively simple beamformer in which the output from an array of sensors are subject to independent time delays and then summed together. The delays are typically selected to provide a look direction from which the desired signal will constructively interfere at the summer while signals from other directions are attenuated because they tend to destructively interfere. The delays are dictated by the geometry of the array of sensors and the speed of propagation of the wavefront. See also Adaptive Beamformer, Beamformer, Broadside, Endfire. Delays Summer Output Σ τ1 τ2 d1 d τ n = -----nc d2 τ3 90o c is propagation velocity θ Look Direction θ τM Sensors In a delay-and-sum beamformer, the output from each of the sensors in an array is delayed an appropriate amount (to time-align the desired signal) and then combined via a summation to generate the beamformed output. No amplitude weighting of the sensors is performed. Delay LMS: See Least Mean Squares Algorithm Variants. Delta Modulation: Delta modulation is a technique used to take a sampled signal, x(n), and encode the magnitude change from the previous sample and transmit only the single bit difference ( ∆ ) between adjacent samples [2]. If the signal has increased from the previous sample, then encode a 1, if it had decreased then encode as a -1. The received signal is then demodulated by taking successive delta samples and summing to reconstruct the original signal using an integrator. Delta modulation can reduce the number of bits per second to be transmitted down a channel, 79 compared to PCM. However when using a delta modulator, the sampling rate and step size must be carefully chosen or slope overload and/or granularity problems may occur. See also Adaptive Differential Pulse Code Modulation,Continuously Variable Slope Delta Modulation, Differential Pulse Code Modulation, Integrator, Slope Overload, Granularity Effects.. x(n) 1-bit Quantizer Σ xd(n) ∆(n) Channel Low Pass Filter x(n) fs ∫ De-modulator Modulator x(n) ∫ 4 3 2 1 0 -1 -2 -3 -4 xd(n) time ∆(n) 1 -1 time Delta-Sigma: Synonymous term with Sigma Delta. See Sigma-Delta. Descrambler: See Scrambler/Descrambler. Destructive Interference: The addition of two waveforms with nearly opposite phase. Destructive interference is exploited to cancel unwanted noise, vibrations, and interference in physical and electrical systems. Destructive interference is also responsible for energy nulls in diffraction patterns. See also Diffraction, Constructive Interference, Beamforming. Determinant: See Matrix Properties - Determinant. Diagonal Matrix: See Matrix Structured - Diagonal. DSPedia 80 Dial Tone: Tones at 350 Hz and 440 Hz make up the dialing tone for telephone systems. See also Dual Tone Multifrequency, Busy Tone, Ringing Tone. ~440 Hz ~350 Hz 50 Hz mains hum Dichotic: A situation where the aural stimulation reaching both ears is not the same. For example, setting up a demonstration of binaural beats is a dichotic stimulus. The human ear essentially provides dichotic hearing whereby it is possible for the auditory mechanism to process the differing information arriving at both ears and subsequently localize the source. See also Audiometry, Binaural Unmasking, Binaural Beats,Diotic, Lateralization, . Difference Limen (DL): The smallest noticeable difference between two audio stimuli, or the Just Noticeable Difference (JND) between these stimuli. Determination of DL’s usually requires that subjects be given a discrimination task. Typically, DL’s (or JND’s) are computed for two signals that are identical in all respects save the parameter being tested for a DL. For example, if the DL is desired for sound intensity discrimination, two stimuli differing only in intensity would be presented to the subject under test. These stimuli could be tones at a given frequency that are presented for a fixed period. It is interesting to note that the DL for sound intensity (measured in dB) is generally found to be constant over a very wide range (this is known as Weber’s law). To have meaning a DL must be specified along with the set up and conditions used to establish the value. For example stating that the frequency DL for the human ear is 1Hz between the frequencies of 1- 4 kHz requires that sound pressure levels, stimuli duration, and stimuli decomposition are clearly stated as varying these parameters will cause variation in the measured frequency DL. See also Audiology, Audiometry, Frequency Range of Hearing, Threshold of Hearing. Differentiation: See Differentiator. Differential Phase Shift Keying (DPSK): A type of modulation in which the information bits are encoded in the change of the relative phase from one symbol to the next. DPSK is useful for communicating over time varying channels. DPSK also removes the need for absolute phase synchronization, since the phase information is encoded in a relative way. See also Phase Shift Keying. Differentiator: A (linear) device that will produce an output that is the derivative of the input. In digital signal processing terms a differentiator is quite straightforward. The output of a differentiator, y(t), will be the rate of change of the signal curve, x(t), at time t. For sampled digital signals the input will be constant for one sampling period, and therefore to differentiate the signal the previous sample value is subtracted from the current value and divided by the sampling period. If the sampling period is normalized to one, then a signal is differentiated in the discrete domain by 81 subtracting consecutive input samples. A differentiator is implemented using a digital delay element, and a summing element to calculate: y ( n ) = x ( n ) –x( n – 1 ) (79) In the z-domain the transfer function of a differentiator is: Y ( z ) = X ( z ) –z –1 X ( z ) ⇒ (80) Y ( z )----------= 1 – z –1 X(z ) When viewed in the frequency domain a differentiator has the characteristics of a high pass filter. Thus differentiating a signal with additive noise tends to emphasize or enhance the high frequency components of the additive noise. See also Analog Computer, Integrator, High Pass Filter. x(t) x(t) time y(t) d x ( t )-----------dt y(t) 3 2 1 time Analog Differentiation x(n) y(n) x(n) Discrete time, n ∆t + x(n) Σ x(n-1) ∆x ( n )-------------∆t y(n) Discrete time, n Discrete Differentiation y(n) X(z) 1–z − –1 Y(z) ∆ Time Domain Discrete Differentiator SFG z-domain differentiator representation Differential Pulse Code Modulation (DPCM): DPCM is an extension of delta modulation that makes use of redundancy in analog signals to quantize the difference between a discrete input signal and a predicted value to one of P values [2]. (Note a delta modulator has only one level ± 1 ). The integrator shown below performs a summation of all input values as the predictor. More x(n) P-level Quantizer Σ x̂ ( n ) ∆(n) Channel ∫ fs ∫ Modulator De-modulator Low Pass Filter x̃ ( n ) DSPedia 82 complex DPCM systems require a predictor filter in place of the simple integrator. Note that the x(n) P-level Quantizer Σ x̂ ( n ) ∆(n) Channel Linear Predictor x̃ ( n ) fs Linear Predictor Modulator De-modulator predictor at the modulator end uses the same quantized error values as inputs that are available to the predictor at the demodulator end. If the unquantized error values were used at the modulator end then there would be an accumulated error between demodulator output and the modulator input with a strictly increasing variance. This does not happen in the above configuration. See also Adaptive Differential Pulse Code Modulation (ADPCM), Delta Modulation, Continuously Variable Slope Delta Modulation (CVSD), Slope Overload, Granularity. Diffraction: Diffraction is the bending of waves around an object via wave propagation of incident and reflected waves impinging on the object. See also Constructive Interference, Destructive Interference, Head Shadow. Boundary Incident Waves Diffracted Waves Example of diffraction of incident waves through an opening in a boundary. Digital: Represented as a discrete countable quantity. When an analog voltage is passed through an ADC the output is a digitized and sampled version of the input. Note that digitization implies quantization. Digital Audio: Any aspect of audio reproduction or recording that uses a digital representation of analogue acoustic signals is often referred to generically as digital audio [33], [34], [37]. Over the last 10-20 years digital audio has evolved into three distinguishable groups of application dependent quality: 1. Telephone Speech 300 - 3400Hz: Typically speech down a telephone line is carried over a channel with a bandwidth extending from around 300Hz to 3400Hz. This bandwidth is adequate for good coherent and intelligible conversation. Music is coherent but unattractive. Clearly intelligible speech can be obtained by 83 sampling at 8kHz with 8 bit PCM samples, corresponding to an uncompressed bit rate of 64kbits/s. 2. Wideband Speech: 50 - 7000Hz: For applications such as teleconferencing prolonged conversation requires a speech quality that has more naturalness and presence. This is accomplished by retaining low and high frequency components of speech compared to a telephone channel. Music with the same bandwidth will have almost AM radio quality. Good quality speech can be obtained by sampling at 16kHz with 12 bit PCM samples, corresponding to a bit rate of 192kbits/s. 3. High Fidelity Audio: 20 - 20000Hz: For high fidelity music reproduction audio the reproduced sound should be of comparable quality to the original sound. Wideband audio is sampled at one of the standard frequencies of 32 kHz, 44.1 kHz, or 48 kHz using 16 bit PCM. A stereo compact disc (44.1kHz, 16 bits) has a data rate of 1.4112 Mbits/s. Generally, when one refers to digital audio applications involving speech materials only (e.g., speech coding) the term speech is directly included in the descriptive term. Consequently, digital audio has come to connote high fidelity audio, with speech applications more precisely defined. The table below summarizes the key parameters for a few well known digital audio applications. Note that to conserve bandwidth and storage requirements DSP enabled compression techniques are applied in a few of these applications. Technology Example Application Sampling Rate (kHz) Compression Single Channel Bit Rate (kbits/s) Digital Audio Tape (DAT) Professional recording 48 No 768 Compact Disc (CD) Consumer audio 44.1 No 705.6 Digital Compact Cassette (DCC) Consumer audio 32, 44.1, 48 Yes 192 MiniDisc (MD) Consumer audio 44.1 Yes 146 Dolby AC-2 Cinema sound 48 Yes 128 MUSICAM (ISO Layer II) Consumer broadcasting 32, 44.1, 48 Yes 16 - 192 NICAM TV audio 32 Yes 338 PCM A/µ-law (G711) Telephone 8 Yes 64 ADPCM (G721) Telephone 8 Yes 16,24,32,40 LD-CELP (G728) Telephone 8 Yes 16 RPE-LTP (GSM) Telephone 8 Yes 13.3 Subband ADPCM (G722) Teleconferencing 16 Yes 64 Digital Audio Systems Although the digital audio market is undoubtedly very mature, the power of DSP systems is stimulating research and development in a number of areas: 1. Improved compression strategies based on perceptual and predictive coding; compression ratios of up to 20:1 for hifidelity audio may eventually be achievable. 2. The provision of surround sound using multichannel systems to allow cinema and “living room” audiences to experience 3-D sound. 3. DSP effects processing: remastering, de-scratching recordings, sound effects, soundfield simulation etc. 4. Noise reduction systems such as adaptive noise controllers, echo cancellers, acoustic echo cancellers, equalization systems. DSPedia 84 5. Super-fidelity systems sampling at 96kHz to provide ultrasound [154] (above 20kHz and which is perhaps more tactile than audible), and systems to faithfully reproduce infrasound [138] (below 20Hz and which is most definitely tactile and in some cases rather dangerous!) Real-time digital audio systems are one of three types: (1) input/output system (e.g. telephone/ teleconferencing system); (2) output only (e.g. CD player); or (3) input only (e.g. DAT professional recording). The figure below shows the key elements of a single channel input/output digital audio system. The input signal from a microphone is signal conditioned/amplified as appropriate to the input/output characteristic of the analogue to digital converter (ADC) at a sampling rate of f s Hz. Prior to being input to the ADC stage the analogue signal is low pass filtered to remove all frequencies above f s ⁄ 2 by the analogue anti-alias filter. The output from ADC is then a stream of binary numbers, which are then compressed, coded and modulated for transmission, broadcasting or recording via/to a suitable medium (e.g. FM radio broadcast, telephone call or CD mastering). When a digital audio signal is received or read it is a stream of binary numbers which are demodulated and decoded/decompressed with DSP processing into a sampled data PCM format for input to a digital to analogue converter (DAC) which outputs to an analogue low pass reconstruction filter stage (also cutting off at f s ⁄ 2 prior to being amplified and output to a loudspeaker (e.g. reception of digital audio FM radio or a telephone call, or playback of a CD). Acoustic Analogue Digital Amp Input Signal recording and conditioning Acoustic fs fs ADC & AntiAlias Filter Analogue DSP Processing: Coding/ Compression/ Modulation DSP Processing: Decoding/ Decompression / Demodulation Data transmission/ broadcasting/ recording & playback DAC & Reconstruction Filter Amp Output Signal conditioning and reproduction The generic single input, single output channel digital audio signal processing system. See also Compact Disc, Data Compression, Digital Audio Tape, Digital Compact Cassette, MiniDisc, Speech Coding. Digital Audio Broadcasting (DAB): The transmission of electromagnetic carriers modulated by digital signals. DAB will permit the transmission of high fidelity audio and is more immune to noise and distortion than conventional techniques. Repeater transmitters can receive a DAB signal, clean the signal and retransmit a noise free version. Currently there is a large body of interest in developed DAB consumer systems using a combination of satellite, terrestrial and cable transmission. For terrestrial DAB however there is currently no large bandwidth specifically allocated for DAB, and therefore FM radio station owners may be required to volunteer their bands for digital audio broadcasting. See also Compression,Standards. Digital Audio Tape (DAT): An audio format introduced in the late 1980s to compete with compact disc. DAT samples at 48kHz and used 16 bit data with stereo channels. Although DAT was a commercial failure for the consumer market it has been adopted as a professional studio recording 85 medium. A very similar format of 8mm digital tape is also quite commonly used for data storage. See also Digital Compact Cassette, MiniDisc. Digital Communications: The process of transmitting and receiving messages (information) by sending and decoding one of a finite number of symbols during a sequence of symbol periods. One primary requirement of a digital communication system is that the information must be represented in a digital (or discrete) format. See also Message,Symbol, Symbol Period. Digital Compact Cassette (DCC): DCC was introduced by Philips in the early 1990s as a combination of the physical format of the popular compact cassette, and featuring new digital audio signal processing and magnetic head technology [83], [52], [150]. Because of physical constraints DCC uses psychoacoustic data compression techniques to increase the amount of data that can be stored on a tape. The DCC mechanism allows it to play both (analog) compact cassette tapes and DCC tapes. The tape speed is 4.75cm/s for both types of tapes and a carefully designed thin film head is used to achieve both digital playback and analog playback. The actual tape quality is similar to that used for video tapes. DCC is a competing format to Sony’s MiniDisc which also uses psychoacoustic data compression techniques. If normal stereo 16 bit, 48kHz (1.536 Mbits/sec) PCM digital recording were done on a DCC tape, only about 20 minutes of music could be stored due to the physical restrictions of the tape. Therefore to allow more than an hour of music on a single tape data compression is required. DCC uses precision adaptive subband coding (PASC) to compress the audio by a factor of 4:1 to a data rate of 384 Mbits/s (192 Mbits/s per channel) thus allowing more than an hour of music to be stored. PASC is based on psychoacoustic compression principles and is similar to ISO/MPEG layer 1 standard. The input to a PASC encoder can be PCM data of up to 20 bits resolution at sampling rates of 48kHz, 44.1kHz or 32kHz. The quality of music from a PASC encoded DCC is arguably as good as a CD, and in fact for some parameters such as dynamic range a prerecorded DCC tape can have improved performance over a CD (see Precision Adaptive Subband Coding). Eight to ten modulation and cross interleaved Reed-Solomon coding (CIRC) is used for the DCC tape channel coding and error correction. In addition to the audio tracks DCC features an auxiliary channel capable of storing 6.75kbits/sec and which can be used for storing timing, textual information and copyright protection codes. L R in out L R ADC Digital I/O 32 Channel Subband Filter Psychoacoustic Coding: PASC Error Coding/ Error Correction Data Modulation Read/ Write Head DAC The Digital Compact Cassette (DCC) compresses PCM encoded 48kHz, 44.1kHz or 32kHz digital audio to a bit rate of 384 bits/s. The PCM input data can have up to 20 bits precision. In terms of DSP algorithms the DCC also uses an IIR digital filter for equalization of the thin film magnetic head frequency response, and a 12 weight FIR filter to compensate for the high frequency roll-off of the magnetic channel. See also Compact Disc, Digital Audio, Digital Audio Tape (DAT), MiniDisc, Precision Adaptive Subband Coding (PASC), Psychoacoustics. DSPedia 86 Digital European Cordless Telephone (DECT): The DECT is a telephone whereby a wireless radio connection at 1.9GHz communicates with a base station and is normally connected to the public switched telephone network. One or more handsets can communicate with each other or the outside world. P out Attenuation = 10 log ---------P in 0 -3 (f) = 20 log Y ---------X( f) Gain (dB) -20 -40 -3dB point Gain Factor Digital Filter: A DSP system that will filter a digital input (i.e., selectively discriminate signals in different frequency bands) according to some pre-designed criteria is called a digital filter. In some situations digital filters are used to modify phase only [10], [7], [21], [31], [29]. A digital filter’s characteristics are usually viewed via their frequency response and for some applications their phase response (discussed in Finite Impulse Response Filter, and Infinite Impulse Response Filter). For the frequency response, the filter attenuation or gain characteristic can either be specified on a linear gain scale, or more commonly a logarithmic gain scale: -60 1 (f) Attenuation = H ( f ) = Y ---------X( f) 0.8 0.6 0.4 0.2 -80 1000 0 1000 frequency (Hz) Linear Response Logarithmic Response X( f) frequency (Hz) Digital Filter, H(f) Y(f) The above digital filter is a low pass filter cutting off at 1000Hz. Both the linear and logarithmic magnitude responses of the transfer function, H ( f ) = Y ( f ) ⁄ X ( f ) are shown. The cut-off frequency of a filter is usually denoted as the “3dB frequency”, i.e. at f3dB = 1000 Hz, the filter attenuates the power of a sinusoidal component signal at this frequency by 0.5, i.e. P out 10 log ----------P in f3dB Y ( f 3dB ) = 20 log ------------------ = 10 log 0.5 = 20 log 0.707… = – 3 dB X ( f 3dB ) The power of the output signal relative to the input signal at f3dB is therefore 0.5, and the signal amplitude is attenuated by 1 ⁄ 2 = 0.707… . For a low pass filter signals with a frequency higher than f3dB are attenuated by more than 3dB. Digital filters are usually designed as either low pass, high pass, band-pass or band-stop: 0 0 0 Gain 0 frequency Low Pass frequency High Pass frequency Band-Pass frequency Band-Stop 87 A number of filter design packages will give the user the facility to design a filter for an arbitrary frequency response by “sketching” graphically: Gain 0 frequency User Defined Frequency Response There are two types of linear digital filters, FIR (finite impulse response filter) and IIR (infinite impulse response filter). An FIR filter is a digital filter that performs a moving, weighted average on a discrete input signal, x ( n ) , to produce an output signal. (For a more intuitive discussion of FIR filtering operation see entry for Finite Impulse Response Filter). The arithmetic computation required by the digital filter is of course performed on a DSP processor or equivalent: x(t) x(k) 0 time, t AntiAlias Filter ADC fs 0 time, k DSP Processor fs Recons truction Filter DAC y(k) y(t) 0 time, k 0 time, t Analogue Digital Analogue The digital filter equations are implemented on the DSP Processor which processes the time sampled data signal to produce a time sampled output data signal. The actual frequency and phase response of the filter is found by taking the discrete frequency transform (DFT) of the weight values of w 0 to w N – 1 . An FIR digital filter is usually represented in a signal flow graph or with a summation (convolution) equation: DSPedia 88 x(k) x(k-1) x(k-2) w1 w0 x(k-3) w2 x(k-N+2) w3 wN-2 x(k-N+1) wN-1 y(k) y ( k ) = w 0 x ( k ) + w 1 x ( k – 1 ) + w 2 x ( k – 2 ) + w 3 x ( k – 3 ) + ..... + w N – 1 x ( k – N + 1 ) N–1 = ∑ wn x ( k – n ) = w T xk n=0 where w = w 0 w 1 w 2 … w N – 1 and x k = x ( k ) x ( k – 1 ) x ( k – 2 ) : x ( k – N + 1 ) The signal flow graph and the output equation for an FIR digital filter. The filter output y(k) can be expressed as a summation equation, a difference equation or using vector notation. The signal flow graph can be drawn in a more modular fashion by splitting the N element summer into a series of two element summers: x(k) x(k-1) w0 w1 x(k-2) w2 x(k-3) w3 x(k-N+2) wN-2 x(k-N+1) wN-1 y(k) The signal flow graph for an FIR filter is often modularized in order that the large N element summer is broken down into a series of N-1 two element summing nodes. The operation, of course, of this filter is identical to the above. An IIR digital utilizes feedback (or recursion) in order to achieve a longer impulse response and therefore the possible advantage of a filter with a sharper cut off frequency (i.e., smaller transition bandwidth - see below) but with fewer weights than an FIR digital filter with an analogous frequency response. (For a more intuitive discussion on the operation of an IIR filter see entry for Infinite Impulse Response Filter.) The attraction of few weights is that the filter is cheaper to implement (in 89 terms of power consumption, DSP cycles and/or cost of DSP hardware). The signal flow graph and output equation for an IIR filter is: x(k) a0 a1 ∑ b3 a2 2 y( k) = y(k-3) x(k-2) x(k-1) y(k-2) b2 y(k-1) y(k) b1 3 an x ( k – n ) + n=0 ∑ bn y( k – n) n=1 = a 0 x ( k ) + a1 x ( k – 1 ) + a 2 x ( k – 2 ) + b 1 y ( k – 1 ) + b 2 y ( k – 2 ) + b3 y ( k – 3 ) x(k ) y(k – 1) = a T x k + b T yk – 1 = a 0 a 1 a 2 x ( k – 1 ) + b1 b 2 b 3 y ( k – 2 ) x(k – 2) y(k – 3 )) A signal flow graph and equation for a 2 zero, 3 pole IIR digital filter. The filter output y(k) can be expressed as a summation equation, a difference equation or using vector notation. Design algorithms to find suitable weights for digital FIR filters are incorporated into many DSP software packages and typically allow the user to specify the parameters of: • • • • • • • Sampling frequency; Passband; Transition band; Stopband; Passband ripple; Stopband attenuation; No. of weights in the filter. DSPedia 90 These parameters allow variations from the ideal (brick wall) filter, with the trade-offs being made by the design engineer. In general, the less stringent the bounds on the various parameters, then the fewer weights the digital filter will require: Transition Band Stopband Passband Passband Ripple Low Pass -3 Gain (dB) -3 Gain (dB) Transition Stopband Band Stopband Attenuation Passband Passband Ripple High Pass Stopband Attenuation “Ideal” Filter “Ideal” Filter Stop- Transition Band band Passband Gain (dB) -3 Band-Pass frequency fs/2 Transition StopBand band Passband Ripple Stopband Attenuation “Ideal” Filter frequency Passband fs/2 Transition Transition Band Stop- Band Passband band -3 Gain (dB) frequency Band-Stop Stopband Attenuation “Ideal” Filter fs/2 frequency Parameters for specifying low pass, high pass, band-pass and band stop filters fs/2 91 After the filter weights are produced by DSP filter design software the impulse response of the digital filter can be plotted, i.e. the filter weights shown against time: w0 = w30 = 0.00378... w1 = w29 = 0.00977... w2 = w28 = 0.01809... w3 = w27 = 0.02544... w4 = w26 = 0.027154... w5 = w25 = 0.019008... w6 = w24 = 0.00003... w7 = w23 = -0.02538... w8 = w22 = -0.04748... w9 = w21 = -0.05394... w10 = w20 = -0.03487... w11 = w19 = 0.01214... w12 = w18 = 0.07926... w13 = w17 = 0.14972... w14 = w16 = 0.20316... w15 = 0.22319... (Truncated to 5 decimal places) h(n) 0.25 0.20 0.15 0.10 1 T = ---------------- secs 10000 0.05 0 time, n -0.05 30 20 10 DESIGN 1: Low Pass FIR Filter Impulse Response The impulse response h ( n ) = w n of the low pass filter specified in the above SystemView dialog boxes: cut-off frequency 1000 Hz; passband gain 0dB; stopband attenuation 60dB; transition band 500 Hz; passband ripple 5dB and sampling at fs = 10000 Hz. The filter is linear phase and has 31 weights and therefore an impulse response of duration 31/10000 seconds. For this particular filter the weights are represented with floating point real numbers. Note that the filter was designed with 0dB in the passband. As a quick check the sum of all of the coefficients is approximately 1, meaning that if a 0 Hz (DC) signal was input, the output is not amplified or attenuated, i.e. gain = 1 or 0 dB. From the impulse response the DFT (or FFT) can be used to produce the filter magnitude frequency response and the actual filter characteristics can be compared with the original desired specification: 20 log H ( f ) H(f) 1.2 Gain (dB) Gain 1.0 0.8 0.6 0.4 0.2 0 1000 2000 3000 4000 5000 frequency (Hz) Linear Magnitude Response 0 -10 -20 -30 -40 -50 -60 -70 -80 0 1000 2000 3000 4000 5000 frequency (Hz) Logarithmic Magnitude Response The 1024 point FFT (zero padded) of the above DESIGN 1 low pass filter impulse response. The passband ripple is easier to see in the linear plot, whereas the stopband ripple is easier to see in the logarithmic plot. DSPedia 92 Amplitude To illustrate the operation of the above digital filter, a chirp signal starting at a frequency of 900 Hz, and linearly increasing to 1500 Hz over 0.05 seconds (500 samples) can be input to the filter and the output observed (individual samples are not shown): 0 1.00e-2 2.00e-2 3.00e-2 4.00e-2 0 time 0 1.00e-2 Amplitude 3.00e-2 4.00e-2 1 -----------secs 1500 1--------secs 900 0 2.00e-2 1000 Hz Cut off Low Pass Digital Filter 1.00e-2 2.00e-2 3.00e-2 4.00e-2 0 time 0 1.00e-2 2.00e-2 3.00e-2 4.00e-2 As the chirp frequency reaches about 1000 Hz, the digital filter attenuates the amplitude output signal by a factor of around 0.7 (3dB) until at 1500 Hz the signal amplitude is attenuated by more than 60 dB or a factor of 0.001. If a low pass filter with less passband ripple and a sharper cut off is required then another filter can be designed, although more weights will be required and the implementation cost of the filter has therefore increased. To illustrate this point, if the above low pass filter is redesigned, but this time with a stopband attenuation of 80dB, a passband ripple of 0.1dB and a transition band of, again, 93 500 Hz, the impulse response of the filter produced by the DSP design software now requires 67 weights: h(n) 0.25 0.20 0.15 0.10 1 T = ---------------- secs 10000 0.05 0 time, n -0.05 10 20 40 30 50 60 70 DESIGN 2: Low Pass FIR Filter Impulse Response The impulse response h ( n ) = w n of a low pass filter with: cut-off frequency 1000 Hz; passband gain 0dB; stopband attenuation 80dB; transition band 500 Hz; passband ripple 0.1dB and sampling at fs = 10000 Hz. The filter is linear phase and has 67 weights (compare to the above Design 1 which had 31 weights) and therefore an impulse response of duration 67/10000 seconds. The frequency response of this Design 2 filter can be found by taking the FFT of the digital filter impulse response: H(f) 20 log H ( f ) 1.2 0.8 Gain (dB) Gain 1.0 0.6 0.4 0.2 0 1000 2000 3000 4000 5000 frequency (Hz) Linear Magnitude Response 0 -10 -20 -30 -40 -50 -60 -70 -80 0 1000 2000 3000 4000 5000 frequency (Hz) Logarithmic Magnitude Response The 1024 point FFT (zero padded) of the DESIGN 2 impulse response low pass filter impulse response. Note that, as specified, the filter roll-off is now steeper, the stopband is almost 80 dB and the inband ripple is only fractions of a dB. Therefore low pass, high pass, bandpass, and bandstop digital filters can all be released by using the formal digital filter design methods that are available in a number of DSP software packages. (Or if you have a great deal of time on your hands you can design them yourself with a paper and pencil and reference to one of the classic DSP textbooks!) There are of course many filter design DSPedia 94 trade-offs. For example, as already illustrated above, to design a filter with a fast transition between stopband and passband requires more filter weights than a low pass filter with a slow roll-off in the transition band. However the more filter weights, the higher the computational load on the DSP processor, and the larger the group delay through the filter is likely to be. Care must therefore be taken to ensure that the computational load of the digital filter does not exceed the maximum processing rate of the DSP processor (which can be loosely measured in multiply-accumulates, MACs) being used to implement it. The minimum computation load of DSP processor implementing a digital filter in the time domain is at least: Computational Load of Digital Filter = ( Sampling Rate × No. of Filter Weights ) MACs (81) and likely to be a factor greater than 1 higher due to the additional overhead of other assembly language instructions to read data in/out, to implement loops etc. Therefore a 100 weight digital filter sampling at 8000 Hz requires a computational load of 800,000 MACs/second (readily achievable in the mid-1990’s), whereas for a two channel digital audio tape (DAT) system sampling at 48kHz and using stereo digital filters with 1000 weights requires a DSP processor capable of performing almost 100 million MACs per second (verging on the “just about” achievable with late-1990s DSP processor technology). See also Adaptive Filter, Comb Filter, Finite Impulse Response (FIR) Filter, Infinite Impulse Response (IIR) Filter, Group Delay, Linear Phase. Digital Filter Order: The order of a digital filter is specified from the degree of the z-domain polynomial. For example, an N weight FIR filter: y ( k ) = w 0 x ( k ) + w 1 x ( k – 1 ) + …w N – 1 x ( k – N + 1 ) (82) can be written as an N-1th order z-polynomial: Y ( z ) = X ( z ) [ w 0 + w 1 z –1 + w N – 1 z – N + 1 ] = X ( z )z – N + 1 [ w 0 z N – 1 + w 1 z N – 2 + …w N – 1 ] (83) For an IIR filter, the order of the feedforward and feedback sections of the filter can both be specified. For example an IIR filter with a 0-th order feedforward section (i.e. N = 1 above meaning w 0 = 1 and all other weights are 0), and an M-1th order feedback section is given by the difference equation: y ( k ) = x ( k ) + b 1 y ( k – 1 ) + b 2 y ( k – 2 ) + …b M – 1 y ( k – M + 1 ) (84) and the M-1th order denominator polynomial is shown below as: Y ( z )- = ----------------------------------------------------------------------------------------------------------------1 ----------– 1 X(z ) 1 + b1 z + … + bM – 2 z – M + 2 + bM – 1 z – M + 1 zM – 1 = ------------------------------------------------------------------------------------------------z M – 1 + b1 zM – 2 + … + bM – 2 z + bM – 1 (85) It is worth noting that for an IIR filter the coefficients are indexed starting at 1, i.e. b 1 If a b 0 coefficient were added in the above signal flow graph, then this would introduce a scaling of the output, y(k). See also Digital Filter, Finite Impulse Response Filter, Infinite Impulse Response Filter. 95 Digital Soundfield Processing (DSfP): The name given to the artificial addition of echo and reverberation to a digital audio signal. For example music played in a car can add echo and reverberation to the digital signal prior to being played through the speakers thus giving the impression of the acoustics of a large theatre or a stadium. Digital Television: The enabling technologies of digital television are presented in detail in [95], [96]. Digital to Analog Converter (D/A or DAC): A digital to analog converter is a device which will take a stream of digital numbers and convert to a continuous voltage signal. Every digital to analog converter has an input-output characteristic that specifies the output voltage for a given binary number input. The output of a DAC is very steppy, and will in fact produce frequency components above the sampling frequency. Therefore a reconstruction filter should be used at the output of a DAC to smooth out the steps. Most D/As used in DSP operate using 2’s complement arithmetic. See also Reconstruction Filter, Analog to Digital Converter. Digital Value Voltage 2 15 12 8 4 0 -4 -8 -12 -16 1 DAC time, k -1 time, k -2 Output (Volts) -4 2 1 Example of a 5 bit DAC converting a train of binary values to an analog waveform. 4 8 01100 01111 01000 -1 00100 11001 -8 11000 10100 -16 -12 10000 0 12 Binary Input 15 -2 Digital Video Interactive (DVI): Intel Inc. have produced a proprietary digital video compression technology which is generally known as DVI. Files that are encoded as DVI usually have the suffix, “.dvi” (as do LaTeXTM device independent files -- these are different). See also Standards. Diotic: A situation where the aural stimulation reaching both ears is the same. For example, diotic audiometric testing would play the exactly the same sounds into both ears. See also Audiometry, Dichotic, Monauralic. DSPedia 96 Dirac Impulse or Dirac Delta Function: The continuous time analog to the unit impulse function. See Unit Impulse Function. Direct Broadcast Satellite (DBS): Satellite transmission of television and radio signals may be received directly by a consumer using a (relatively small) parabolic antenna (dish) and a digital tuner. This form of broadcasting is gaining popularity in Europe, Japan, the USA and Australia. Direct Memory Access: Allowing access to read or write RAM without interrupting normal operation of the processor. The TMS320C40 DSP Processor has 6 independent DMA channels that are 8 bits wide and allow access to memory without interrupting the DSP computation operation. See also DSP Processor. Directivity: A measure of the spatial selectivity of an array of sensors, or a single microphone or antenna. Loosely, directivity is the ratio of the gain in the look direction to the average gain in all directions. The higher the directivity, the more concentrated the spatial selectivity of a device is in the look direction compared to all other directions. Mathematically, directivity is defined for a (power) gain function G(θ,φ,f) as: G ( 0, 0, f ) D ( f ) = ------------------------------------------------1 ------ ∫ G ( θ, φ, f ) dΩ 4π (86) FOV where the look direction (and the maximum of the gain function) is assumed to be θ=0 and φ=0 and the field of view (FOV) is assumed to be Ω = 4π steradians (units of solid angle). Note that the directivity defined above is a function of frequency, f, only. If directivity as a function of frequency, D(f), is averaged (i.e., integrated) over frequency then a single directivity number can be obtained for a wideband system. See also Superdirectivity, Sidelobe, Main Lobe, Endfire. Discrete Cosine Transform (DCT): The DCT is given by the equation: N–1 2πkn ∑ x ( n ) cos ------------N X(k) = for k = 0 to N – 1 (87) n=0 The DCT is essentially discrete Fourier transform (DFT) evaluated only for the real part of the complex exponential: N–1 X( k) = ∑ – j 2πkn -----------------x ( n )e N for k = 0 to N – 1 (88) n=0 The DCT is used in a number of speech and image coding algorithms. See also Discrete Fourier Transform. Discrete Fourier Transform: The Fourier transform [57], [58], [93] for continuous signals can be defined as: 97 ∞ x(t) = ∫ X ( f )ej2πft df Synthesis –∞ ∞ X( f) = (89) –∞ x( n) Analysis ∫ x ( t )e –j2πft dt Fourier Transform Pair NT s seconds 10 8 6 4 Ts 2 0 0 -1 1 3 N-3 N-2 N-1 4 sample -2 Sampling an analogue signal, x ( t ) , to produce a discrete time signal, x ( nTs ) written as x ( n ) . The sampling period is T s and the sampling frequency is therefore f s = 1 ⁄ T s . The total time duration of the N samples is NT s seconds. Just as there exists a continuous time Fourier transform, we can also derive a discrete Fourier transform (DFT) in order to assess what sinusoidal frequency components comprise this signal. In the case where a signal is sampled at intervals of Ts seconds and is therefore discrete, the Fourier transform analysis equation will become: ∞ X(f) = ∫ x ( nTs )e–j2πfnT d( nTs ) (90) s –∞ and hence we can write: ∞ ∞ ∑ X(f) = x ( nT 0 )e – j 2πfnT 0 ∑ = n = –∞ x ( nT0 )e – j 2πfn ---------------fs (91) n = –∞ To further simplify we can write the discrete time signal simply in terms of its sample number: ∞ X(f) = ∑ n = –∞ ∞ x ( nT 0 )e – j 2πfnT 0 = ∑ x ( n )e – j 2πfn ---------------fs (92) n = –∞ Of course if our signal is causal then the first sample is at n = 0 , and the last sample is at n = N – 1 , giving a total of N samples: DSPedia 98 N ∑ x ( n )e X( f) = – j 2πfn---------------fs (93) n=0 By using a finite number of data points this also forces the implicit assumption that our signal is now periodic, with a period of N samples, or NT s seconds (see above figure). Therefore noting that Eq. 93 is actually calculated for a continuous frequency variable, f , then in actual fact we need only evaluate this equation at specific frequencies which are the zero frequency (DC) and hamonics of the “fundamental” frequency, f 0 = 1 ⁄ NT s = f s ⁄ N , i.e. N – 1 discrete frequencies of 0, f 0 , 2f 0 , upto f s . kf s X ------- = N N–1 ∑ x ( n )e – j 2πkfs n ---------------------Nfs for k = 0 to N – 1 (94) n=0 Simplifying to use only the time indice, n , and the frequency indice, k , gives the discrete Fourier transform: N–1 X( k) = ∑ – j 2πkn -----------------x ( n )e N for k = 0 to N – 1 (95) n=0 If we recall that the discrete signal x ( k ) was sampled at f s then the signal has image (or alias) components above f s ⁄ 2 , then when evaluating Eq. 95 it is only necessary to evaluate up to f s ⁄ 2 , and therefore the DFT is further simplified to: N–1 X(k) = ∑ – j 2πkn -----------------x ( n )e N n=0 for k = 0 to N ⁄ 2 (96) Discrete Fourier Transform Clearly because we have evaluated the DFT at only N frequencies, then the frequency resolution is limited to the DFT “bins” of frequency width f s ⁄ N Hz. Note that the discrete Fourier transform only requires multiplications and since each complex exponential is computed in its complex number form. – j2πkn -----------------e N 2πkn 2πkn = cos -------------- – j sin -------------N N (97) If the signal x ( k ) is real valued, then the DFT computation requires approximately N 2 real multiplications and adds (noting that a real value multiplied by a complex value requires two real multiplies). If the signal x ( k ) is complex then a total of 2N 2 MACs are required (noting that the multiplication of two complex values requires four real multiplications). From the DFT we can calculate a magnitude and a phase response: X ( k ) = X ( k ) ∠X ( k ) From a given DFT sequence, we can of course calculate the inverse DFT from: (98) 99 1 x ( n ) = ---N N–1 ∑ j2πnk-------------X ( k )e N (99) k=0 As an example consider taking the DFT of 128 samples of an 8Hz sine wave sampled at 128 Hz: Time Signal x ( nT s ) 1 A m p Ts 500.e-3 l i 0 t u d e -500.e-3 -1 0 X ( kf 0 ) 250.e-3 500.e-3 time/s 750.e-3 Magnitude Response 500.e-3 M a 400.e-3 g n 300.e-3 i t u 200.e-3 d e 100.e-3 0 0 8 16 24 32 40 48 56 64 frequency/Hz The time signal shows 128 samples of an 8 Hz sine wave sampled at 128Hz: x ( n ) = sin ( 16πn ) ⁄ 128 . Note that there are exactly an integral number of periods (eight) present over the 128 samples. Taking the DFT exactly identifies the signal as an 8 Hz sinusoid. The DFT magnitude spectrum has an equivalent negative frequency portion which is identical to that of the positive frequencies if the time signal was real valued. DSPedia 100 If we take the DFT of the slightly more complex signal consisting of an 8Hz and a 24Hz sine wave of half the amplitude of the 8Hz then: 0 250.e 3 500.e 3 750.e 3 x ( nTs ) 1 A m p 500.e-3 Ts l i 0 t Time Signal u d -500.e-3 e -1 0 X ( kf 0 ) 250.e-3 500.e-3 time/s 750.e-3 Magnitude Response 500.e-3 M a 400.e-3 g n 300.e-3 i t u 200.e-3 d e 100.e-3 0 0 8 16 24 32 40 48 56 64 frequency/Hz The time signal shows 128 samples of an 8 Hz and 24 Hz sine waves sampled at 128Hz: x ( n ) = sin ( 16πn ) ⁄ 128 + 0.5 sin ( 48πn ) ⁄ 128 . Note that there are exactly an integral number of periods present for both sinusoids over the 128 samples. 101 Now consider taking the DFT of 128 samples of an 8.5 Hz sine wave sampled at 128 Hz: Time Signal x ( nT s ) 1 A m p 500.e-3 Ts l i 0 t u d e -500.e-3 -1 0 X ( kf 0 ) M a 250.e-3 500.e-3 time/s 750.e-3 Magnitude Response 350.e-3 300.e-3 250.e-3 g n 200.e-3 i t 150.e-3 u d e 100.e-3 50.e-3 0 0 8 16 24 32 40 48 56 64 frequency/Hz The time signal shows 128 samples of an 8.5 Hz sine wave sampled at 128Hz: x ( n ) = sin ( 17πn ) ⁄ 128 . Note that because the 8.5Hz sine wave does not lie exactly on a frequency bin, then its energy appears spread over a number of frequency bins around 8Hz. So why is the signal energy now spread over a number of frequency bins? We can interpret this by recalling that the DFT implicitly assumes that the signal is periodic, and the N data points being analysed are one full period of the signal. Hence the DFT assumes the signal has the form: x( t) time N samples Repeated samples Repeated samples and so on..... If there are an integral number of sine wave periods in the N samples input to the DFT computation, then the spectral peaks will fall exactly on one of the frequency bins as shown earlier. Essentially the result produced for the DFT computation has assumed that the signal was periodic, and the N samples form one period of the signal and thereafter the period repeats. Hence the DFT assumes the complete signal is as illustrated above (the discrete samples are not shows for clarity. DSPedia 102 If there are not an integral number of periods in the signal (as for the 8.5Hz example), then: Discontinuity x(t) time Repeated samples Repeated samples N samples and so on..... If there are not an integral number of sine wave periods in the N samples input to the DFT computation, then the spectral peaks will not fall exactly on one of the frequency bins. As the DFT computation has assumed that the signal was periodic, the DFT interprets that the signal undergoes a “discontinuity” jump at the end of the N samples. Hence the result of the DFT interprets the time signal as if this discontinuity was part of it. Hence more than one single sine wave is required to produce this waveform and thus a number of frequency bins indicate sine wave components being present. In order to address the problem of spectral leakage, the DFT is often used in conjunction with a windowing function. See also Basis Function, Discrete Cosine Transform, Discrete Fourier Transform - Redundant Computation, Fast Fourier Transform, Fourier, Fourier Analysis, Fourier Series, Fourier Transform, Frequency Response. Discrete Fourier Transform, Redundant Computation: If we rewrite the form of the DFT in Eq. 96 as: N–1 X(k) = ∑ x ( n )WNkn for k = 0 to N ⁄ 2 (100) n=0 where W = j2π -------eN Therefore to calculated the DFT of a (trivial) signal with 8 samples requires: X( 0) = x(0 ) + x( 1) + x( 2) + x( 3) + x(4 ) + x(5 ) + x( 6) + x( 7) X ( 1 ) = x ( 0 ) + x ( 1 )W 8– 1 + x ( 2 )W 8– 2 + x ( 3 )W 8– 3 + x ( 4 )W 8–4 + x ( 5 )W 8– 5 + x ( 6 )W 8– 6 + x ( 7 )W 8– 7 X ( 2 ) = x ( 0 ) + x ( 1 )W 8– 2 + x ( 2 )W 8– 4 + x ( 3 )W 8– 6 + x ( 4 )W 8–8 + x ( 5 )W 8– 10 + x ( 6 )W 812 + x ( 7 )W 8– 14 (101) X ( 3 ) = x ( 0 ) + x ( 1 )W 8– 3 + x ( 2 )W 8– 6 + x ( 3 )W 8– 9 + x ( 4 )W 8–12 + x ( 5 )W 8– 15 + x ( 6 )W 8– 18 + x ( 7 )W 8– 21 However note that there is redundant computation in Eq. 101. Consider the third term in the second line of Eq. 101: x ( 2 )W 8–2 = x ( 2 )e –2 j2π ------ 8 = x ( 2 )e – jπ -------2 (102) Now consider the computation of the third term in the fourth line of Eq. 101 x ( 2 )W 8– 6 = x ( 2 )e –6 j2π ------ 8 = – j3π ----------x ( 2 )e 2 = – jπ -------jπ x ( 2 )e e 2 = – jπ -------– x ( 2 )e 2 (103) 103 There we can save one multiply operation by noting that the term x ( 2 )W 8–6 = – x ( 2 )W 8–2 . In fact because of the periodicity of W Nkn every term in the fourth line of Eq. 101 is available from the terms in the second line of the equation. Hence a considerable saving in multiplicative computations can be achieved. This is the basis of the fast (discrete) Fourier transform discussed under item Fast Fourier Transform. Discrete Fourier Transform, Spectral Aliasing: Note that the discrete Fourier transform of a signal x ( n ) is periodic in the frequency domain. If we assume that the signal was real and was sampled above the Nyquist rate f s , then there are no frequency components of interest above f s ⁄ 2 . From the Fourier transform, if we calculate the frequency components up to frequency f s ⁄ 2 then this is equivalent to evaluating the DFT for the first N ⁄ 2 – 1 discrete frequency samples: N–1 ∑ X(k) = – j 2πkn -----------------x ( n )e N for k = 0 to N ⁄ 2 – 1 (104) n=0 Of course if we evaluate for the next N ⁄ 2 – 1 discrete frequencies (i.e. from f s ⁄ 2 to f s ) then: N–1 X( k ) = ∑ – j 2πkn -----------------x ( n )e N for k = N ⁄ 2 to N – 1 (105) n=0 In Eq. 11 if we substitute for the variable i = N – k ⇒ k = N – i and calculate over range i = 1 to N ⁄ 2 (equivalent to the range k = N ⁄ 2 to N – 1 ) then: N–1 X( i) = ∑ – j 2πin ----------------x ( n )e N for i = 1 to N ⁄ 2 (106) n=0 and we can write: N–1 X(N – k) = ∑ n=0 N–1 = ∑ – j 2π ( N – k )n--------------------------------N x ( n )e j2πkn- –------------------j 2πNn-------------N e N x ( n )e ∑ ∑ = j2πkn --------------x ( n )e N e –j 2πn (107) n=0 n=0 N–1 = N–1 x ( n )e j2πkn-------------N for k = N ⁄ 2 to N – 1 n=0 since e j2πn = 1 for all integer values of n . Therefore from Eq. 107 it is clear that: X( k ) = X(N – k ) (108) DSPedia 104 Hence when we plot the DFT it is symmetrical about the N ⁄ 2 frequency sample, i.e. the frequency value f s ⁄ 2 Hz depending on whether we plot the x-axis as a frequency indice or a true frequency value. We can further easily show that if we take a value of frequency index k above N – 1 (i.e. evaluate the DFT above frequency f s , then: N–1 ∑ X ( k + mN ) = n=0 N–1 ∑ = – j 2π ( k + mN )n --------------------------------------N x ( n )e N–1 = ∑ – j 2πkn -----------------x ( n )e N e –j 2πmn n=0 (109) – j 2πkn -----------------x ( n )e N n=0 = X(k) where m is a positive integer and we note that e j2πmn = 1 . Therefore we can conclude that when evaluating the magnitude response of the DFT the components of specific interest cover the (baseband) frequencies from 0 to f s ⁄ 2 , and the magnitude spectra will be symmetrical about the f s ⁄ 2 line and periodic with period f s : x(n) NT s seconds 10 8 6 4 Ts 2 0 0 -1 2 1 3 N-3 N-2 N-1 4 sample index -2 Discrete Fourier transform 1 ⁄ NT s Hz 1 ⁄ T s Hz X( k) fs/2 fs N discrete frequency points 3/2fs 2fs 5/2fs 3fs frequency/Hz Spectral aliasing. The main portion of interest of the magnitude response is the “baseband” from 0 to f s ⁄ 2 Hz. The “baseband” spectra is symmetrical about the point f s ⁄ 2 and thereafter periodic with period f s Hz. See also Discrete Fourier Transform, Fast Fourier Transform, Fast Fourier Transform - Zero Padding, Fourier Analysis, Fourier Series, Fourier Transform. 105 Discrete Time: After an analog signal has been sampled at regular intervals, each sample corresponds to the signal magnitude at a particular discrete time. If the sampling period was τ secs, then sampling a continuous time analog signal: x(t) (110) x n = x ( n ) = x ( nτ ) , for n = 0, 1, 2, 3, … (111) every τ seconds would produce samples For notational convenience the τ is usually dropped, and only the discrete time index, n, is used. Of course, any letter can be used to denote the discrete time index, although the most common are: “n”, “k” and “i”. Digital Signal After Sampling Analog Signal Before Sampling x(t) x(n) 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.010 0.011 0.012 5 time,t (secs) 1 2 3 4 8 6 7 12 9 10 11 Discrete time,n Sampling a signal x(t) at 1000Hz. The sampling interval is therefore: 1 τ = ------------- seconds 1000 The sampled signal is denoted as x ( n ) , where the explicit reference to τ has been dropped or notational convenience. Distortion: If the output of a system differs from the input in a non-linear fashion then distortion has occurred. For example, if a signal is clipped by a DSP system then the output is said to be distorted. By the very nature of non-linear functions, a distorted signal will contain frequency components that were not present in the input signal. Distortion is also sometimes used to describe linear frequency shaping. See also Total Harmonic Distortion. Distribution Function: See Random Variable. Dithering (audio): Dithering is a technique whereby a very low level of noise is added to a signal in order to improve the quality of the psychoacoustically perceived sound. Although the addition of dithering noise to a signal clearly reduces the signal to noise ratio (SNR) because it actually adds more noise to the original signal, the overall sound is likely to be improved by breaking up the correlation between the various signal components and quantization error (which, without dithering, results in the quantization noise being manifested as harmonic or tonal distortion). DSPedia 106 One form of dithering adds a white noise dither signal, d ( t ) with a power of q 2 ⁄ 12 , where q is the quantization level of the analog to digital converter (ADC), to the audio signal, x ( t ) prior to conversion: Dither signal d(t) time time Analog to Digital Converter (ADC) x(t) y(k) k Input signal Dithered sampled output signal Note that without dithering, the quantization noise power introduced by the ADC is q 2 ⁄ 12 , and therefore after dithering, the noise power in the digital signal is q 2 ⁄ 6 , i.e. the noise has doubled or increased by 3dB ( 20 log 2 ). However the dithered output signal will have decorrelated the quantization error of the ADC and the input signal, thus reducing the harmonic distortion components. This reduction improves the perceived sound quality. The following example illustrates dithering. A 600Hz sine wave of amplitude 6.104 × 10 –5 ( = 2 ⁄ 32767 ) volts was sampled at 48000Hz with a 16 bit ADC which had the following input/output characteristic: Binary Output 32767 16384 -1 -0.5 0.5 -16384 1 Voltage Input (volts) -32768 16 bit Analogue to Digital Converter Input/Output Characteristic. After analog to digital conversion (with d ( t ) = 0 , i.e. no dithering) the digital output has an amplitude of 2. On a full scale logarithmic plot, 2 corresponds to -84 dB ( = 20 log ( 2 ⁄ 32767 ) ) where 107 Amplitude, x(n) Magnitude, |X(f)| (dB) the full scale amplitude of 32767 ( = 2 15 – 1 ) is 0dB. Time and frequency representations of the output of the ADC are shown below, along with a 16384 point FFT of the ADC output: frequency (kHz) time(ms) The frequency representation of the 600Hz sine wave clearly shows that the quantization noise manifests itself as harmonic distortion. Therefore when this signal is reconverted to analog and replayed, the harmonic distortion may be audible. The magnitude frequency spectrum of the (undithered) signal clearly highlights the tonal distortion components which result from the conversion of this low level signal. The main distortion components are at 1800Hz, 3000Hz, 4200Hz, and so on, (i.e. at 3, 5, 7,..., times the signal’s fundamental frequency of 600 Hz). Amplitude, y(n) Magnitude, |Y(f)| (dB) However if the signal was first dithered by adding an analog white noise dithering signal, d ( t ) of power q 2 ⁄ 12 prior to ADC conversion then the time and frequency representations of the ADC output are: time(ms) frequency (kHz) The frequency representation of the dithered 600Hz sine wave clearly shows that the correlation between signal and the quantization error has been removed. Therefore if the signal is reconverted to analog and replayed then the quantization noise is now effectively whitened and harmonic distortion of the signal is no longer perceived. Note that the magnitude frequency spectrum of the dithered signal has a higher average noise floor, but the tonal nature of the quantization noise has been removed. This dithered signal is more DSPedia 108 perceptually tolerable to listen to as the background white noise is less perceptually annoying that than the harmonic noise generated without dithering. Note that a common misconception is that dithering can be used to improve the quality of prerecorded 16 bit hifidelity audio signals. There are, however, no techniques by which a 16 bit CD output can be dithered to remove or reduce harmonic distortion other than add levels of noise to mask it! It may appear in the previous figure as if simply perturbing the quantized values would be a relatively simple and effective dithering technique. There are a number of important differences between dithering before and after the quantizer. First, after the quantizer the noise is simply additive and the spectra of the dither and the harmonically distorted signal add (this is the masking of the harmonic distortion referred to above -- requiring a relatively high power dither). The additive dithering before quantization does not result in additive spectra because the quantization is nonlinear. Another difference can be thought of this way: the dither signal is much more likely to cause a change in the quantized level when the input analog signal is close to a quantization boundary (i.e., it does not have to move the signal value very far). After quantization, we have no way of knowing (in the general case) how close an input signal was to a quantization boundary -so mimicking the dither effect is not, in general, possible. However if a master 20 bit (or higher) resolution recording exists and it is to be remastered to 16 bits, then digital dithering is appropriate, whereby the 20 bit signal can be dithered prior to requantizing to 16 bits. The benefits will be similar to those described above for ADCs. Some simple mathematical analysis of the benefits of dithering for breaking up correlation between the signal and the quantization noise can be done. The following figure shows the correlation between a sine wave input signal and the quantization error for 1 to 8 bits of signal resolution: 0.4 0.35 0.3 0.25 No dither 0.2 0.15 0.1 Single bit dither 0.05 0 1 2 3 4 5 6 7 8 Number of bits of signal resolution SNR (dB) Correlation Coefficient 0.45 50 45 40 35 30 25 20 15 10 5 0 No dither Single bit dither 1 2 3 4 5 6 7 8 Number of bits of signal resolution For low resolution signals the correlation between the signal and quantization error is high. This will be see as tonal or harmonic distortion. however if simple dithering scheme is performed prior to analog to digital conversion the correlation can be greatly reduced. For less than 8 bits resolution the correlation between the signal and quantization noise increases to 0.4 and the signal will sound very (harmonically) distorted. The solid line shows the correlation and signal to noise ratio (SNR) of the signal before and after dither has been added. Clearly the dither is successful at breaking up the correlation between signal and quantization noise and the benefits are greatest for low resolutions. However the total quantization noise in the digital signal after dithering is increased by 3dB for all bit resolutions. 109 A uniformly distributed probability density function (PDF) and maximum amplitude of a half bit ( ± q ⁄ 2 ) is often used for dithering. Adding a single half bit dither signal successfully decorrelates the expected error, however the second moment of the error remains correlated. To decorrelate the second order moment a second uniformly distributed signal can be added. Higher order moments can be decorrelated by adding additional single bits (with uniform probability density functions), however it is found in practice that two uniform random variables (combining to give a triangular probability density function) are sufficient. The effect of adding two random variables with uniform PDFs of p ( x ) is equivalent to adding a random binary sequence with a triangular PDF (TPDF): p( x) -q/2 p( y) q/2 d 1 -q p(x ) q y d1 ( t ) -q/2 q/2 d 2 When two uniformly distributed random variables d 1 and d 2 , are added together, the probability density function (PDF) of the result, y is a random variable with a triangular PDF (TPDF) obtained by a convolution of the PDFs of d 1 and d 2 . The noise power added to the output signal by one uniform PDF is q 2 ⁄ 12 , and therefore with two of these dithering signals q 2 ⁄ 6 noise power is added to the output signal. Noting that the quantization noise power of the ADC is q 2 ⁄ 12 and therefore the total noise power of an audio signal dithered with a TPDF is q 2 ⁄ 4 , i.e. total noise power in the output signal has increased by a factor of 3 or by 4.8 dB ( 10 log 3 ) over the noise power from the ADC being used without dither. Despite this increase in total noise, the noise power is now more uniformly distributed over frequency (i.e., more white and sounding like a broadband hissing) and the harmonic distortion components caused by correlation between quantization error and the input signal has been effectively attenuated. In order to mathematically illustrate why dither works, an extreme case of low bit resolution will be addressed. For a single bit ADC (stochastic conversion) the quantizer is effectively reduced to a comparator where: x ( k ) = sign ( x ( k ) ) = 1, v ( n ) ≥ 0 – 1, v ( n ) < 0 (112) For an input constant (dc) input signal of v ( t ) = V 0 then x ( k ) = 1 , if V 0 > 0 regardless of the exact magnitude. However by adding a dither signal d(n) with uniform probability density function over the values Q ⁄ 2 and – Q ⁄ 2 before performing the conversion, such that: x ( k ) = 1, v ( n ) + d ( n ) ≥ 0 – 1, v ( n ) + d ( n ) < 0 (113) DSPedia 110 and taking the mean (expected) value of x ( n ) gives: E [ x ( n ) ] = E [ sign ( v ( n ) + d ( n ) ) ] = E [ sign ( n′ ( k ) ) ] (114) where the n′ ( k ) is a uniformly distributed random variable with a uniform distribution over values of V 0 – Q ⁄ 2 and V 0 + Q ⁄ 2 . We can therefore show that the expected or mean value of the dither signal is: E [ x ( n ) ] = ( –1 ) 0 ∫ V0 – Q ⁄ 2 1--dn′ + Q V0 + Q ⁄ 2 ∫ 0 1--dn′ Q (115) 1 2 1 = ---- V 0 – Q ---- + ---- V 0 + Q ---- = ---- V Q Q 0 Q 2 2 Therefore in the mean, the quantizer average dithered output is proportional to V 0 . The same intuitive argument can be seen for time varying x(n), as long as the sampling rate is sufficiently fast compared to the changes in the signal. Dither can be further addressed with oversampling techniques to perform noise shaped dithering. See also Analog to Digital Conversion, Digital to Analog Conversion, Digital Audio, Noise Shaping, Tonal Distortion. Divergence: When an algorithm does not converge to a stable solution and instead progresses ever further away from a solution it may be said to be diverging. See also the Convergence entry. Divide and Conquer: The name given to the general problem solving strategy of first dividing the overall problem into a series of smaller sub-problems, solving these subproblems, and finally using the solutions to the subproblems to give the overall solution. Some people also use this as an approach to competing against external groups or managing people within their own organization. Division: Division is rarely required by real time DSP algorithms such as filtering, FFTs, correlation, adaptive algorithms and so on. Therefore DSP processors do not provide a provision for performing fast division, in the same way that single cycle parallel multipliers are provided. Therefore division is usually performed using a serial algorithm producing a bit at a time result, or using an iterative technique such as Newton-Raphson. Processors such as the DSP56002 can perform a fixed point division in around 12 clock cycles. It is worth pointing out however that some DSP algorithms such the QR for adaptive signal processing have excellent convergence and stability properties and do require division. Therefore is it possible that in the future some DSP devices may incorporate fast divide and square roots to allow these techniques to be implemented in real time. See also DSP Processor, Parallel Adder, Parallel Multiplier. Dosemeter: See Noise Dosemeter. Dot Product: See Vector Properties - Inner Product. Downsampling: The sampling rate of a digital signal sampled at fs can be downsampled by a factor of M to a sampling frequency fd = fs/M by retaining only every M-th sample. Downsampling can lead to aliasing problems and should be performed in conjunction with a low pass filter that cuts- 111 off at fs/2M; this combination is usually referred to as a decimator. See also Aliasing, Upsampling, Decimation, Interpolation, Fractional Sampling Rate Conversion. ts x(k) 1f s = --ts y(k) td ----fd = M td time time Input Output 4 Downsampler |Y(f)| 0 |X(f)| fs /2 fs 0 fd /2 fd frequency 3fd /2 2fd 5fd /2 3fd 7fd /2 4fd frequency Dr. Bub: The electronic bulletin board operated by Motorola and providing public domain source code, and Motorola DSP related information and announcements. Driver: The power output from a DAC is usually insufficient to drive an actuator such as a loudspeaker. Although the voltage may be at the correct level, the DAC cannot source enough current to deliver the required power. Therefore a driver in the form of an amplifier is required. See also Signal Conditioning. DSP Processor DAC Driver Amplifier DSP Board: A DSP board is a generic name for a printed circuit board (PCB) which has a DSP processor, memory, A/D and D/A capabilities, and digital input ports (parallel and serial). For development work most DSP boards are plug-in modules for computers such as the IBM-PC, and Macintosh. The computer is used as a host to allow assembly language programs to be conveniently developed and tested using assemblers and cross compilers. When an application DSPedia 112 has been fully developed, a stand-alone DSP board can be realized. See also Daughter Module, DSP Processor, Motherboard. Interface to Host Computer Address bus DSP Processor ROM Digital to Analog Converter RAM Analog to Digital Converter Parallel and Serial I/O Data bus Voltage Output Voltage Input DSP Processor: A microprocessor that has been designed for implementing DSP algorithms. The main features of these chips are fast interrupt response times, a single cycle parallel multiplier, and a subset of the assembly language instructions found on a general purpose microprocessor (e.g. Motorola 68030) to save on silicon area and optimize DSP type instructions. The main DSP processors are the families of the DSP56/96 (Motorola), TMS320 (Texas Instruments), ADSP 2100 (Analog Devices), and DSP16/32 (AT&T). DSP Processors are either floating point or fixed point devices. See also DSP Board. Address Bus Data Bus Control Bus Data and Address Registers Parallel Multiplier Instruction Decoder Interrupt Handler RAM Arithmetic Logic Unit ROM Timers EPROM A Generic DSP Processor DSPLINKTM: A bidirectional and parallel 16 bit data interface path used on Loughborough Sound Images Ltd. (UK) and Spectron (USA) DSP boards to allow high speed communication between separate DSP boards and peripheral boards. The use of DSPLINK means that data between separate boards in a PC do not need to communicate data via the PC bus. Dual: A prefix to mean “two of”. For example the Burr Brown DAC2814 chip is described as a Dual 12 Bit Digital to Analog Converter (DAC) meaning that the chip has two separate (or independent) DACs. In the case of DACs and ADCs, if the device is used for hi-fidelity audio dual devices are often referred to as stereo. See also Quad. Dual Slope: A type of A/D converter. 113 Dual Tone Multifrequency (DTMF): DTMF is the basis of operation of push button tone dialing telephones. Each button on a touch tone telephone is a combination of two frequencies, each from a group of four. 2 4 = 16 possible combinations of tones pairs can be encoded using the two groups of four tones. The two groups of four frequencies are: (low) 697Hz, 770Hz, 852Hz, 941Hz, and (high) 1209Hz, 1336Hz, 1477Hz, and 1633Hz: 1209 Hz 1336 Hz 1477Hz 1633Hz 697 Hz 1 2 3 A 770 Hz 4 5 6 B 852 Hz 7 8 9 C 941 Hz . 0 # D Each button on the keypad is a combination of two DTMF frequencies. (Note most telephones do not have keys A,B, C, D) The standards for DTMF signal generation and detection are given in the ITU (International Telecommunication Union) standards Q.23 and Q.24. In current telephone systems, virtually every telephone now uses DTMF signalling to allow transmission of a 16 character alphabet for applications such as number dialing, data entry, voice mail access, password entry and so on. The DTMF specifications commonly adopted are: Signal Frequencies: • Low Group 697, 770, 852, 941 Hz • High Group: 1209, 1336, 1477, 1633 Hz Frequency tolerance: • Operation: ≤ 1.5% Power levels per frequency: • Operation: 0 to -25dBm • Non-operation: -55dBm max Power level difference between frequencies • +4dB to -8dB Signal Reception timing: • Signal duration: operation: 40ms (min) • Signal duration: non-operation: 23ms (max) • Pause duration: 40ms (min); • Signal interruption: 10ms (max); • Signalling velocity: 93 ms/digit (min). DSPedia 114 See also Dual Tone Multifrequency - Tone Detection, Dual Tone Multifrequency - Tone Generation, Goertzel’s Algorithm. Dual Tone Multifrequency (DTMF), Tone Generation: One method to generate a tone is to use a sine wave look up table. For example some members of the Mototola DSP56000 series of processors include a ROM encoded 256 element sine wave table which can be used for this purpose. Noting that each DTMF signal is a sum of two tones, then it should be possible to use a look up table at different sampling rates to produce a DTMF tone. An easier method is to design a “marginally stable” IIR (infinite impulse response) filter whereby the poles of the filter are on the unit circle and the filter impulse response is a sinusoid at the desired frequency. This method of tone generation requires only a few lines of DSP code, and avoids the requirement for “expensive” look-up tables. The structure of an IIR filter suitable for tone generation is simply: time x(k) y(k-2) y(k-1) y(k) Impulse input time Sinusoidal Output -1 b1 A two pole “marginally stable” IIR filter. For an input of an impulse the filter begins to oscillate. This operation of this 2 pole filter can be analysed by considering the z-domain representation. The discrete time equation for this filter is: 2 y(k) = x(k) + ∑ b n y ( k – n ) = x ( k ) + by ( k – 1 ) – y ( k – 2 ) (116) n=1 where we now write b 1 = b and b 2 = – 1 . Writing this in the z-domain gives: Y ( z ) = X ( z ) + b z –1 Y ( z ) – z – 2 Y ( z ) (117) The transfer function, H ( z ) , is therefore: ( z )----------H(z) = Y X( z ) 1 1 1 = ----------------------------------= --------------------------------------------------------- = ------------------------------------------------------------------1 – bz – 1 + z –2 ( 1 – p 1 z – 1 ) ( 1 – p 2 z – 1 ) 1 – ( p 1 + p 2 )z –1 + p 1 p 2 z – 2 (118) 115 where, p 1 and p 2 are the poles of the filter, and b = p 1 + p 2 and p 1 p 2 = 1 . The poles of the filter, p 1, 2 (where the notation p 1, 2 means p 1 and p 2 ) can be calculated from the quadratic formula as: b ± j 4 – b 2± b 2 – 4- = --------------------------------------------------------p 1, 2 = b 2 2 (119) Given that b is a real value, then p 1 and p 2 are complex conjugates. Rewriting Eq. 119 in polar form gives: p 1, 2 = 4 – b2 ± j tan–1 ------------------b e (120) Considering the denominator polynomial of Eq. 118, the magnitude of the complex conjugate values p 1 and p 2 are necessarily both 1, and the poles will lie on the unit circle. In terms of the frequency placement of the poles, noting that this is given by: p 1, 2 = 1 = e ± j2 πf -------------fs (121) (where e jω = 1 for any ω ) for a sampling frequency f s , from Eqs. 121 and 120 it follows that: 4 – b2 2πf --------- = tan–1 ------------------b fs (122) For most telecommunication systems the sampling frequency is f s = 8000Hz . The values of b for the various desired DTMF frequency of oscillations can therefore be calculated from Eq. 122 to be: b frequency, f / Hz 1.707737809 697 1.645281036 770 1.568686984 852 1.478204568 941 1.164104023 1209 0.996370211 1336 0.798618389 1477 0.568532707 1633 DSPedia 116 For example, in order to generate the DTMF signal for the digit #1, it is required to produce two tones, one at 697 Hz and one at 1209 Hz. This can be accomplished by using the IIR filter : time x(k) y(k) Impulse input time Dual tone Output -1 1.707737... -1 1.164104... An IIR filter to produce the DTMF signal for the digit #1. The filter consists of two “marginally stable” two pole IIR files producing the 697 Hz tone (top) and the 1209 Hz tone (bottom) added together. Note that the filters will have different magnitude responses and therefore the two tones are unlikely to have the same amplitude. The ITU standard allows for this amplitude difference. See also Dual Tone Multifrequency (DTMF) - Tone Detection, Dual Tone Multifrequency (DTMF) Tone Detection, Goertzel’s Algorithm. Dual Tone Multifrequency (DTMF), Tone Detection: DTMF tones can be detected by performing a discrete Fourier transform (DFT), and considering the level of power that is present in a particular frequency bin. Because DTMF tones are often used in situations where speech may also be present, it is important that any detection scheme used can distinguish between a tone and a speech signal that happens to have strong tonal components at a DTMF frequency. Therefore for a DTMF tone at f Hz, a detection scheme should check for the signal component at f Hz and also check that there is no discernable component at 2f Hz; quasi-periodic speech components (such as vowel sounds) are rich in (even) harmonics, whereas DTMF tones are not. The number of samples used in calculating the DFT should be shorter than the number of samples in half of a DTMF signalling interval, typically of 50ms duration equivalent to 400 samples at a sampling frequency of f s = 8000 Hz , but be large enough to give a good frequency resolution. The DTMF standards of the International Telecommunication Union (ITU) therefore suggest a value of 205 samples in standards Q.23 and Q.24. Using this 205 point DFT the DTMF fundamental and the second harmonics of the 8 possible tones can be successfully discerned. Simple decision logic is applied to the DFT output to specify which tone is present. The second harmonic is also detected in order that the tones can be discriminated from speech utterances that happen to include a frequency component at one of the 8 frequencies. Speech can have very strong harmonic content, whereas the DTMF tone will not. To add robustness against noise, the same DTMF tones require to be detected in a row to give a valid DTMF signal . 117 If a 205 point DFT is used, then the frequency resolution will be: ------------- = 39.02 Hz Frequency Resolution = 8000 205 (123) The DTMF tones therefore do not all lie exactly on the frequency bins. For example the tone at 770 Hz will be detected at the frequency bin of 780 Hz ( 20 × 39.02 Hz ). In general the frequency bin, k to look for a single tone can be calculated from: f tone N k = int ---------------- fs (124) where f tone is a DTMF frequency, N = 205 and f s = 8000 Hz . The bins for all of the DTMF tones for these parameters are therefore: frequency, f / Hz bin 697 18 770 20 852 22 941 24 1209 31 1336 34 1477 38 1633 42 When the 2nd harmonic of a DTMF frequency is to be considered, then the bin at twice the fundamental frequency bin value is detected (there should be no appreciable signal power there for a DTMF frequency). When calculating the DFT for DTMF detection because we are only interested in certain frequencies, then it is only necessary to calculate the frequency components at the frequency bins of interest. Therefore an efficient algorithm based on the DFT called Goertzel’s algorithm is usually used for DTMF tone detection. See also Dual Tone Multifrequency , Dual Tone Multifrequency - Tone Generation, Goertzel’s Algorithm. Dynamic Link Library: A library of compiled software routines in a separate file on disk that can be called by a Microsoft Windows program. Dynamic RAM (DRAM): Random access memory that needs to be periodically refreshed (electrically recharged) so that information that is stored electrically is not lost. See also Non-volatile RAM, Static RAM. Dynamic Range: Dynamic range specifies the numerical range, giving an indication of the largest and smallest values that can be correctly represented by a DSP system. For example if 16 bits are used in a system then the linear (amplitude) dynamic range is -215 → 215-1 (-32768 to +32767). Usually dynamic range is given in decibels (dB) calculated from 20 log10 (Linear Range), e.g. for 16 bits 20log10216 =96dB. 118 DSPedia 119 E e: The natural logarithm base, e = 2.7182818… . e can be derived by taking the following limit: n --- e ≡ lim 1 + 1 n n → ∞ (125) See also Exponential Function. Ear: The ear is a basically the system of flesh, bone, nerves and brain allowing mammals to perceive and react to sound. It is probably fair to say that a very large percentage of DSP is dealing with the processing, coding and reproduction of audio signals for presentation to the human ear. Semicircular canals Pinna Inner ear bones Cochlear nerves to the brain Auditory canal Cochlea Eardrum A Simplified Diagram of the Human Ear The human ear can be generally described as consisting of three parts, the outer, middle and inner ear. The outer ear consists of the pinna and the ear canal. The shape of the external ear has evolved such that is has good sensitivity to frequencies in the range 2 - 4kHz. Its complex shape provides a number of diffracted and reflected acoustic paths into the middle ear which will modify the spectrum of the arriving sound. As a result a single ear can actually discriminate direction of arrival of broadband sounds. The ear canal leads to the ear drum (tympanic membrane) which can flex in response to sound. Sound is then mechanically conducted to the inner ear interconnection of bones (the ossicles), the malleus (hammer), the incus (anvil) and the stapes (stirrup) which act as an impedance matching network (with the ear drum and the oval window of the cochlea) to improve the transmission of acoustic energy to the inner ear. Muscular suppression of the ossicle movement provides for additional compression of very loud sounds. The inner ear consists mainly of the cochlea and the vestibular system which includes the semicircular canals (these are primarily used for balance). The cochlea is a fluid filled snail-shell shaped organ that is divided along its length by two membranes. Hair cells attached to the basilar membrane detect the displacement of the membrane along the distance from the oval window to the end of the cochlea. Different frequencies are mapped to different spots along the basilar membrane. The further the distance from the oval window, the lower the frequency. The basilar membrane and its associated components can be viewed as acting like a series of bandpass filters DSPedia 120 sending information to the brain to interpret [30]. In addition, the output of these filters is logarithmically compressed. The combination of the middle and inner ear mechanics allows signals to be processed over the amazing dynamic range of 120dB. See also See also Audiology, Audiometer, Audiometry, Auditory Filters, Hearing Impairment, Threshold of Hearing. EBCDIC: See also ASCII. Echo: When a sound is reflected of a nearby wall or object, this reflection is called an echo. Subsequent echoes (of echoes), as would be clearly heard in a large, empty room are referred to collectively as reverberations. Echoes also occur on telecommunication systems where impedance mismatches reflect a signal back to the transmitter. Echoes can sometimes be heard on long distance telephone calls. See also Echo Cancellation, Reverberation. Echo Cancellation: An echo canceller can be realised [53] with an adaptive signal processing system identification architecture. For example if a telephone line is causing an echo then by incorporating an adaptive echo canceller it should be possible to attenuate this echo: A ADC Input Signal Adaptive Filter Output Signal − B DAC Simulated echo of A Echo “Generator” e.g. Hybrid Telephone Connection To Speaker B + B + echo of A A simple adaptive echo canceller. The success of the cancellation will depend on the statistics and relative powers of the signals A and B. When speaker A (or data source A) sends information down the telephone line, mismatches in the telephone hybrids can cause echoes to occur. Therefore speaker A will hear an echo of their own voice which can be particularly annoying if the echo path from the near and far end hybrids is particularly long. (Some echo to the earpiece is often desirable for telephone conversation, and the local hybrid is deliberately mismatched. However for data transmission echo is very undesirable and must be removed.) If the echo generating path can be suitably modelled with an adaptive filter, then a negative simulated echo can be added to cancel out the signal A echo. At the other end of the line, telephone user B can also have an echo canceller. In general local echo cancellation (where the adaptive echo canceller is inside the consumer’s telephone/data communication equipment) is only used for data transmission and not speech. Minimum specifications for the ITU V-series of recommendations can be found in the CCITT Blue Book. For V32 modems (9600 bits/sec with Trellis code modulation) an echo reduction ratio of 52dB is required. This is a power reduction of around 160,000 in the echo. Hence the requirement for a powerful DSP processor. For long distance telephone calls where the round trip echo delay is more than 0.1 seconds and suppressed by less than 40dB (this is typical via satellite or undersea cables) line echo on speech 121 can be a particularly annoying problem. Before adaptive echo cancellers were cost effective to implement, the echo problem was be solved by setting up speech detectors and allowing speech to be half duplex. This was inconvenient for speakers who were required to take turns speaking. Adaptive echo cancellers at telephone exchanges have helped to solve this problem. The set up of the telephone exchange echo cancellers is a little different from the above example and the echo is cancelled on the outgoing signal line, rather than the incoming signal line. See also Acoustic Echo Cancellation, Adaptive Filtering, Least Mean Squares Algorithm. Eigenanalysis: See Matrix Decompositions - Eigenanalysis. Eigenvalue: See Matrix Decompositions - Eigenanalysis. Eigenvector: See Matrix Decompositions - Eigenanalysis. Eight to Fourteen Modulation (EFM): EFM is used in compact disc (CD) players to convert 8 bit symbols to a 14 bit word using a look-up table [33]. When the 14 bit words are used fewer 1-0 and 0-1 transitions are needed than would be the case with the 8 bit words. In addition, the presence of the transitions are guaranteed. This allows required synchronization information to be placed on the disc for every possible data set. In addition, the forced presence of zeros allows the transitions (ones) to occur less frequently than would otherwise be the case. This increases the playing time since more bits can be put on a disk with a fixed minimum feature size (i.e., pit size). See also Compact Disc. Amplitude (mV) Electrocardiogram (ECG): The general name given to the electrical potentials of the heart sensed by electrodes placed externally on the body (i.e., surface leads) [48]. These potentials can also be sensed by placing electrodes directly on the heart as is done with implantable devices (sometimes referred to as pacemakers). The bandwidth used for a typical clinical ECG signal is about 0.05-100Hz. The peak amplitude of a sensed ECG signal is about 1 mV and for use in a DSP system the ECG will typically require to be amplified by a low noise amplifier with gain of about 1000 or more. 0.4 0.2 0 Example ECG time (secs) Electroencephalogram (EEG): The EEG measures small microvolt potentials induced by the brain that are picked up by electrodes placed on the head [48]. The frequency range of interest is about 0.5-60Hz. A number of companies are now making multichannel DSP acquisition boards for recording EEGs at sampling rates of a few hundred Hertz. Electromagnetic Interference (EMI): Unwanted electromagnetic radiation resulting from energy sources that interfere with or modulate desired electrical signals within a system. Electromagnetic Compatibility (EMC): With the proliferation of electronic circuit boards in virtually every walk of life particular care must be taken at the design stage to avoid the electronics DSPedia 122 acting as a transmitter of high frequency electromagnetic waves. In general a strip of wire with a high frequency current passing through can act as an antenna and transmit radio waves. The harmonic content from a simple clock in a simple microprocessor system can easily give of radio signals that may interfere with nearby radio communications devices, or other electronic circuitry. A number of EMC regulations have recently been introduced to guard against unwanted radio wave emissions from electronic systems. Electromagnetic Spectrum: Electromagnetic waves travel through space at approximately 3 × 10 8 m/s, i.e. the speed of light. In fact, light is a form of electromagnetic radiation for which we have evolved sensors (eyes). The various broadcasting bands are classified as very low (VLF), low (LF), medium (MF), high (HF), very high (VHF), ultra high (UHF), super high (SHF), and extremely high frequencies (EHF). One of the most familiar bands in everyday life is VHF (very high) used by FM radio stations. VLF LF MF HF AM Radio 3kHz 30kHz 300kHz 3MHz VHF UHF FM Radio 30MHz 300MHz SHF Satellite 3GHz EHF Infrared Visible Light 30GHz 300GHz Electromyogram (EMG): Signals sensed by electrodes placed inside muscles of the body. The frequency range of interest is 10-200Hz. Electroreception: Electroreception is a means by which fish, animals and birds use electric fields for navigation or communication. There are two type of electric fish: “strongly electric” such as the electric eel which can uses its electrical energy as a defense mechanism, and; “weakly electric” which applies to many common sea and freshwater fish who use electrical energy for navigation and perhaps even communication [151]. Weakly electric fish can have one of two differing patterns of electric discharge: (1) Continuous wave where a tone like signal is output at frequencies of between 50 and 1000 Hz, and (2) Pulse wave where trains of pulses lasting about a millisecond and spaced about 25 milliseconds apart. The signals are generated by a special tubular organ that extends almost from the fish head to tail. By sensing the variation in electrical conductivity caused by objects distorting the electric field, an electrical image of can be conveyed to the fish via receptors on its body. The relatively weak electric field, however, means that fish are in general electrically short sighted and cannot sense distances any more than one or two fish lengths away. However this is enough to avoid rocks and other poor electrical conductors which will disperse electrical shadows that the fish can pick up on. See also Mammals. Elementary Signals: A set of elementary signals can be defined which have certain properties and can be combined in a linear or non linear fashion with time shifts and periodic extensions to create more complicated signals. Elementary signals are useful for the mathematical analysis and description of signals and systems [47]. Although there is no universally agreed list of elementary signals, a list of the most basic functions is likely to include: 1. Unit Step; 2. Unit Impulse; 3. Rectangular Pulse; 4. Triangular Pulse 5. Ramp Function; 123 6. Harmonic Oscillation (sine and cosine waves); 7. Exponential Functions; 8. Complex Exponentials; 9. Mother Wavelets and Scaling Functions; Both analog and discrete versions of the above elementary signals can be defined. Elementary signals are also referred to as signal primitives. See also Convolution, Elementary Signals, Fourier Transform Properties, Impulse Response, Sampling Property, Unit Impulse Function, Unit Step Function. Elliptic Filter: See Filters. Embedded Control: DSP processors and associated A/D and D/A channels can be used for control of a mechanical system. For example a feedback control algorithm with could be used to control the revolution speed of the blade in a sheet metal cutter. Typically the term embedded will imply a real-time system. Emulator: A hardware board or device which has (hopefully!) the same functionality as an actual DSP chip, and can be used conveniently and effectively for developing and debugging applications before actual implementation on the DSP chip. Endfire: A beamformer configuration in which the desired signal is located along a line that contains a linear array of sensors. See also Broadside, Superdirectivity. Sensors τ1 Output Summer or DSP processor τ2 τ3 τM Delays di,i+1 Endfire Look Direction d 1, n τ n = ---------c c is propagation velocity Engaged Tone: See also Busy Tone. Ensemble Averages: A term used interchangeably with statistical average. See Expected Value. Entropy: See Information Theory Entropy Coding: Any type of data compression technique which exploits the fact that some symbols are likely to occur less often than others and assigns fewer bits for coding to the more frequent. For example the letter “e” occurs more often in the English language that the letter “z”. Therefore the transmission code for “e” may only use 2 bits, whereas the transmission code for “z” might require 8 bits. The technique can be further enhanced by assigning codes to comment groups of letters such as “ch”, or “sh”. See also Huffman Coding. DSPedia 124 Equal Loudness Contours: Equal loudness gives a measure of the actual SPL of a sound compared to the perceived or judged loudness. i.e. a purely subjective measure. The equal loudness contours are therefore presented for equal phons (the subjective measure of loudness). Equal Loudness Contours 140 SPL (dB) Phons: 120 120 100 100 80 80 60 60 40 40 20 20 10 0 10 Threshold of Hearing 50 100 500 1000 5000 10000 20000 frequency (Hz) The curves are obtained by averaging over a large cross section of the population who do not have hearing impairments [30]. These measurements were first performed by Fletcher and Munson in 1933 [73], and later by Robinson and Dadson in 1956 [126]. See also Audiometry, Auditory Filters, Frequency Range of Hearing, Hearing, Loudness Recruitment, Sound Pressure Level, Sound Pressure Level Weighting Curves, Spectral Masking, Temporal Masking, Temporary Threshold Shift, Threshold of Hearing, Ultrasound. Equal Tempered Scale: See Equitempered Scale. Equalisation: If a signal is passed through a channel (e.g., it is filtered) and the effects of the channel on the signal are removed by making an inverse channel filter using DSP, then this is referred to as equalization. Equalization attempts to restore the frequency and phase characteristic of the signal to the values prior to transmission and is widely used in telecommunications to maximize the reliable transmission data rate, and reduce errors caused by the channel frequency and phase response. Equalization implementations are now commonly found in FAX machines and 125 telephone MODEMS. Most equalization algorithms are adaptive signal processing least squares or least mean squares based. See also Inverse System Identification. USA SCOTLAND Telephone Channel T(f) Channel Frequency Response T(f) 2 USA Equalization Digital Filter E(f) A/D 2 T(f)E(f) 2 Equalizer Frequency Response E(f) 4 frequency (kHz) D/A 4 frequency (kHz) Combined Frequency Response of Channel and Equalizer 4 frequency (kHz) SCOTLAND Equitempered Scale: Another name for the well known Western music scale of 12 musical notes in an octave where the ratio of the fundamental frequencies of adjacent notes is a constant of value 2 1 / 12 = 1.0594631… . The frequency different between adjacent notes on the equitempered scale is therefore about 6%. The difference between the logarithm of the fundamental frequency of adjacent notes is therefore a constant of: log ( 2 1 / 12 ) = 0.0250858… (126) Hence if a piece of digital music is replayed at a sampling rate that mismatches the original by more or less than 6%, the key of the music will be changed (as well as everything sounding that little bit slower!). See also Music, Music Synthesis, Western Music Scale. Equivalent Sound Continuous Level (Leq): Sound pressure level in units of dB (SPL), gives a measure of the instantaneous level of sound. To produce a measure of averaged or integrated sound pressure level a time interval T, the equivalent sound continuous level can be calculated [46]: T L eq,T --1- ∫ P 2 ( t ) T 0 - = 10 log ---------------------- P 2 ref (127) where P ref is the standard SPL reference pressure of 2 × 10-5 N/m2 = 20 µ Pa, and P ( t ) is the time varying sound pressure. If a particular sound pressure level weighting curve was used, such as the A-weighting scale, then this may be indicated as LAeq,T Leq measurements can usually be calculated by good quality SPL meters which will average the sound over a specified time typically from a few seconds to a few minutes. SPL meters which provide this facility will correspond to IEC 804: 1985 (and BS 6698 in the UK). See also Hearing DSPedia 126 Impairment, Sound Exposure Meters, Sound Pressure Level, Sound Pressure Level Weighting Curves, Threshold of Hearing. Ergodic: If a stationary random process (i.e., a signal) is ergodic, then its statistical average (or ensemble average) equal the time average of a single realization of the process. For example given a signal x ( n ) , with a probability density function p { x ( n ) } the mean or expected value is calculated from: Mean of x ( n ) = E { x ( n ) } = ∑ x ( n )p { x ( n ) } (128) n and the mean squared value is calculated as: Mean Squared Value of x ( n ) = E { [ x ( n ) ] 2 } = ∑ [ x( n ) ]2 p{ x( n ) } (129) n For a stationary signal the probability density function or a number of realizations of the signal may be difficult or inconvenient to obtain. Therefore if the signal is ergodic the time averages can be used: 1 E { x ( n ) } ≈ --------------------M2 – M1 M2 – 1 ∑ x ( n ) for large ( M 2 – M 1 ) (130) [ x ( n ) ] 2 for large ( M 2 – M 1 ) (131) n = M1 and 1 E { [ x ( n ) ] 2 } ≈ --------------------M2 – M1 M2 – 1 ∑ n = M1 See also Expected Value, Mean Value, Mean Squared Value, Variance, Wide Sense Stationarity. Error Analysis: When the cumulative effect of arithmetic round-off errors in an algorithm is calculated, this is referred to as an error analysis. Most error analysis is performed from consideration of relative and absolute errors of quantities. For example, consider two real numbers x and y, that are estimated as x’ and y’ with absolute errors ∆x and ∆y . Therefore: x = x′ + ∆x y = y′ + ∆y (132) w = x+y (133) If x and y are added: then the error, ∆w , caused by adding the estimated quantities such that w′ = x′ + y′ is calculated by noting that: w = w′ + ∆w = x′ + ∆x + y′ + ∆y (134) 127 and therefore: ∆w = ∆x + ∆y (135) Therefore the (worst case) error caused by the adding (or subtracting) two values is calculated as the sum of the absolute errors. When the product z = xy is formed then: z = xy = ( x′ + ∆x ) ( y′ + ∆y ) = x′y′ + ∆xy′ + ∆yx′ + ∆x∆y (136) Using the estimated quantities to calculate z′ = x′y′ , the product error, ∆z , is given by: ∆z = z – z′ = ∆xy′ + ∆yx′ + ∆x∆y (137) If we assume that the quantities ∆x and ∆y . are small with respect to x′ and y′ then the term ∆x∆y can be neglected and the error in the product given by: ∆z ≅ ∆xy′ + ∆yx′ (138) Dividing both sides of the equation by z, we can express the relative error in z as the sum of the relative errors of x and y: ∆z ------- ≅ ∆x ------- + ∆y ------z x y (139) The above two results can be used to simplify the error analysis of the arithmetic of many signal processing algorithms. See also Absolute Error, Quantization Noise, Relative Error. Error Budget: See Total Error Budget. Error Burst: See Burst Errors. Error Performance Surface: See Wiener-Hopf Equations. Euclidean Distance: Loosely, Euclidean distance is simply linear distance, i.e., distance “as the crow flies”. More specifically, Euclidean distance is the square root of the sum of the squared differences between two vectors. One example would be the distance between the endpoints of the hypotenuse of a right triangle. This distance satisfies the Pythagorean Theorem, i.e., the square root of the sum of the squares. See also Hamming Distance, Viterbi Algorithm. Euler’s Formula: An important mathematical relationship in dealing with complex numbers and harmonic relationships is given by Euler’s Formula: e jθ jθ = cos θ + j sin θ (140) If we think of e as being a 2-dimensional unit length vector (or phasor) that rotates around the origin as θ is varied, then the real part ( cos θ ) is given by the projection of that vector onto the xaxis, and the imaginary part ( sin θ ) is given by the projection of that vector onto the y-axis. DSPedia 128 European Broadcast Union (EBU): The EBU define standards and recommendations for broadcast of audio, video and data. The EBU has a special relationship with the European Telecommunications Standards Institute (ETSI) through which joint standards are produced such as NICAM 728 (ETS 300 163). “a network, in general evolving from a telephony integrated digital network (IDN), that provides end to end connectivity to support a wide range of services including voice and non-voice services, to which users have a limited set of standard multi-purpose user network interfaces.” The ITU-T I-series of recommendations fully defines the operation and existence of ISDN. See also European Telecommunications Standards Institute, International Telecommunication Union, International Organisation for Standards, Standards, I-series Recommendations, ITU-T Recommendations. European Telecommunications Standards Institute (ETSI): ETSI provides a forum at which all European countries sit to decide upon telecommunications standards. The institute was set up in 1988 for three main reasons: (1) the global (ISO/IEC) standards often left too many questions open; (2) they often do not prescribe enough detail to achieve interoperability; (3) Europe cannot always wait for other countries to agree or follow the standards of the USA and Asia. ETSI has 12 committees covering telecommunications, wired fixed networks, satellite communications, radio communications for the fixed and mobile services, testing methodology, and equipment engineering. ETSI were responsible for the recommendations of GSM (Group Specialé Mobile, or Global System for Mobile Communications). See also Comité Européen de Normalisation Electrotechnique, International Telecommunication Union, International Organisation for Standards, Standards. Evaluation Board: A printed circuit board produced in volume by a company, and intended for evaluation and benchmarking purposes. An evaluation board is often a cut down version of a production board available from the company. A DSP evaluation board is likely to have limited memory available, use a slow clock DSP processor, and be restricted in its convenient expandability. See also DSP Board. Even Function: The graph of an even function is symmetric about the y-axis such that y = f ( x ) = f ( – x ) . This simple 1-dimensional intuition is quickly extended to more complex functions by noting that the basic requirement is still f ( x ) = f ( – x ) whether x or f(x) are vectors or vector-valued functions or some combination. Example even functions include y = cos x and y = x 2 . In contrast an odd function has point symmetry about the origin such that y = f ( x ) = – f ( x ) . See also Odd Function. y y x y = x2 x y = cos x Evoked Potentials: When the brain is excited by audio or visual stimuli, small voltage potentials can be measured on the head, emanating from brain [48]. These Visually Evoked Potentials (VEP), and Audio Evoked Potentials (AEP) can be sampled, and processed using a DSP system. Evoked potentials can also be measured directly on the brain or the brainstem. 129 Excess Mean Square Error: See Least Mean Squares (LMS) Algorithm. Exp: Common notation used for the exponential function. See Exponential Function. Expected Value: The expected value, E { . } , of a random variable (or a function of a random variable) is simply the average value of the random variable (or of the function of a random variable). The statistical average or mean value of signal x ( n ) is computed from: Mean of x ( n ) = E { x ( n ) } = ∑ x ( n )p { x ( n ) } (141) n where E { x ( n ) } is “the expected value of x ( n ) ”, and p { x ( n ) } is the probability density function of the random variable x ( n ) . An another example of expected values, the mean squared value of x ( k ) is calculated as: Mean Squared Value of x ( n ) = E { x 2 ( n ) } = ∑ x 2 ( n )p { x ( n ) } (142) n Expected value is a linear operation, i.e.,: E { ax ( n ) + by ( n ) } = aE { x ( n ) } + bE { y ( n ) } (143) where a and b are constants and x ( n ) and y ( n ) are random signals generated by known probability density functions, p y { y ( n ) } and p x { x ( n ) } . For most signals encountered in real time DSP the probability density function is unlikely to be known and therefore the expected value cannot be calculated as suggested above. However if the signal is ergodic, then time averages can be used to approximate the statistical averages. See also Ergodic, Mean Value, Mean Squared Value, Variance, Wide Sense Stationarity. Exponential Averaging: An exponential averager with parameter α computes an average x ( n ) of a sequence {x(n)} as: x ( n ) = ( 1 – α )x ( n – 1 ) + αx ( n ) (144) where α is contained in the interval [0,1]. An exponential average (a one pole lowpass filter) is simpler to compute than a moving rectangular window since older data points are simply forgotten by the exponentially decreasing powers of (1 - α). A convenient rule of thumb approximation for the “equivalent rectangular window” of an exponential averager is 1/α data samples. See also Waveform Averaging, Moving Average, Weighted Moving Average. Exponential Function: The simple exponential function is: DSPedia 130 y = e x = exp ( x ) y (145) 20 15 10 y = ex 5 -1 0 1 2 3 x where “e” is the base of the natural logarithm, e = 2.7182818 . A key property of the exponential function is that the derivative of e x is e x , i.e. d x = x e e dx (146) Real causal exponential functions can be used to represent the natural decay of energy in a passive system, such as the voltage decay in an RC circuits. For example consider the discrete time exponential: x(k) A 0 1 2 3 4..... x ( k ) = Ae –λkts u ( k ) k (147) where u(k) is the unit step function, t s is the sampling period, and A and λ are constants. See also Complex Exponential Functions, Damped Sinusoid, RC Circuit. 131 F F-Series Recommendations: The F-series telecommunication recommendations from the International Telecommunication (ITU), advisory committee on telecommunications (denoted ITUT and formerly known as CCITT) provide standards for services other than telephone (ops, quality, service definitions and human factors). Some of the current recommendations (http://www.itu.ch) include: F.1 F.2 F.4 F.10 F.11 F.14 F.15 F.16 F.17 F.18 F.20 F.21 F.23 F.24 F.30 F.31 F.35 F.40 F.41 F.59 F.60 F.61 F.63 F.64 F.65 F.68 F.69 F.70 F.71 F.72 F.73 F.74 F.80 F.82 F.86 F.87 F.89 F.91 Operational provisions for the international public telegram service. Operational provisions for the collection of telegram charges. Plain and secret language. Character error rate objective for telegraph communication using 5-unit start-stop equipment. Continued availability of traditional services. General provisions for one-stop-shopping arrangements. Evaluating the success of new services. Global virtual network service. Operational aspects of service telecommunications. Guidelines on harmonization of international public bureau services. The international gentex service. Composition of answer-back codes for the international gentex service. Grade of service for long-distance international gentex circuits. Average grade of service from country to country in the gentex service. Use of various sequences of combinations for special purposes. Telegram retransmission system. Provisions applying to the operation of an international public automatic message switching service for equipments utilizing the International Telegraph Alphabet No. 2. International public telemessage service. Interworking between the telemessage service and the international public telegram service. General characteristics of the international telex service. Operational provisions for the international telex service. Operational provisions relating to the chargeable duration of a telex call. Additional facilities in the international telex service. Determination of the number of international telex circuits required to carry a given volume of traffic. Time-to-answer by operators at international telex positions. Establishment of the automatic intercontinental telex network. The international telex service Service and operational provisions of telex destination codes and telex network identification codes. Evaluating the quality of the international telex service. Interconnection of private teleprinter networks with the telex network. The international telex service - General principles and operational aspects of a store and forward facility. Operational principles for communication between terminals of the international telex service and data terminal equipment on packet switched public data networks. Intermediate storage devices accessed from the international telex service using single stage selection answerback format. Basic requirements for interworking relations between the international telex service and other services. Operational provisions to permit interworking between the international telex service and the intex service. Interworking between the international telex service and the videotex service. Operational principles for the transfer of messages from terminals on the telex network to Group 3 facsimile terminals connected to the public switched telephone network. Status enquiry function in the international telex service. General statistics for the telegraph services. DSPedia 132 F.93 F.95 F.96 F.100 F.104 F.105 F.106 F.107 F.108 F.111 F.112 F.113 F.115 F.120 F.122 F.125 F.127 F.130 F.131 F.140 F.141 F.150 F.160 F.162 F.163 F.170 F.171 F.180 F.182 F.184 F.190 F.200 F.201 F.202 F.203 F.220 F.230 F.300 F.350 F.351 F.353 F.400 F.401 F.410 F.415 Routing tables for offices connected to the gentex service. Table of international telex relations and traffic. List of destination indicators. Scheduled radiocommunication services. International leased circuit services - Customer circuit designations. Operational provisions for phototelegrams. Operational provisions for private phototelegraph calls. Rules for phototelegraph calls established over circuits normally used for telephone traffic. Operating rules for international phototelegraph calls to multiple destinations. Principles of service for mobile systems. Quality objectives for 50-baud start-stop telegraph transmission in the maritime mobile-satellite service. Service provisions for aeronautical passenger communications supported by mobile-satellite systems. Service objectives and principles for future public land mobile telecommunication systems. Ship station identification for VHF/UHF and maritime mobile-satellite services. Operational procedures for the maritime satellite data transmission service. Numbering plan for access to the mobile-satellite services of INMARSAT from the international telex service. Operational procedures for interworking between the international telex service and the service offered by INMARSAT-C system. Maritime answer-back codes. Radiotelex service codes. Point-to-multipoint telecommunication service via satellite. International two-way multipoint telecommunication service via satellite. Service and operational provisions for the intex service. General operational provisions for the international public facsimile services. Service and operational requirements of store-and-forward facsimile service. Operational requirements of the interconnection of facsimile store-and-forward units. Operational provisions for the international public facsimile service between public bureaux (bureaufax). Operational provisions relating to the use of store-and-forward switching nodes within the bureaufax service. General operational provisions for the international public facsimile service between subscriber stations (telefax). Operational provisions for the international public facsimile service between subscribers' stations with Group 3 facsimile machines (Telefax 3). Operational provisions for the international public facsimile service between subscriber stations with Group 4 facsimile machines (Telefax 4). Operational provisions for the international facsimile service between public bureaux and subscriber stations and vice versa (bureaufax-telefax and vice versa). Teletex service. Interworking between teletex service and telex service - General principles. Interworking between the telex service and the teletex service - General procedures and operational requirements for the international interconnenction of telex/teletex conversion facilities. Network based storage for the teletex service. Service requirements unique to the processable mode number eleven (PM11) used within teletex service. Service requirements unique to the mixed mode (MM) used within the teletex service Videotex service. Application of T Series recommendations. General principles on the presentation of terminal identification to users of the telematic services. Provision of telematic and data transmission services on integrated services digital network (ISDN). Message handling services: Message Handling System and service overview. X.400 Message handling services: naming and addressing for public message handling services. Message Handling Services: the public message transfer service. Message handling services: Intercommunication with public physical delivery services. 133 F.420 F.421 F.422 F.423 F.435 F.440 F.500 F.551 F.581 F.600 F.701 F.710 F.711 F.720 F.721 F.730 F.732 F.740 F.761 F.811 F.812 F.813 F.850 F.851 F.901 F.902 F.910 Message handling services: the public interpersonal messaging service. Message handling services: Intercommunication between the IPM service and the telex service. Message handling services: Intercommunication between the IPM service and the teletex service. Message Handling Services: intercommunication between the interpersonal messaging service and the telefax service. Message handling: electronic data interchange messaging service. Message handling services: the voice messaging service. International public directory services. Service for the telematic file transfer within Telefax 3, Telefax 4, Teletex services and message handling services. Guidelines for programming communication interfaces (PCIs) definition: Service Service and operational principles for public data transmission services. Teleconference service. General principles for audiographic conference service. Audiographic conference teleservice for ISDN. Videotelephony services - general. Videotelephony teleservice for ISDN. Videoconference service- general. Broadband Videoconference Services. Audiovisual interactive services. Service-oriented requirements for telewriting applications. Broadband connection-oriented bearer service. Broadband connectionless data bearer service. Virtual path service for reserved and permanent communications. Principles of Universal Personal Telecommunication (UPT). Universal personal telecommunication (UPT) - Service description (service set 1) Usability evaluation of telecommunication services. Interactive services design guidelines. Procedures for designing, evaluating and selecting symbols, pictograms and icons. For additional detail consult the appropriate standard document or contact the ITU. See also International Telecommunication Union, ITU-T Recommendations, Standards. Far End Echo: Signal echo that is produced by components in far end telephone equipment. Far end echo arrives after near end echo. See also Echo Cancellation, Near End Echo. Fast Fourier Transform (FFT): The FFT [66], [93] is a method of computing the discrete Fourier transform (DFT) that exploits the redundancy in the general DFT equation: N–1 X( k) = ∑ – j 2πkn -----------------x ( n )e N for k = 0 to N – 1 (148) n=0 Noting that the DFT computation of Eq. 148 requires approximately N 2 complex multiply accumulates (MACs), where N is a power of 2, the radix-2 FFT requires only Nlog2N MACs. The computational savings achieved by the FFT is therefore a factor of N/log2N. When N is large this DSPedia 134 saving can be considerable. The following table compares the number of MACs required for different values of N for the DFT and the FFT: N DFT MACs FFT MACs 32 1024 160 1024 1048576 10240 32768 ~ 1 x 109 ~ 0.5 x106 There are a number of different FFT algorithms sometimes grouped via the names Cooley-Tukey, prime factor, decimation-in-time, decimation-in-frequency, radix-2 and so on. The bottom line for all FFT algorithms is, however, that they remove redundancy from the direct DFT computational algorithm of Eq. 148. We can highlight the existence of the redundant computation in the DFT by inspecting Eq. 148. First, for notational simplicity we can rewrite Eq. 148 as: N–1 X( k) = ∑ x ( n )WN–kn for k = 0 to N – 1 (149) n=0 where W = e j2π ⁄ N = cos 2π ⁄ N + j sin 2π ⁄ N Using the DFT algorithm to calculate the first four components of the DFT of a (trivial) signal with only 8 samples requires the following computations: X( 0) = x(0 ) + x( 1) + x( 2) + x( 3) + x(4 ) + x( 5) + x( 6) + x( 7) X ( 1 ) = x ( 0 ) + x ( 1 )W 8– 1 + x ( 2 )W 8– 2 + x ( 3 )W 8– 3 + x ( 4 )W 8– 4 + x ( 5 )W 8– 5 + x ( 6 )W 8–6 + x ( 7 )W 8– 7 X ( 2 ) = x ( 0 ) + x ( 1 )W 8– 2 + x ( 2 )W 8– 4 + x ( 3 )W 8– 6 + x ( 4 )W 8– 8 + x ( 5 )W 8– 10 + x ( 6 )W 8–12 + x ( 7 )W 8– 14 (150) X ( 3 ) = x ( 0 ) + x ( 1 )W 8– 3 + x ( 2 )W 8– 6 + x ( 3 )W 8– 9 + x ( 4 )W 8– 12 + x ( 5 )W 8– 15 + x ( 6 )W 8– 18 + x ( 7 )W 8– 21 However note that there is redundant (or repeated) arithmetic computation in Eq. 150. For example, consider the third term in the second line of Eq. 150: x ( 2 )W 8–2 = x ( 2 )e –2 j2π ------ 8 = – jπ -------x ( 2 )e 2 (151) Now consider the computation of the third term in the fourth line of Eq. 150: x ( 2 )W 8–6 = x ( 2 )e –6 j2π ------ 8 = – j3π ----------x ( 2 )e 2 = – jπ -------– π j x ( 2 )e e 2 = – jπ -------– x ( 2 )e 2 (152) Therefore we can save one multiply operation by noting that the term x ( 2 )W 8– 6 = – x ( 2 )W 8–2 . In fact because of the periodicity of W Nkn every term in the fourth line of Eq. 150 is available from the computed terms in the second line of the equation. Hence a considerable saving in multiplicative computations can be achieved if the computational order of the DFT algorithm is carefully considered. More generally we can show that the terms in the second line of Eq. 150 are: 135 x ( n )W 8–n = x ( n )e – j 2πn-------------8 = x ( n )e – j πn ----------4 (153) and for terms in the fourth line of Eq. 150: x ( n )W 8–3n = – j 6πn -------------x ( n )e 8 = x ( n ) ( –j ) n e = – j 3πn -------------x ( n )e 4 = x ( n )e π π – j --- + --- n 2 4 = πn πn – j ------ – j -----2 x ( n )e e 4 (154) πn – j -----4 = ( – j ) n x ( n )W 8– n This exploitation of the computational redundancy is the basis of the FFT which allows the same result as the DFT to be computed, but with less MACs. To more formally derive one version of the FFT (decimation-in-time radix-2), consider splitting the DFT equation into two “half signals” consisting of the odd numbered and even numbered samples, where the total number of samples is a power of 2 ( N = 2 n ): N⁄2–1 X(k) = ∑ – j 2πk ( 2n ) -------------------------N x ( 2n )e n=0 N⁄2–1 = ∑ n=0 N⁄2–1 = ∑ N⁄2–1 + ∑ x ( 2n + 1 )e – j 2πk ( 2n + 1 ) ------------------------------------N n=0 N⁄2–1 x ( 2n )W N–2 nk + ∑ x ( 2n + 1 )W N–( 2n + 1 ) k (155) n=0 N⁄2–1 x ( 2n )W N–2 nk + W N–k n=0 ∑ x ( 2n + 1 )W N–2 nk n=0 Notice in Eq. 155 that the N point DFT which requires N 2 MACs in Eq. 148 is now accomplished by performing two N ⁄ 2 point DFTs requiring a total of 2 × N 2 ⁄ 4 MACs which is a computational saving of 50%. Therefore a next logical step is to take the N ⁄ 2 point DFTs and perform as N ⁄ 4 point DFTs, saving 50% computation again, and so on. As the number of points we started with was a power of 2, then we can perform this decimation of the signal a total of N times, and each time reduce the total computation of each stage to that of a “butterfly” operation. If N = 2 n then the computational saving is a factor of: N -------------log 2 N (156) DSPedia 136 In general equations for an FFT are awkward to write mathematically, and therefore the algorithm is very often represented as a “butterfly” based signal flow graph (SFG), the butterfly being a simple signal flow graph of the form: Splitting node c a Summing node b d k WN -1 Multiplier k is a complex number, and the input The butterfly signal flow graph. The multipler W N data, a and b may also be compex. One butterfly computation requires one complex multiply and two complex additions (assuming the data is complex). A more complete SFG for an 8 point decimation in time radix 2 FFT computation is: X(0) x(0) x(4) 0 W8 -1 x(2) x(6) 2 W8 0 W8 X(1) 0 W8 -1 -1 x(1) x(5) X(3) 0 W8 -1 1 W8 0 W8 -1 x(3) x(7) X(2) -1 2 W8 0 W8 -1 2 0 W8 W8 -1 -1 3 W8 -1 -1 -1 X(4) X(5) X(6) X(7) kn = e – 2π ⁄ N . Note A radix-2 Decimation-in-time (DIT) Cooley-Tukey FFT, for N = 8; W N that the butterfly computation is repeated through the SFG. See also Bit Reverse Addressing, Cooley-Tukey, Discrete Cosine Transform, Discrete Fourier Transform, Fast Fourier Transform - Decimation-in-Time (DIT), Fast Fourier Transform Decimation-in-Frequency (DIF), Fast Fourier Transform - Zero Padding, Fourier, Fourier Analysis, Fourier Series, Fourier Transform, Frequency Response, Phase Response. Fast Fourier Transform, Decimation-in-Frequency (DIF): The DFT can be reformulated to give the FFT either as a DIT or a DIF algorithm. Since the input data and output data values of the FFT appear in bit-reversed order, decimation-in-frequency computation of the FFT provides the output frequency samples in bit-reversed order. See also Discrete Fourier Transform, Fast Fourier Transform, Fast Fourier Transform - Decimation-in-Frequency. 137 Fast Fourier Transform, Decimation-in-Time (DIT): The DFT can be reformulated to give the FFT either as a DIF or a DIT algorithm. Since the input data and output data values of the FFT appear in bit-reversed order, decimation-in-time computation of the FFT provides the output frequency samples in proper order when the input time samples are arranged in bit-reversed order. See also Discrete Fourier Transform, Fast Fourier Transform - Decimation-in-Time, Fast Fourier Transform - Decimation-in-Frequency. See also Discrete Fourier Transform. Fast Fourier Transform, Zero Padding: When performing an FFT, the number of data points used in the algorithm is a power of 2 (for radix-2 FFT algorithms). What if a particular process only produces 100 samples and the FFT is required? There are two choices: (1) Truncate the sequence to 64 samples; (2) Pad out the signal by setting the last 28 values of the FFT to be the same as the first 28 samples; (3) Zero pad the data by setting the last 28 values of the FFT to zero. Solution (1) will lose signal information and solution (2) will add information which is not necessarily part of the signal (i.e. discontinuities). However, solution (3) will only increase the frequency resolution of the FFT by adding more harmonics and does not affect the integrity of the data. Fast Given’s Rotations: See Matrix Decompositions - Square Root Free Given’s Rotations. Filtered-U LMS: See Active Noise Cancellation. Filtered-X LMS: See Least Mean Squares Filtered-X Algorithm. Filters: A circuit designed to pass signals of certain frequencies, and attenuate others Filters can be analog or digital [45]. In general a filter with N poles (where N is usually the number of reactive circuit elements used, such as capacitors or inductors) will have a roll-off of 6N dB/octave or 20N dB/decade. Although the above second order (two pole) active filter increases the final rate of roll-off, the sharpness of the knee (at the 3dB frequency) of the filter is not improved and the further increase in order will not produce a filter that approaches the ideal filter. Other designs, such as the Butterworth, Chebychev and Bessel filter, produce filters that have a flatter passband characteristic or a much sharper knee. In general, for a fixed order filter, the sharper the knee of the filter the more variation in the gain of the passband. A simple active filter is illustrated below. + Vin Vin Vout A simple 3rd order active filter. The cut-off frequency can be changed by modifying the resistor values. This filter has a roll-off of 18dB/octave, therefore meaning that if used as an anti-alias filter cutting of at fs/2 where f s is the sampling frequency, the filter would only provide attenuation of 18 dB at fs and hence aliasing problems may occur. A popular (though not necessarily appropriate) rule of thumb anti-alias filters DSPedia 138 Second Order (Active) Filter First Order (Passive) Filter Vin R R R C Vout f 3dB 1 = --------------2πRC Vin C Buffer Amplifier C Vou V out 1 ----------- = ------------------------------------------------------------------------V in 1 + 2 ( f ⁄ f 3dB ) 2 + ( f ⁄ f 3dB ) 4 V out 1 ----------- = -------------------------------------V in 1 + ( f ⁄ f 3dB ) 2 20log10 Vout/Vin (dB) Ideal filter 0 -5 -10 -15 -20 -25 -30 -35 -40 -45 -50 -55 -60 Log10 frequency (decade) 1st order (passive) RC circuit: Roll-off = 20dB/decade 2nd order active RC circuit: Roll-off = 40dB/decade 0.1 0.5 1 5 10 50 100 500 1000 log10(f/f3dB) should provide at least the same attenuation at the sampling frequency as the dynamic range of the wordlength. For example, if using 16 bit arithmetic the dynamic range is 20 log 2 16 = 96dB and the roll-off of the filter above the 3dB frequency is at least 96dB/octave. In designing anti-alias fitters, the key requirement is limiting the significance of any aliased frequency components. Because it is the nature of lowpass filters to provide more attenuation at higher frequencies that at lower ones, the aliased components at fs/2 are usually the limiting factor. See also Active Filter, Anti-alias Filter, Bandpass Filter, Digital Filter, High Pass Filter, Low Pass Filter, Knee, Reconstruction Filter, RC Filter, Roll-off. Bessel Filter: A filter that has a maximally flat phase response in its passband. Butterworth Filter: This is a filter based on certain mathematical constraints and defining equations. These filters have been used for a very long time in designing stable analog filters. In general the Butterworth filter has a passband that is very flat, at the expense of a slow roll off. The gain of the order n (analog) Butterworth can be given as V out 1 ----------- = ---------------------------------------V in 1 + ( f ⁄ f 3db ) 2n Chebyshev Filter: A type of filter that has a certain amount of ripple in the passband, but has a very steep roll-off. The gain of the order n (analog) Chebyshev filter can be given as below where C n is a special (157) 139 polynomial and ε is a constant that determines the magnitude of the passband ripple. The spelling of Chebyshev has many variants (such as Tschebyscheff). V out 1 ----------- = -----------------------------------------------V in 2 1 + ε C n2 ( f ⁄ f 3db ) (158) Elliptic Filter: A type of filter that achieves the maximum possible roll-off for a particular filter order. The phase response of an elliptic filter is extremely non-linear. Finite Impulse Response (FIR) Filter: (See first Digital Filter). An FIR filter digital filter performs a moving weighted average on an input stream of digital data to filter a signal according to some predefined frequency criteria such as a low pass, high pass, band pass, or band-stop filter: 0 0 0 Gain 0 frequency Low Pass frequency frequency High Pass Band-Pass frequency Band-Stop FIR Filters are usually designed with software to be low pass, high pass, band pass or band-stop. As discussed under Digital Filter, an FIR filter is integrated to the real world via analogue to digital converters (ADC) and digital to analogue converters (DAC) and suitable anti-alias and reconstruction filters. An FIR digital filter can be conveniently represented in a signal flow graph: x(k) x(k-1) w0 w1 x(k-2) w2 x(k-3) w3 x(k-N+2) wN-2 x(k-N+1) wN-1 y(k) The signal flow graph and the output equation for an FIR digital filter. The last N input samples are weighted by the filter coefficients to produce the output y ( k ) The general output equation (convolution) for an FIR filter is: y ( k ) = w 0 x ( k ) + w 1 x ( k – 1 ) + w 2 x ( k – 2 ) + w 3 x ( k – 3 ) + ..... + wN – 1 x ( k – N + 1 ) N–1 = ∑ n=0 wn x ( k – n ) (159) DSPedia 140 The term finite impulse response refers to the fact that the impulse response results in energy at only a finite number of samples after which the output is zero. Therefore if the input sequence is a unit impulse the FIR filter output will have a finite duration: δ(k) h(k) Unit Impulse 1 Finite Impulse Response 1 T = ---- secs fs 1 T = ---- secs fs 0 1 2 3 4 5 6 7 0 K 1 3 4 6 time (secs/fs) 7 K time (secs/fs) x(k) Digital FIR Filter y(k) The discrete output of a finite impulse response (FIR) filter sampled at fs Hz has a finite duration in time, i.e. the output will decay to zero within a finite time. This can be illustrated by considering that the FIR filter is essentially a shift register which is clocked once per sampling period. For example consider a simple 4 weight filter: 1 0 0 w1 0 w2 0 w3 0 1 w1 0 w2 0 w1 w0 Time, k=0 0 1 0 w2 w3 w1 Time, k=1 0 0 w3 0 w1 1 w2 w3 w2 Time, k=2 0 w3 Time, k=3 0 0 w1 0 w2 0 0 w3 0 w1 0 w2 w3 0 Time, k=4 0 Time, k=5 etc.....etc..... When applying a unit impulse response to a filter, the 1 value passes through the filter “shift register” causing the filter impulse response to be output. As an example, a simple low pass FIR filter can be designed using the DSP design software SystemView by Elanix , with a sampling rate of 10000 Hz, a cut off frequency of around 1000Hz, a 141 stopband attenuation of about 40dB, passband ripple of less than 1 dB and limited to 15 weights. The resulting filter is: Low Pass FIR Filter Impulse Response h(n) 0.25 w0 = w14 = -0.01813... w1 = w13 = -0.08489... w2 = w12 =-0.03210... w3 = w11 = -0.00156... w4 = w10 = 0.07258... w5 = w9 = 0.15493... w6 = w8 = 0.22140... w7 = 0.25669... (Truncated to 5 decimal places) 0.20 0.15 1 T = ---------------- secs 10000 0.10 0.05 0 time, n -0.05 15 10 5 The impulse response h ( n ) = w n of a low pass filter, FIR1 with 15 weights, a sampling rate of 10000 Hz, and cut off frequency designed at around 1000Hz. Noting that a unit impulse contains “all frequencies”, then the magnitude frequency response and phase response of the filter are found from the DFT (or FFT) of the filter weights: H(f) Gain (dB) 1.0 Gain 20 log H ( f ) 10 1.2 0.8 0.6 0.4 0.2 0 1000 2000 3000 4000 5000 frequency (Hz) Linear Magnitude Response 0 -10 -20 -30 -40 0 1000 2000 3000 4000 5000 frequency (Hz) Logarithmic Magnitude Response The 1024 point FFT (zero padded) of the above low pass filter impulse response, FIR1. As the sampling rate is 10000 Hz the frequency response is only plotted up to 5000 Hz. (Note that the y-axis is labelled Gain rather than Attenuation, this is because -10dB gain is the same as 10dB attenuation. Hence if attenuation was plotted the above figures would be inverted.) DSPedia 142 H(f) Phase Response (unwrapped) 0 Phase (radians) Phase (radians) H(f) -π -2π -3π -4π -5π -6π 0 1000 2000 3000 4000 5000 frequency (Hz) Phase Response (wrapped) π π/2 0 -π/2 -π 0 1000 2000 3000 4000 5000 frequency (Hz) The 1024 point FFT generated phase response (phase shift versus frequency) above low pass filter impulse response, FIR1. Note that the the filter is linear phase and the wrapped and unwrapped phase responses are different ways of representing the same information. The “wrapped” phase response will often produced by DSP software packages and gives phase values between -π and π only. As the phase is calculated as modulo 2π. i.e. a phase shift of θ is the same as a phase shift of θ + 2π and so on. Phase responses are also often plotted using degrees rather than radians. From the magnitude and phase response plots we can therefore calculate the attenuation and phase shift of different input signal frequencies. For example, if a single frequency at 1500Hz, with an amplitude of 150 is input to the above filter, then the amplitude of the output signal will be around 30, and phase shifted by a little over -2π radians. However, if a single frequency of 500Hz was input, then the output signal amplitude is amplified by a factor of about 1.085 and phase shifted by about -0.7π radians. As a more intuitive and illustrative example of filtering, consider inputing the signal, x ( k ) below to a suitably designed “low pass filter” to produce the output signal, y ( k ) : x(k) y(k) time, k x(k) time, k Low Pass Digital Filter y(k) Example of an FIR Filter performing low pass filtering, i.e. removing high frequencies by performing a weighted moving average with suitable low pass characteristic weights. The remaining low frequencies are phase shifted (i.e. time delayed) as a result of passing through the filter. So, how long is a typical FIR filter? This of course depends on the requirement of the problem being addressed. For the generic filter characteristic shown below more weights are required if: • A sharper transition bandwidth is required; • More stopband attenuation is required; 143 • Very small passband ripple is required. Transition Band Gain (dB) 0 Passband Ripple -3 Stopband Attenuation Low Pass Ideal Filter fs/2 frequency Generic low pass filter magnitude response. The more stringent the filter requirements of stopband attenuation, transition bandwidth and to a lesser extent passband ripple, the more weights that are required. Consider again the design of the above FIR filter (FIR1) which was a low pass filter cutting of at about 1000Hz. Using SystemView, the above criteria can be varied such that the number of filter weights can be increased and a more stringent filter designed. Consider the design of three low pass filters cutting off at 1000 Hz, with stopband attenuation of 40dB and transition bandwidths 500 Hz, 200 Hz and 50 Hz: Gain (dB) 0 FIR 1 0 FIR 2 0 -20 -20 -20 -40 -40 -40 -60 -60 -60 -80 -80 -80 0 1000 2000 3000 4000 5000 frequency (Hz) Transition Band: 1000 - 1500Hz No. of weights: 29 0 1000 2000 3000 4000 5000 frequency (Hz) Transition Band: 1000 - 1200Hz No. of weights: 69 FIR 3 0 1000 2000 3000 4000 5000 frequency (Hz) Transition Band: 1000 - 1100Hz No. of weights: 269 Low pass filters designed parameters: Stopband Attenuation = 40dB; Passband Ripple = 1dB and transition bandwidths, of 500, 200, and 50 Hz. The sharper the transition band the more filter weights that are required. DSPedia 144 The respective impulse responses of FIR1 , FIR2 and FIR3 are respectively 15, 69 and 269 weights long, with group delays of 7, 34 and 134 samples respectively. 1 ------------ secs 10000 0.2 0.2 FIR2, 1 ------------ secs 10000 time 69 weights time 0.2 1 ------------ secs 10000 FIR3, 269 weights time The impulse responses of low pass filters FIR1, FIR2, and FIR3, all with 40 dB stopband attenuation, 1dB passband ripple, but transition bandwidths of 500, 200 and 50 Hz respectively. Clearly the more stringent the filter parameters, the longer the required impulse response. Similarly if the stopband attenuation specification is increased, the number of filter weights required will again require to increase. For a low pass filter with a cut off frequency again at 1000 Hz, a transition bandwidth of 500 Hz and stopband attenuations of 40 dB , 60 dB and 80 dB : Gain (dB) 0 0 FIR 1 0 FIR 4 -20 -20 -20 -40 -40 -40 -60 -60 -60 -80 -80 -80 0 1000 2000 3000 4000 5000 frequency (Hz) Transition Band: 1000 - 1500Hz No. of weights: 29 0 1000 2000 3000 4000 5000 frequency (Hz) Transition Band: 1000 - 1200Hz No. of weights: 41 FIR 5 0 1000 2000 3000 4000 5000 frequency (Hz) Transition Band: 1000 - 1100Hz No. of weights: 55 Low pass filters designed parameters: Transition Bandwidth = 500Hz; Passband Ripple = 1dB and stopband attenuations of 40 dB, 60 dB, and 80 dB. 0.2 FIR 1 FIR 4 0.2 1 ------------ 1 ------------ secs 10000 time 10000 FIR 5 0.2 1 ------------ secs 10000 secs time The impulse responses of low pass filters FIR1, FIR4, and FIR5, all 1dB passband ripple, and transition bandwidths of 500 Hz and stopband attenuation of 40, 60 and 80dB respectively. Clearly the more stringent the filter parameters, the longer the required impulse response. time 145 Similarly if the passband ripple parameter is reduced, then a longer impulse response will be required. See also Adaptive Filter, Digital Filter, Low Pass Filter, High Pass Filter, Bandpass Filter, Bandstop Filter, IIR Filter. Finite Impulse Response (FIR) Filter, Bit Errors: If we consider the possibility of a random single bit error in the weights of an FIR filter, the effect on the filter magnitude and phase response can be quite dramatic. Consider a simple 15 weight filter : “Correct” Filter 1 T = ------------- secs 8000 Gain (dB) h(n) 0.25 0.20 0.15 0.10 0.05 0 -0.05 0 -10 -20 -30 -40 time 20 log H ( f ) 10 800 0 1600 2400 3200 4000 frequency (Hz) Fifteen weight low pass FIR filter cutting off at 800 Hz. The 3rd coefficient is of value -0.0725..., and in 16 bit fractional binary notation this is 0.0001001010010102. If a single bit occurs in the 3rd bit of this binary coefficient then the value becomes: 0.0011001010010102 = -0.1957... The impulse response clearly changes “a little” whereas the effect on the frequency response changes is a little more substantial and causes a loss of about 5 dB attenuation. Bit Error Filter 1 T = ------------- secs 8000 10 time, n 15 Gain (dB) h(n) 0.25 0.20 0.15 0.10 0.05 0 -0.05 20 log H ( f ) 10 0 -10 -20 -30 -40 0 800 1600 2400 3200 4000 frequency (Hz) 5 15 weights low pass FIR filter cutting off at 800 Hz with the 3rd coefficient being in error by a single bit. Note the change to the frequency response compared to the correct filter above. DSPedia 146 Also because the impulse response is no longer symmetric the phase response is no longer linear: H(f) Correct Filter 0 Phase (radians) Phase (radians) H(f) -π -2π -3π -4π -5π -6π 800 0 1600 2400 3200 4000 frequency (Hz) Bit Error Filter 0 -π -2π -3π -4π -5π -6π 800 0 1600 2400 3200 4000 frequency (Hz) Phase response of the original (“correct”) filter and the bit error filter. The result of the error in a single coefficient has caused the phase to be no longer exactly linear. Of course the bit error may have occured at the least significant bits and the frequency domain effect would be much less pronounced. However because of the excellent reliability of DSP processors the occurence of bit errors in filter coefficients is unlikely. See also Digital Filter, Finite Impulse Response Filter. Finite Impulse Response (FIR), Group Delay: See Finite Impulse Response Filter - Linear Phase. Finite Impulse Response Filter (FIR), Linear Phase: If the weights of an N weight real valued FIR filter are symmetric or anti-symmetric, i.e. : w ( n ) = ±w ( N – 1 – n ) (160) then the filter has linear phase. This means that all frequencies passing through the filter are delayed by the same amount. The impulse response of a linear phase FIR filter can have either an even or odd number of weights. line of symmetry line of symmetry wk wk 0 k Symmetric impulse response of an 11 (odd number) weight linear phase FIR filter. 0 k Symmetric impulse response of an 8 (even number) weight linear phase FIR filter. 147 Location of anti-symmetry Location of anti-symmetry wk wk 0 0 k Anti-symmetric impulse response of an 11 (odd number) weight linear phase FIR filter. k Anti-symmetric impulse response of an 8 (even number) weight linear phase FIR filter. The z-domain plane pole zero plot of a linear phase filter will always have conjugate pair zeroes, i.e. the zeroes are symmetric about the real axis: The desirable property of linear phase is particularly important in applications where the phase of a signal carries important information. To illustrate the linear phase response, consider inputting a cosine wave of frequency f , sampled at f s samples per second (i.e. cos 2πfk ⁄ f s ) to a symmetric impulse response FIR filter with an even number of weights N (i.e. w n = w N – n for n = 0, 1, …, N ⁄ 2 – 1 ). For notational convenience let ω = 2πf ⁄ f s : N⁄2–1 N–1 y( k ) = ∑ wn cos ω ( k – n ) n=0 N⁄2–1 = ∑ = ∑ w n ( cos ω ( k – n ) + cos ω ( k – N + n ) ) n=0 ∙ 2w n cos ω ( k – N ⁄ 2 ) cos ω ( n – N ⁄ 2 ) n=0 (161) N⁄2–1 = 2 cos ω ( k – N ⁄ 2 ) ∑ w n cos ω ( n – N ⁄ 2 ) n=0 N⁄2–1 = M ⋅ cos ω ( k – N ⁄ 2 ), where M = ∑ 2w n cos ω ( n – N ⁄ 2 ) n=0 where the trigonometric identity, cos A + cos B = 2 cos ( ( A + B ) ⁄ 2 ) cos ( ( A – B ) ⁄ 2 ) has been used. From this equation it can be seen that regardless of the input frequency, the input cosine wave is delayed only by N ⁄ 2 samples, often referred to as the group delay, and its magnitude is scaled by the factor M. Hence the phase response of such an FIR is simply a linear plot of the straight line defined by ωN ⁄ 2 . Group delay is often defined as the differentiation of the phase response with respect to angular frequency. Hence, a filter that provides linear phase has a group delay that is constant for all frequencies. An all-pass filter with constant group delay (i.e., linear phase) produces a pure delay for any input time waveform. DSPedia 148 Linear phase FIR filters can be implemented with N ⁄ 2 multiplies and N accumulates compared to the N MACs required by an FIR filter with a non-symmetric impulse response. This can be illustrated by rewriting the output of a symmetric FIR filter with an even number of coefficients: N⁄2–1 N–1 y(k ) = ∑ wn x ( k – n ) n=0 ∑ = wn [ x ( k – n ) + x ( k – N + n ) ] (162) n=0 Although the number of multiplies is halved, most DSP processors can perform a multiplyaccumulate in the same time as an addition so there is not necessarily a computational advantage for the implementation of a symmetric FIR filter on a DSP device. One drawback of a linear phase filter is of course that they always introduce a delay. Linear phase FIR filters are non-minimum phase, i.e. they will always have zeroes that are on are outside of the unit circle. For the z-domain plane plot of the z-transform of a linear phase filter, for all zeroes that are not on the unit circle, there will be a complex conjugate reciprocal of that zero. For example : h(n) Imag 0.4 2 0.3 0.2 z-domain 0.1 0 1 1 2 3 4 time, n The impulse response of a simple 5 weight linear phase FIR filter and the corresponding z-domain plane plot. Note that for the zeroes inside the unit circle at z = – 0.286 ± 0.3526 j , there are conjugate reciprocal zeroes at: -1 1 z = ------------------------------------------ = – 1.384 ± 1.727j – 0.286 ±0.3526 j 0 1 Real -1 -2 See also Digital Filter, Finite Impulse Response Filter. Finite Impulse Response (FIR), Minimum Phase: If the zeroes of an FIR filter all lie within the unit circle on the z-domain plane, then the filter is said to be minimum phase. One simple property is that the inverse filter of a minimum phase FIR filter is a stable IIR filter, i.e. all of the poles lie within the unit circle. See also Finite Impulse Response Filter. Finite Impulse Response (FIR) Filter, Order Reversed: Consider the general finite impulse response filter with transfer function denoted as H ( z ) : H ( z ) = a 1 + a 2 z – 1 + … + a N – 1 z – N + 1 + a N z –N (163) The order reversed FIR filter transfer function, H r ( z ) is given by: H r ( z ) = a N + a N – 1 z –1 + … + a 1 z – N + 1 + a 0 z –N (164) 149 The respective FIR filter signal flow graphs (SFG) are simply: FIR Filter Order Reversed FIR Filter x(k) x(k) a1 a2 aN−1 aN aN aN−1 a2 y(k) a1 y(k) The signal flow graph for an N+1 weight FIR filter and the order reversed FIR filter. The order reversed FIR filter is same order as the original FIR filter but with the filter weights in opposite order. From the z-domain functions above it is easy to show that H r ( z ) = z –N H ( z –1 ) . The order reversed FIR filter has exactly the same magnitude frequency response as the original FIR filter: H r ( z ) z = e jω = z –N H ( z –1 ) z = e jω = e –jωN H ( e –j ω ) = H ( e –jω ) = H ( e jω ) = H ( z ) z = e jω (165) The phase response of the two filters are however different. The difference to the phase response can be noted by considering that the zeroes of the order reversed FIR filter are the inverse of the zeroes of the original FIR filter, i.e. if the zeroes of Eq. 164 are α 1, α 2, …α N – 1, α N : H ( z ) = ( 1 – α1 z –1 ) ( 1 – α 2 z – 1 )… ( 1 – α N – 1 z –1 ) ( 1 – α N z –1 ) (166) then the zeroes of the order reversed polynomial are α 1–1, α2– 2, …α N–1– 1 , αN– 1 which can be seen from: H r ( z ) = z –N H ( z –1 ) = z –N ( 1 – α 1 z ) ( 1 – α 2 z )… ( 1 – α N – 1 z ) ( 1 – α N z ) = ( z – 1 – α 1 ) ( z –1 – α 2 )… ( z – 1 – α N – 1 ) ( z – 1 – α N ) (167) ( –1 )N = ------------------------------------------ ( 1 – α 1–1 z –1 ) ( 1 – α 2–1 z – 1 )… ( 1 – α N–1– 1 z –1 ) ( 1 – α N–1 z – 1 ) α 1 α2 …α N – 1 α N As examples consider the 8 weight FIR filter H ( z ) = 10 + 5z –1 – 3z –2 – z –3 + 3z –4 + 2z – 5 – z – 6 + 0.5z – 7 (168) and the corresponding order reversed FIR filter: H ( z ) = 0.5 – z – 1 + 2z –2 + 3z –3 – z –4 – 3z –5 + 5z –6 + 10z –7 (169) DSPedia 150 Assuming a sampling frequency of f s = 1 , the impulse response of both filters are easily plotted as : h(k ) hr ( k ) 10 8 6 4 2 0 10 8 6 4 2 0 k Impulse response, h ( k ) of simple FIR filter k Order reversed impulse response, h r ( k ) . The corresponding magnitude and phase frequency responses of both filters are: 20 log H ( e jω ) Magnitude Response 25 20 15 10 5 0 0.1 0.2 0.3 0.4 H ( e jω ) Phase Response π Phase (radians) Gain (dB) 30 π/2 0 -π/2 -π 0.5 0 frequency (Hz) 0.1 0.2 0.3 0.4 0.5 frequency (Hz) Magnitude and phase frequency response of FIR filter H ( z ) = 10 + 5z – 1 – 3z –2 – z –3 + 3z – 4 + 2z – 5 – z – 6 + 0.5z – 7 20 log H r ( e jω ) Magnitude Response 25 20 15 10 5 0 0.1 0.2 0.3 0.4 0.5 frequency (Hz) H r ( e jω ) Phase Response (wrapped) π Phase (radians) Gain (dB) 30 π/2 0 -π/2 -π 0 0.1 0.2 0.3 Magnitude and phase frequency response of order reversed FIR filter H r ( z ) = 0.5 – z – 1 + 2z –2 + 3z – 3 – z – 4 – 3z – 5 + 5z – 6 + 10z – 7 0.4 0.5 frequency (Hz) 151 and the z-domain plots of both filter zeroes are: Imag - Zeroes of FIR filter H(z) - Zeroes of order reversed FIR filter Hr(z) z-domain For a zero α = x + jy we note that α = x 2 + y 2 and therefore for related the order reversed filter zero at 1 ⁄ α we note: x – jy - = --------------------x 2 + y 2- = --------------------1 1 - = ------------------1- = ------------2 2 2 α x + jy x2 + y2 x +y x + y2 1 0 -1 1 Real -1 For this particular example H ( z ) is clearly minimum phase (all zeroes inside the unit circle), and therefore H r ( z ) is maximum phase (all zeroes outside of the unit circle. See also All-pass Filter, Digital Filter, Finite Impulse Response Filter. Finite Impulse Response (FIR) Filter, Real Time Implementation: For each input sample, an FIR filter requires to perform N multiply accumulate (MAC) operations: N–1 y(k) = ∑ wn x ( k – n ) (170) n=0 Therefore if a particular FIR filter is sampling data at fs Hz, then the number of arithmetic operations per second is: MACs/sec = Nf s (171) Finite Impulse Response (FIR) Filter, Wordlength: For a real time implementation of a digital filter, the wordlength used to represent the filter weights will of course have some bearing on the achievable accuracy of the frequency response. Consider for example the design of a high pass digital filter using 16 bit filter weights: Gain (dB) 0 FIR 1 0 FIR 4 0 -20 -20 -20 -40 -40 -40 -60 -60 0 1000 2000 3000 4000 5000 frequency (Hz) 16 bit coefficients FIR 5 -60 0 1000 2000 3000 4000 5000 frequency (Hz) 8 bit coefficients 0 1000 2000 3000 4000 5000 frequency (Hz) 4 bit coefficients Low pass filters designed parameters: Transition Bandwidth = 500Hz; Passband Ripple = 1dB and stopband attenuations of 40 dB, 60 dB, and 80 dB. DSPedia 152 Finite Impulse Response (FIR) Filter, Zeroes: An important way of representing an FIR digital filter is with a z-domain plot of the filter zeroes. By writing the transfer function of an FIR filter in the z-domain, the resulting polynomial in z can be factorised to find the roots, which are in fact the “zeroes” of the digital filter. Consider a simple 5 weight FIR filter : y ( k ) = – 0.3 x ( k ) + 0.5x ( k – 1 ) + x ( k – 2 ) + 0.5x ( k – 3 ) – 0.3x ( k – 4 ) (172) The signal flow graph of this filter can be represented as: x(k) x(k-1) -0.3 0.5 x(k-2) 1 x(k-4) x(k-3) 0.5 -0.3 y(k) The signal flow graph for a 5 weight FIR filter. The z-domain transfer function of this polynomial is therefore: ( z )- = – 0.3 + 0.5z –1 + z – 2 + ----------H( z) = Y 0.5z – 3 – 0.3z –4 X(z) (173) If the z-polynomial of Eq. 173 is factorised (using DSP design software rather than with paper and pencil!) then this gives for this example: H ( z ) = – 0.3 ( 1 – 2.95z –1 ) ( 1 – ( – 0.811 + 0.584j )z – 1 ) ( 1 – ( – 0.811 + 0.584j )z – 1 ) ( 1 – 0.339z –1 ) (174) and the zeroes of the FIR filter (corresponding to the roots of the polynomial are, z = 2.95, 0.339, – 0.811 + 0.584j, and –0.811 – 0.584j . (Note all quantities have been rounded to 3 decimal places). The corresponding SFG of the FIR filter written in the zero form of Eq. 174 is therefore: x(k) 2.95 0.339 -0.811+ 0.584j -0.8110.584j -0.3 y(k) The signal flow graph of four first order cascaded filters corresponding to the same impulse response as the 5 weight filter shown above. The first order filter coefficients correspond to the zeroes of the 5 weight filter. 153 The zeroes of the FIR filter can also be plotted on the z-domain plane: Imag 1 z-domain 0.5 -1 -0.5 0 0.5 1 Real 2 3 -0.5 -1 The zeroes of the FIR filter in Eq. 173. Note that some of roots are complex. In the case of an FIR filter with real coefficients the zeroes are always symmetric about the x-axis (conjugate pairs) such that when the factorised polynomial is multiplied out there are no imagniary values. If all of the zeroes of the FIR filter are within the unit circle then the filter is said to be minimum phase. FIR Filter: See Finite Impulse Response Filter. First Order Hold: Interpolation between discrete samples using a straight line. First order hold is a crude form of interpolation. See also Interpolation, Step Reconstruction, Zero Order Hold. Fixed point: Numbers are represented as integers. 16 bit fixed point can represent a range of 65536 (216) numbers (including zero). 24 bit fixed point as used by some Motorola fixed point DSP processors can represent a range of 16777216 (224) numbers. See also Binary, Binary Point, Floating Point, Two’s Complement. Fixed Point DSP: A DSP processor that can manipulate only fixed point numbers, such as the Motorola DSP56002, the Texas Instruments TMS320C50, the AT&T DSP16, or the Analog Devices ADSP2100. See also Floating Point DSP. Flash Converter: A type (expensive) analog to digital converter. Fletcher-Munson Curves: Fletcher and Munson’s 1933 paper [73] studied the definition of sound intensity, the subjective loudness of human hearing, and associated measurements. Most notably they produced a set of equal loudness contours which showed the variation in SPL of tones at different frequencies that are perceived as having the same loudness. The work of Fletcher and Munson was re-evaluated a few years later by Robinson and Dadson [126]. See also Equal Loudness Contours, Frequency Range of Hearing, Loudness Recruitment, Sound Pressure Level, Threshold of Hearing. Floating Point: Numbers are represented in a floating point notation with a mantissa and an exponent. 32 bit floating point numbers have a 24 bit mantissa and an 8 bit exponent. Motorola DSP processors use the IEEE 754 floating point number format whereas Texas Instruments use their own floating point number format. Both formats give a dynamic range of approximately 2-128 to 2128 with a resolution of 24 bits. fs: Abbreviation for the sampling frequency (in Hz) of a DSP system. DSPedia 154 Floating Point Arithmetic Standards: See IEEE Standard 754. Fourier: Jean Baptiste Fourier (died 1830) made a major contribution to modern mathematics with his work in using trigonometric functions to represent heat and diffusion equations. Fourier’s work is now collectively refered to as Fourier Analysis. See also Discrete Fourier Transform, Fourier Analysis, Fourier Series, Fourier Transform. Fourier Analysis: The mathematical tools of the Fourier series, Fourier transform, discrete Fourier transform, magnitude response, phase response and so on can be collectively refered to as Fourier analysis tools. Fourier analysis is widely used in science, engineering and business mathematics. In DSP representing a signal in the frequency domain using Fourier techniques, can bring a number of advantages: Physical Meaning: Many real world signals are produced as a sum of harmonic oscillations, e.g. vibrating music strings; vibration induced from the reciprocating motion of an engine; vibration of the vocal tract and other forms of simple harmonic motion. Hence reliable mathematical models can be produced. Filtering: It is often useful to filter in a frequency selective manner, e.g. filter out low frequencies. Signal Compression: If a signal is periodic over a long time, then rather than transmit the time signal, we can transmit the frequency domain parameters (amplitude, frequencies and phase) and the signal can be reconstructed at the other end of a communications line. See also Discrete Fourier Transform, Fast Fourier Transform, Fourier Transform. Fourier Series: There exists mathematical theory called the Fourier series that allows any periodic waveform in time to be decomposed into a sum of harmonically related sine and cosine waves. The first requirement in realising the Fourier series is to calculate the fundamental period, T , which is the shortest time over which the signal repeats, i.e. for a signal x ( t ) , then: x ( t ) = x ( t + T ) = x ( t + 2T ) = … = x ( t + kT ) (175) 1 T = ---f0 x( t) t0 t0 + T t 0 + 2T time The (fundamental) period of a signal x ( t ) identified as T . The fundamental frequency, f 0 , is calculated as f 0 = 1 ⁄ T . Clearly x ( t 0 ) = x ( t 0 + T ) = x ( t 0 + 2T ) . For a periodic signal with fundamental period T seconds, the Fourier series represents this signal as a sum of sine and cosine components that are harmonics of the fundamental frequency, f 0 = 1 ⁄ T Hz. The Fourier series can be written in a number of different ways: 155 ∞ ∑ x( t) = 2πnt A n cos ------------ + T n=0 2πnt - ∑ Bn sin ----------T n=1 ∞ = A0 + ∞ ∑ (176) 2πnt 2πnt A n cos ------------ + B n sin ------------ T T n=1 ∞ = A0 + ∑ [ A n cos ( 2πnf 0 t ) + B n sin ( 2πnf 0 t ) ] n=1 ∞ = A0 + ∑ [ A n cos ( nω 0 t ) + B n sin ( nω 0 t ) ] n=1 ∞ = ∑ [ An cos ( nω0 t ) + Bn sin ( nω0 t ) ] n=0 = A 0 + A 1 cos ( ω 0 t ) + A 2 cos ( 2ω 0 t ) + A 2 cos ( 3ω 0 t ) + … + B 1 sin ( ω 0 t ) + B 2 sin ( 2ω 0 t ) + B 2 sin ( 3ω 0 t ) + … where A n and B n are the amplitudes of the various cosine and sine waveforms, and angular frequency is denoted by ω 0 = 2πf 0 radians/second. Depending on the actual problem being solved we can choose to specify the fundamental periodicity of the waveform in terms of the period ( T ), frequency ( f 0 ), or angular frequency (ω 0 ) as shown in Eq. 176. Note that there is actually no requirement to specifically include a B 0 term since sin 0 = 0 , although there is an A 0 term, since cos 0 = 1 , which represents any DC component that may be present in the signal. In more descriptive language the above Fourier series says that any periodic signal can be reproduced by adding a (possibly infinite) series of harmonically related sinusoidal waveforms of amplitudes A n or B n . Therefore if a periodic signal with a fundamental period of say 0.01 seconds is identified, then the Fourier series will allow this waveform to be represented as a sum of various cosine and sine waves at frequencies of 100 Hz (the fundamental frequency, f 0 ), 200Hz, 300Hz (the harmonic frequencies 2f 0, 3f 0 ) and so on. The amplitudes of these cosine and sine waves are given by A 0, A 1, B 1, A 2, B 2, A 3 ..... and so on. So how are the values of A n and B n calculated? The answer can be derived by some basic trigonometry. Taking the last line of Eq. 176, if we multiply both sides by cos ( pω 0 t ) , where p is an arbitrary positive integer, then we get: ∞ cos ( pω 0 t ) x ( t ) = cos ( pω 0 t ) ∑ [ An cos ( nω0 t ) + Bn sin ( nω0 t ) ] n=0 (177) DSPedia 156 A0 time T A1 time B1 T time T/2 A2 time + x( t ) time 3 B2 A0 + ∑ 2πnt 2πnt A n cos ------------ + B n sin ------------ T T n=1 time T/3 A3 time B3 time Fourier series for a periodic signal x ( t ) . If we analyse a periodic signal and realise the cosine and sine wave Fourier coefficients of appropriate amplitudes A n and B n , then summing these components will lead to exactly the original signal. If we now take the average of one fundamental period of both sides, this can be done by integrating the functions over any one period, T : T T T ∫ cos ( pω0 t ) x ( t )dt = ∫ cos ( pω0 t ) ∑ An cos ( nω0 t ) + ∑ Bn sin ( nω0 t ) dt n=0 n=0 0 0 T ∞ = T ∞ T ∑ ∫ { An cos ( pω0 t ) cos ( nω0 t ) } dt + ∑ ∫ { Bn cos ( pω0 t ) sin ( nω0 t ) } dt n=0 0 n=0 0 Noting the zero value of the second term in the last line of Eq. 178, i.e. : (178) 157 T T 0 0 T T 0 0 Bn ∫ { Bn cos ( pω0 t ) sin ( nω0 t ) } dt = -----2- ∫ ( sin ( p + n )ω0 t – sin ( p – n )ω0 t ) dt (179) Bn Bn ( p + n )2πt ( p – n )2πt = ------ ∫ sin ----------------------------- dt – ------ ∫ sin ---------------------------- dt T T 2 2 = 0 using the trigonometric identity 2 cos A sin B = sin ( A + B ) – sin ( A – B ) and noting that the integral over one period, T, of any harmonic of the term sin [ 2πt ⁄ T ] is zero: T 2πt sin --------- = sin ω 0 t T 6πt sin --------- = sin 3ω 0 t T T time time The integral over T of any sine/cosine waveform of frequency f 0 = 1 ⁄ T or harmonics thereof, 2f 0, 2f 0, 3f 0, … is zero, regardless of the amplitude or phase of the signal. Eq. 179 is true for all values of the positive integers p and n . For the first term in the last line of Eq. 178 the average is only zero if p ≠ n , i.e. : T T An ∫ An cos ( pω0 t ) cos ( nω0 t ) dt = -----2- ∫ ( cos ( p + n ) ω0 t + cos ( p – n ) ω0 t ) dt = 0, 0 p≠n (180) 0 this time using the trigonometric identity 2 cos A cos B = cos ( A + B ) + cos ( A – B ) . If p = n then: T ∫ An cos ( nω0 t ) cos ( nω0 t ) dt T = A n ∫ cos2 ( nω 0 t ) dt 0 0 T T 0 0 An An An t = ------ ∫ ( 1 + cos 2n ω 0 t ) dt = ------ ∫ 1 dt = ------2 2 2 T 0 (181) AnT = --------2 Therefore using Eqs. 179, 180, 181 in Eq. 178 we note that: T ∫ cos ( pω0 t ) x ( t )dt 0 and therefore: An T = --------2 (182) DSPedia 158 T 2 A n = --- ∫ x ( t ) cos ( nω 0 t ) dt T (183) 0 By premultiplying and time averaging Eq. 178 by sin ( pω 0 t ) simplifications to Eqs. 179, 180, 181 we can similarly show that: and using a similar set of T 2 B n = --- ∫ x ( t ) sin ( nω 0 t ) dt T (184) 0 Hence the three key equations for calculating the Fourier series of a periodic signal with fundamental period T are: ∞ x(t) = ∑ n=0 T 2πnt A n cos ------------ + T ∞ ∑ 2πnt B n sin ------------ T n=1 2 A n = --- ∫ x ( t ) cos ( nω 0 t ) dt T (185) 0 T 2 B n = --- ∫ x ( t ) sin ( nω 0 t ) dt T 0 Fourier Series Equations See also Basis Function, Discrete Cosine Transform, Discrete Fourier Transform, Fast Fourier Transform, Fourier, Fourier Analysis, Fourier Series - Amplitude/Phase Representation, Fourier Series - Complex Exponential Representation, Fourier Transform, Frequency Response, Impulse Response, Gibbs Phenomenon, Parseval’s Theorem. Fourier Series, Amplitude/Phase Representation: It is often useful to abbreviate the notation of the Fourier series such that the series is a sum of cosine (or sine) only terms with a phase shift. To perform this notational simplification, first consider the simple trigonometric function: A cos ωt + B sin ωt where A and B are real numbers. If we introduce another variable, M such that M = then: (186) A2 + B2 159 A2 + B2 A cos ωt + B sin ωt = ------------------------ ( A cos ωt + B sin ωt ) A2 + B2 A B = M ------------------------ cos ωt + ------------------------ sin ωt A2 + B2 A2 + B2 (187) = M ( cos θ cos ωt + sin θ sin ωt ) = M cos ( ωt – θ ) = A 2 + B 2 cos ( ωt – { tan–1 B ⁄ A } ) since θ is the angle made by a right angle triangle of hypotenuese M and sides of A and B , i.e. tan–1 ( B ⁄ A ) = θ . M = A2 + B2 B θ A A simple right angled triangle with arbitrary length sides of A and B. The sine of the angle θ is the ratio of the opposite side over the hypotenuese, B ⁄ M and the cosine of the angle θ is the ratio of the adjacent side over the hypotenuese, A ⁄ M . the tangent of the angle θ is the ratio of the opposite side over the adjacent side, B ⁄ A . This result shows that the sum of a sine and a cosine waveform of arbitrary amplitudes is a sinusoidal signal of the same frequency but different amplitude and phase from the original sine and cosine terms. Using this result of Eq. 187 to combine each sine and cosine term, we can rewrite the Fourier series of Eq. 176 as: ∞ x( t) = ∑ 2πnt A n cos ------------ + T n=0 ∞ x( t) = θ n = tan–1 B n ⁄ A n A n2 + B n2 2πnt - ∑ Bn sin ----------T n=1 ∑ Mn cos ( nω0 t – θn ) n=0 Mn = ∞ (188) DSPedia 160 where A n and B n are calculated as before using Eqs. 183 and 184. A0 time T M1 T time T/2 + M2 time time T/3 M3 x(t) 3 A0 + ∑ n=1 time 2πnt M n cos ------------ – θ n T Comparing this Fourier series with the one on page 156 note that the sine and cosine terms have been combined for each frequency to produce a single cosine waveform of amplitude M n = A n2 + B n2 and phase θ n = B ⁄ A . From this representation of the Fourier series, can plot an amplitude line spectrum and a phase spectrum: T x( t ) time Phase Amplitude Fourier series calculation M1 M2 M3 100 200 300 frequency/Hz Amplitude Spectrum -30o 100 200 300 frequency/Hz Phase Spectrum The Fourier series components of the form: M n cos ( 2πf 0 t – θ n ) . The amplitude spectrum shows the amplitudes of each of the sine waves, and the phase spectrum shows the phase shift (in degrees in this example) of each cosine component. Note that the combination of the amplitude and phase spectrum completely defines the time signal. See also Discrete Cosine Transform, Discrete Fourier Transform, Fast Fourier Transform - Zero Padding, Fourier, Fourier Analysis, Fourier Series, Fourier Series - Complex Exponential Representation, Fourier Transform, Impulse Response, Gibbs Phenomenon, Parseval’s Theorem. Fourier Series, Complex Exponential Representation: It can be useful and instructive to represent the Fourier series in terms of complex exponentials rather than sine and cosine waveforms. (In the derivation presented below we will assume that the signal under analysis is real valued, although the result extends easily to complex signals.) From Euler’s formula, note that: 161 e jω + e –jω ⇒ cos ω = -----------------------2 e jω = cos ω + j sin ω and e jω – e – jω sin ω = -----------------------2j (189) Substituting the complex exponential definitions for sine and cosine in Eq. 176 (defined in item Fourier Series) and rearranging gives: ∞ ∑ An cos ( nω0 t ) + Bn sin ( nω0 t ) x ( t ) = A0 + n=1 ∞ ∑ = A0 + (190) e jnω0 t + e –j nω0 t e jnω 0 t – e –j nω0 t A n ------------------------------------- + B n ------------------------------------ 2 2j n=1 ∞ A B B A -----n- + -----n- e jnω0 t + -----n- – -----n- e – j nω0 t 2 2 2j 2j ∑ = A0 + n=1 ∞ ∑ = A0 + ∞ n – jB n jnω 0 t A -------------------- e + 2 n=1 A n + jB n - e – j nω t ∑ ------------------- 2 0 n=1 For the second summation term, if the sign of the complex sinusoid is negated and the summation limits are reversed, then we can rewrite as: ∞ ∑ x ( t ) = A0 + n – jB n jnω 0 t A -------------------- e + 2 n=1 –1 ∑ n + jB n jnω 0 t A -------------------- e 2 n = –∞ ∞ = ∑ (191) C n e jnω 0 t n = –∞ Writing C n in terms of the Fourier series coefficients of Eqs. 183 and 184 gives: C0 = A0 C n = ( A n – jB n ) ⁄ 2 for n > 0 C n = ( A n + jB n ) ⁄ 2 for n < 0 From Eq 192, note that for n ≥ 0 : (192) DSPedia 162 T T 0 0 A n – jB n 1 1 - = --- ∫ x ( t ) cos ( nω 0 t ) dt – j --- ∫ x ( t ) sin ( nω 0 t ) dt C n = -------------------T T 2 T 1 = --- ∫ x ( t ) [ cos ( nω 0 t ) – j sin ( nω 0 t ) ] dt T (193) 0 T 1 = --- ∫ x ( t )e –j nωo t dt T 0 For n < 0 it is clear from Eq. 192 that C n = C –* n where “*” denotes complex conjugate. Therefore we have now defined the Fourier series of a real valued signal using a complex analysis and a synthesis equation: ∞ x(t) = ∑ C n e jnω0 t Synthesis n=∞ T (194) 1 Analysis C n = --- ∫ x ( t )e –j nωo t dt T 0 Complex Fourier Series Equations The complex Fourier series also introduces the concept of “negative frequecies” whereby we view signals of the form e j2πf 0 as a positive complex sinusoid of frequency f 0 Hz, and signals of the form e –j 2πf 0 as a complex sinusoid of frequency – f 0 Hz. Note that the complex Fourier series is more notationally compact, and probably simpler to work with than the general Fourier series. (The “probably” depends on how clear you are in dealing with complex exponentials!) Also if the signal being analysed is in fact complex the general Fourier series of Eq. 176 (see Fourier Series) is insufficient but Eqs. 194 can be used. (For complex signals the coefficient relationship in Eq. 192 will not in general hold.) Assuming the waveform being analysed is real (usually the case), then it is easy to convert C n coefficients into A n and B n . Also note from Eq. 188 (see item Fourier Series) and Eq. 192 that: Mn = noting that C n = A n2 + B n2 = 2 C n (195) A n2 + B n2 ⁄ 2 . Clearly we can also note that for the complex number C n : B ∠C n = tan– 1 ---- = θ n A i.e. C n = C n e jθn (196) Therefore although a complex exponential does not as such exist as a real world (single wire voltage) signal, we can easily convert from a complex exponential to a real world sinusoid simply by taking the real or imaginary part of the complex Fourier coefficients and use in the Fourier series equation (see Eq. 176, Fourier Series): 163 ∞ x(t) = ∑ [ A n cos ( nω 0 t ) + B n sin ( nω 0 t ) ] (197) n=0 There are of course certain time domain signals which can be considered as being complex, i.e. having a separate real and imaginary components. This type of signal can be found in some digital communication systems or may be created within a DSP system to allow certain types of computation to be performed. If a signal is decomposed into its complex Fourier series, the resulting values for the various components can be plotted as a line spectrum. As we now have both complex and real values and positive and negative frequencies, this will require two plots, one for the imaginary components and one for the real components: T x(t) time Complex Fourier series calculation Real Valued Line Spectrum (An) 300 200 100 Amplitude 100 200 300 frequency/Hz Imaginary Valued Line Spectrum (Bn) 300 200 100 Amplitude 100 200 300 frequency/Hz The complex Fourier series line spectra. Note that there are both positive and negative frequencies, and for the complex Fourier series of a real valued signal the real line spectrum is symmetrical about f = 0 and the imaginary spectrum has point symmetry about the origin. DSPedia 164 Rather than showing the real and imaginary line spectra, it is more usual to plot the magnitude spectrum and phase spectrum: T x( t ) time Complex Fourier series calculation M1 Phase Magnitude B Phase tan– 1 ------n An A n + jB n Magnitude A n2 + B n2 M2 M3 100 200 300 frequency/Hz Magnitude Spectrum 100 200 300 -30o frequency/Hz Phase Spectrum Calculating the magnitude and phase spectra from the complex Fourier series. For a real valued signal, the result will be identical, except for a magnitude scaling factor of 2, to that obtained from the amplitude phase form of the Fourier series as on page 160. As both spectra are symmetrical about the y-axis the negative frequency values are not plotted. The “ease” of working with complex exponentials over sines and cosines can be illustrated by asking the reader to simplify the following equation to a sum of sine waves: sin ( ω 1 t ) sin ( ω 2 t ) (198) This requires the recollection (or re-derivation!) of trigonometric identities to yield: 1 1 sin ( ω 1 t ) sin ( ω 2 t ) = --- cos ( ω 1 – ω 2 )t + --- cos ( ω 1 + ω 2 )t 2 2 (199) While not particularly arduous, it is somewhat easier to simplify the following expression to a sum of complex exponentials: e jω1 t e jω2 t = e j ( ω1 + ω2 )t (200) Although a seemingly simple comment, this is the basis of using complex exponentials rather than sines and cosines; they make the maths easier. Of course in situations where the signal being analysed is complex, then the complex exponential Fourier series must be used. See also Discrete Fourier Transform, Fast Fourier Transform, Fast Fourier Transform - Decimationin-Time, Fourier, Fourier Analysis, Fourier Series, Fourier Series - Amplitude/Phase Representation, Fourier Transform, Frequency Response, Impulse Response, Gibbs Phenomenon, Parseval’s Theorem. Fourier Transform: The Fourier series (rather than transform) allows a periodic signal to be broken down into a sum of real valued sine and cosine waves (in the case of a real valued signal) or more generally a sum of complex exponentials. However most signals are aperiodic, i.e. not 165 periodic. Therefore the Fourier transform was derived in order to analyse the frequency content of an aperiodic signal. Consider the complex Fourier series of a periodic signal: ∞ ∑ x(t) = C n e jnω0 t n = –∞ T (201) 1 C n = --- ∫ x ( t )e –j nωo t dt T 0 1T = --f0 x(t) t0 t0 + T t 0 + 2T time A periodic signal x ( t ) with period T . The fundamental frequency, f 0 is calculated simply as f 0 = 1 ⁄ T . Clearly x ( t 0 ) = x ( t 0 + T ) = x ( t 0 + 2T ) . The period of the signal has been identified as T and the fundamental frequency is f 0 = 1 ⁄ T . Therefore the Fourier series harmonics occur at frequencies f 0, 2f 0, 3f 0, … . Magnitude Response Amplitude / Volts Time Signal 0.5 C0=0.5 0.4 T 1 0.3 0.5 0.2 0.1 0 0 1 2 3 4 5 0 0.5 1 time/s 1.5 2 2.5 3 3.5 4 4.5 frequency/Hz Fourier Series Computation Magnitude response of a (periodic) square wave. The phase response is zero for all components. The fundamental period is T = 2 and therefore the fundamental frequency is f 0 = 1 ⁄ 2 = 0.5 Hz and harmonics are therefore 0.5 Hz apart when the Fourier series is calculated. For the above square wave we can calculate the Fourier series using Eq. 201 as: DSPedia 166 T 1 1 1 1 t C0 = --- ∫ s ( t ) dt = --- ∫ 1 dt = --T 2 2 0 0 0 T 1 0 0 = 1 --2 1 1 e –j πntC n = --- ∫ s ( t )e –j ω0 nt dt = --- ∫ e –j πnt dt = -------------T 2 – 2jπn jπn ------e2 – j πn ----------- e 2 1 0 (202) – j πn – 1 = e --------------------– 2jπn (203) – j πn – j πn ----------– sin πn ⁄ 2 ----------= ----------------------------- e 2 = ---------------------- e 2 2jπn πn recalling that sin x = ( e jx – e –jx ) ⁄ 2j Noting that e –j πn ⁄ 2 = cos πn ⁄ 2 – j sin πn ⁄ 2 = 0, j or –j (depending on the value of n ) and recalling from Eq. 190 and 191 (see Fourier Series) that C n = A n + jBn then the square wave can be decomposed into a sum of harmonically related sine waves of amplitudes: A0 = 1 ⁄ 2 (204) 1 ⁄ nπ for odd n An = 0 for even n The amplitude response of the Fourier series is plotted above. Now consider the case where the signal is aperiodic, and is in fact just a single pulse: Amplitude / Volts Time Signal 1 0.5 0 1 2 3 4 5 time/s A single aperiodic pulse. This signal is most defintely not periodic and therefore the Fourier series cannot be calculated. 167 One way to obtain “some” information on the sinusoidal components comprising this aperiodic signal would be to assume the existence of a periodic “relative” or “pseudo-period” of this signal: Amplitude / Volts Time Signal “Pseudo-period” Tp 1 0.5 0 1 2 3 4 6 5 7 8 9 time/s Fourier Series Magnitude Response 0.25 0.2 0.15 0.1 0.05 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 frequency/Hz A periodic signal that is clearly a relative of the single pulse aperiodic signal. By adding the pseudo-periods we essentially assume that the single pulse of interest is a periodic signal and therefore we can now use the Fourier series tools to analyse. The fundamental period, T p = 4 and therefore the harmonics of the Fourier series are placed f 0 = 0.25 Hz apart. DSPedia 168 If we assumed that the “periodicity” of the pulse was even longer, say 8 seconds, then the spacing between the signal harmonics would further decrease: Amplitude / Volts Time Signal Tp 1 “Pseudo-period” 0.5 0 1 2 3 4 6 5 7 8 9 time/s Fourier Series Magnitude Response 0.125 0.1 0.075 0.05 0.025 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 frequency/Hz If we increase the fundamental pseudo-period to T p = 8 the harmonics of the Fourier series are more closely spaced at f 0 = 1 ⁄ 8 = 0.125 Hz apart. The magnitude of all the harmonics proportionally decreases with the increase in the pseudo-period. This is expected since the power of the signal decreases as the number of harmonics decreases. If we further assumed that the period of the signal was such that T → ∞ then f 0 → ∞ and given the finite energy in the signal, the magnitude of each of the Fourier series sine waves will tend to zero given that the harmonics are now so closely spaced! Hence if we multiply the magnitude response 169 by T and plot the Fourier series we have now realised a graphical interpretation of the Fourier transform: Amplitude / Volts Time Signal 1 Period, T → ∞ 0.5 0 1 time/s Fourier Series Magnitude Response 0.5/T 0.4/T 0.3/T 0.2/T 0.1/T 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 frequency/Hz If we increase the fundamental pseudo-period such that T → ∞ the frequency spacing between the harmonics of the Fourier series tends to zero, i.e. f 0 → 0 . Note that the magnitude of the Fourier series components are scaled proportionally down by the value of the “pseudo” period and in the limit as T → ∞ will tend to zero. Hence the y-axis is plotted as 1 ⁄ T . To realise the mathematical version of the Fourier transform first define a new function based on the general Fourier series of Eq. 201 such that: C X ( f ) = ------n- = C n T f0 (205) then: ∞ x(t) = ∑ C n e j2πnf 0 t n = –∞ T⁄2 X( f) = ∫ –T ⁄ 2 x ( t )e –j 2πnf0 t dt = ∞ (206) ∫ x ( t )e–j2πft dt –∞ where nf 0 becomes the continuous variable f as f 0 → 0 and n → ∞ . This equation is refered to as the Fourier transform and can of course be written in terms of the angular frequency: DSPedia 170 ∞ ∫ x ( t )e –jωt dt X(ω) = (207) –∞ Knowing the Fourier transform of a signal, of course allows us to transform back to the original aperiodic signal: ∞ x(t) = ∑ ∞ Cn e j2πnf 0 t = n = –∞ ∑ X ( f )f 0 e j2πnf0 t n = –∞ ∞ j2πnf 0 t = X f f ∑ ( )e 0 n = –∞ ∞ ⇒ x(t) = (208) ∫ X ( f )e j2πft df –∞ This equation is refered to as the inverse Fourier transform and can also be written in terms of the angular frequency: 1 x ( t ) = -----2π ∞ ∫ X ( ω )e jωt dω (209) –∞ Hence we have realised the Fourier transform analysis and synthesis pair of equations: ∞ x(t) = ∫ X ( f )e j2πft df –∞ ∞ X(f) = ∫ –∞ Synthesis (210) x ( t )e –j 2πft dt Analysis Fourier Transform Pair Therefore the Fourier transform of a continuous time signal, x ( t ) , will be a continuous function in frequency. See also Discrete Cosine Transform, Discrete Fourier Transform, Fast Fourier Transform, Fourier Analysis, Fourier Series, Fourier Series - Complex Exponential Representation, Fourier Transform. Forward Substitution: See Matrix Algorithms - Forward Substitution. Fractals: Fractals can be used to define seemingly irregular 1-D signals or 2-D surfaces using, amongst other things, properties of self similarity. Self similarity occurs when the same pattern repeats itself at different scalings, and is often seen in nature. A good introduction and overview of fractals can be found in [86]. Fractional Binary: See Binary Point. 171 Fractional Bandwidth: A definition of (relative) bandwidth for a signal obtained by dividing the difference of the highest and lowest frequencies of the signal by its center frequency. The result is a number between 0 and 2. When this number is multiplied by 100, the relative bandwidth can be stated in terms of percentage. See also Bandwidth. Fractional Delay Implementation: See All-pass Filter - Fractional Sample Delay Implementation. Fractional Sampling Rate Conversion: Sometimes sampling rate conversions are needed between sampling rates that are not integer multiples of each other and therefore simple integer downsampling or upsampling cannot be performed. One method of changing sampling rate is to convert a signal back to its analog form using a DAC, then resample the signal using an ADC sampling at the required frequency. In general this is not acceptable solution as two levels of noise are introduced by the DAC and ADC Interpolation by a factor of N, followed by decimation by a factor of M results in a sampling rate change of N/M. The higher the values of N and M, the more computation that is required. For example to convert from CD sampling rates of 44100Hz to DAT sampling rate of 48000Hz requires upsampling by a factor of 160, and downsampling by a factor of 147. When performing fractional sampling rate conversion the low pass anti-alias filter associated with decimation, and the low pass filter used in interpolation can be combined into one digital filter. See also Upsampling, Downsampling, Decimation, Interpolation. fs N Low Pass Filter Cut-Off = fs/2 max(N,M) Upsampler M N/Mfs Downsampler Frequency: Frequency is measured in Hertz (Hz) and gives a measure of the number of cycles per second of a signal. For example if a sine wave has a frequency of 300Hz, this means that the signal has 300 single wavelength cycles in one second. Square waves also can be assigned a frequency that is defined as 1/T where T is the period of one cycle of the square wave. See also Sine Wave. Frequency Domain Adaptive Filtering: The LMS (and other adaptive algorithms) can be configured to operate of time series data that has been transformed into the frequency domain [53], [131]. Frequency, Logarithmic: See Logarithmic Frequency. Frequency Modulation: One of the three ways of modulating a sine wave signal to carry information. The sine wave or carrier has its frequency changed in accordance with the information signal to be transmitted. See also Amplitude Modulation, Phase Modulation. Frequency Range of Hearing: The frequency range of hearing typically goes from around 20Hz to up to 20kHz in healthy young people. For adults the upper range of hearing is more likely to be in the range 11-16kHz as age erodes the high frequency sensitivity. The threshold of hearing varies over the frequency range, with the most sensitive portion being from around 1-5kHz, where speech frequencies occur. Low frequencies, below 20Hz, are tactile and only audible at very high sound pressure levels. Also listening to frequencies below 20Hz does not produce any further perception of reducing pitch. Inaudible sound below the lowest perceptible frequency is termed infrasound, and above the highest perceptible frequency, is known as ultrasound. DSPedia 172 Discrimination between tones at similar frequencies (the JND - just noticeable difference or DL Difference Limen), depends on a number of factors such as the frequency, sound pressure level (SPL), and sound duration. The ear can discriminate by about 1Hz for frequencies in the range 12kHz where the SPL is about 20dB above the threshold of hearing, and the duration is at least 1/4 seconds [30]. See also Audiogram, Audiometry, Auditory Filters, Beat Frequencies, Binaural Beats, Difference Limen, Ear, Equal Loudness Contours, Hearing Aids, Hearing Impairment, Hearing Level, Infrasound, Sensation Level, Sound Pressure Level, Spectral Masking, Temporal Masking, Threshold of Hearing, Ultrasound. Frequency Response: The frequency response a system defines how the magnitude and phase of signal components at different frequencies will be changed as the signal passes through, or is convolved with a linear system. For example the frequency response of a digital filter may attenuate low frequency magnitudes, but amplify those at high frequencies. The frequency response of a linear system is calculated by taking the discrete Fourier transform (DFT) of the impulse response or evaluating the z-transform of the linear system for z = e jω = e j2πf . See also Discrete Fourier Transform, Fast Fourier Transform. . Digital Filter Impulse Responseh(n) Magnitude |H(k)| Frequency Response (Magnitude Only) N–1 H(k) = ∑ h ( n )e 2πnk – j -------------- N n=0 frequency, k Frequency Shift Keying (FSK): A digital modulation technique in which the information bits are encoded in the frequency of a symbol. Typically, the frequencies are chosen so that the symbols are orthogonal over the symbol period. FSK demodulation can be either coherent (phase of carrier signal known) or noncoherent (phase of carrier signal unknown). Given a symbol period of T seconds, signals separated in frequency by 1/T Hz will be orthogonal and will have continuous phase. Signals separated by 1/(2T) Hz will be orthogonal (if demodulated coherently) but will result in phase discontinuities. See also Amplitude Shift Keying, Continuous Phase Modulation, Minimum Shift Keying, Phase Shift Keying. Frequency Transformation: The transformation of any time domain signal into the frequency domain. Frequency Weighting Curves: See Sound Pressure Level Weighting Curves. Frobenius Norm: See Matrix Properties - Norm. Formants: The vocal tract (comprising throat, mouth and lips) can act as an acoustics resonator with more than one resonant frequency. These resonant frequencies are known as formants and they change in frequency while we move tongue and lips in the process of joining speech sounds together (articulation). 173 Four Wire Circuit: A circuit containing two pairs of wires (or their logical equivalent) for simultaneous (full duplex) two transmission. See also Two Wire Channel, Full Duplex, Half Duplex, Simplex. Fricatives: One of the elementary sounds of speech, namely plosives, fricatives, sibilant fricative, semi-vowels, and nasals. Fricatives are formed from the lower lip and teeth with air through as when “f” is used in the word “fin”. See also Nasals, Plosives, Semi-vowels, and Sibilant Fricatives. Full Adder: The full adder is the basic single bit arithmetic building block for design of multibit binary adders, multipliers and arithmetic logic units. The full adder has three single bit inputs and two single bit outputs: a b cin cout sout 0 0 0 0 0 0 0 1 0 1 0 1 0 0 1 0 1 1 1 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 0 1 1 1 1 1 Truth Table c out = abc + abc + abc + abc = ab + bc + ac s out = abc + abc + abc + abc = ( a ⊕ b ) ⊕ c a b b cin a cin Boolean Algebra a cout cout b FA cin sout a b Symbol sout cin Logic Circuit Boolean Algebra: (a+b) represents (a OR b); (ab) represents (a AND b); a ⊕ b represents (a Exclusive-OR b). The full adder (FA) simply adds three bits (0 or 1) together to produce a sum bit, s out and carry bit, c out See also Arithmetic Logic Unit, Parallel Adder, Parallel Multiplier, DSP Processor. Full Duplex: Pertaining to the capability to send and receive simultaneously. See also Half Duplex, Simplex. Fundamental Frequency: The name of the lowest (and usually) dominant frequency component which has associated with it various harmonics (integer multiples of the frequency). In music for example the fundamental frequency identifies the note being played, and the various harmonics (and occasionally sub-harmonics) give the note its rich characteristic quality pertaining to the instrument being played. See also Fourier Series, Harmonics, Music, Sub-Harmonic, Western Music Scale. Fundamental Period: See also Fourier Series. Fuzzy Logic: A mathematical set theory which allows systems to be described in natural language rules. Binary for example uses only two level logic: 0 and 1. Fuzzy logic would still have the levels 0 and 1, but it would also be capable of describing all logic levels in between perhaps ranging through: almost definitely low, probably low, maybe high or low, probably high, to almost definitely high. Control of systems defined by fuzzy logic are currently being implemented in conjunction with 174 DSPedia DSP algorithms. Essentially fuzzy logic is a technique for representing information and combining objective knowledge (such as mathematical models and precise definitions) with subjective knowledge (a linguistic description of a problem). One advantage often cited about fuzzy systems is that they can produce results almost as good as an “optimum” system, but they are much simpler to implement. A good introduction, with tutorial papers, can be found in [63]. 175 G G-Series Recommendations: The G-series recommendations from the International Telecommunication (ITU), advisory committee on telecommunications (denoted ITU-T, and formerly known as CCITT) propose a number of standards for transmission systems and media, digital systems and networks. From a DSP perspective the G164/5/6/7 define aspects of echo and acoustic echo cancellation, and some of the G.7XX define various coding and compression schemes which underpin digital audio telecommunication. The ITU-T G-series recommendations (http://www.itu.ch) can be summarised as: G.100 G.101 G.102 G.103 G.105 G.111 G.113 G.114 G.117 G.120 G.121 G.122 G.123 G.125 G.126 G.132 G.133 G.134 G.135 G.141 G.142 G.143 G.151 G.152 G.153 G.162 G.164 G.165 G.166 G.167 G.172 G.173 G.174 G.180 G.181 G.191 G.211 G.212 G.213 G.214 Definitions used in Recommendations on general characteristics of international telephone connections and circuits. The transmission plan. Transmission performance objectives and Recommendations. Hypothetical reference connections. Hypothetical reference connection for crosstalk studies. Loudness ratings (LRs) in an international connection. Transmission impairments. One-way transmission time. Transmission aspects of unbalance about earth (definitions and methods). Transmission characteristics of national networks. Loudness ratings (LRs) of national systems. Influence of national systems on stability and talker echo in international connections. Circuit noise in national networks. Characteristics of national circuits on carrier systems. Listener echo in telephone networks. Attenuation distortion. Group-delay distortion. Linear crosstalk. Error on the reconstituted frequency. Attenuation distortion. Transmission characteristics of exchanges. Circuit noise and the use of Companders. General performance objectives applicable to all modern international circuits and national extension circuits. Characteristics appropriate to long-distance circuits of a length not exceeding 2500 km. Characteristics appropriate to international circuits more than 2500 km in length. Characteristics of Companders for telephony. Echo suppressors. Echo cancellers. Characteristics of syllabic Companders for telephony on high capacity long distance systems. Acoustic echo controllers. Transmission plan aspects of international conference calls. Transmission planning aspects of the speech service in digital public land mobile networks. Transmission performance objectives for terrestrial digital wireless systems using portable terminals to access the PSTN. Characteristics of N + M type direct transmission restoration systems for use on digital and analogue sections, links or equipment. Characteristics of 1 + 1 type restoration systems for use on digital transmission links. Software tools for speech and audio coding standardization. Make-up of a carrier link. Hypothetical reference circuits for analogue systems. Interconnection of systems in a main repeater station. Line stability of cable systems. DSPedia 176 G.215 G.221 G.222 G.223 G.224 G.225 G.226 G.227 G.228 G.229 G.230 G.231 G.232 G.233 G.241 G.242 G.243 G.322 G.325 G.332 G.333 G.334 G.341 G.343 G.344 G.345 G.346 G.352 G.411 G.421 G.422 G.423 G.431 G.441 G.442 G.451 G.473 G.601 G.602 G.611 G.612 G.613 G.614 G.621 G.622 G.623 G.631 G.650 G.651 Hypothetical reference circuit of 5000 km for analogue systems. Overall recommendations relating to carrier-transmission systems. Noise objectives for design of carrier-transmission systems of 2500 km. Assumptions for the calculation of noise on hypothetical reference circuits for telephony. Maximum permissible value for the absolute power level (power referred to one milliwatt) of a signalling pulse. Recommendations relating to the accuracy of carrier frequencies. Noise on a real link. Conventional telephone signal. Measurement of circuit noise in cable systems using a uniform-spectrum random noise loading. Unwanted modulation and phase jitter. Measuring methods for noise produced by modulating equipment and through-connection filters. Arrangement of carrier equipment. 12-channel terminal equipments. Recommendations concerning translating equipments. Pilots on groups, supergroups, etc. Through-connection of groups, supergroups, etc. Protection of pilots and additional measuring frequencies at points where there is a throughconnection. General characteristics recommended for systems on symmetric pair cables. General characteristics recommended for systems providing 12 telephone carrier circuits on a symmetric cable pair [(12+12) systems]. 12 MHz systems on standardized 2.6/9.5 mm coaxial cable pairs. 60 MHz systems on standardized 2.6/9.5 mm coaxial cable pairs. 18 MHz systems on standardized 2.6/9.5 mm coaxial cable pairs. 1.3 MHz systems on standardized 1.2/4.4 mm coaxial cable pairs. 4 MHz systems on standardized 1.2/4.4 mm coaxial cable pairs. 6 MHz systems on standardized 1.2/4.4 mm coaxial cable pairs. 12 MHz systems on standardized 1.2/4.4 mm coaxial cable pairs. 18 MHz systems on standardized 1.2/4.4 mm coaxial cable pairs. Interconnection of coaxial carrier systems of different designs. Use of radio-relay systems for international telephone circuits. Methods of interconnection. Interconnection at audio-frequencies. Interconnection at the baseband frequencies of frequency-division multiplex radio-relay systems. Hypothetical reference circuits for frequency-division multiplex radio-relay systems. Permissible circuit noise on frequency-division multiplex radio-relay systems. Radio-relay system design objectives for noise at the far end of a hypothetical reference circuit with reference to telegraphy transmission. Use of radio links in international telephone circuits. Interconnection of a maritime mobile satellite system with the international automatic switched telephone service; transmission aspects. Terminology for cables. Reliability and availability of analogue cable transmission systems and associated equipments (10) Characteristics of symmetric cable pairs for analogue transmission. Characteristics of symmetric cable pairs designed for the transmission of systems with bit rates of the order of 6 to 34 Mbit/s. Characteristics of symmetric cable pairs usable wholly for the transmission of digital systems with a bit rate of up to 2 Mbits. Characteristics of symmetric pair star-quad cables designed earlier for analogue transmission systems and being used now for digital system transmission at bit rates of 6 to 34 Mbit/s. Characteristics of 0.7/2.9 mm coaxial cable pairs. Characteristics of 1.2/4.4 mm coaxial cable pairs. Characteristics of 2.6/9.5 mm coaxial cable pairs. Types of submarine cable to be used for systems with line frequencies of less than about 45 MHz. Definition and test methods for the relevant parameters of single-mode fibres. Characteristics of a 50/125 µm multimode grades index optical fibre cable. 177 G.652 G.653 G.654 G.661 G.662 G.701 G.702 G.703 G.704 G.705 G.706 G.707 G.708 G.709 G.711 G.712 G.720 G.722 G.724 G.725 G.726 G.727 G.728 G.731 G.732 G.733 G.734 G.735 G.736 G.737 G.738 G.739 G.741 G.742 G.743 G.744 G.745 G.746 G.747 G.751 G.752 G.753 G.754 Characteristics of a single-mode optical fibre cable. Characteristics of a dispersion-shifted single-mode optical fibre cable. Characteristics of a 1550 nm wavelength loss-minimized single-mode optical fibre cable. Definition and test methods for relevant generic parameters of optical fibre amplifiers. Generic characteristics of optical fibre amplifier devices and sub-systems. Vocabulary of digital transmission and multiplexing, and pulse code modulation (PCM) terms. Digital hierarchy bit rates. Physical/electrical characteristics of hierarchical digital interfaces. Synchronous frame structures used at primary and secondary hierarchical levels. Characteristics required to terminate digital links on a digital exchange. Frame alignment and cyclic redundancy check (CGC) procedures relating to basic frame structures defined in Recommendation G.704. Synchronous digital hierarchy bit rates. Network node interface for the synchronous digital hierarchy. Synchronous multiplexing structure. Pulse code modulation (PCM) of voice frequencies. Transmission performance characteristics of pulse code modulation. Characterization of low-rate digital voice coder performance with non-voice signals. 7 kHz audio-coding within 64 kbit/s; Annex A: Testing signal-to-total distortion ratio for kHz audiocodecs at 64 kbit/s. Characteristics of a 48-channel low bit rate encoding primary multiplex operating at 1544 kbit/s. System aspects for the use of the 7 kHz audio codec within 64 kbit/s. 40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code Modulation (ADPCM). Annex A: Extensions of Recommendation G.726 for use with uniform-quantized input and output. 5-, 4-, 3- and 2-bits sample embedded adaptive differential pulse code modulation (ADPCM). Coding of speech at 16 kbit/s using low-delay code excited linear prediction. Annex G to Coding of speech at 16 kbit/s using low-delay code excited linear prediction: 16 kbit/s fixed point specification. Primary PCM multiplex equipment for voice frequencies. Characteristics of primary PCM multiplex equipment operating at 2048 kbit/s. Characteristics of primary PCM multiplex equipment operating at 1544 kbit/s. Characteristics of synchronous digital multiplex equipment operating at 1544 kbit/s. Characteristics of primary PCM multiplex equipment operating at 2048 kbit/s and offering synchronous digital access at 384 kbit/s and/or 64 kbit/s. Characteristics of a synchronous digital multiplex equipment operating at 2048 kbit/s. Characteristics of an external access equipment operating at 2048 kbit/s offering synchronous digital access at 384 kbit/s and/or 64 kbit/s. Characteristics of primary PCM multiplex equipment operating at 2048 kbit/s and offering synchronous digital access at 320 kbit/s and/or 64 kbit/s. Characteristics of an external access equipment operating at 2048 kbit/s offering synchronous digital access at 320 kbit/s and/or 64 kbit/s. General considerations on second order multiplex equipments. Second order digital multiplex equipment operating at 8448 kbit/s and using positive justification. Second order digital multiplex equipment operating at 6312 kbit/s and using positive justification. Second order PCM multiplex equipment operating at 8448 kbit/s. Second order digital multiplex equipment operating at 8448 kbit/s and using positive/zero/negative justification. Characteristics of second order PCM multiplex equipment operating at 6312 kbit/s. Second order digital multiplex equipment operating at 6312 kbit/s and multiplexing three tributaries at 2048 kbit/s. Digital multiplex equipments operating at the third order bit rate of 34368 kbit/s and the fourth order bit rate of 139264 kbit/s and using positive justification. Characteristics of digital multiplex equipments based on a second order bit rate of 6312 kbit/s and using positive justification. Third order digital multiplex equipment operating at 34368 kbit/s and using positive/zero/negative justification. Fourth order digital multiplex equipment operating at 139264 kbit/s and using positive/zero/negative justification. DSPedia 178 G.755 G.761 G.762 G.763 G.764 G.765 G.766 G.772 G.773 G.774 G.775 G.780 G.781 G.782 G.783 G.784 G.791 G.792 G.793 G.794 G.795 G.796 G.797 G.801 G.802 G.803 G.804 G.821 G.822 G.823 G.824 G.825 G.826 G.831 G.832 G.901 G.911 G.921 G.931 G.950 G.951 G.952 G.953 G.954 G.955 G.957 G.958 G.960 G.961 G.962 Digital multiplex equipment operating at 139264 kbit/s and multiplexing three tributaries at 44736 kbit/s. General characteristics of a 60-channel transcoder equipment. General characteristics of a 48-channel transcoder equipment. Summary of Recommendation G.763. Voice packetizationpacketized voice protocols. Packet circuit multiplication equipment. Facsimile demodulation/remodulation for DCME. Protected monitoring points provided on digital transmission systems. Protocol suites for Q-interfaces for management of transmission systems. Synchronous Digital Hierarchy (SDH) management information model for the network element view. G.774.01: Synchronous digital hierarchy (SDH) performance monitoring for the network element view. G.774.02: Synchronous digital hierarchy (SDH) configuration of the payload structure for the network element view. G.774.03: Synchronous digital hierarchy (SDH) management of multiplex-section protection for the network element view. Loss of signal (LOS) and alarm indication signal (AIS) defect detection and clearance criteria. Vocabulary of terms for synchronous digital hierarchy (SDH) networks and equipment. Structure of Recommendations on equipment for the synchronous digital hierarchy (SDH). Types and general characteristics of synchronous digital hierarchy (SDH) equipment. Characteristics of synchronous digital hierarchy (SDH) equipment functional blocks. Synchronous digital hierarchy (SDH) management. General considerations on transmultiplexing equipments. Characteristics common to all transmultiplexing equipments. Characteristics of 60-channel transmultiplexing equipments. Characteristics of 24-channel transmultiplexing equipments. Characteristics of codecs for FDM assemblies. Characteristics of a 64 kbit/s cross-connect equipment with 2048 kbit/s access ports. Characteristics of a flexible multiplexer in a plesiochronous digital hierarchy environment. Digital transmission models. Interworking between networks based on different digital hierarchies and speech encoding laws. Architectures of transport networks based on the synchronous digital hierarchy (SDH). ATM cell mapping into plesiochronous digital hierarchy (PDH). Error performance of an international digital connection forming part of an integrated services digital network. Controlled slip rate objectives on an international digital connection. The control of jitter and wander within digital networks which are based on the 2048 kbit/s hierarchy. The control of jitter and wander within digital networks which are based on the 1544 kbit/s hierarchy. The control of jitter and wander within digital networks which are based on the Synchronous Digital Hierarchy (SDH). Error performance parameters and objectives for international, constant bit rate digital paths at or above the primary rate. Management capabilities of transport networks based on the Synchronous Digital Hierarchy (SDH). Transport of SDH elements on PDH networks: Frame and multiplexing structures. General considerations on digital sections and digital line systems. Parameters and calculation methodologies for reliability and availability of fibre optic systems. Digital sections based on the 2048 kbit/s hierarchy. Digital line sections at 3152 kbit/s. General considerations on digital line systems. Digital line systems based on the 1544 kbit/s hierarchy on symmetric pair cables. Digital line systems based on the 2048bit/s hierarchy on symmetric pair cables. Digital line systems based on the 1544 kbit/s hierarchy on coaxial pair cables. Digital line systems based on the 2048 kbit/s hierarchy on coaxial pair cables. Digital line systems based on the 1544 kbit/s and the 2048 kbit/s hierarchy on optical fibre cables. Optical interfaces for equipments and systems relating to the synchronous digital hierarchy. Digital line systems based on the synchronous digital hierarchy for use on optical fibre cables. Access digital section for ISDN basic rate access. Digital transmission system on metallic local lines for ISDN basic rate access. Access digital section for ISDN primary rate at 2048 kbit/s. 179 G.963 Access digital section for ISDN primary rate at 1544 kbit/s. G.964 V-Interfaces at the digital local exchange (LE)V5.1-Interface (based on 2048 kbit/s) for the support of access network (AN). G.965 V-Interfaces at the digital local exchange (LE)V5.2 interface (based on 2048 kbit/s) for th support of Access Network (AN). G.971 General features of optical fibre submarine cable systems. G.972 Definition of terms relevant to optical fibre submarine cable systems. G.974 Characteristics of regenerative optical fibre submarine cable systems. G.981 PDH optical line systems for the local network. For additional detail consult the appropriate standard document or contact the ITU. See also International Telecommunication Union, ITU-T Recommendations, Standards. Gabor Spectrogram: An algorithm to transform signals from the time domain to the joint timefrequency domain (similar to the Short Time FFT spectrogram). The Gabor is most useful for analyzing signals who frequency content is time varying, but which does not show up on conventional spectrogram methods. For example in a particular jet engine the casing vibrates at 50Hz when running at full speed. If the frequency actually fluctuates about ±1Hz around 50Hz, then when using the conventional FFT the fluctuations may not have enough energy to be detected or may be smeared due to windowing effects. The Gabor spectrogram on the other hand should be able to highlight the fluctuations. Gain: An increase in the voltage, or power level of a signal usually accomplish by an amplifier. Gain is expressed as a factor, or in dB. See also Amplifier. Gauss Transform: See Matrix Decompositions - Gauss Transform. Gaussian Distribution: See Random Variable. Gaussian Elimination: See Matrix Decompositions - Gaussian Elimination. Gibbs Phenomenon: The Fourier series for a periodic signal with (almost) discontinuities will tend to an infinite series. If the signal is approximated using a finite series of harmonics then the reconstructed signal will tend to oscillate near or on the discontinuities. For example, the Fourier series of a signal, x ( t ) , is given by: ∞ x( t) = ∑ 2πnt A n cos ------------ + T n=0 ∞ 2πnt - ∑ Bn sin ----------T (211) n=1 For a signal such as a square wave, the series will be infinite. If however we try to produce the signal using just the first few Fourier series coefficients up to M: M x( t) = ∑ n=0 2πnt A n cos ------------ + T M 2πnt - ∑ Bn sin ----------T n=1 (212) DSPedia 180 then “ringing” will be seen near the discontinuties since to adequately represent these parts of the waveform we require the high frequency components which have been truncated. This ringing is refered to as Gibb’s phenonmenon. Time Signal x(t) 100 Ts A m 50 p l i t 0 u d e -50 time/s -100 0 10.e-3 20.e-3 30.e-3 time/s The Fourier series for a square wave is an infinite series of sine waves at frequencies of f 0, 3f 0, 5f 0, … . and relative amplitudes of 1, 1 ⁄ 3, 1 ⁄ 5, … If this series is truncated to the 15th harmonic, then the resulting “square wave” rings at the discontinuities. See also Discrete Fourier Transform, Fourier Series, Fourier Series - Amplitude/Phase Representation, Fourier Series - Complex Exponential Representation, Fourier Transform. Given’s Rotations: See Matrix Decompositions - Given’s Rotations. Global Information Infrastructure (GII): The Global Information Infrastructure will be jointly defined by the International Organization for Standards (ISO), International Electrotechnical Committee (IEC) and the International Telecommunication Union (ITU). The ISO, IEC and ITU have all defined various standards that have direct relevance to interchange of graphics, audio, video and data information via computer and telephone networks and all therefore have a relevant role to play in the definition of the GII. Global Minimum: The global minimum is the smallest value taken on by that function. For example for the function, f(x), the global minimum is at x = xg. The minima are x1, x2 and x3 are termed local minima: f(x) x1 xg x2 x3 x The existence of local minima can cause problems when using a gradient descent based adaptive algorithm. In these cases, the algorithm can get stuck in a local minimum. This is not a problem when the cost function is quadratic in the parameter of interest (e.g., the filter coefficients), since 181 quadratic functions (such as a parabola) have a unique minimum (or maximum) or, worst case, a set of continuous minima that all give the same cost. See also Hyperparaboloid, Local Minima, Adaptive IIR Filters, Simulated Annealing. Glue Logic: To connect different chips on printed circuit boards (PCBs) it is often necessary to use buffers, inverters, latches, logic gates etc. These components are often referred to a glue logic. Many DSP chip designers pride themselves in having eliminated glue logic for chip interfacing, especially between D/A and A/D type chips. Golden Ears: A term often used to describe a person with excellent hearing, both in terms of frequency range and threshold of hearing. Golden ear individuals can be in demand from recording studios, audio equipment manufacturers, loudspeaker manufacturers and so on. Although a necessary qualification for golden ears is excellent hearing, these individuals most probably learn their trade from many years of audio industry experience. It would be expected that a golden ears individual could “easily” distinguish Compact Disc (CD) from analog records. The big irony is that golden eared individuals cannot distinguish recordings of REO Speedwagon from those of Styx. See also Audiometry, Compact Disc, Frequency Range of Hearing, Threshold of Hearing. Goertzel’s Algorithm: Goertzel’s algorithm is used to calculate if a frequency component is present at a particular frequency bin of a discrete Fourier transform (DFT). Consider the DFT equation calculating the discrete frequency domain representation, X ( m ) , of N samples of a discrete time signal x ( k ) : N–1 X(m) = ∑ x ( n )e 2πnm – j ---------------- N , for all k = 0 to N – 1 (213) n=0 This computation requires N 2 complex multiply accumulates (CMACs), and the frequency representation will have a resolution of f s ⁄ N Hz. If we require to calculate the frequency component at the p-th frequency bin, only N CMACs are required. Of course the fast Fourier transform (FFT) is usually used instead of the DFT, and this requires Nlog 2 N CMACs. Therefore if a Fourier transform is being performed simply to find if a tonal component is present at one frequency only, it makes more sense to use the DFT. Note that by the nature of the calculation data flow, the FFT cannot calculate a frequency component at one frequency only - it’s all bins or none. Goertzel’s algorithm provides a formal algorithmic procedure for calculating a single bin DFT. Goertzel’s algorithm to calculate the p-th frequency bin of an N point DFT is given by: 2πp sp ( k ) = x ( k ) + 2 cos ---------- s p ( k – 1 ) – s p ( k – 2 ) N where W Np = e 2πp j ---------N yp ( k ) = sp ( k ) – W Np s p ( k – 1) and the initial conditions s p ( – 2 ) = s p ( – 1 ) = 0 apply. (214) DSPedia 182 Eq. 214 calculates the p-th frequency bin of the DFT after the algorithm has processed N data points, i.e. X ( p ) = y p ( N ) . Goertzel’s algorithm can be represented as a second order IIR: x(k) yp ( k ) sp ( k ) 2πn 2 cos ---------N p WN = e 2πp j ---------N -1 An IIR filter representation of Goertzel’s algorithm. Note that the non-recursive part of the filter has complex weights, whereas the recursive part has only real weights. The recursive part of this filter is in fact a simple narrowband filter. For an efficient implementation it is best to compute s p ( k ) for N samples, and thereafter evaluate y p ( N ) . For tone detection (i.e. tone present or not-present), only the signal power of the p-th frequency bin is of interest, i.e. X ( p ) 2 . Therefore from Eq. 214: X ( p ) 2 = X ( p )X * ( p ) = yp ( N )yp* ( N ) 2πp = s p ( N )sp ( N ) + 2 cos ---------- sp ( N )s p ( N – 1 ) + s p ( N – 1 )s p ( N – 1 ) N (215) Goertzel’s algorithm is widely used for dual tone multifrequency (DTMF) tone detection because of its simplicity and that it requires less computation than the DFT or FFT. For DTMF tones, there are 8 separate frequencies which must be detected. Therefore a total of 8 frequency bins are required. The International Telecommunication Union (ITU) suggest in standards Q.23 and Q24 that a 205 point DFT is performed for DTMF detection. To do a full DFT would require 205 × 205 = 42025 complex multiplies and adds (CMACs). To use a zero padded 256 point FFT would require 256log 2 256 = 2048 CMACs. Given that we are only interested in 8 frequency bins (and not 205 or 256), the computation required by Goerztel’s algorithm is 8 × 205 = 1640 CMACs. Compared to the FFT, Goertzel’s algorithm is simple and requires little memory or assembly language code to program. For DTMF tone detection the frequency bins corresponding to the second harmonic of each tone are also calculated. Hence the total computation of Goertzel’s algorithm in this case is 3280 CMACs which is more than for the FFT. However the simplicity of Goertzel’s algorithm means it is still the technique of choice. In order to detect the tones at the DTMF frequencies, and using a 205 point DFT with f s = 8000 Hz , the frequency bins to evaluate via Geortzel’s algorithm are: frequency, f / Hz bin 697 18 770 20 852 22 183 frequency, f / Hz bin 941 24 1209 31 1336 34 1477 38 1633 42 Note that if the sampling frequency is not 8000 Hz, or a different number of data points are used, then the bin numbers will be different from above. See also Discrete Fourier Transform, Dual Tone Multifrequency, Fast Fourier Transform. Gram-Schmidt: See Matrix Decompositions - Gram-Schmidt. Granular Synthesis: A technique for musical instrument sound synthesis [13], [14], [32]. See also Music, Western Music Scale. Granularity Effects: If the step size is too large in a delta modulator, then the delta modulated signal will give rise to a large error and completely fail to encode signals with a magnitude less than the step size. See also Delta Modulation, Slope Overload. x(n) time Graphic Interchange Format (GIF): The GIF format has become a de facto industry standard for the interchange of raster graphic data. GIF was first developed by Compuserve Inc, USA. GIF essentially defines a protocol for on-line transmission and interchange of raster graphic data such that it is completely independent of the hardware used to create or display the image. GIF has a limited, non-exclusive, royalty-free license and has widespread use on the Internet and in many DSP enabled multimedia systems. See also Global Information Infrastructure, Joint Photographic Experts Group, Standards. Graphical Compiler: A system that allows you to draw your algorithm and application architecture on a computer screen using a library of icons (FIR filters, FFTs etc.) which will then be compiled into executable code, usually ‘C’, which can then be cross compiled to an appropriate assembly language for implementation on a DSP processor. See also Cross Compiler. Graphical Equalizer: This is a device used in music systems which can be used to control the frequency content of the output. A graphic equalizer is therefore effectively a set of bandpass filters with independent gain settings that can be implemented in the analog or digital domains. Group Delay: See Finite Impulse Response Filter. DSPedia 184 G ( e jω ) 0 -10 -2π -4π -20 0 G ( e jω ) 0 frequency (Hz) frequency (Hz) All-pass filter Input Gain (dB) G(z) 0 HA(z) G ( e jω )H A ( e jω ) -10 G ( e jω )H A ( e jω ) -2π -20 0 0 Output Phase 0 Phase Gain (dB) Group Delay Equalisation: A technique to equalise the phase response of a system to be linear (i.e. constant group delay) by cascading the output of the system with an all pass filter designed to have suitable phase shifting characteristics. The magnitude frequency response of the system cascaded with the all pass filter is the same as that of the system on its own. -4π frequency (Hz) 0 frequency (Hz) Group delay equalisation by cascading an all pass filter H A ( z ) with a non-linear phase filter G ( z ) in order to linearise the phase response and therefore produce a constant group delay. The magnitude frequency response of the cascaded system, G ( e jω )H A ( e jω ) is the same as the original system, G ( e jω ) .. The design of group delay equalisers is not a trivial procedure. See also All-pass Filter, Equalisation, Finite Impulse Reponse Filter - Linear Phase . Group Speciale Mobile (GSM): The European mobile communication system that implements 13.5kbps speech coding (with half-rate 6.5kbps channels optional) and uses Gaussian Minimum Shift Keying (GMSK) modulation [85]. Data transmission is also available at rates slightly below the speech rates. See also Minimum Shift Keying. 185 H H261: See H-Series Recommendations - H261. H320: See H-Series Recommendations - H320. H-Series Recommendations: The H-series recommendations from the International Telecommunication (ITU), advisory committee on telecommunications (denoted ITU-T, and formerly known as CCITT) propose a number of standards for the line transmission of nontelephone signals. Some of the current ITU-T H-series recommendations (http://www.itu.ch) can be summarised as: H.100 H.110 H.120 H.130 H.140 H.200 H.221 H.224 H.230 H.231 H.233 H.234 H.242 H.243 H.261 H.281 H.320 H.331 Visual telephone systems. Hypothetical reference connections for videoconferencing using primary digital group transmission. Codecs for videoconferencing using primary digital group transmission. Frame structures for use in the international interconnection of digital codecs for videoconferencing or visual telephony A multipoint international videoconference system Framework for Recommendations for audiovisual services Frame structure for a 64 to 1920 kbit/s channel in audiovisual teleservices A real time control protocol for simplex application using the H.221 LSD/HSD/MLP channels. Frame-synchronous control and indication signals for audiovisual systems. Multipoint control units for audiovisual systems using digital channels up to 2 Mbit/s. Confidentiality system for audiovisual services. Encryption key management and authentication system for audiovisual services. System for establishing communication between audiovisual terminals using digital channels up to 2 Mbit/s. Procedures for establishing communication between three or more audiovisual terminals using digital channels up to 2 Mbit/s. Video codec for audiovisual services at p x 64 kbit/s. A far end camera control protocol for videoconferences using H.224. Narrow-band visual telephone systems and terminal equipment below. Broadcasting type audiovisual multipoint systems and terminal equipment. From the interest point of DSP and multimedia systems and algorithms the above title descriptions of H242, H261 and H320 can be expanded upon as per http://www.itu.ch: • H.242: The H242 recommendation defines audiovisual communication using digital channels up to 2 Mbit/s. This recommendation should be read in conjunction with ITU-T recommendations G.725, H.221 and H.230. H242 is suitable for applications that can use narrow (3 kHz) and wideband (7 kHz) speech together with video such as video-telephony, audio and videoconferencing and so on. H242 can produce speech, and optionally video and/ or data at several rates, in a number of different modes. Some applications will require only a single channel, whereas others may require two or more channels to provide the higher bandwidth. • H.261: The H.261 recommendation describes video coding and decoding methods for the moving picture component of audiovisual services at the rate of p x 64 kbit/s, where p is an integer in the range 1 to 30, i.e. 64kbits/s to 1.92Mbits/s. H261 is suitable for transmission of video over ISDN lines, for applications such as videophones and videoconferencing. The videophone application can tolerate a low image quality and can be achieved for p = 1 or 2 . For videoconferencing applications where the transmission image is likely to include a few people and last for a long period, higher picture quality is required and p > 6 is required. H.261 defines two picture formats: CIF (Common Intermediate Format) has 288 lines by 360 pixels/line of luminance information and 144 x 180 of chrominance information; and QCIF (Quarter Common Intermediate Format) which is 144 lines by 180 pixels/line of luminance and 72 x 90 of chrominance. The choice of CIF or QCIF depends on available channel capacity and desired quality. DSPedia 186 The H261 encoding algorithm is similar in structure that of MPEG, however they are not compatible. It is also worth noting that H.261 requires considerably less CPU power for encoding than MPEG. Also the algorithm makes available use of the bandwidth by trading picture quality against motion. Therefore a fast moving image will have a lower quality than a static image. H.261 used in this way is thus a constant-bit-rate encoding rather than a constant-quality, variable-bit-rate encoding. • H.320: H.320 specifies a narrow-band visual telephone services for use in channels where the data rate cannot exceed 1920 kbit/s. For additional detail consult the appropriate standard document or contact the ITU. See also International Telecommunication Union, ITU-T Recommendations, Standards. Haas Effect: In a reverberant environment the sound energy received by the direct path can be much lower than the energy received by indirect reflective paths. However the human ear is still able to localize the sound location correctly by localizing the first components of the signal to arrive. Later echoes arriving at the ear increase the perceived loudness of the sound as they will have the same general spectrum. This psychoacoustic effect is commonly known as the precedence effect, the law of the first wavefront, or sometimes the Haas effect [30]. The Haas effect applies mainly to short duration sounds or those of a discontinuous or varying form. See also Ear, Lateralization, Source Localization, Threshold of Hearing. Habituation: Habituation is the effect of the auditory mechanism not perceiving a repetitive noise (which is above the threshold of hearing) such as the ticking of a nearby clock or passing of nearby traffic until attention is directed towards the sound. See also Adaptation, Psychoacoustics, Threshold of Hearing. Hamming Distance: Often used in channel coding applications, Hamming distance refers to the number of bit locations in which two binary codewords differ. For example the binary words 10100011 and 10001011 differ in two positions (the third and the fifth from the left) so the Hamming distance between these words is 2. See also Euclidean Distance, Channel Coding, Viterbi Algorithm. Hamming Window: See Windows. Half Duplex: Pertaining to the capability to send and receive data on the same line, but not simultaneously. See also Full Duplex, Simplex. Hand Coding: When writing programs for DSP processors ‘C’ cross compilers are often available. Although algorithm development with cross compilers is faster than when using assembly language, the machine code produced is usually less efficient and compact as would be achieved by writing in assembler. Cleaning up this less efficient assembly code is sometimes referred to as hand-coding. Coding directly in machine code is also referred to as hand-coding. See also Assembly Language, Cross-Compiler, Machine Code. Handshaking: A communication technique whereby one system acknowledges receipt of data from another system by sending a handshaking signal. 187 Harmonic: Given a signal with fundamental frequency of M Hz, harmonics of this signal are at integer multiples of M, i.e. at 2M, 3M, 4M, and so on. See also Fundamental Frequency, Music, Sub-harmonic, Total Harmonic Distortion. Magnitude fundamental frequency harmonics M 2M 3M 4M frequency (Hz) The frequency domain representation of a tone at M Hz with associated harmonics. harris Window: See Windows. Hartley Transform: The Hartley transform is “similar” in computational structure (although different in properties) to the Fourier transform. One key difference is that the Hartley transform uses real numbers rather than complex numbers. A good overview of the mathematics and application of the Hartley transform can be found in [121]. Harvard Architecture: A type of microprocessor (and microcomputer) architecture where the memory used to store the program, and the memory used to store the data are separate therefore allowing both program and data to be accessed simultaneously. Some DSPs are described as being a modified Harvard architecture where both program and data memories are separate, but with cross-over links. See also DSP Processor. Head Shadow: Due to the shape of the human head, incident sounds can be diffracted before reaching the ears. Hence the actual waveform arriving at the ears is different than what would have been received by an ear without the head present. Headshadow is an important consideration in the design of virtual sound systems and in the design of some types of advanced DSP hearing aids. See also Diffraction. Hearing: The mechanism and process by which mammals perceive changes in acoustic pressure waves, or sound. See also Audiology, Audiometry, Ear, Psychoacoustics, Threshold of Hearing. Hearing Aids: A hearing aid can be described as any device which aids the wearer by improving the audibility of speech and other sounds. The simplest form of hearing aid is an acoustic amplification device (such as an ear trumpet), and the most complex is probably a cochlear implant system (surgically inserted) which electrically stimulates nerves using acoustic derived signals received from a body worn radio transmitter and microphone. More commonly, hearing aids are recognizable as analogue electronic amplification devices consisting of a microphone and amplifier connected to an acoustic transducer usually just inside the ear. However a hearing aid which simply makes sounds louder is not all that is necessary to allow hearing impaired individuals to hear better. In everyday life we are exposed to a very wide range of sounds coming from all directions with varying intensities, and various degrees of reverberation. Clearly hearing aids are required to be very versatile instruments, that are carefully designed around known parameters and functions of the ear, and providing compensation techniques that are suitable for the particular type of hearing loss, in particular acoustic environments. DSPedia 188 Simple analogue electronic hearing aids can typically provide functions of volume and tone control. More advanced devices may incorporate multi-band control (i.e., simple frequency shaping) and automatic gain control amplifiers to adjust the amplification when loud noises are present. Hearing aids offering multi-band compression with a plethora of digitally adjustable parameters such as attack and release times, etc., have become more popular. Acoustic feedback reduction techniques have also been employed to allow more amplification to be provided before the microphone/ transducer loop goes unstable due to feedback (this instability is often detected as an unsatisfied hearing aid wearer with a screeching howl in their ear). Acoustic noise reduction aids that exploit the processing power of advanced DSP processing have also been designed. Digital audio signal processing based hearing aids may have advantages over traditional analogue audio hearing aids. They provide a greater accuracy and flexibility in the choice of electroacoustic parameters and can be easily interfaced to a computer based audiometer. More importantly they can use powerful adaptive signal processing techniques for enhancing speech intelligibility and reducing the effects of background noise and reverberation. Currently however, power and physical size constraints are limiting the availability of DSP hearing aids. See also Audiology, Audiometry, Beamforming, Ear, Head Shadow, Hearing Impairment, Threshold of Hearing. Hearing Impairment: A reduction in the ability to perceive sound, as compared to the average capability of a cross section of unimpaired young persons. Hearing impairment can be caused by exposure to high sound pressure levels (SPL), drug induced, virus-induced, or simply as a result of having lived a long time. A hearing loss can be simply quantified by an audiogram and qualified with more exact audiological language such as sensorineural loss or conductive loss, etc., [4], [30]. See also Audiology, Audiometry, Conductive Hearing Loss, Ear, Hearing, Loudness Recruitment, Sensorineural Hearing Loss, Sound Pressure Level, Threshold of Hearing. Hearing Level (HL): When the hearing of person is to be tested, the simplest method is to play pure tones through headphones (using a calibrated audiometer) over a range of frequencies, and determine the minimum sound pressure level (SPL) at which the person can hear the tone. The results could then be plotted as minimum perceived SPL versus frequency. To ascertain if the person has a hearing impairment the plot can be compared with the average minimum level of SPL for a cross section of healthy young people with no known hearing impairments. However if the minimum level of SPL (the threshold of hearing) is plotted as SPL versus frequency, the curve obtained is not a straight line and comparison can be awkward. Therefore for Hearing Level (dB) plots (or audiograms), the deviation from the average threshold of hearing of young people is plotted with hearing loss indicated by a positive measurement that is plotted lower on the audiogram. The threshold of hearing is therefore the 0dB line on the Hearing Level (dB) scale. The equivalent dB (HL) and dB (SPL) for some key audiometric frequencies in the UK are [157]: Frequency (Hz) 250 500 1000 2000 4000 8000 dB (HL) 0 0 0 0 0 0 dB (SPL) 26 15.6 8.2 5.2 7 20 See also Audiogram, Audiometry, Equal Loudness Contours, Frequency Range of Hearing, Hearing Impairment, Loudness Recruitment, Sensation Level, Sound Pressure Level, Threshold of Hearing. Hearing Loss: See Hearing Impairment. 189 Hermitian: See Matrix Properties - Hermitian Transpose. Hermitian Transpose: See Matrix Properties - Hermitian Transpose. Hertz (Hz): The unit of frequency measurement named after Heinrich Hertz. 1 Hz is 1 cycle per second. Hexadecimal, Hex: Base 16. Conversion from binary to hex is very straightforward and therefore hex digits have become the standard way of representing binary quantities to programmers. A 16 bit binary number can be easily represented in 4 hex digits by grouping four bits together starting from the binary point and converting to the corresponding hex digit. The hex digits are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F. Hexadecimal entries in DSP assembly language programs are prefixed by either by $ or 0x to differentiate them from decimal entries. An example (with base indicated as subscript): 0010 1010 0011 11112 = 2A3F16 = (2 x 163) + (10 x 162) + (3 x 161) + 15 = 1081510 High Pass Filter: A filter which passes only the portions of a signal that have frequencies above a specified cut-off frequency. Frequencies below the cut-off frequency are highly attenuated. See also Digital Filter, Low Pass Filter, Bandpass Filter, Filters. Bandwidth Input High pass Filter G(f) Output Magnitude |G(f)| Cut-off frequency frequency Higher Order Statistics: Most stochastic DSP techniques such as the power spectrum, least mean squares algorithm and so on, are based on first and second order statistical measures such as mean, variance and autocorrelation. The higher order moments, such as the 3rd order moment (note that the first order moment is the mean, the second order central moment is the variance) are usually not considered. However there is information to be gathered from a consideration of these higher order statistics. One example is detecting the baud rate of PSK signals. Recently there has been considerable interest in higher order statistics within the DSP community. For information refer to the tutorial article [117]. See also Mean, Variance. Hilbert Transform: Simply described, a Hilbert transform introduces a phase shift of 90 degrees at all frequencies for a given signal. A Hilbert transform can be implemented by an all-pass phase shift network. Mathematically, the Hilbert transform of a signal x(t) can be computed by linear filtering (i.e., convolution) with a special function: 1 x h ( t ) ≡ x ( t ) ⊗ ----πt (216) It may be more helpful to think about the Hilbert transform as a filtered version of a signal rather than a “transform” of a signal. The Hilbert transform is useful in constructing single sideband signals (thus conserving bandwidth in communications examples). The transform is also useful in signal analysis by allowing real bandpass signals (such as a radio signal) to be analyzed and simulated DSPedia 190 as an equivalent complex baseband (or lowpass) process. Virtually all system simulation packages exploit this equivalent representation to allow for timely completion of system simulations. Not obvious from the definition above is the fact that the Hilbert transform of the Hilbert transform of x(t) is -x(t). This may be expected from the heuristic description of the Hilbert transform as a 90 degree phase shift -- i.e., two 90 degree phase shifts are a 180 degree phase shift which means multiplying by a minus one. Host: Most DSP boards can be hosted by a general purpose computer, such as an IBM compatible PC. The host allows a DSP designer to develop code using the PC, and then download the DSP program to the DSP board. The DSP board therefore has a host interface. The host usually supplies power (analog, 12V and digital, 5V) to the board. See also DSP Board. Householder Transformation: See Matrix Decompositions - Householder Transformation. Huffman Coding: This type of coding exploits the fact that discrete amplitudes of a quantized signal may not occur with equal probability. Variable length codewords can therefore be assigned to a particular data sequence according to their frequency of occurrence. Data that occurs frequently are assigned shorter code words, hence data compression is possible. Hydrophone: An underwater transducer of acoustic energy for sonar applications. Hyperchief: A MacIntosh program developed by a DSP graduate student from 1986 - 1991, somewhere on the west coast of the USA, to simulate the wisdom of a Ph.D. supervisor. However, while accurately simulating the wisdom of a Ph.D. supervisor, Hyperchief precisely illustrated the pitfalls of easy access to powerful computers. Hyperchief is sometime spelled as Hypercheif (pronounced Hi-per-chife). Hyperparaboloid: Consider the equation: e = x T Rx + 2p T x + s (217) where x is an n ×1 vector, R is a positive definite n ×n matrix, p is an n ×1 vector, and s is a scalar. The equation is quadratic in x. If n = 1, then e will form a simple parabola, and if n = 2, e can be represented as a (solid) paraboloid: e e x = x2 x n = 1 n = 2 x1 x2 x1 The positive definiteness of R ensures that the parabola is up-facing. Note that in both cases the e has exactly one minimum point (a global minimum) at the bottom of the parabolic shape. For systems with n ≥ 3 e cannot be shown diagrammatically as four or more dimensions are required! Hence we are asked to imagine the existence of a hyperparaboloid for n ≥ 3 and which will also have exactly one minimum point for e. The existence of the hyperparaboloid is much referred to for 191 least squares, and least mean squares algorithm derivations. See also Global Minimum, Local Minima. Hypersignal: An IBM PC based program for DSP written by Hyperception Inc. Hypersignal provides facilities for real time data acquisition in conjunction with various DSP processors, and a menu driven system to perform off-line processing of real-time FFTs, digital filtering, signal acquisition, signal generation, power spectra and so on. DOS and Windows versions are available. HyTime: HyTime (Hypermedia/Time-Based Structuring Language) is a standardised infrastructure for the representation of integrated, open hypermedia documents produced by the International Organization for Standards (ISO), Joint Technical Committee, Sub Committee (SC) 18, Working Group (WG) 8 (ISO JTC1/SC18/WG8). See also Bento, Multimedia and Hypermedia Information Coding Experts Group, Standards. 192 DSPedia 193 I i: ”i” (along with “k” and “n”) is often used as a discrete time index for in DSP notation. See Discrete Time. I: Often used to denoted the identity matrix. See Matrix. I-Series Recommendations: The I-series telecommunication recommendations from the International Telecommunication (ITU), advisory committee on telecommunications (denoted ITUT and formerly known as CCITT) provide standards for Integrated Services Digital Networks. Some of the current recommendations (http://www.itu.ch) include: I.112 I.113 I.114 I.120 I.121 I.122 I.140 I.141 I.150 I.200 I.210 I.211 I.220 I.221 I.230 I.231 I.231.9 I.231.10 I.232 I.232.3 I.233 I.233.1-2 I.241.7 I.250 I.251.1-9 I.252.2-5 I.253.1-2 I.254.2 I.255.1 I.255.3-5 I.256 I.257.1 I.258.2 I.310 I.311 I.312 I.320 I.321 I.324 Vocabulary of terms for ISDNs. Vocabulary of terms for broadband aspects of ISDN. Vocabulary of terms for universal personal telecommunication. Integrated services digital networks (ISDNs). Broadband aspects of ISDN. Framework for frame mode bearer services. Attribute technique for the characterization of telecommunication services supported by an ISDN and network capabilities of an ISDN. ISDN network charging capabilities attributes. B-ISDN asynchronous transfer mode functional characteristics. Guidance to the I.200-series of Recommendations. Principles of telecommunication services supported by an ISDN and the means to describe them. B-ISDN service aspects. Common dynamic description of basic telecommunication services. Common specific characteristics of services. Definition of bearer service categories. Circuit-mode bearer service categories. Circuit mode 64 kbit/s 8 kHz structured multi-use bearer service category. Circuit-mode multiple-rate unrestricted 8 kHz structured bearer service category. Packet-mode bearer services categories. User signalling bearer service category (USBS). Frame mode bearer services. ISDN frame relaying bearer service/ ISDN frame switching bearer service. Telephony 7 kHz teleservice. Definition of supplementary services. Direct-dialling-in/ Multiple subscriber number/ Calling line identification presentation/ Calling line identification restriction/ Connected Line Identification Presentation (COLP)/ Connected Line Identification Restriction (COLR)/ Malicious call identification/ Sub-addressing supplementary service. Call forwarding busy/ Call forwarding no reply/ Call forwarding unconditional/ Call deflection. Call waiting (CW) supplementary service/ Call hold. Three-party supplementary service. Closed user group. Multi-level precedence and preemption service (MLPP)/ Priority service/ Outgoing call barring. Advice of charge User-to-user signalling. In-call modification (IM). ISDN Network functional principles. B-ISDN general network aspects. (See also Q.1201.) Principles of intelligent network architecture. ISDN protocol reference model. B-ISDN protocol reference model and its application. ISDN network architecture. DSPedia 194 I.325 I.327 I.328 I.329 I.330 I.331 I.333 I.334 I.350 I.351 I.352 I.353 I.354 I.355 I.356 I.361 I.362 I.363 I.364 I.365.1 I.370 I.371 I.372 I.373 I.374 I.376 I.410 I.411 I.412 I.413 I.414 I.420 I.421 I.430 I.431 I.432 I.460 I.464 I.470 I.500 I.501 I.510 I.511 I.515 I.520 I.525 I.530 I.555 I.570 I.580 I.601 I.610 Reference configurations for ISDN connection types. B-ISDN functional architecture. Intelligent Network - Service plane architecture. Intelligent Network - Global functional plane architecture. ISDN numbering and addressing principles. Numbering plan for the ISDN era. Terminal selection in ISDN. Principles relating ISDN numbers/subaddresses to the OSI reference model network layer addresses. General aspects of quality of service and network performance in digital networks, including ISDNs. Relationships among ISDN performance recommendations. Network performance objectives for connection processing delays in an ISDN. Reference events for defining ISDN performance parameters. Network performance objectives for packet mode communication in an ISDN. ISDN 64 kbit/s connection type availability performance. B-ISDN ATM layer cell transfer performance. B-ISDN ATM layer specification. B-ISDN ATM Adaptation Layer (AAL) functional description. B-ISDN ATM adaptation layer (AAL) specification. Support of broadband connectioneless data service on B-ISDN. Frame relaying service specific convergence sublayer (FR-SSCS). Congestion management for the ISDN frame relaying bearer service. Traffic control and congestion control in B-ISDN. Frame relaying bearer service network-to-network interface requirements. Network capabilities to support Universal Personal Telecommunication (UPT). Framework Recommendation on “Network capabilities to support multimedia services”. ISDN network capabilities for the support of the teleaction service. General aspects and principles relating to Recommendations on ISDN user-network interfaces. ISDN user-network interfaces - references configurations. ISDN user-network interfaces - Interface structures and access capabilities. B-ISDN user-network interface. Overview of Recommendations on layer 1 for ISDN and B-ISDN customer accesses. Basic user-network interface. Primary rate user-network interface. Basic user-network interface - Layer 1 specification. Primary rate user-network interface - Layer 1 specification. B-ISDN user-network interface - Physical layer specification. Multiplexing, rate adaption and support of existing interfaces. Multiplexing, rate adaption and support of Existing interfaces for restricted 64 kbit/s transfer capability. Relationship of terminal functions to ISDN. General structure of the ISDN interworking Recommendations. Service interworking. Definitions and general principles for ISDN interworking. ISDN-to-ISDN layer 1 internetwork interface. Parameter exchange for ISDN interworking. General arrangements for network interworking between ISDNs. Interworking between ISDN and networks which operate at bit rates of less than 64 kbit/s. Network interworking between an ISDN and a public switched telephone network (PSTN). Frame relaying bearer service interworking. Public/private ISDN interworking. General arrangements for interworking between B-ISDN and 64 kbit/s based ISDN. General maintenance principles of ISDN subscriber access and subscriber installation. B-ISDN operation and maintenance principles and functions. For additional detail consult the appropriate standard document or contact the ITU. See also International Telecommunication Union, ITU-T Recommendations, Standards. 195 Magnitude Ideal Filter: The ideal filter for a DSP application is one which will give absolute discrimination between passband and stopband. The impulse response of an ideal filter is always non-causal, and therefore impossible to build. See also Brick Wall Filter, Digital Filter . 4000Hz frequency A brick wall filter cutting off at 4000Hz is the ideal anti-alias filter for a DSP application with fs = 8000Hz. All frequencies below 4000Hz are passed perfectly with no amplitude or phase distortion, and all frequencies above 4000Hz are removed. In practice the ideal filter cannot be achieved as it would be non-causal. In an FIR implementation, the more weights that are used, the closer the frequency response will be to the ideal. Identity Matrix: See Matrix Structured - Identity. IEEE 488 GPIB: Many DSP laboratory instruments such as data loggers and digital oscilloscopes are equipped with a GPIB (General Purpose Interface Bus). Note that this bus is also referred to as HPIB by Hewlett-Packard, developers of the original bus on which the standard is based. Different devices can then communicate through cables of maximum length 20 metres using an 8-bit parallel protocol with a maximum data transfer of 2Mbytes/sec. IEEE Standard 754: The IEEE Standard for binary floating point arithmetic specifies basic and extended floating-point number formats; add, subtract, multiply, divide, remainder, and square root. It also provides magnitude compare operations, conversion from/to integer and floating-point formats and conversions between different floating-point formats and decimal strings. Finally the standard also specifies floating-point exceptions and their handling, including non-numbers caused by divide by zero. The Motorola DSP96000 is an IEEE 754 compliant floating point processor. Devices such as the Texas Instruments TMS320C30 use their similar (but different!) floating point format. The IEEE Standard 754 has also been adopted by ANSI and is therefore often referred to as ANSI/IEEE Standard 754. See also Standards. IEEE Standards: The IEEE publish standards in virtually every conceivable area of electronic and electrical engineering. These standards are available from the IEEE and the titles, classifications and a brief synopsis can be browsed at http://stdsbbs.ieee.org. See also Standards. Ill-Conditioned: See Matrix Properties - Ill-Conditioned. Image Interchange Facility (IIF): The IIF has been produced by the International Organization for Standards (ISO,) Joint Technical Committee (JTC) 1, sub-committee (SC) 24 (ISO/IEC JTC1/ SC24) which is responsible for standards on “Computer graphics and image processing”. The IIF standard is ISO 12087-3 and is the definition of a data format for exchanging image data of an arbitrary structure. The IIF format is designed to allow easy integration into international telecommunication services. See also International Organisation for Standards, JBIG, JPEG, Standards. Imaginary Number: The imaginary number denoted by j for electrical engineers (and by most other branches of science and mathematics) is the square root of -1. Using imaginary numbers 196 allows the square root of any negative number to be expressed. For example, also Complex Numbers, Fourier Analysis, Euler’s Formula. DSPedia – 25 = 5j . See Impulse: An impulse is a signal with very large magnitude which lasts only for a very short time. A mechanical impulse could be applied by striking an object with a hammer; a very large force for a very short time. A voltage impulse would be a very large voltage signal which only lasts for a few milli- or even microseconds. A digital impulse has magnitude of 1 for one sample, then zero at all other times and is sometimes called the unit impulse or unit pulse. The mathematical notation for an impulse is usually δ ( t ) for an analog signal, and δ ( n ) for a digital impulse. For more details see Unit Impulse Function,. See also Convolution, Elementary Signals, Fourier Transform Properties, Impulse Response, Sampling Property, Unit Impulse Function, Unit Step Function. Impulse Response: When any system is excited by an impulse, the resulting output can be described as the impulse response (or the response of the system to an impulse). For example, striking a bell with a hammer gives rise to the familiar ringing sound of the bell which gradually decays away. This ringing can be thought of as the bell’s impulse response, which is characterized by a slowly decaying signal at a fundamental frequency plus harmonics. The bell’s physical structure supports certain modes of vibrations and suppresses others. The impulsive input has energy at all frequencies -- the frequencies associated with the supported modes of vibration are sustained while all other frequencies are suppressed. These sustained vibrations gives rise to the bell’s ringing sound that we hear (after the extremely brief “chink” of the impulsive hammer blow). We can also realize the digital impulse response of a system by applying a unit impulse and observing the output samples that result. From the impulse response of any linear system we can calculate the output signal for any given input signal simply by calculating the convolution of the impulse response with the input signal. Taking the Fourier transform of the impulse response of a system gives the frequency response. See also Convolution, Elementary Signals, Fourier Transform Properties, Impulse, Sampling Property, Unit Impulse Function, Unit Step Function. Incoherent: See Coherent. Infinite Impulse Response (IIR) Filter: A digital filter which employs feedback to allow sharper frequency responses to be obtained for fewer filter coefficients. Unlike FIR filters, IIR filters can exhibit instability and must therefore be very carefully designed [10], [42]. The term infinite refers to the fact that the output from a unit pulse input will exhibit nonzero outputs for an arbitrarily long time. 197 If the digital filter is IIR, then two weight vectors can be defined: one for the feedforward weights and one for the feedback weights: xk-1 xk a0 yk-3 xk-2 a1 b3 a2 Feedforward Zeroes (non-recursive) 2 yk = ∑ yk b1 Feedback Poles (recursive) an xk – n + ∑ bn y k – n = a 0 xk + a 1 x k – 1 + a 2 xk – 2 + b 1 y k – 1 + b2 y k – 2 + b3 y k – 3 n=1 xk ⇒ yk = b2 yk-1 3 n=0 aTx yk-2 k + bT y k–1 yk – 1 = a 0 a 1 a2 xk – 1 + b 1 b2 b3 yk – 2 xk – 2 yk – 3 A signal flow graph and equation for a 3 zero, 4 pole infinite impulse response filter. See also Digital Filter, Finite Impulse Response Filter, Least Mean Squares IIR Algorithms. Infinite Impulse Response (IIR) LMS: See Least Mean Squares IIR Algorithms. Infinity (∞) Norm: See Matrix Properties - ∞ Norm. Information Theory: The name given to the general study of the coding of information. In 1948 Claude E. Shannon presented a mathematical theory describing, among other things, the average amount of information, or the entropy of a information source. For example, a given alphabet is composed of N symbols (s1, s2, s3, s4,......., sN). Symbols from a source that generates random elements from this alphabet are encoded and transmitted via a communication line. The symbols are decoded at the other end. Shannon described a useful relationship between information and the probability distribution of the source symbols: if the probability of receiving a particular symbol is very high then it does not convey a great deal of information, and if low, then it does convey a high degree of information. In addition, his measure was logarithmically based. According to Shannon’s measure, the self information conveyed by a single symbol that occurs with probability Pi is: 1 I ( s i ) = log ----- 2 P i (218) The average amount of information, or first order entropy, of a source can then be expressed as: N Hr ( s ) = 1 ∑ Pi log 2 P-----i i=1 (219) DSPedia 198 Infrasonic: Of, or relating to infrasound. See Infrasound. Infrasound: Acoustics signals (speed in air, 330ms-1) having frequencies below 20Hz, the low frequency limit of human hearing, are known as infrasound. Although sounds as low as 3Hz have been shown to be aurally detectable, there is no perceptible reduction in pitch and the sounds will also be tactile. Infrasound is a topic close to the heart of a number of professional recording engineers who believe that it is vitally important to the overall sound of music. In general CDs and DATs can record down to around 5Hz. Exposure to very high levels infrasound can be extremely dangerous and certain frequencies can set cause organs and other body parts to resonate:: Area of Body Approximate Resonance Range (Hz) Motion sickness 0.3-0.6 Abdomen 3-5 Spine/pelvis 4-6 Testicle/Bladder 10 Head/Shoulders 20-30 Eyeball 60-90 Jaw/Skull 120-200 Infrasound has been considered as a weapon for the military and also as a means of crowd control, whereby the bladder is irritated. See also Sound, Ultrasound. Inner Product: See Vector Operations - Inner Product. In-Phase: See Quadrature. Instability: A system or algorithm goes unstable when feedback (either physical or mathematical) causes the system output to oscillate uncontrollably. For example if a microphone is connected to an amplifier then to a loudspeaker, and the microphone is brought close to the speaker then the familiar feedback howl occurs; this is instability. Similarly in a DSP algorithm mathematical feedback in equations being implemented (recursion) may cause instability. Therefore to ensure a system is stable, feedback must be carefully controlled. Institute of Electrical Engineers (IEE): The IEE is a UK based professional body representing electronic and electrical engineers The IEE publish a number of signal processing related publications each month, and also organize DSP related colloquia and conferences. Institute of Electrical and Electronic Engineers, Inc. (IEEE): The IEEE is a USA based professional body covering every aspect of electronic and electrical engineering. IEEE publishes a very large number of journals each month which include a number of notable signal processing journals such Transactions on Signal Processing, Transactions on Speech and Audio Processing, Transactions on Biomedical Engineering, Transactions on Image Processing and so on. Integration (1): The simplest mathematical interpretation of integration is taking the area under a graph. 199 Integration (2): The generic term for the implementation of many transistors on a single substrate of silicon. The technology refers to the actual process used to produce the transistors: CMOS is the integration technology for MOSFET transistors; Bipolar is the integration technology for TTL. The number of transistors on a single device is often indicated by one of the acronyms, SSI, MSI, LSI, VLSI, or ULSI. Acronym No. of Transistors Technology First Circuits Example SSI Small scale integration < 10 1960s NPN junction MSI Medium Scale Integration < 1000 1970s 4 NAND gates LSI Large Scale Integration < 10000 Early 1980s 8086 microprocessor VLSI Very Large Scale Integration <1000000 Mid 1980s DSP56000 ULSI Ultra Large Scale Integration <100000000 1990s TMS320C80 Integrated Circuit (IC): The name given to a single silicon chip containing many transistors that collectively realize some system level component such as an A/D converter or microprocessor. Integrated Digital Services Network (ISDN): See I-Series Recommendations. Integrator: A device which will performs the function of computing the integral as an output for an arbitrary input signal. In digital signal processing terms an integrator is quite straightforward. Consider the simple mathematical definition of integration which is the area under a graph. The output of an integrator, y(t), will be the area cumulative area under the input signal curve, x(t). For sampled digital signals the input will be constant for one sampling period, and therefore to approximately integrate the signal we can simply add the area of the sampling rectangles together. If the sampling period is normalized to one, then a signal can be integrated in the discrete domain by adding together the input samples. An integrator is implemented using a digital delay element, and a summing element which calculates the function: y(n) = x(n) + y(n – 1) (220) In the z-domain the transfer function of a discrete integrator is: Y ( z ) = X ( z ) + z –1 Y ( z ) ⇒ Y ( z )z ----------= ----------X( z) z–1 (221) DSPedia 200 When viewed in the frequency domain an integrator has the characteristics of a simple low pass filter. See also Differentiator, Low Pass Filter. y(t) x(t) 3 2 1 x(t) y(t) ∫ x ( t ) dt time time Analog Integration y(n) x(n) x(n) Discrete time, n ∆t x(n) + y(n) Σ + y(n-1) ∑ x ( n )∆t y(n) Discrete time, n Discrete Integration X(z) 1 ----------------1 – Z –1 Y(z) ∆ Time Domain Discrete Integrator SFG z-domain integrator representation Intensity: See Sound Intensity. Interchannel Phase Deviation: The difference in timing between the left and right channel sampling times of a stereo ADC or DAC. Interleaving: In channel coding interleaving is used to enhance the performance of a coder over a channel that is prone to error bursts. The basic idea behind interleaving is to spread a block of coded bits over a large number of dispersed channel symbols to allow the correction of just a few errors in each block in spite of the fact that many consecutive channel symbols are corrupted. 201 Interleaving is best illustrated by an example. See also Channel Coding, Cross-Interleaved Reedcoded input symbol stream load blocks into columns 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 single error correcting block 16 17 18 19 20 read symbols from rows interleaved symbol stream single error correcting block single error correcting block single error correcting block The interleaving is accomplished by placing symbols from each block into a separate column of an array and then transmitting the symbols sequentially from the rows. For this block coding example, interleaving places symbols from separate blocks of a single error correcting code next to each other. In this way, when a burst error of 3 consecutive symbols occurs, all 3 symbols can be corrected because they come from separately coded blocks. Note that in the example below, all three symbols are from separate blocks. burst error 1 6 11 16 2 7 12 17 3 8 13 18 4 9 14 19 5 10 15 20 Solomon Coding. International Electrotechnical Commission (IEC): The IEC was founded in 1906 with the object of promoting “international co-operation on all questions of standardization and related matters in the fields of electrical and electronic engineering and thus to promote international understanding.” The IEC is composed of a number of committees made up from members from the main industrial countries of the world. The IEC publishes a wide variety of international standards and technical reports. The IEC works with other international organizations, particularly with the International Organization (ISO), and also with the European Committee for Electrotechnical Standardization (CENELEC). Standards resulting from cooperations are often prefixed with the letters JTC - Joint Technical Committee. Some of the JTC standards relevant to DSP are discussed under International Organization for Standards. More information on the IEC can be found at the WWW site http://133.82.181.177/ikeda/IEC/. See also International Organization for Standards (ISO), International Telecommunication Union, Standards. International Mobile (Maritime) Satellite Organization (Inmarsat): Inmarsat provides mobile satellite communications world-wide for the maritime community. This satellite communication system supports services such as telephone, telex, facsimile, e-mail and data connections. Inmarsat's compact land mobile telephones (an essential tool for workers in remote parts of the world) can fit inside a briefcase and provide an excellent means of worldwide emergency communications. The various communication modes of Inmarsat rely on powerful DSP systems and the use of various coding standards. International Organisation for Standards (ISO): ISO is not in fact an acronym for the International Organisation for Standards; that would be IOS. “ISO” is a word derived from the Greek word isos, meaning “equal” such as in words like isotropic or isosceles. However it is quite DSPedia 202 commonplace for ISO to be assumed to be an acronym for International Standards Organisation, which it is not! But, on average, only one out of two authors would care. ISO is an autonomous organization established in 1947 to promote the development of standardization worldwide. ISO standards essentially contain technical criteria and other detail to ensure that the specification, design, manufacture and use of materials, products, processes and services are fit for their purpose. One common example of standardization in everyday life is the woodscrew which should be produced in common ISO standards defining thread size, width, length etc. Another example are credit cards which should all be produced according to ISO standard widths, heights and lengths. Standards on coding of audio and video are of particular relevance to DSP. ISO is made of various committees, sub-committees (SC) and working groups who oversee the definition of new standards, and ensure that current standards maintain their relevance. Some of the work most relevant to DSP is actually performed by joint technical committees (JTC) with other standards organisations such as the International Electrotechnical Commission (IEC). The ISO/IEC JTC 1 is on information technology and has the scope of standardization within established and emerging areas of information technology. Some of the key subcommittees that have been set up include: SC 1: SC 2: SC 6: SC 7: SC 11 SC 14: SC 15: SC 17: SC 18: SC 21: SC 22: SC 23: SC 24: SC 25: SC 26: SC 27: SC 28: SC 29: SC 30: Vocabulary Coded character sets Telecommunications and information exchange between systems Software engineering Flexible magnetic media for digital data interchange Data element principles Volume and file structure Identification cards and related devices Document processing and related communication Open systems interconnection, data management and open distributed processing Programming languages, their environments and system software interfaces Optical disk cartridges for information interchange Computer graphics and image processing Interconnection of information technology equipment Microprocessor systems IT Security techniques Office equipment Coding of audio, picture, multimedia and hypermedia information Open electronic data interchange Of most relevance to DSP, is the work of SC6, 24 and 29. SC29 is currently of particular interest and is responsible for standards on “Coding of Audio, Picture, Multimedia and Hypermedia Information”. SC29 is further subdivided into working groups (WG) which have already defined various standards: WG 1: Coding of Still Pictures ISO/IEC 11 544: JBIG (Progressive Bi-level Compression) ISO/IEC 10 918: JPEG (Continuous-tone Still Image) Part 1: Requirement and Guidelines Part 2: Compliance Testing Part 3: Extensions WG 11: Coding of Moving Pictures and Associated Audio ISO/IEC 11 172: MPEG-1 (Moving Picture Coding up to 1.5 Mbit/s) 203 Part 1: Systems Part 2: Video Part 3: Audio Part 4: Compliance Testing (CD) Part 5: Technical Report on Software for ISO/IEC 11 172 ISO/IEC 13 818: MPEG-2 (Generic Moving Picture Coding) Part 1: Systems (CD) Part 2: Video (CD) Part 3: Audio (CD) Part 4: Compliance Testing Part 5: Technical Report on Software for ISO/IEC 13 818 Part 6: Systems Extensions Part 7: Audio Extensions There is also work on MPEG-4 (Very-low Bitrate Audio-Visual Coding). WG 12: Coding of Multimedia and Hypermedia Information ISO/IEC 13 522: MHEG (Coding of Multimedia and Hypermedia Information) Part 1: Base Notation (ASN.1) (CD) Part 2: Alternate Notation (SGML) (WD) Part 3: MHEG Extensions for Scripting: Language Support More information on the ISO and ISO JTC standards can be found in the relevant ISO publications which are summarized on http://www.iso.ch. See also International Electrotechnical Commission (IEC), International Telecommunication Union (ITU), Standards. International Standards Organization: See International Organisation for Standards. International Telecommunication Union (ITU): The ITU is an agency of the United Nations who operate a world-wide organization from which governments and private industry from various countries coordinate the definition, implementation and operation of telecommunication networks and services. The responsibilities of the ITU extend to regulation, standardization, coordination and development of international telecommunications. They also have a general responsibility to ensure the integration of the differing policies and systems in various countries. The headquarters of the ITU is currently International Telecommunication Union, Place des Nations, CH-1211 Geneva 20, Switzerland. They can be contacted on the world wide web at address http://www.itu.ch. The recommendations and various standards of the ITU are divided into two key areas resulting from the output two advisory committees on: (1) Telecommunication and denoted as ITU-T recommendations, (formerly known as CCITT); and (2) Radiocommunications and denoted as ITUR recommendations (formerly known as CCIR. See also International Organisation for Standards, ITU-R Recommendations, ITU-T Recommendations, Multimedia Standards, Standards. Internet: The name give to the worldwide connection of computers each having a unique identifying internet number. The internet currently allows interchange of electronic mail, and general computer files containing anything from text, images, and audio. Useful tools for navigating the internet and exploring information available from other users on machines other than your own, include ftp (file transfer protocol) Gopher, Netscape, Mosaic, and Lynx [169], etc. Interpolation: Interpolation is the creation of intermediate discrete values between two samples of a signal. For example, if 3 intermediate and equally spaced samples are created, then the sampling DSPedia 204 n 1 = ----t n time Upsampler 4 From DSP Processor t t ovs = n ⁄4 freq fn/2 fn fovs/2 fovs Magnitude 4fn 2fn d f ovs 4 = 4f n = ----t n freq fn/2 fn fovs/2 fovs fovs Attenuation fn/2 fn Magnitude freq time t 0 0 Magnitude time Digital Low Pass Filter freq fn/2 fn 2fn 4fn Oversampling DAC Attenuation f Amplitude Amplitude tn Amplitude rate has increased by a factor of 4. Interpolation is usually accomplished by first up-sampling to insert zeroes between existing samples, and then filtering with a low pass digital filter. Analog anti-alias filter freq fn/2 fn fovs/2 Analog Output Interpolation of a 4 x’s oversampled signal by upsampling by 4 (zero insertion) and low pass digital filtering. The interpolation process is essentially a technique whereby the reconstruction filtering is being done partly in the analog domain and partly in the digital domain. Note that the digital oversampled baseband signal will be delayed by the group delay, t d of the digital low pass filter (which is usually linear phase) Other types of curve fitting interpolators can also be produced, although there are less common. Interpolators are widely found in digital audio systems such as CD players, where oversampling filters (typically 4 ×’s) are used to increase the sampling rate in order to allow a simpler reconstruction filter to be used at the output of the digital to analog converter (DAC). See also Upsampling, Decimation, Downsampling, First Order Hold, Fractional Sampling Rate Conversion, Zero Order Hold. Interrupt: Inside a DSP processor an interrupt will temporarily halt the processor and force it to perform an interrupt routine. For example an interrupt may happen every 1 ⁄ f s seconds in order that a DSP processor executes the interrupt service routine, whereby it reads the value from an A/ D converter at a rate of fs samples every second. Inverse, Matrix: See Matrix Operations - Inverse. Inverse System Identification: Using adaptive filtering techniques, the approximate inverse of an unknown filter, plant or data channel can be identified. In an adaptive signal processing inverse system identification architecture, when the error, ε(k) has adapted to a minimum value (ideally zero) then this means that in some sense y ( k ) ≈ s ( k ) , where s(k) is the input to the unknown channel. Therefore the transfer function of the adaptive filter is now an approximate inverse of the unknown system. Inverse system identification is widely used for equalizing data transmission 205 channels. See also Adaptive Filtering, Adaptive Line Enhancer, Equalisation, Least Mean Squares Algorithm, System Identification, Delay s(k) Unknown System x(k) Adaptive Filter + y(k) − Σ ε(k) Adaptive Algorithm Generic Adaptive Signal Processing Inverse System Identification Architecture Inversion Lemma: See Matrix Properties - Inversion Lemma. ITU-R Recommendations: The International Telecommunication Union (ITU) have produced a very comprehensive set of regulatory, standardizing and coordination documents for radiocommunication systems. The ITU-Radiocommunications (ITU-R) advisory committee are responsible for the generation, upkeep and amendment of the ITU-R recommendations. These recommendations are classified into various subgroups or series identified by the letters: Series BO BR BS BT F IS M PI PN RA S SA SF SM SNG TF V Description Broadcasting satellite service (sound and television); Sound and television recording; Broadcasting service (sound); Broadcasting service (television); Fixed service; Inter-service sharing and compatibility; Mobile, radiodetermination, amateur and related satellite services; Propagation in ionized media; Propagation in non-ionized media; Radioastronomy; Fixed satellite service; Space applications; Frequency sharing between the fixed satellite service and the fixed service; Spectrum management techniques; Satellite news gathering; Time signals and frequency standards emissions; Vocabulary and related subjects. In addition to the ITU-R (radiocommunication) recommendations, there are also the ITU-T (telecommunication) recommendations See also International Organization for Standards, International Telecommunication Union, ITU-T Recommendations, Standards. ITU-T Recommendations: The International Telecommunication Union (ITU) have produced a very comprehensive set of regulatory, standardizing and coordination documents for telecommunication systems. The ITU-Telecommunications (ITU-T) advisory committee are responsible for the generation, upkeep and amendment of the ITU-T recommendations. These DSPedia 206 standards, definitions and recommendations are classified into various subgroups or series identified by a letter: A B C D E F G H I J K L M N O P Q R S T U V X Z Organization of the work of the ITU-T; Means of expression (definitions, symbols, classification); General telecommunication statistics; General tariff principles; Overall network operation (numbering, routing, network management, etc.; Services other than telephone (ops, quality, service definitions and human factors); Transmission systems and media, digital systems and networks; Line transmission of non-telephone signals; Integrated Services Digital Networks; Transmission of sound programmes and television signals; Protection against interference; Construction, installation and protection of cable and other elements of outside plant; Maintenance: international systems, telephone, telegraphy, fax & leased circuits; Maintenance: international sound programme and television transmission circuits; Specifications of measuring equipment; Telephone transmission quality, telephone installations, local line networks; Switching and Signalling; Telegraph transmission; Telegraph services terminal equipment; Terminal characteristics protocols for telematic services, document architecture; Telegraph switching; Data communication over the telephone network; Data networks and open system communication; Programming languages. These recommendations were formerly known as CCITT (the former name of the ITU) regulations, and are available from the ITU (usually for a price) in published book form (20 volumes and 61 Fascicles), or electronic form (http://www.itu.ch). The book form is also sometimes referred to as the “blue book”. The work of the committee is clearly outlined in the A-series recommendations: A.1 A.10 A.12 A.13 A.14 A.15 A.20 A.21 A.22 A.23 A.30 Presentation of contributions relative to the study of questions assigned to the ITU-T. Terms and definitions. Collaboration with the International Electrotechnical Commission (IEC) on the subject of definitions for telecommunications. Collaboration with the IEC on graphical symbols and diagrams used in telecommunications. Production maintenance and publication of ITU-T terminology. Elaboration and presentation of texts for Recommendations of the ITU Telecommunication Standardization Sector. Collaboration with other international organizations over data transmission. Collaboration with other international organizations on ITU-T defined telematic services. Collaboration with other international organizations on information technology. Collaboration with other international organizations on information technology, telematic services and data transmission. Major degradation or disruption of service. From a DSP algorithm and implementation perspective the G-series specifies a variety of algorithms for audio digital signal coding and compression, the H-series specifies video compression techniques and the V-series specifies modem data communications strategies including echo cancellation, equalisation and data compression. In addition to the ITU-T (telecommunication) recommendations, there are also the ITU-T (radiocommunication) recommendations. See also G-Series Recommendations, H-Series 207 Recommendations, International Organization for Standards, International Telecommunication Union, ITU-R Recommendations, MPEG, Standards, V-Series Recommendations. i860: Intel’s powerful RISC processor which has been used in many DSP applications. 208 DSPedia 209 J j: The electrical engineering representation of – 1 , the imaginary number that mathematicians denote as “i”. However, electrical engineers use “i” to denote current. JND: Just Noticeable Difference. See Difference Limen. J-Series Recommendations: The J-series telecommunication recommendations from the International Telecommunication (ITU), advisory committee on telecommunications (denoted ITUT and formerly known as CCITT) provide standards for transmission of sound programme and television signals. Some of the current recommendations (http://www.itu.ch) include: J.11 J.12 J.13 J.14 J.15 J.16 J.17 J.18 J.19 J.21 J.23 J.31 J.33 J.34 J.41 J.42 J.43 J.44 J.51 J.52 J.61 J.62 J.63 J.64 J.65 J.66 J.67 J.73 J.74 J.75 J.77 Hypothetical reference circuits for sound-programme transmissions. Types of sound-programme circuits established over the international telephone network. Definitions for international sound-programme circuits. Relative levels and impedances on an international sound-programme connection. Lining-up and monitoring an international sound-programme connection. Measurement of weighted noise in sound-programme circuits. Pre-emphasis used on sound-programme circuit. Crosstalk in sound-programme circuits set up on carrier systems. A conventional test signal simulating sound-programme signals for measuring interference in other channels. Performance characteristics of 15 kHz-type sound-programme circuits - circuits for high quality monophonic and stereophonic transmissions. Performance characteristics of 7 kHz type (narrow bandwidth) sound-programme circuits. Characteristics of equipment and lines used for setting up 15 kHz type sound-programme circuits. Characteristics of equipment and lines used for setting up 6.4 kHz type sound-programme circuits. Characteristics of equipment used for setting up 7 kHz type sound-programme circuits Characteristics of equipment for the coding of analogue high quality sound programme signals for transmission on 384 kbit/s channels. Characteristics of equipment for the coding of analogue medium quality sound-programme signals for transmission on 384-kbit/s channels. Characteristics of equipment for the coding of analogue high quality sound programme signals for transmission on 320 kbit/s channels. Characteristics of equipment for the coding of analogue medium quality sound-programme signals for transmission on 320 kbit/s channels. General principles and user requirements for the digital transmission of high quality sound programmes. Digital transmission of high-quality sound-programme signals using one, two, or three 64 kbit/s channels per mono signal (and up to six per stereo signal). Transmission performance of television circuits designed for use in international connections. Single value of the signal-to-noise ratio for all television systems. Insertion of test signals in the field-blanking interval of monochrome and colour television signals. Definitions of parameters for simplified automatic measurement of television insertion test signals. Standard test signal for conventional loading of a television channel. Transmission of one sound programme associated with analogue television signal by means of time division multiplex in the line synchronizing pulse. Test signals and measurement techniques for transmission circuits carrying MAC/packet signals for HD-MAC signals. Use of a 12-MHz system for the simultaneous transmission of telephony and television. Methods for measuring the transmission characteristics of translating equipments. Interconnection of systems for television transmission on coaxial pairs and on radio-relay links. Characteristics of the television signals transmitted over 18 MHz and 60-MHz systems. DSPedia 210 J.80 J.81 J.91 Transmission of component-coded digital television signals for contribution-quality applications at bit rates near 140 Mbit/s. Transmission of component-coded television signals for contribution-quality applications at the third hierarchical level of ITU-T Recommendation G.702. Technical methods for ensuring privacy in long-distance international television transmission. For additional detail consult the appropriate standard document or contact the ITU. See also International Telecommunication Union, ITU-T Recommendations, Standards. Joint Bi-level Image Group (JBIG): JBIG is the name for a lossless compression algorithm for binary (one bit/pixel) images which results from the International Organization for Standards (ISO) sub-committee (SC) 29 which is responsible for standards on “Coding of Audio, Picture, Multimedia and Hypermedia Information”. Working Group (WG) 1 of SC29 (ISO/IEC JTC1/SC29/WG1) considered the problem of coding of still binary images and produced a joint standard with the International Electrotechnical Commission (IEC): ISO/IEC 10918 - JBIG (Progressive Bi-level Compression). JBIG is intended to replace the current, (and less effective) Group 3 and 4 fax algorithms which are primarily used for document text transmission (i.e., Fax). JBIG achieves compression by modelling the redundancy in the image as the correlations of the pixel currently being coded with a set of nearby pixels using arithmetic coding techniques. See also JPEG, MPEG Standards, Standards. Joint Photographic Experts Group (JPEG): JPEG is the general name for a lossy compression algorithm for continuous tone still images. JPEG is the original name of the committee who drafted the standard for the International Organization for Standards (ISO) sub-committee (SC) 29 which is responsible for standards on “Coding of Audio, Picture, Multimedia and Hypermedia Information”. Working Group (WG) 1 (ISO/IEC JTC1/SC24/WG1) considered the problem of coding of still binary images and produced the JPEG joint standard with the International Electrotechnical Commission (IEC): ISO/IEC 11544 - JPEG (Continuous Tone Still Image). JPEG is designed for compressing full 24 bit colour or gray-scale digital images of “natural” (realworld) scenes (as opposed to, for example, complex geometrical patterns). JPEG does not cater for motion picture compression (see MPEG) or for black and white image compression (see JBIG) where is does not cope well with edges formed at black-white boundaries. The primary compression scheme in JPEG consists of a two dimensional discrete cosine transform (DCT) of image blocks, a coefficient quantizer, a zig-zag scan of the quantized DCT coefficients (that has probably produced long runs of zeros) that is subsequently run-length encoded by a Huffman code designed for a set of training image zig-zag scan fields [39]. JPEG is a lossy algorithm however most of the compression is achieved by exploiting known limitations of the human eye, for example that small colour details are not perceived by the eye and brain as well as small details of light and dark. The degree of information loss from JPEG compression can be varied by adjusting the values of certain compression parameters. Therefore file size can be traded off against image quality, which will of course depend on the actual application. Extremely small files (thumbnails) can be produced using JPEG which are useful for icons or image indexing and archive purposes. The ITU-T T-series standards T.80 - T83 are similar to JPEG: • T.80 Common components for image compression and communication; basic principles. 211 • T.81 • T.82 • T.83 Digital compression and encoding of continuous tone still images. Progressive compression techniques for bi-level images. Compliance testing. Additional information is available form the independent JPEG group at [email protected] JPEG software and file specifications are available from a number of FTP sites, including ftp://ftp.uu.net:/graphics/jpeg. See also JBIG, MPEG, Standards, T-Series Recommendations. Joint Stereo Coding: When compressing hifidelity stereo audio higher levels of compression can be obtained by exploiting the commonalities between the audio on the left and right channels, than would be gained by compressing the left and right channels independently. MPEG-Audio has a joint stereo coding facility. See Compression, Moving Picture Experts Group (MPEG) - Audio. Just (Music) Scale: A few hundred years ago, prior to the existence of the equitemporal or Western music scale, a (major) musical key was formed from using certain carefully chosen frequency ratios between adjacent notes, rather than the constant tone and semitone ratios of the modern Western music scale. The C-major just scale would have had the following frequency ratios: C-major Scale C Frequency ratio 1/1 D E F G A B C 9/8 5/4 4/3 3/2 5/3 15/8 2/1 The frequency ratio gives the ratio of the fundamental frequency of the root note, to the current note. The above ratios correspond to the Just Music Scale. Any note can be used to realise a just major key or scale. However using the just scale it is difficult to form other major or minor keys without a complete retuning of the instrument as all of the fundamental frequencies in other keys are different. Instruments that are tuned and played using the just scale will probably sound in some sense “medieval” as our modern appreciation of music is now firmly based on the equitempered Western music scale. See also Digital Audio, Music, Music Synthesis, Pythagorean Scale, Western Music Scale. Just Noticeable Difference: See Difference Limen. 212 DSPedia 213 K k: ”k” (along with “i” and “n”) is often used as a discrete time index for in DSP notation. It is also often used as the frequency index in the DFT. See Discrete Time, Discrete Fourier Transform. Karaoke DSP: For professionally recorded stereo music on CDs, DATs and so on, the vocal track, v ( k ) , of a song is usually centered on the left and right channels, i.e. the same signal in the left track L ( k ) and the right track R ( k ) which is perceived as coming from between the two loudspeakers if the listener is sitting equidistant from both. The musical instruments are likely to be laid out in some off-centre set up which means that they are unlikely to be identical signals on both left and right channels, i.e.: Left = L ( k ) = v ( k ) + M L ( k ) Right = R ( k ) = v ( k ) + M R ( k ) (222) By digitally subtracting the left and right channels: L ( k ) – R ( k ) = ML ( k ) – MR ( k ) (223) the vocal track may be somewhat attenuated, enabling the song to be played with the vocals deemphasised by a few dBs, all ready for the bellowing tones of a Karaoke singer! See My Way by Frank Sinatra. Knee: The knee is the part of a magnitude-frequency graph of a filter, where the transition from passband to stopband is made. A soft knee is where the transition realises a filter with very low rolloff, and a harder knee approaches the ideal filter. See also Roll-off . 10log10 Vout/Vin (dB) Soft knee: Roll-off of 20dB/decade simple first order RC circuit 0 -5 -10 -15 -20 -25 -30 -35 -40 -45 -50 -55 -60 Hard knee: Roll-off of 80dB/decade using a 4th order active filter. 0.1f3dB f3dB 10f3dB 100f3dB 1000f3dB log10 f Khoros: Khoros is a block diagram simulator for image and video processing which runs on a variety of computer platforms such as Sun workstations. Kronecker Impulse, or Kronecker Delta Function: See Unit Impulse Function. Kronecker Product: See Matrix Operations - Kronecker Product. 214 DSPedia 215 L LA (Linear Arithmetic) Synthesis: A technique for synthesis of the sound of musical instruments [32]. See also Music, Music Synthesis. LabView: A software package from National Instruments Inc. which allows powerful PC based DSP instrumentation front-ends to be designed. LabView also convincingly presents the Virtual Instrument concept. See also Virtual Instrument. Laplace: A mathematical transform use for the analysis of analog systems. Laplacian: A probability distribution that is often used to model the differences between adjacent pixels in an image. Lateralization: Lateralization refers to a psychoacoustics task in which a sound is determined to be at some point within the head, either near one ear or the other along a line separating the two ears. Very much like localization, lateralization differs in that the sound source is perceived within the head rather than outside of the head. The common experience of listening to stereophonic music via headphones (lateralization) versus listening to the same music via loud speakers in a normal room (localization) emphasizes the difference between the two tasks. See also Localization. Law of First Wavefront: In a reverberant environment the sound energy received by the direct path can be very much lower than the energy received by indirect reflective paths. However the human ear is still able to localize the sound location correctly by localizing the first components of the signal to arrive. Later echoes arriving at the ear increase the perceived loudness of the sound as they will have the same general spectrum. This psychoacoustic effect is sometimes known as the law of the first wavefront or the Haas effect, and more commonly the precedence effect. The precedence effect applies mainly to short duration sounds or those of a discontinuous or varying form. See also Ear, Lateralization, Source Localization, Threshold of Hearing. LDU: See Matrix Decompositions - LDU Decomposition. Leaky LMS: See Least Mean Squares Algorithm Variants. Least Mean Squares (LMS) Algorithm: The LMS is an adaptive signal processing algorithm that is very widely used in adaptive signal processing applications such as system identification, inverse system identification, noise cancellation and prediction. The LMS algorithm is very simple to implement in real time and in the mean will adapt to a neighborhood of the Wiener-Hopf least mean square solution. The LMS algorithm can be summarised as follows: To derive the LMS algorithm, first consider plotting the mean squared error (MSE) performance surface (i.e. E { e 2 ( k ) } as a function of the weight values) which gives an N+1-dimensional hyperparaboloid which has one minimum. It is assumed that x ( k ) (the input data sequence)and d ( k ) (a desired signal) are wide sense stationary signals (see Wiener-Hopf Equations). For DSPedia 216 d(k ) Adaptive FIR Filter, w(k) x( k ) y(k) + e(k) − LMS Algorithm y( k) = N–1 ∑ w ( k )x ( k – n ) = w T ( k )x ( k ) n=0 x ( k ) = [ x ( k ), x ( k – 1 ), x ( k – 2 ), … , x ( k – N + 2 ), x ( k – N + 1 ) ] T where w ( k ) = [ w 0 ( k ), w 1 ( k ), w 2 ( k ), …, w N – 2 ( k ), w N – 1 ( k ) ] T e ( k ) = d ( k ) – y ( k ) = d ( k ) – w T ( k )x ( k ) w ( k + 1 ) = w ( k ) + 2µe ( k )x ( k ) In the generic adaptive filtering architecture the aim can intuitively be described as adapting the impulse response of the FIR digital filter such that the input signal x ( k ) is filtered to produce y ( k ) which, when subtracted from desired signal d ( k ) , will minimise the error signal e ( k ) . If the filter weights are updated using the LMS weight update then the adaptive FIR filter will adapt to the minimum mean squared error, assuming d ( k ) and x ( k ) to be wide sense stationary signals. discussion and illustration purposes the three dimensional paraboloid for a two weight FIR filter can be drawn: Large step size, µ MSE, E{e2(k)} Small step size, µ w1 w1(opt) MMSE MMSE w0(opt) w0 The mean square error (MSE) performance surface for a two weight FIR filter. The Wiener-Hopf solution is denoted as w k ( opt ) ) , which denotes where the minimum MSE (MMSE) occurs. The gradient based LMS algorithm will (on average) adapt towards the MMSE by taking “jumps” in the direction of the negative of the gradient of the surface (therefore “downhill”). To find the minimum mean squared error (MMSE) we can use the Wiener Hopf equation, however this is an expensive solution in computation terms. As an alternative we can use gradient based techniques, whereby we can traverse down the inside of the parabola by using an iterative algorithm 217 which always updates the filter weights in the direction opposite of the steepest gradient. The iterative algorithm is often termed gradient descent and has the form: w ( k + 1 ) = w ( k ) + µ ( –∇k ) (224) where ∇ k is the gradient of the performance surface: ∂ E{ e2( k ) } ∂w(k ) = 2Rw ( k ) – 2p ∇k = (225) where R is the correlation matrix, p is cross correlation vector (see Correlation Matrix and Cross Correlation Vector) and µ is the step size (used to control the speed of adaption and the achievable minimum or misadjustment). In the above figure a small step size “jumps” in small steps towards the minimum are is therefore slow to adapt, however the small jumps mean that it will arrive very close to the MMSE and continue to jump back and forth close to the minimum. For a large step size the jumps are larger and adaption to the MMSE is faster, however when the weight vector reaches the bottom of the bowl it will jump back and forth around the MMSE with a large magnitude than for the small step size. The error caused by the traversing of the bottom of the bowl is usually called the excess mean squared error (EMSE). To calculate the MSE performance surface gradient directly is (like the Wiener Hopf equation) very expensive as it requires that both R, the correlation matrix and p, the cross correlation vector are known (see Wiener-Hopf Equations). In addition, if we knew R and p, we could directly compute the optimum weight vector. But in general, we do not have access to R and p. Therefore a subtle innovation, first defined for DSP by Widrow et al [152], was to replace the actual gradient with an instantaneous (noisy) gradient estimate. One approach to generating this noisy gradient estimate is to take the gradient of the actual squared error (versus the mean squared error), i.e. ˆ ∇k = ∂ e2 ( k ) ∂w(k) (226) ∂ ∂ = 2e ( k ) e ( k ) = –2 e ( k ) y ( k ) = – 2 e ( k )x ( k ) ∂w(k) ∂w(k) ˆ Therefore using this estimated gradient, ∇ k in the gradient descent equation yields the LMS algorithm: w ( k + 1 ) = w ( k ) + 2µe ( k )x ( k ) (227) DSPedia 218 The LMS is very straightforward to implement and only requires N multiply-accumulates (MACs) to perform the FIR filtering, and N MACs to implement the LMS equation. A typical signal flow graph for the LMS is shown below: x(k) d(k) FIR Filter w0 w1 w2 wN-2 wN-1 + y(k) − LMS Weight Updates: e(k) Σ w ( k + 1 ) = w ( k ) + 2µe ( k )x ( k ) A simple signal flow graph for an adaptive FIR filter, where the adaptive nature of the filter weights is explicitly illustrated. The LMS is very widely used in many applications such as telecommunications, noise control, control systems, biomedical DSP, and so on. Its properties have been very widely studied and a good overview can be found in [77], [53]. From a practical implementation point of view the algorithm designer must carefully choose the filter length to suit the application. In addition, the step size must be chosen to ensure stability and a good convergence rate. For the LMS upper and lower bounds for the adaptive step size can be calculated as: 1 -≅ 0 < µ < ---------------------------NE { x 2 ( k ) } 1 0 < µ < ------------------------------------------------------------N { Input Signal Power } (228) A more formal bound can be defined in terms of the eigenvalues of the input signal correlation matrix [53]. However for practical purposes these values are not calculated and the above practical bound is used (see Least Mean Squares Algorithm Convergence). In general the speed of adaption is inversely proportional to the step size, and the excess MSE or steady state error is proportional to the step size. A simple example of a 20 weight FIR filter being 219 used to identify an unknown filter (i.e., system identification) was simulated to produce the error plots below for two different step sizes of 0.001 and 0.01: Amplitude, e(k) 20log|e(k)| (dB) Small Step Size time index, k time index, k Adapting with a step size of µ = 0.001 the error signal, e ( k ) adapts slowly, however the steady state error of about -35dB that is reached is about 10dB smaller than for the larger step size of µ = 0.01 . 20log |e(k)| (dB) Amplitude, e(k) Large Step Size time index, k time index, k Adapting with a step size of µ = 0.01 the error signal, e ( k ) adapts quickly, however the steady state error of about -25dB that is reached is about 10dB larger than for the smaller step size of µ = 0.001 . Clearly a trade-off exists -- once again the responsibility of choosing this parameter is in the domain of the algorithm designer. See also Acoustic Echo Cancellation, Active Noise Control, Adaptive Line Enhancer, Adaptive Signal Processing, Adaptive Step Size, Correlation Matrix, Correlation Vector, Echo Cancellation, Least Mean Squares Algorithm Convergence, Least Mean Squares Algorithm Misadjustment/Algorithm/IIR Algorithms/Time Constant/ Variants, Least Mean Squares Filtered-X Algorithm, Least Squares, Noise Cancellation, Recursive Least Squares, Wiener-Hopf Equations, Volterra Filter. Least Mean Squares (LMS) Algorithm Convergence: It can be shown that the (noisy) gradient estimate used in the LMS algorithm (see Least Mean Squares Algorithm) is an unbiased estimate of the true gradient: DSPedia 220 ˆ E ∇ k = E [ –2 e ( k )x ( k ) ] = ( E [ – 2 ( d ( k ) – w T ( k )x ( k ) )x ( k ) ] ) = 2 Rw ( k ) – 2p ˆ = ∇k (229) where we have assumed that w ( k ) and x ( k ) are statistically independent. It can be shown that in the mean the LMS will converge to the Wiener-Hopf solution if the step size, µ, is limited by the inverse of the largest eigenvalue. Taking the expectation of both sides of the LMS equation gives: E { w ( k + 1 ) } = E { w ( k ) } + 2µE [ e ( k )x ( k ) ] = E { w ( k ) } + 2µ ( E [ d ( k )x ( k ) ] – E [ ( x ( k )x T ( k ) )w ( k ) ] ) (230) and again assuming that w ( k ) and x ( k ) are statistically independent: E { w ( k + 1 ) } = E { w ( k ) } + 2µ ( p – RE { w ( k ) } ) = ( I – 2µR )E { w ( k ) } + 2µRw opt (231) where w opt = R –1 p and I is the identity matrix. Now, defining v ( k ) = w ( k ) – w opt then we can rewrite the above in the form: E { v ( k + 1 ) } = ( I – 2µR )E { v ( k ) } (232) For convergence of the LMS to the Wiener-Hopf, we require that w ( k ) → w opt as k → ∞ , and therefore v ( k ) → 0 as k → ∞ . If the eigenvalue decomposition of R is given by Q T ΛQ , where Q T Q = I and Λ is a diagonal matrix then writing the vector v ( k ) in terms of the linear transformation Q, such that E { v ( k ) } = Q T E { u ( k ) } and multiplying both sides of the above equation, we realise the decoupled equations: E { u ( k + 1 ) } = ( I – 2µΛ )E { u ( k ) } (233) E { u ( k + 1 ) } = ( I – 2µΛ ) k E { u ( 0 ) } (234) and therefore: where ( I – 2µΛ ) is a diagonal matrix: 221 ( I – 2µΛ ) = ( 1 – 2µλ 0 ) 0 0 … 0 0 ( 1 – 2µλ 1 ) 0 … 0 0 0 ( 1 – 2µλ 2 ) … 0 : 0 : 0 : 0 (235) … 0 0 ( 1 – 2µλ N – 1 ) For convergence of this equation to the zero vector, we require that ( 1 – 2µλ n ) n → 0 for all n = 0, 1, 2, …, N – 1 (236) Therefore the step size, µ, must cater for the largest eigenvalue, λ max = max ( λ 0, λ 1, λ 2, …, λ N – 1 ) such that: 1 – 2µλ max < 1 , and therefore: 1 0 < µ < -----------λ max (237) This bound is a necessary and sufficient condition for convergence of the algorithm in the mean square sense. However, this bound is not convenient to calculate, and hence not particularly useful for practical purposes. A more useful sufficient condition for bounding µ can be found using the linear algebraic result that: N–1 trace [ R ] = ∑ λn (238) n=0 i.e. the sum of the diagonal elements of the correlation matrix R, is equal to the sum of the eigenvalues, then the inequality: λ max ≤ trace [ R ] (239) will hold. However if the signal x ( k ) is wide sense stationary, then the diagonal elements of the correlation matrix, R, are E { x 2 ( k ) } which is a measure of the signal power. Hence: trace [ R ] = NE { x 2 ( k ) } = N<Signal Power> (240) and the well known LMS stability bound (sufficient condition) of: 1 0 < µ < ------------------NE [ x k2 ] (241) is the practical result. See also Adaptive Signal Processing, Least Mean Squares Algorithm, Least Mean Squares Algorithm Misadjustment, Least Mean Squares Algorithm Time Constant, WienerHopf Equations. Least Mean Squares (LMS) Algorithm Misadjustment: Misadjustment is a term used in adaptive signal processing to indicate how close the achieved mean squared error (MSE) is to the DSPedia 222 minimum mean square error. It is defined as the ratio of the excess MSE, to the minimum MSE, and therefore gives a measure of how well the filter can adapt. For the LMS: MSE-------------------------------Misadjustment = excess MMSE ≈ µtrace [ R ] ≈ µN<Signal Power> (242) Therefore misadjustment from the MMSE solution is proportional to the LMS step size, the filter length, and the signal input power of x ( k ) . See also Adaptive Signal Processing, Least Mean Squares Algorithm, Least Mean Squares Algorithm Convergence, Least Mean Squares Algorithm Time Constant, Wiener-Hopf Equations. Least Mean Squares (LMS) Algorithm Time Constant: The speed of convergence to a steady state error (expressed as an exponential time constant) can be precisely defined in terms of the eigenvalues of the correlation matrix, R (see Least Mean Squares Algorithm Convergence). A commonly used (if less accurate) measure is given by: N 1 τ mse ≈ ----------------------------------- = ------------------------------------------------4µ ( trace [ R ] ) 4µ<Signal Power> (243) Therefore the speed of adaption is proportional to the inverse of the signal power and the inverse of the step size. A large step size will adapt quickly but with a large MSE, whereas a small step size, will adapt slowly but achieve a small MSE. The design trade-off to select µ, is a requirement of the algorithm designer, and will, of course, depend of the particular application. See also Adaptive Signal Processing, Adaptive Step Size, Least Mean Squares Algorithm, Least Mean Squares Algorithm Convergence, Least Mean Squares Algorithm Misadjustment, Wiener-Hopf Equations. Least Mean Squares (LMS) Algorithm Variants: A number of variants of the LMS exist. These variants can be split into three families: (1) algorithms derived to reduce the computation requirements compared to the standard LMS; (2) algorithms derived to improve the convergence properties over the standard LMS; (3) modifications of the LMS to allow a more efficient implementation. In order to reduce computational requirements, the sign-error, sign-data and sign-sign LMS algorithms circumvent multiplies and replace them with shifting operations (which are essentially power of two multiplies or divides). The relevance of the sign variants of the standard LMS however is now somewhat dated due to the low cost availability of modern DSP processors where a multiply can be performed in the same time as a bit shift (and faster than multiple bit shifts). The convergence speed and achievable mean squared error for all of the sign variants of the LMS are less desirable than the for the standard LMS algorithm. To improve convergence speed, the stability properties and ensure a small excess mean squared error the normalized, the leaky and the variable step size LMS algorithms have been developed. A summary of some of the LMS variants are: • Delay LMS: The delay LMS simply delays the error signal in order that a “systolic” timed application specific circuit can be implemented: w ( k + 1 ) = w ( k ) + 2µe ( k – n )x ( k – n ) (244) 223 Note that the delay-LMS is in fact a special case of the more general filtered-X LMS. • Filtered-X LMS: See Least Mean Squares Filtered-X Algorithm. • Filtered-U LMS: See Active Noise Control. • Infinite Impulse Response (IIR) LMS: See Least Mean Squares - IIR Filter Algorithms. • Leaky LMS: A leakage factor, c, can be introduced to improve the numerical behaviour of the standard LMS: w ( k + 1 ) = cw ( k ) + 2µe ( k )x ( k ) (245) By continually leaking the weight vector, w ( k ) , even if the algorithm has found the minimum mean squared error solution it will require to continue adapting to compensate for the error introduced by the leakage factor. The advantage of the leakage is that the sensitivity to potentially destabilizing round off errors is reduced. In addition, in applications where the input occasionally becomes very small, leaky LMS drives the weights toward zero (this can be an advantage in noise cancelling applications). However the disadvantage to leaky LMS is that the achievable mean squared error is not as good as for the standard LMS. Typically c has a value between 0.9 (very leaky) and 1 (no leakage). • Multichannel LMS: See [68]. . • Newton LMS: This algorithm improves the convergence properties of the standard LMS. There is quite a high computational overhead to calculate the matrix vector product (and, possibly, the estimate of the correlation matrix R – 1 ) at each iteration: w ( k + 1 ) = w ( k ) + 2R – 1 µe ( k )x ( k ) (246) • Normalised Step Size LMS: The normalised LMS calculates an approximation of the signal input power at each iteration and uses this value to ensure that the step size is appropriate for rapid convergence. The normalized step size, µn, is therefore time varying. The normalised LMS is very useful in situations where the input signal power fluctuates rapidly and the input signal is slowly varying non-stationary: w ( k + 1 ) = w ( k ) + 2µ n e ( k )x ( k ), 1 µ n = --------------------------ε + x( k) 2 (247) where ε is a small constant to ensure that in conditions of a zero input signal, x ( k ) , a divide by zero does not occur. x ( k ) is 2-norm of the vector x ( k ) . • Sign Data/Regressor LMS: The sign data (or regressor) LMS was first developed to reduce the number of multiplications required by the LMS. The step size, µ, is carefully chosen to be a power of two and only bit shifting multiplies are required: w ( k + 1 ) = w ( k ) + 2µe ( k )sign [ x ( k ) ], 1, x ( k ) > 0 sign [ x ( k ) ] = 0, x ( k ) = 0 – , ( ) < 1 x k 0 (248) • Sign Error LMS: The sign error LMS was first developed to reduce the number of multiplications required by the LMS. The step size, µ, is carefully chosen to be a power of two and only bit shifting multiplies are required: w ( k + 1 ) = w ( k ) + 2µsign [ e ( k ) ]x ( k ), 1, e ( k ) > 0 sign [ e ( k ) ] = 0, e ( k ) = 0 – , ( ) < 1 e k 0 • Sign-SIgn LMS: The sign-sign error LMS was first presented in 1966 to reduce the number of multiplications required by the LMS. The step size, µ, is carefully chosen to be a power of two and only bit shifting multiplies are required: (249) DSPedia 224 w ( k + 1 ) = w ( k ) + 2µsign [ e ( k ) ]sign [ x ( k ) ], 1, z ( k ) > 0 sign [ z ( k ) ] = 0, z ( k ) = 0 – 1, z ( k ) < 0 (250) • Variable Step Size LMS: The variable step size LMS was developed in order that when the LMS algorithm first starts to adapt, the step size is large and convergence is fast. However as the error reduces the step size is automatically decreased in magnitude in order that smaller steps can be taken to ensure that a small excess mean squared error is achieved: w ( k + 1 ) = w ( k ) + 2µ v e ( k )x ( k ), (251) µv ∝ E { e 2 ( k ) } Alternatively variable step size algorithms can be set up with deterministic schedules for the modification of the step size. For example w ( k + 1 ) = w ( k ) + 2µ v e ( k )x ( k ), µ v = µ2 – int ( λk ) (252) such that as time, k, passes the step size, µ v , gets smaller in magnitude. µ is the step size calculated for the standard LMS, λ is a positive constant, and int ( λk ) is the closest integer to λk . Note that a hybrid of more than one of the above LMS algorithm variants could also be implemented. See also Adaptive Signal Processing, Least Mean Squares Algorithm, Least Mean Squares IIR Algorithms, Recursive Least Squares. Least Mean Squares (LMS) Filtered-X Algorithm: In certain control applications the adaptive architecture has a transfer function at the output of the adaptive filter: d(k) x(k) Adaptive Filter, w(k) y(k) Transfer Function, G(f) + e(k) − z(k) This adaptive filtering architecture has a known transfer function at the output of the adaptive filter which filters y ( k ) before subtraction from d ( k ) to produce the error. Compare this to the generic adaptive filtering described previously (see Adaptive Filtering). Note that the DAC and ADC at the input and output respectively of the transfer function G ( f ) are not shown for diagrammatic clarity. In deriving the standard LMS algorithm the gradient of the instantaneous squared error was calculated. Note, however, in the above architecture the instantaneous error is given by: e(k ) = d( k) – z( k) = d ( k ) – { y ( k )* g ( k ) } (253) where g ( k ) is the perfectly sampled impulse response of the transfer function at the output of the adaptive filter, and the term { y ( k )* g ( k ) } is the result of y ( k ) being convolved with g ( k ) . Therefore calculating the derivative of the instantaneous error produces: 225 ˆ ∇k = ∂ e2( k ) ∂w( k ) = 2e ( k )f ( k ) (254) where and f ( k ) = [ f ( k ), f ( k – 1 ), f ( k – 2 ), …, f ( k – M + 1 ) ] (255) f ( k ) = { x ( k )* g ( k ) } (256) Therefore this algorithm requires that the pulse response g ( k ) is known exactly in order to convolve with the input vector to create the f ( k ) vector. Clearly it is unlikely that g ( k ) will be known exactly, however an estimate, ĝ ( k ) can be found by apriori system identification. Therefore the filtered-X LMS algorithm is: w ( k + 1 ) = w ( k ) + 2µe ( k )f ( k – n ) f( k) = M–1 ∑ (257) ĝ ( k )x ( k – n ) n=0 where M is the number of filter weights used in the FIR filter estimate of g ( k ) . Note that the number of weights in this estimate will influence the performance of the algorithm; too few weights may not adequately model the transfer function and could degrade performance. Therefore M must be carefully chosen by the algorithm designer. The filtered-X LMS can be summarised as: d(k) Adaptive Filter, w(k) x(k) y(k) Transfer Function, g(t) + e(k) − z(k) ĝ ( k ) f(k) w ( k + 1 ) = w ( k ) + 2µe ( k )f ( k – n ) The filtered-X LMS prefilters the x ( k ) vector using an estimate, ĝ ( k ) , of the impulse response of the transfer function g ( t ) . The accuracy of this estimate will influence the performance of the algorithm. See also Active Noise Control, Adaptive Signal Processing, Adaptive Step Size, Inverse System Identification, Least Mean Squares (LMS) Algorithm. Least Mean Squares (LMS) IIR Algorithms: Recently adaptive filtering algorithms based on IIR filters have been investigated for a number of applications. A good overview of adaptive IIR filters can be found in [36], [132]. The very simplest form of adaptive IIR LMS, sometimes referred to as Feintuch’s algorithm [71], can be represented as: In addition to the normal step size stability concerns of adaptive filters, the adaptive IIR LMS filter instability can also result if the poles of the filter migrate outside of the unit circle. Therefore extreme DSPedia 226 d(k) Output Error IIR LMS x(k) y(k) FIR Filter a(k) Σ + Σ − + + e(k) FIR Filter b(k) a ( k + 1 ) = a ( k ) + 2µe ( k )x ( k ) M–1 N–1 y(k) = ∑ b ( k + 1 ) = b ( k ) + 2µe ( k )y ( k – 1 ) a ( k )x ( k – n ) + ∑ b ( k )y ( k – m ) = a ( k )x ( k ) + b ( k )y ( k – 1 ) n=1 n=0 The simplest form of output error adaptive IIR LMS where the filter poles and zeroes are updated by independent pole and zero weight updates. care is necessary when choosing the adaptive step size for both recursive and non-recursive weight updates. While this simple (some would say simple-minded) algorithm appears to be useless, it is surprisingly robust in a wide variety of applications. In order to address the problem of poles migrating outside of the unit circle, one suggestion has been the equation error adaptive IIR LMS filter which is actually the updating of two independent FIR filters: d(k) Equation Error IIR LMS x(k) y(k) FIR Filter a(k) + Σ - + + Σ e(k) FIR Filter b(k) a ( k + 1 ) = a ( k ) + 2µe ( k )x ( k ) M–1 N–1 y(k) = ∑ n=0 b ( k + 1 ) = b ( k ) + 2µe ( k )d ( k ) a ( k )x ( k – n ) + ∑ b ( k )d ( k – m ) = a ( k )x ( k ) + b ( k )d ( k ) n=0 The simplest form of output error adaptive IIR LMS where the filter poles and zeroes are updated by independent pole and zero weight updates. In conditions of high observation noise the equation error will give a biased (and very poor!) solution. See also Active Noise Control, Adaptive Signal Processing, Least Mean Squares Algorithm. 227 Least Significant Bit (LSB): The bit in a binary number with the least arithmetic numerical significance. See also Most Significant Bit, Sign Bit. MSB LSB -128 64 0 1 32 16 8 4 2 1 0 1 1 0 1 1 In 2’s complement notation the MSB has a negative weighting. = 64 + 16 + 8 + 2 + 1 = 9110 Least Squares: Given the overdetermined linear set of equations, Ax = b, where A is a known m × n matrix of rank n (with m > n ), b is a known m element vector, and x is an unknown n element vector, then the least squares solution is given by: x LS = ( A T A ) –1 A T b (258) (Note that if the problem is underdetermined, m < n , then Eq.258 is not the solution, and in fact there is no unique solution; a good (i.e., close) solution can often be found however using the pseudoinverse obtained via singular value decomposition.) The least squares solution can be derived as follows. Consider again the overdetermined linear set of equations: a 1 a 12 … a 1n b1 a 21 a 22 … a 2n x 1 b2 a 31 a 32 … a 3n x 2 a 41 a 42 … a 4n : : a m1 a m2 A … : … a mn : xn x = b3 b4 : bm b (259) If A is a nonsingular square matrix, i.e. m = n , then the solution can be calculated as: x = A –1 b (260) However if m ≠ n then A is a rectangular matrix and therefore not invertible, and the above equation cannot be solved to give exact solution for x. If m < n then the system is often referred to as underdetermined and an infinite number of solutions exist for x (as long as the m equations are consistent). If m > n then the system of equations is overdetermined and we can look for a solution by striving to make Ax be as close as possible to b, by minimizing Ax – b in some sense. The most mathematical tractable way to do this is by the method of least squares, performed by minimizing the 2-norm denoted by e : e = ( Ax – b 2 ) 2 = ( Ax – b ) T ( Ax – b ) (261) DSPedia 228 Plotting e against the n-dimensional vector x gives a hyperparabolic surface in (n+1)-dimensions. . If n = 1 , x has only one element and the surface is a simple parabola. For example consider the case where A is a 2 × 1 matrix, then from Eq. 261: T a b a b e = 1 x– 1 1 x– 1 b2 a2 b2 a2 = a1 a2 a1 a2 a1 x 2 –2 b1 b2 a2 b1 x + b1 b2 (262) b2 = a 12 + a 22 x 2 – 2 a 1 b 1 + a 2 b 2 x + b 12 + b 22 = P x 2 – Qx + R where P = a 12 + a 22 , Q = 2 a 1 b 1 + a 2 b 2 and R = b 12 + b 22 . Clearly the minimum point on the surface lies at the bottom of the parabola: de = 2Px LS – Q = 0 dx e R Q⇒ x LS = -----2P a 1 b1 + a2 b 2 Qx LS = ( A T A ) – 1 A T b = ------------------------------------ = -----2P a 12 + a 22 emin xLS x If n = 2, x = [x1 x2]T and the error surface is a paraboloid. This surface has one minimum point at the bottom of the paraboloid where the gradient of the surface with respect to both x1 and x2 axis: e x = x2 emin x2LS x1LS x1 x1 x2 de d x1 de = = 0 dx 0 de d x2 229 If the x vector has three or more elements (n ≥ 3) the surface will be in four or more dimensions and cannot be shown diagrammatically. To find the minimum value of e for the general case of an n-element x vector the “bottom” of the hyperparaboloid can be found by finding the point where the gradient in every dimension is zero (cf. the above 1 and 2-dimensioned examples). Therefore differentiating e with respect to the vector x: de = de de de dx d x1 d x2 d x3 … de d xn T = 2A T ( Ax – b ) (263) and setting the gradient vector to the zero vector, de = 0 dx (264) to find the minimum point, emin, on the surface gives the least squares error solution for xLS: 2A T ( Ax LS – b ) = 0 A T Ax LS – A T b = 0 (265) x LS = ( A T A ) –1 A T b If the rank of matrix A is less than n, then the inverse matrix (ATA)-1 does not exist and the least squares solution cannot be found using Eq. 265 and the pseudoinverse requires to be calculated using singular value decomposition techniques. Note that if A is an invertible square matrix, then the least squares solution simplifies to: x = ( A T A ) –1 A T b = A –1 A –T A T b = A –1 b (266) See also Matrix Decompositions - Singular Value Decompositions, Matrix Inversion, Minimum Residual, Normal Equations, Least Mean Squares, Least Squares Residual, Square System of Equations, Overdetermined System, Recursive Least Squares. Least Squares Residual: The least squares error solution to the overdetermined system of equations, Ax = b , is given by: x LS = ( A T A ) –1 A T b (267) where A is a known m × n matrix of rank n and with m > n, b is a known m element vector, and x is an unknown n element vector. The least squares residual given by: r LS = b – Ax LS (268) is a measure of the error obtained when using the method of least squares. The smaller the value of rLS, then the more accurately b can be generated from the columns of the matrix A. The DSPedia 230 magnitude or size of the least squares residual is calculated by finding the squared magnitude, or 2-norm of rLS: ρ LS = Ax LS – b (269) 2 As an example, for a system with n = 2 the least squares residual can be shown on the least squares error surface, e, as: e e = ( Ax – b 2 ) 2 x2 ρ2LS x2LS x1LS x1 Note that if m = n, and A is a non-singular matrix, then ρLS =0. See also Least Squares, Matrix, QR Algorithm, Recursive Least Squares. Leq: See Equivalent continuous level. Linear Algebra: Linear algebra is an older branch of mathematics that uses matrix based equations. The computer has spawned a rebirth of interest in linear algebra and changed what was thought to be an arcane, obsolete and strictly academic area into a ubiquitous, fundamental tool in virtually every applied, pure or social science field. Over the last few years the advent of fast DSP processors has led to the solution of many DSP problems using numerical linear algebra [15]. See also Matrix, Matrix Algorithms, Matrix Decompositions, Matrix Properties. Linear Feedback Shift Register (LFSR): A simple shift register with feedback and combinational logic using for the generation of pseudo random binary noise. See Pseudo Random Binary Sequence. Linear Phase Filter: See Finite Impulse Response Filter. Linear Predictive Coding (LPC): Linear predictive coding is a compression algorithm for reducing the storage requirements of digitized speech. In LPC the vocal tract is modelled as an allpole digital filter (IIR) and the calculated filter coefficients are used to code the speech down to levels of 2400 bits/sec from speech sampled at 8kHz with 8 bit resolution. 231 Linear System: A system is said to be linear if the weighted sum of the system output given two distinct inputs equals the system output given a single input equal to the weighted sum of the two distinct inputs.: x(n) Linear System y(n) time time In general, for a linear system y ( n ) = f ( x ( n ) ) , if, whenever: y1 ( n ) = f [ x1 ( n ) ] y2 ( n ) = f [ x2 ( n ) ] (270) then: a 1 y 1 ( n ) + a 2 y2 ( n ) = f [ a1 x1 ( n ) + a 2 x2 ( n ) ] (271) for all values of a 1 and a 2 . For example consider the linear system: y ( n ) = 4.3x ( n ) + 6.01x ( n – 1 ) (272) If x1(n) = sin100nt, then the output which will be denoted as y1(n), is given by: y 1 ( n ) = 4.3 sin 100nt + 6.01 sin 100 ( n – 1 )t (273) For a different input x2(n) = sin250nt, then the output denoted as y2(n) is given by: y 2 ( n ) = 4.3 sin 250nt + 6.01 sin 250 ( n – 1 )t (274) Therefore, given that the system is linear, if x3(n) = sin100nt + sin250nt, then: y 3 ( n ) = 4.3 ( sin 100nt + sin 250nt ) + 6.01 ( sin 100 ( n – 1 )t + sin 250 ( n – 1 )t ) = y1 ( n ) + y2 ( n ) (275) In general inputting a sine wave to a linear system will yield an output that is a sine wave at exactly the same frequency but with modified phase and magnitude. If any other frequencies are output (e.g., if the sine wave is distorted in any way) then the system is nonlinear. (Note that this is not true for other waveforms; inputting a square to a linear system is unlikely to produce a square wave at the output. If the square wave is viewed as its sine wave components (from Fourier analysis) then the output of the linear system should only contain sine waves at those frequencies, but where the modification of their amplitude, phase and frequency means that their superposition no longer gives a square wave.) DSP systems such as digital filters (IIR and FIR) are linear filters. Any filter that has time varying weights, however, is non-linear. See also Distortion, Non-Linear System, Poles, Transfer Function, Frequency Response. DSPedia 232 Linearity: Linearity is the property possessed by any system which is linear. Linearly Dependent Vectors: See Vector Properties and Definitions - Linearly Dependent. LLT: See Matrix Decompositions - Cholesky Decomposition. Local Minima: The global minimum is the smallest value taken on by that function. For example for the function, f ( x ) , the global minimum is at x = x g . The minima are x 1 , x 2 and x 3 are termed local minima: f(x ) x1 xg x2 x3 x When attempting to use least squares, or least mean squared based algorithms to find the global minimum of a function, the zero gradient of the function is found. For a quadratic surface with only one minimum the method works very well. However if the surface in not quadratic, then the solution obtained is not necessarily the global minimum, as the gradient is also zero at the local minima (and the local maxima and inflection points). See also Adaptive IIR Filters, Hyperparaboloid, Global Minima, Least Squares, Simulated Annealing. Localization: When used in the context of acoustics, localization is the ability to perceive the direction from which sounds are coming. For animals the two ears provide excellent instruments of localization. Localization problems are also found in radar and sonar systems where arrays of sensors are used to sense the direction from which signals are radiating. Generally, a minimum of two sensors are required to accurately localize a sound source. A current focus of research is in producing arrays of microphones using DSP algorithms to improve sound quality for applications such as hands-free telephony, hearing aids, and concert hall microphone pick-ups. Some applications require that a desired source be located before it can be extracted or filtered from the rest of the sound field. See also Audiology, Beamforming, Lateralization. Logarithmic Amplitude: If the amplitude range of a signal or system is very large then it is often convenient to plot the magnitude on a logarithmic scale rather than a linear scale. The most common form of logarithmic magnitude uses the logarithmic decibel scale which represents a ratio of two powers. See also Decibels (dB). Logarithmic Frequency: When the frequency range of a signal or system is very large, it is often convenient to plot the frequency axis on a logarithmic rather than a linear scale. The human ear, for example, has a sensitivity range from around 70Hz to 15000Hz and is often described as being a logarithmic frequency response. Logarithmically spaced frequencies are equally spaced distances on the basilar membrane within the cochlea. The perception of frequency change is such that a doubling of frequency from 200Hz to 400Hz is perceived as being the same change as a doubling of frequency from 2000Hz to 4000Hz, i.e., both sounds have increased by an octave. In DSP 233 systems everything from digital filter frequency responses, to spectrograms may be represented with a logarithmic frequency scale. See also Wavelet Transform. The most common logarithmic scales are decade (log10f) and octave (log2f) although clearly any logarithmic base can be used. If the y-axis is also plotted on a logarithmic scale (such as dB), then the graph is log-log. See also Decibels, Roll-off . 1 10 log 10 -------------2- 1 + f Range 1- 100Hz 0 -10 -20 Linear frequency -30 -40 10 20 30 40 50 60 70 80 90 100 f (Hz) Range 1- 100Hz 1 10 log 10 -------------2- 1 + f 0 Log10 (decade) frequency -10 Roll-off at 20dB/decade -20 -30 -40 -50 -60 0.1 0.5 1 5 10 50 100 500 log10 f Range 1- 100Hz 1 10 log 10 -------------2- 1 + f 0 -6 1000 Log2 (octave) frequency -12 -18 Roll-off at 6dB/octave -24 -30 -36 -42 -48 -54 0.125 0.25 0.5 1 2 4 8 16 32 64 128 256 512 log2 f Graphs of the second order system 1 ⁄ ( 1 + f 2 ) . The range of 1 to 100Hz is the width on all three graphs. Clearly using a logarithmic scale allows much greater frequency ranges to be represented than with a linear scale. More resolution is available at the lower frequencies (0 to 1 Hz), although at higher frequencies there is less resolution. Lossless Compression: If a compression algorithm is lossless, then the signal information (or entropy) after the signal has been compressed and decompressed has not changed, i.e. all signal information has been retained. Hence, the uncompressed signal is identical to the original signal. DSPedia 234 Lossless compression for digital audio signals is not particularly successful and is likely to achieve at best 2.5:1 compression ratio [61]. See also Compression, Lossy Compression. Lossy Compression: If a compression algorithm is lossy, then the signal information (or entropy) after the signal has been compressed and decompressed is reduced, i.e. some signal information has been lost. However if the lossy algorithm is carefully designed then the elements of the signal that are lost are not particularly important to the integrity of the signal. For example, the precision adaptive subband coding (PASC) algorithm compresses a hifidelity digital audio signal by a factor of 4, however the information that is “lost” would not have been perceived by the listener due to masking effects of the human ear. Alternatively if very high levels of compression are being attempted then the lossy effects of the algorithm may be quite noticeable. See also Compression, Lossless Compression. Loudness Recruitment: Defects in the auditory mechanism can lead to a hearing impairment whereby the dynamic range from the threshold of audibility to the threshold of discomfort is greatly reduced [30]. Loudness recruitment is the abnormally rapid growth in perceived loudness (versus intensity) in individuals with reduced dynamic range of audibility. The range of hearing is nominally 120dB(SPL). However, in persons with hearing loss, the range may be as low as 40dB. These individuals have a raised threshold of audibility, but after sounds exceed that threshold the perceived loudness grows rapidly until they reach normal perceived loudness for sounds near the threshold of discomfort. This growth in their perceived loudness is termed loudness recruitment. One common misconception is that individuals with loudness recruitment are more sensitive to changes in intensity (i.e., they have smaller intensity JNDs or DLs). When tested, however, their JNDs for intensity are very near normal -- this indicates that they have fewer different perceptible difference limens (DLs) over the normal range of loudness than normal hearing individuals. See also Ear, Equal Loudness Contours, Hearing Aids, Threshold of Hearing. Low Noise Components: All electronic components introduce certain levels of unwanted and potentially interfering noise. Low noise components introduce lower levels of noise than standard components, but the cost is usually higher. Low Pass Filter: A filter which passes only the portions of a signal that have frequencies between DC (0 Hz) and a specified cut-off frequency. Frequencies above the cut-off frequency are highly attenuated. See also Digital Filter, Filters, High Pass Filter, Bandpass Filter, Filters. Input Low pass Filter G(f) Output Magnitude Bandwidth G(f) Cut-off frequency Lower Triangular Matrix: See Matrix Structured - Lower Triangular. LU: See Matrix Decompositions - LU Decomposition. frequency 235 M m-sequences: Shorthand term for a maximum length sequence. See Maximum Length Sequences, Pseudo-Random Binary Sequence. Machine Code: The binary codes that are stored in memory and are fetched by the DSP processor to be executed on the chip and perform some useful function, such as multiplication of two numbers. Collectively machine code instructions form a meaningful program. Machine code is usually generated (by the assembler program) from source code written in the assembly language. This machine code can then be downloaded onto the DSP processor. Machine code has a one to one correspondence with assembly language. See also Assembly Language, Cross Compiler. Main Lobe: In an antenna or sensor array processing system, main lobe refers to the primary lobe of sensitivity in the beampattern. For a filter or a data window, main lobe refers to the primary passband lobe of sensitivity. The more narrow the main lobe, the more selective or sensitive a given system is said to be. Main lobes are best illustrated by an example. mainlobe array gain as a function of angle sidelobe -15 -10 -5 0 dB contour Typical Beampattern See also Beamformer, Beampattern, Sidelobes, Windows. Magnitude Response: See Fourier Series - Complex Exponential Representation. Mammals: While not using digital signal processing capabilities, many mammals do of course use analog signals for communication and navigation. Most obviously mammals (including humans) use acoustic signals for communication via, for example, speech (humans), barking (dogs), and so on. Elephants communicate with very low frequencies (around 100Hz and well below -- even down to a few Hz), and can therefore communicate over very long distances via acoustic waves travelling in the ground. These ground-borne waves suffer less attenuation than airborne acoustic waves. It was this low frequency rumble communication that caused many early elephant watchers to believe that elephants had ESP (extra sensory perception) ability. Light signals (from the electromagnetic family) are used by most animals for navigation and communication purposes. Another well known use of signal processing is by the bat which uses sonar blips to avoid objects in its path during night flying. The magnetic field sensing abilities of birds and bees is another well known though not fully understood use of signal processing for navigation. Some mammals (mainly antipodean), such as the platypus an the echidna have electroreception abilities. See also Electroreception. Marginally Stable: If a discrete system has poles on the unit circle then it can be described as marginally stable. See Dual Tone Multifrequency. DSPedia 236 Masking: Masking refers to the process whereby one particular sound is close to inaudible in the presence of another louder signal. Masking is more precisely defined as spectral or temporal, although in audio and speech coding the term is usually used in reference to spectral masking. For spectral masking a loud signal raises the threshold of hearing of signals of a lower level but with slightly higher or lower frequencies. This effectively leaves these other signals inaudible. For temporal masking, sounds that occur a short time before of after a louder sound are not perceived. Simultaneous masking is also used in audiometry in order to minimize the perceivable conductance of test tones from the ear under test by injecting noise into the ear not being tested. See also Audiometry, Spectral Masking, Temporal Masking, Threshold of Hearing. Masking Pattern Adapted Universal Subband Integrated Coding and Multiplexing (MUSICAM): MUSICAM was developed jointly by CCETT (France), IRT (Germany) and Philips (the Netherlands), amongst others, originally for the application of digital audio broadcasting (DAB). MUSICAM is based on subband psychoacoustic compression techniques and has been incorporated into MPEG-1 in combination with the ASPEC compression system. See also Adaptive Spectral Perceptual Entropy Coding (ASPEC), Precision Adaptive Subband Coding (PASC), Psychoacoustics, Spectral Masking, Temporal Masking. Matlab: A program produced by the MathWorks that allows high level simulation of matrix and DSP systems, with excellent post-processing graphics facilities for data presentation. Libraries containing virtually every DSP operation are widely available for Matlab. Matrix: A matrix is a set of numbers stored in a 2 dimensional array usually to represent data in an ordered structure. If ℜ denotes the set of real numbers, then the vector space of all m × n real matrices is denoted by ℜ m × n , and if A∈ ℜm × n a 11 … a 1n then A = : : a m1 … a mn with a ij ∈ ℜ, for 0 ≤ i ≤ m 0≤j≤n (276) where the symbol ∈ simply means “is an element of” -- so A is an m × n matrix. The ordering of the data values is important to the information being conveyed by the matrix. The dimensions of a matrix are specified as the number of rows by the number of columns (the rows running from left to right, and the columns from top to bottom). Matrices are usually denoted in upper-case boldface font or upper case font with an underscore, e.g. M or M. (Note that vectors are usually represented in lower case boldface font or lower font with an underscore, e.g. v or v. As an example a particular 4 × 3 matrix, A, is: 4 A = 10 3 1 9 1 4 2 2 13 5 2 a 4 (row) by 3 (column) matrix (277) Clearly each element in the matrix can be denoted by a subscript which refers to its row and column position: 237 a 11 a 12 a 13 A = a 21 a 22 a 23 (278) a 31 a 32 a 33 a 41 a 42 a 43 In the example, a12 = 9, and a32 = 4. In DSP algorithms and analysis, matrices are extremely useful for compact and convenient mathematical representation of data and algorithms. For example the Wiener Hopf solution, and the Recursive Least Squares algorithm are expressed using matrix equations. See also Matrix Algorithms, Matrix - Complex, Matrix Decompositions, Matrix Identities, Matrix Properties, Vector. Matrix - Complex: Each element in an m × n complex matrix is a complex number. The complex vector space is often denoted as C m × n where every element of that space is a complex number c ij ∈ C . Scaling, addition, subtracting and multiplication of a complex matrix is performed in the same way as for real matrices, except that the arithmetic is complex. For example: 1 – 3j + 4 – 4j = 9 – 5j + 4 – 4j = 13 – 9j Cd + a = 1 + 2j 2 + j 3 2 – 0.5j – 2j 1 + 3j 2 – 13j 1 + 3j 3 – 10j (279) Simple row column transposition (i.e. transpose operation) of complex matrices is not normally performed, but instead the Hermitian transpose is done where the matrix is transposed in the normal row-column style, but every element is complex conjugated. In DSP applications such as beamforming and digital communications, complex representation of information is often used for convenience. See also Matrix, Matrix Properties - Hermitian Transpose. Matrix Algorithms: There are a number of well known matrix algorithms used in DSP for solving structured systems of equations. These algorithms are invariably used after a suitable decomposition has been performed on a matrix in order to produce a structured matrix/system of equations. See also Matrix, Matrix Decompositions, Matrix - Partitioning. • Back Substitution: If an upper triangular system of linear equations: u 11 … u 1, n – 2 Ux = b ⇒ u 1, n – 1 u 1n x1 b1 : 0 : : : : : : … u n – 2 , n – 2 u n – 2, n – 1 u n – 2, n x n – 2 = b n – 2 0 …0 u n – 1, n – 1 u n – 1, n x n – 1 bn – 1 0 …0 0 u nn bn xn (280) has to be solved for the unknown n element vector x, where U is an n × n non-singular upper triangular matrix, then the last element of the unknown vector, x n can be calculated from multiplication of the last row of U with the vector x: u nn x n = b n ⇒ bn x n = -------u nn the second last element can therefore be calculated from multiplication of the second last row of U with vector x, and substitution of x n from Eq. 281: (281) DSPedia 238 ⇒ u n – 1, n – 1 x n – 1 + u n – 1, n x n = b n – 1 xn – 1 bn b n – 1 – u n – 1, n --------- u nn = ----------------------------------------------------u n – 1, n – 1 (282) In general it can be shown that all elements of x can be calculated recursively from: n ∑ bi – u ij x j (283) j = i+1 x i = -----------------------------------u ii This method of solving an upper triangular system of linear equations is called backsubstitution. Note that if the diagonal elements of U are very small relative to the off-diagonal elements, then the arithmetic required for the computation may require a large dynamic range. See also Matrix Decompositions Cholesky/Forward Substitution/Gaussian Elimination/QR. • Forward Substitution: If a system of lower triangular linear equations: 0 … 0 x1 b1 l 21 l 22 0 … 0 x2 l 31 l 32 l 33 … 0 x3 b2 = b 3 : : : … l nn x n : bn l 11 0 Lx = b ⇒ : : : l n1 l n2 l n3 (284) has to be solved for the unknown n element vector x, where L is an n × n non-singular lower triangular matrix, then the first element of the unknown vector, x 1 can be calculated from multiplication of the first row of L with the vector x: l 11 x 1 = b 1 b1 x 1 = -----l 11 ⇒ (285) The second element can therefore be calculated from multiplication of the second row of L with vector x, and substitution of x 1 from Eq. 285: l 21 x 1 + l 22 x 2 = b 2 b1 b 2 – l 21 ------ l 11 x 2 = ------------------------------l 22 ⇒ (286) In general it can be shown that all elements of x can be calculated sequentially from: i bi – 1 ∑ lij xj j=1 x i = ---------------------------l ii (287) This method of solving an upper triangular system of linear equations is called forward substitution. Note that if the diagonal elements of L are very small relative to the off-diagonal elements, then the arithmetic required for the computation may require a large dynamic range. See also Matrix Decompositions - BackSubstitution/Cholesky/Gaussian Elimination/QR. Matrix Decompositions: There are a number of methods which allow a matrix to be decomposed into structured matrices. The reason for performing a matrix decomposition is to either extract certain parameters from the matrix, or to provide a computationally cost effective and, ideally, 239 numerically stable method of solving a set of linear equations. A number of decompositions often performed in DSP can be identified. • Back Substitution: See Matrix Algorithms - Backsubstitution. • Cholesky: The Cholesky decomposition or factorization can be applied to a n × n non-singular symmetric (and therefore positive definite) matrix, A such that: 0 … 0 l 11 l 21 l 31 … l n1 l 21 l 22 0 … 0 0 l 22 l 32 … l n2 = l l l … 0 31 32 33 0 0 l 33 … l n3 l 11 0 A = LL T : : : l n1 l n2 l n3 : : … l nn (288) : : : … : 0 0 0 … l nn If a system of equations, Ax = b is to be solved for the unknown n element vector x, where A is an n × n symmetric matrix, and b a known n element vector, the solution can be found by Cholesky factoring matrix A, and performing a backsubtitution followed by forward substitution: Ax = b ⇒ LL T x = b ⇒ Ly = b LT x = y solve by forward substitution solve by backward substitution (289) The elements of the Cholesky matrix, L, are well bounded and in general Cholesky factorization is a numerically well behaved algorithm with fixed point arithmetic. The Cholesky factorization may also be written in the form of the LDLT factorization, where L is now a unit upper triangular matrix, and D is a diagonal matrix. See also Matrix Decompositions - Back Substitution/ Forward Substitution/Gaussian Elimination/LDU/LU/LDLT, Recursive Least Squares - Square Root Covariance. • Complete Pivoting: See entry for Matrix Decompositions - Pivoting. • Eigenanalysis: Eigenanalysis allows a square n × n matrix, A, to be broken down into components of an eigenvector and an eigenvalue which satisfy the condition: Ax = λx (290) where x is an n × 1 vector, referred to as a (right) eigenvector of A, and the scalar λ is an eigenvalue of A. In order to calculate the eigenvalues Eq. 290 can be rearranged to give: ( A – λI )x = 0 (291) and if x is to be a non-zero vector, then the solution to Eq. 291 requires that the matrix ( A – λI ) is singular (i.e. linearly dependent columns) and therefore the determinant is zero, i.e. det ( A – λI ) = 0 (292) This equation is often referred to as the characteristic equation of the matrix A, and can be expressed as a polynomial of order n, which in general has n distinct roots. (If the eigenvalue does not have n distinct roots, then the matrix A is said to be degenerate). Therefore we can note that there are n instances of Eq. 290: Ax i = λ i x i for i = 1 to n (293) Writing the eigenvalues as a diagonal matrix, Λ = diag ( λ 1, λ 2, λ 3, …λ n ) , and each vector, x i as a column of an n × n matrix X: A ( x 1, x 2, x 3, …, x n ) = AX = XΛ (294) DSPedia 240 and therefore X is a similarity transform matrix: (295) X – 1 AX = Λ and matrices A and Λ are said to be similar. Note also that trace ( A ) = trace ( Λ ) = λ 1 + λ 2 + … + λ n , (296) which is easily seen from noting that: (297) trace ( Λ ) = trace ( X – 1 AX ) = trace ( AX – 1 X ) = trace ( A ) For the general eigenvalue problem, techniques such as the QL algorithm (not to be confused with the QR decomposition) are used to reduce the matrix A to various structured intermediate forms before ultimately extracting eigenvalues and eigenvectors. Note that although the eigenvalues could be found from solving the polynomial in Eq. 292 this is in general not a good method either numerically or computationally. For DSP systems a particularly relevant problem is the symmetric eigenvalue problem, whereby a (symmetric) correlation matrix is to be decomposed. For a symmetric n × n matrix R, (298) Rq i = λq i for i = 1 to n it is relatively straightforward to show for the symmetric case that the eigenvectors, q i , will be orthogonal to each other, and Eq. 295 can be written in the form: Q T RQ = Λ or R = QΛQ T (299) where, Q T Q = I . Other useful properties of the symmetric eigenanalysis problems are that the condition number of R can be calculated as the eigenvalue spread: λ max κ ( R ) = ----------λ min (300) See also Matrix Decompositions - Singular Value, QL, QR Algorithm. • Schur Form: A canonical form of a matrix that displays the eigenvalues but not eigenvectors of matrix. • Eigenvalue: See Matrix Decompositions - Eigenanalysis. • Eigenvector: See Matrix Decompositions - Eigenanalysis. • Fast Given’s Rotations: See Matrix Decompositions - Square Root Free Givens. • Forward Substitution: See Matrix Algorithms - Forward Substitution. • Gauss Transform: In general the Gauss transform, G k is an n × n matrix used to zero the k – 1 elements below the main diagonal in column k of a non-singular n × n matrix, A: Main Diagonal k n 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 g g g 0 0 0 1 0 0 n G 0 0 0 0 1 0 0 0 0 0 0 1 n x x x x x x x x x x x x x x x x x x n A x x x x x x x x x x x x x x x x x x n x x x y y y x x x y y y x x x 0 0 0 x x x y y y n GA x x x y y y x x x y y y Note that only elements in the rows being zeroed actually change. 241 As an example of zeroing a matrix column below the main diagonal, the elements a 31 and a 21 can be “zeroed” by premultiplying a 3 × 3 matrix A with a 3 × 3 Gauss transform matrix, G 1 : a 11 a 12 a 13 1 0 0 a 11 a 12 a 13 G 1 A = g 21 1 0 a 21 a 22 a 23 = ( a 11 g 21 + a 21 ) ( a 12 g 21 + a 22 ) ( a 13 g 21 + a 23 ) g 31 0 1 a 31 a 32 a 33 a 11 a 12 a 13 = ( a 11 g 31 + a 31 ) ( a 12 g 31 + a 32 ) ( a 13 g 31 + a 33 ) (301) a 31 a 21 where, g 31 = – -------- and g 21 = – -------a 11 a 11 0 a 22 a 23 0 b 32 b 32 In general the Gauss transform matrix which will zero all elements below the diagonal in the k-th column of an n × n matrix, A can be specified as: 1 : 0 Gk A = 0 … 0 : : … 1 … g k + 1, k : … 0 … : g nk a 11 0 : : a k1 0 0 a k + 1, 1 : : : : 0 …1 a 0 : 0 1 … : … … = … a 1k a 1, k + 1 … a 1n : … : a kk : a k, k + 1 : … : a kn … a k + 1, k a k + 1 , k + 1 … a k + 1, n n1 … … : a nk : a n, k + 1 : … : a nn a 11 … a 1k a 1, k + 1 … a 1n : : … : : : a kk a k, k + 1 : … b k + 1, 1 … 0 … … : 0 a k1 : b n1 (302) a kn b k + 1 , k + 1 … b k + 1, n : b n, k + 1 : … : b nn where g ik = – a ik ⁄ a kk . The inverse of a Gauss transform matrix, G k– 1 is simply calculated by negating the “ g “entries: G k– 1 1 : 0 = – 0 … 0 : : … 1 … – g k + 1, k : … 0 … : – g nk 0 : 0 1 … : … … 0 : 0 0 : : : 0 …1 Gauss transforms are used in the main for performing LU matrix decomposition. Gauss transforms are not in general numerically well behaved, and if the pivot element (the divisor a ii ) is very small in magnitude, then very large values may occur in the resulting transformed matrix; hence “pivoting” strategies are often used whereby rows and/or columns of the matrix are interchanged, but the integrity of the problem being solved is maintained. See also Matrix Decompositions - Gaussian Elimination/LU/Pivotting, Matrix Structured - Lower Triangular/Upper Triangular. • Gaussian Elimination: Gaussian elimination is a technique used to find the solution of a square set of linear equations, Ax = b , for the unknown n element vector x, where A is an n × n non-singular matrix, and b a known n element vector. Gaussian elimination converts a square non-singular matrix into an equivalent, and easier to solve system of equations where A has been implicitly premultiplied by a matrix, (303) DSPedia 242 G, to produce an upper triangular matrix, U and a new vector y. (Note that the premultiplication is described as “implicit” as it is not necessary to explicitly form the matrix G - the Gaussian elimination is done in stages.). A n a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a n n n u 1 0 0 0 0 u u 1 0 0 0 n b x x x x x x b b b b b b n 1 1 Performing Gaussian elimination to produce equivalent upper triangular system of equations U 1 0 0 0 0 0 x u u u 1 0 0 u u u u 1 0 u u u u u 1 n x y x x x x x x y y y y y y 1 n GA = U Gb = y 1 Gaussian elimination can be formally described in terms of Gauss transforms which are used to “zero” the elements below the main diagonal of a matrix to ultimately convert it to an upper triangular form using a series of Gauss transforms for each column of the matrix. The Gauss transform matrix, G k can be specified which will zero all elements below the diagonal in the k-th. column of an n × n matrix, A. Therefore to solve the system of linear equations, Ax = b , the transforms G 1 to G k can be used to premultiply matrix A (in the correct order) such that: Ax = b ⇒ G n – 1 …G 2 G 1 Ax = G n – 1 …G 2 G 1 b ⇒ Ux = y and the equivalent system of equations, Ux = y is solved by backsubstitution. In general Gaussian elimination is not numerically well behaved, and will fail if A is singular. In particular small pivot elements, a ii on the diagonal of matrix A may lead to very small and very large values appearing in the L and U matrices respectively. Therefore pivoting techniques are often used whereby the rows and/or columns of A are interchanged using (orthogonal) permutation matrices. In fact where Gaussian elimination is to be used for solving a set of linear equations, it is recommended that pivoting is always used. See also Matrix Decompositions - Gauss Transforms/LU/Pivotting, Matrix Structured - Lower Triangular/Upper Triangular. • Givens Rotations: Given’s rotations (also known as plane rotations, and Jacobi rotations) represent an orthogonal transformation for introducing zero elements into a matrix. The element a 21 of the following (full rank) matrix can be zeroed by applying the appropriate Givens rotation as follows: (304) 243 c s a 11 a 12 … a 1n –s c a 21 a 22 … a 2n 2 + a2 a 11 21 ( a 12 c + sa 22 ) = … ( a 1n c + sa 2n ) ( – sa 12 + ca 22 ) 0 = (305) … ( – sa 1n + ca 2n ) b 11 b 12 … b 1n 0 b 22 … b 2n where a 11 c = ---------------------------2 + a2 a 11 21 and a 21 s = ---------------------------2 + a2 a 11 21 (306) More generally if a zero is to be introduced in the i-th row and j-th column of an m × n matrix A by rotating with the element in the k-th row and j-th column, then an m × m Given’s rotation matrix, G, can be applied: k 1 : k 0 GA = : i 0 : m 0 … : … : … : … 0 : c : –s : 0 m i … : … : … : … 0 : s : c : 0 … : … : … : … … a 1j … a 1n a 11 0 : : a 0 k1 : : 0 a i1 : : 1 a m1 : : : : … a kj … a kn : : : : … a ij … a in a kj a ij where, c = -----------------------and s = -----------------------2 + a2 2 + a2 a kj a kj ij ij (307) : : : : … a mj … a mn Given’s rotations are particularly useful for realizing the upper triangular R matrix in a QR decomposition algorithm. Consider that a 5 × 3 full rank matrix is to be decomposed into its Q and R components (for notational clarity all matrix variables row-column subscripts have been omitted): A-Matrix aa a aa a aa a aa a aa a Zero Element R-Matrix b a a a 0 b a a a b b a a a b 5,1 c a a 0 0 c a a c b c a a c b 4,1 d a 0 0 0 d a d c b 3,1 d a d c b e 0 0 0 0 e e d c b 2,1 e e d c b e 0 0 0 0 e f d c 0 5,2 e f d c f e 0 0 0 0 e g d 0 0 4,2 e g d g f e 0 0 0 0 e h 0 0 0 3,2 e h h g f e 0 0 0 0 e h 0 0 0 5,3 e h i g 0 e 0 0 0 0 e h 0 0 0 e h j 0 0 4,3 All of the elements below the main diagonal in column 1 are first rotated with the a 11 element and after four Given’s rotations all appropriate elements are zeroed. For column 2, all elements below the main diagonal are rotated with the e 22 element, and after three Given’s rotations all appropriate elements are zeroed. Finally for column 3, all elements below the main diagonal are rotated with i 33 and after two Given’s rotations the upper triangular matrix R is realized. Note that the order of element rotation is important in order that previously zeroed elements are retained as zeroes when subsequent columns are rotated. Also note that when a matrix is rotated the only elements that change are the ones in the row with the element being zeroed, and the row with which the element is being rotated. Finally if the Q matrix is specifically required, then the Given’s rotation (sparse) matrices of the form in Eq. 307 can be retained and multiplied together at a later stage. DSPedia 244 The name Given’s rotations is after W. Givens , and the word rotation is used because the transform corresponds to an angle rotation of a vector [ x, y ] T in the x-y plane to the vector [ x r, y r ] T by an angle of θ ; this also explains the name “plane rotation”. y xr yr x – sin θ cos θ y x cos θ = ---------------------x2 + y2 (x, y) θ 0 cos θ sin θ = (xr, yr) x y sin θ = ---------------------x2 + y2 y θ = tan– 1 --x Because of the orthogonal nature of the Given’s rotations, the technique is numerically well behaved. From an intuitive consideration of Eq. it can be seen that the magnitude of c and s will always be less than one (i.e. c < 1 and s < 1 ) and therefore elements in the transformed matrix will have adequately bounded values. Over the last few years Given’s rotations have been widely used for adaptive signal processing problems where fast numerically stable parallel algorithms have been required. See also Matrix Decompositions QR, Recursive Least Squares - QR. • Householder Transformation: The Householder transformation is an m × m matrix, H, used to zero the elements below the main diagonal in the k-th row of a full rank m × n matrix A: Main Diagonal k k m k 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 v v v v 0 0 0 v v v v 0 0 0 v v v v 0 0 0 v v v v m x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x y y y y m x x x y y y y x x x y y y y x x x y 0 0 0 m n n H A HA x x x y y y y Note that only elements in the rows being zeroed actually change. Householder matrices are orthogonal, i.e. HH T = I , and also symmetric, i.e. H = H T . The Householder transformation can be illustrated by noting that the k – 1 lower elements of a k × 1 vector, x, can be zeroed by premultiplying with a suitable Householder matrix: x1 x2 2vv T 2vv T Hx = I – ------------- x = I – ------------- x = 3 vTv vTv : xk where x 0 0 : 0 2 (308) 245 x1 + x 2 x2 v = (309) x3 : xk and the 2-norm, x 2 2 x 12 + x 22 + x 3 + … + x k2 . = Therefore the general Householder matrix, H k , to zero the elements in column k, below the main diagonal of a matrix A can be written in a partitioned matrix form: Hk = I 0 0 H kk a 11 a 12 … a 1k … a 1n a 21 a 22 … a 2k … a 2n : a k1 : a k2 … … : a kk … … : a kn a k + 1, 1 a k + 1, 2 … a k + 1, k … a k + 1, n : a m1 : a m2 … … a 11 a 12 … a 1k … a 1n a 21 a 22 … a 2k … a 2n : = b k1 : b k2 …: … : … b kk … b kn b k + 1, 1 b k + 1, 2 … 0 : b m1 = : b m2 : a mk A 11 A 1k I 0 A 11 A 1k = 0 H kk A k1 A kk H kk A k1 H kk A kk … : … a m, n (310) … b k + 1, n …: …0 … : … b m, n where, a kk + a kk 2v k v kT H kk = I – --------------v kT v k with v k = a k + 1, k : a kk 2 and a kk = a mk where a kk 2 = a k + 1, k : a mk 2 + a2 2 a kk k, k + 1 + … + a mk . A sequence of Householder matrices is very useful for performing certain matrix transforms such as the QR decomposition. Consider an example where a 5 × 3 full rank matrix is to be decomposed into its Q and R components (for clarity all matrix variable row-column subscripts have been omitted): A-Matrix a a a a a a a a a a a a a a a R-Matrix b bb 0bb 0bb 0bb 0bb H1 A bb b 0 c c 00 c 00 c 00 c H2 H 1 A b bb 0 c c 0 0 d 0 00 0 00 H3 H 2 H1 A (311) DSPedia 246 Compared to Given’s rotations, which zero a column vector element by element, Householder transformations requires fewer arithmetic operations, however Given’s rotations have become more popular for modern DSP techniques as a result of their suitability for parallel array implementation [77], [88], unlike the Householder transformation which has no recursive implementation. The zeroing of column elements in a matrix can also be performed by the Gauss transform typically for implementation of algorithms such as LU decomposition. However unlike the Householder transform, Gauss transforms are not orthogonal. Therefore because the Householder transform does not produce matrices with very large or very small elements (which may happen with the Gauss transform) then the numerical behavior is in general good [136]. See also Matrix Decompositions - Given’s Rotations/QR/ SVD/, Recursive Least Squares - QR • LDLT: See Matrix Decompositions - Cholesky. • LDU: LDU decomposition is a special case of LU decomposition, whereby a non-singular matrix n × n A, can be factored into a unit upper triangular matrix L, a unit lower triangular matrix U, and a diagonal matrix, D. See also Matrix Decompositions - Cholesky/LU. • LLT: See Matrix Decompositions - Cholesky. • LU: The LU decomposition is used to convert a non-singular n × n matrix A, into a lower and upper triangular matrix product: l 11 0 0 … 0 l 21 l 22 0 … 0 A = LU = l l l … 0 31 32 33 : : : l n1 l n2 l n3 u 11 u 12 u 13 … u 1n 0 u 22 u 23 … u 2n 0 0 u 33 … u 3n : : : … l nn 0 : 0 : 0 (312) : : … u nn Gaussian elimination (or factorization), via a series of Gauss transforms, can be used to produce the LU decomposition. The k-th Gauss transform matrix, G k will zero all of the elements below the main diagonal in the k-th column of an n × n matrix, A. After applying the Gauss transforms G 1 to G n – 1 an upper triangular matrix is produced: G n – 1 …G 2 G 1 A = U (313) To obtain the lower triangular matrix, the above equation can be rearranged to go: ⇒ A = G 1– 1 G 2– 1 …G n– 1– 1 U ⇒ A = LU (314) where L = G 1– 1 G 2– 1 …G n– 1– 1 . Note that the inverse Gauss transform matrices, G i– 1 are trivial to compute from G i . and they will also be lower triangular matrices (the product of two lower triangular matrices is always lower triangular). If a system of equations, Ax = b is to be solved for the unknown n element vector x, where A is an n × n non-singular matrix, and b a known n element vector, the solution can be found by LU factoring matrix A, and performing a backsubtitution and a forward substitution: Ax = b ⇒ LUx = b ⇒ Ly = b Ux = y solve by forward substitution solve by backward substitution It is however less computation to perform Gaussian elimination which form the U matrix, but does not explicitly form the L matrix. In general using LU decomposition (or Gaussian elimination) to solve a system of linear equations does not have good numerical behavior and the existence of small elements on the diagonals of L and U, and large values elsewhere may lead to the computation requiring a very large (315) 247 dynamic range. Therefore pivoting techniques are usually used on the Gaussian elimination computation in an attempt to circumvent the effects of small and large values. See also Matrix Decompositions - Backsubstitution/Cholesky/Forward Substitution/Gaussian Elimination/ LDU/LDLT/Pivoting. • Partial Pivoting: See entry for Matrix Decompositions - Pivoting. • Pivoting: When performing certain forms of matrix decomposition such as LU, small elements on the main diagonal are used as divisors when producing matrices such as Gauss transforms to zero certain elements in the matrix. If these elements are very small then they can result in very large numbers appearing in the matrices resulting from the decomposition. For example consider the LU decomposition of the following 3 × 3 matrix: A = 0.0001 1 1 1 0 0 0.0001 1 1 – 9999 – 9998 = LU 1 1 2 = 10000 1 0 0 1 13 10000 1 1 0 0 1 (316) If fixed point arithmetic is used, then the dynamic range of numbers required for the L and U matrices is twice that for the A matrix. Small pivot elements can be avoided by rearranging the A matrix elements using orthogonal permutation matrices. Therefore for the above example: 0 0 1 0.0001 1 1 1 1 3 1 00 1 1 3 PA = 1 0 0 1 1 2 = 0.0001 1 1 = 0.0001 1 0 0 0.9999 0.9997 = L p U p –1 0 10 1 1 3 1 1 2 1 01 0 0 (317) and the LU factors now contain suitably small elements. In general when performing pivoting, prior to applying the Gauss transform on the k-th column, the column is scanned to find the smallest element in order to set up the permutation matrix to appropriately swap the rows and attempt to ensure that small pivots are avoided. If a system of linear equations: (318) Ax = b is to be solved using Gaussian elimination (or more exactly LU decomposition with one stage of pivoting), where A is a non-singular n × n matrix, b is a known n element vector, and x is an unknown n element vector then: PAx = Pb ⇒ LUx = Pb ⇒ Ly = Pb Ux = y solve by forward substitution solve by backward substitution If both the rows and the columns are scanned to circumvent small pivots, then this is often referred to as complete pivoting. Column swapping is achieved by postmultiplication of matrix A, with a suitable permutation matrix Q. Pivoting can be used on many other linear algebraic decompositions where small pivoting/divisor elements need to be avoided. Note that because the pivot matrix P (and also Q) is orthogonal, then for least squares type operations, the 2-norm of the pivoted matrix, PA is not affected. See also Matrix Decomposition - Gaussian Elimination/LU, Vector Properties - Norm. • Plane Rotations: See entry for Matrix Decompositions - Given’s Rotations. (319) DSPedia 248 • QR: The QR matrix decomposition is an extremely useful technique in least squares signal processing systems where a full rank m × n matrix A ( m > n ) is decomposed into an upper triangular matrix, R and an orthogonal matrix Q: m a a a a a a a a a a a a a a a a a a a a a a a a x x x x x x m x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x m r 0 0 0 0 0 r r r 0 0 0 r r r r 0 0 QTQ = I n R m Q n A r r 0 0 0 0 If the least squares solution is required for the overdetermined linear set of equations: (320) Ax = b where A is an m × n matrix, b is a known m element vector, and x is an unknown n element vector, then the minimum norm solution is required, i.e. minimize, ε , where ε = Ax – b 2 . This can be found by the least squares solution: (321) x LS = ( A T A ) –1 A T b However noting that the 2-norm (or Euclidean norm) is invariant under orthogonal transforms, then the QR decomposition allows a different computation method to find the solution. Using a suitable sequence of Given’s rotations, or Householder rotations, for a full rank m × n matrix A (where m > n ), the QR decomposition yields: A = Q R 0 , (322) or a 11 a 12 … a 1n q 11 q 12 … q 1n q 1, n + 1 … q 1m a 21 a 22 … a 2n q 21 q 22 … q 2n q 2, n + 1 … q 2m : a n1 : a n2 … … : a nn : q n1 : q n2 … … : q nn : q n, n + 1 … … : q nm = a n + 1, 1 a n + 1, 2 … a n + 1, n : a m1 : a m2 … … : a mn q n + 1 , 1 q n + 1, 2 … q n + 1, n q n + 1, n + 1 … q n + 1, m : q m1 : q m2 … … : q mn … : q m, n + 1 … : q mm r 11 r 12 … r 1n 0 r 22 … r 2n : 0 : … : 0 … r nn 0 : 0 0 … 0 : : : 0 … 0 (323) where Q is an m × m orthogonal matrix, (i.e. QQ T = I ), and R is an n × n upper triangular matrix, and 0 a ( m – n ) × n zero matrix, then: ε = Ax – b 2 = Q T Ax – Q T b 2 = R x– c 0 d = v 2 2 where c is an n element vector, and d and m – n element vector and vector v is therefore computed as: (324) 249 v1 v2 : vn vn + 1 : vm c1 r11 r 12 … r 1n = r 11 x 1 + r 12 x 2 + … + r n x n c1 r 22 x 2 + … + r 2n x n c2 c2 0 r 22 … r 2n x 1 : : : … : x 2 – = c 0 0 … r nn n : d1 0 0 … 0 x n : : : : : dm – n 0 0 … 0 In order to minimize v 2, : r nn x n 0 : 0 – : cn (325) d1 : dm – n note that: v 2 2 = Rx – c 2 2 + d (326) 2 2 Therefore solving the system of equations Rx – c = 0 will give the desired least squares solution of v (note that the sub-vector norm d 2 cannot be minimized) i.e., 2 x LS = R – 1 c (327) which can be conveniently solved using backsubstitution rather than performing the explicit inverse. The least squares residual is simply the value d 2 . Because of the orthogonal nature of the algorithm, the QR is numerically well behaved and represents an extremely powerful and versatile basis for least squares signal processing techniques. Also a brief comparison of the solution obtained in Eq. 321 and that of Eqs. 322-327 will show that the QR approach operates directly on the data matrix, whereas the pseudoinverse form in Eq. 321 requires to square the matrix A. Therefore a simplistic argument is that twice the dynamic range is required to accommodate the spread of numerical values in the pseudoinverse method, as compared to the QR based least squares solution. (Note that both solutions are identical if infinite precision arithmetic is used.) See also Least Squares, Matrix Decompositions - Back substitution/Given’s Rotation/Pseudoinverse, Matrix Properties - Overdetermined, Recursive Least Squares - QR. • Similarity Transform: Two non-singular n × n matrices A and B are said to similar is there exists a similarity transform matrix X, such that: B = X – 1 AX (328) See also Matrix Decompositions - Eigenanalysis. • Singular Value: The singular value decomposition (SVD) is one of the most important and useful decompositions in linear algebraic theory. The SVD allows an m × n matrix A, with r = rank ( A ) ≤ min ( m, n ) to be transformed in the following manner: U T AV = Σ 0 0 0 (329) A = U Σ 0 VT 0 0 (330) and therefore: where U is an m × m orthogonal matrix, i.e. U T U = I , V is an n × n orthogonal matrix, i.e. V T V = I , and Σ is a diagonal sub-matrix containing the singular values of A: Σ = diag ( σ 1, σ 2, σ 3, …, σ r ) (331) DSPedia 250 The Σ matrix is usually written such that σ 1 > σ 2 > … > σ r . The singular value decomposition can be illustrated in a more diagrammatic form. If for matrix A, m > n , and r = rank ( A ) = n the Σ matrix has all non-zero elements in the main diagonal: V UT m n 0 A n m 0 Σ m 0 n n Note if r < n then A has linearly dependent columns and there will be only r non-zero elements: Non-zero main diagonal V m U T n 0 0 Σ 0 m A m 0 0 n n n If for matrix A, m < n , and r = rank ( A ) = m the Σ matrix has all non-zero elements in the main diagonal (again, note if r < m then A has linearly dependent columns and there will be only r non-zero elements): m UT A V m n m n 0 Σ 0 0 n n For signal processing algorithms, one of the main uses of the SVD is the definition of the pseudoinverse, A + which can be used to provide the least squares solution to a system of linear equations of the form: Ax = b (332) where A is an m × n matrix, b is a known m element vector, and x is an unknown n element vector. The least squares, minimum norm solution is given by: x = A+ b (333) –1 A+ = V Σ 0 UT 0 0 (334) where If it is assumed that A has full rank, i.e. rank ( A ) = min ( m, n ) There are three possible cases for the dimensions of matrix A, if: - m = n (square matrix) then A + = A – 1 ; - m > n (the overdetermined problem) then A + = ( A T A ) – 1 A T , and - m < n (the underdetermined problem) then A + = A T ( AA T ) – 1 . 251 The transformation of the pseudo-inverse in Eq. 334 into the three forms shown above can be confirmed with straightforward linear algebra. Note that if A is rank deficient then none of the above three cases apply and solution can only be found using the pseudoinverse of Eq. 334. In DSP systems the overdetermined problem (such as found in adaptive DSP) is by far the most common and recognizable “least squares solution”. However the pseudoinverse also provides a minimum norm solution for the underdetermined problem when A is rank deficient (e.g., inverse modelling problems such as are found in biomedical imaging and seismic data processing). Note that if a non-singular square n × n matrix R is symmetric then the eignenvalue decomposition can then be written as: R = Q T ΛQ (335) where Λ = [ λ 1, λ 2, λ 3, …, λ n ] and the eigenvalues equal the singular values. If in fact, R = A T A , and A is a full rank m × n matrix, then the singular values of A, are the square roots of the eigenvalues. This can be seen by noting that: A T A = V Σ 0 U T U Σ V T = V Σ 0 Σ V T = VΣ 2 V T 0 0 where for illustration purposes m > n . To calculate the singular value decomposition, there are two useful techniques - the Jacobi algorithm and the QR algorithm [15], [77]. See also Least Squares Matrix Properties - Pseudoinverse, Vector Properties - Minimum Norm. . • Spectral Decomposition: The eigenvalue-eigenvector decomposition of a matrix is often referred to as the spectral decomposition. See also Matrix Decomposition - Eigenanalysis. • Square Root Free Given’s Rotations: Square root free Given’s rotations (also known as fast Given’s) are simply a rearranged version of the Given’s rotation, where the square root operation has been circumvented, and an additional diagonal matrix introduced [15]. The reason for doing so is that most DSP processors are not optimized for the square root operation, and hence their implementation can be slow. It is worth pointing out that stable versions of the square root free Given’s require more divisions per rotation than standard Given’s, and DSP processors usually perform square roots faster than divides! Hence the alternative name of fast Given’s, is not a wholly representative name. It is also worthwhile noting that the square root free Given’s may have numerical problems of overflow and underflow, unlike the standard Given’s rotations. Unless square rooting is impossible, there is probably no good reason to use square root free Given’s rotations. . • Square Root Decomposition: See entry for Matrix Decompositions - Cholesky Decomposition. • Triangularization: There are a number of matrix decompositions and algorithms which produce factors of a matrix that have upper and lower triangular forms. Any such procedure can therefore be referred to as a Triangularization. See Matrix Decompositions - Cholesky/LU/QR. Matrix Identities: See Matrix Properties. Matrix Inverse: See Matrix Properties - Inversion. Matrix Inversion Lemma: See Matrix Properties - Inversion Lemma. Matrix Addition: See Matrix Operations - Addition. Matrix Multiplication: See Matrix Operations - Multiplication. Matrix Postmultiplication: See Matrix Operations - Postmultiplication. Matrix Premultiplication: See Matrix Operations - Premultiplication. (336) DSPedia 252 Matrix Operations: Matrices can be added, subtracted, multiplied, scaled, transposed, and inverted. See also Matrix Operation Complexity. • Addition (Subtraction): If two matrices are to be added (or subtracted) then they must be of exactly the same dimensions. Each element in one matrix is added (subtracted) to the analogous element in the other matrix. For example: 35 1 1 54 6 23 + 02 1 = 32 0 2 87 (1 + 3) (5 + 5) (4 + 1) (6 + 0) (2 + 2) (3 + 1) (2 + 3) (8 + 2) (7 + 0) 4 10 5 = 6 4 4 5 10 7 (337) Matrix addition is commutative, i.e. A + B = B + A. (338) ( AB ) T = B T A T • Hermitian Transpose: When the Hermitian transpose of a complex matrix is found, the n-th row of the matrix is written as the n-th column and each (complex) element of the matrix is conjugated. The Hermitian transpose of a matrix A is denoted as AH. Note that the matrix product of AAH will always produce a real and symmetric matrix. A = ( 1 + 2j ) ( – 2 + j ) ( – 1 + 4j ) ( 3 + j ) ( 3 + 7j ) ( 1 + 5j ) ⇔ AH = ( 1 – 2j ) ( 3 – j ) ( – 2 – j ) ( 3 – 7j ) ( – 1 – 4j ) ( 1 – 5j ) (339) ⇒ AA H = 27 25 25 84 Note that if a matrix, B, has only real number elements, then B H = B T . See also Matrix Properties Hermitian, Complex Matrix, Matrix. • Inverse: If for two square matrices A and B: AB = I (340) then B can be referred to as the inverse of A, or B = A-1. If A-1 exists, then A is non-singular. Note that AA – 1 = A – 1 A = I (341) ( AB ) – 1 = B – 1 A –1 (342) and For example 10 1 A = 21 3 ⇒ 01 2 –1 1 –1 A –1 = –4 2 –1 2 –1 1 1 0 1 – 1 1 –1 10 0 AA – 1 = 2 1 3 – 4 2 – 1 = 0 1 0 0 1 2 2 –1 1 00 1 Inversion of matrices is useful for analytical procedures in DSP, however its use in real time computation is rare because of the very large computation requirements and the potential numerical instability of the algorithm. In general the explicit inversion of matrices is circumvented by the use of linear algebraic (343) (344) 253 methods such as LU decomposition (with pivoting), QR decomposition and Cholesky decomposition (for symmetric matrices) which have improved numerical properties [15]. • Kronecker Product: This is a useful mathematical operator for generating vectors and matrices. It is particularly useful in interpretive programming languages such as MatlabTM for implementing simple DSP operations such as upsampling. In general, the Kronecker Product multiplies every element of one matrix by a second matrix and arranges these matrices into the same shape as the first matrix. • Multiplication: The multiplication of two matrices AB is only possible when the number of columns in A is the same as the number of rows in B. Each row of matrix A is multiplied by each column of B in a sum of products (or vector inner product) form. If A is an m × n matrix and B is an n × p matrix the result will be C, an m × p matrix. (Note that because of the dimensions the product of BA cannot be formed unless m = p . Matrix matrix multiplication is not a commutative operation, i.e. in general AB ≠BA) For example, if we form the matrix product C = AB, where A is a 3 × 4 , and B is a 4 × 2 matrix: m B = o q s ab c d A = e f gh i j k l n p r t (345) then mn ( am + bo + cq + ds ) ( an + bp + cr + dt ) a b c d o p = ( me + fo + gq + hs ) ( ne + fp + gr + ht ) e f g h q r ( im + jo + kq + ls ) ( in + jp + kr + lt ) i j k l s t A 3×4 B 4×2 C 3×2 In general for an m × n matrix, A, and an n × p matrix, B, the m × p elements of the product matrix C will have elements: n c ij = ∑ a ik b kj o (346) k=1 • Matrix-Vector Multiplication: Multiplication of a vector by a matrix is a special case of matrix multiplication, where one of the matrices to be multiplied is a vector, or n × 1 matrix. Multiplication of an n ×1 vector by an m ×n vector yields an m ×1. a 11 a 12 a 13 a 14 y = Rx = a 21 a 22 a 23 a 24 a 31 a 32 a 33 a 34 3 ×4 x1 x2 x3 x4 4 ×1 ( a 11 x 1 + a 12 x 2 + a 13 x 3 + a 14 x 4 ) y1 = ( a 21 x 1 + a 22 x 2 + a 23 x 3 + a 24 x 4 ) = y 2 ( a 31 x 1 + a 32 x 2 + a 33 x 3 + a 34 x 4 ) (347) y3 3 ×1 • Premultiplication: See Postmultiplication. • Postmultiplication: Noting that in general for two matrices, A and B, (of dimension n × m and m × n respectively): DSPedia 254 (348) AB ≠ BA and therefore when multiplying two matrices it is important to specify the order. If it is required to multiply two matrices then the order can be verbosely described using the term postmultiplication or premultiplication. To state that matrix C is formed by A being postmultiplied by B means: (349) C = AB which is equivalent to stating that B is premultiplied by A. • Scaling: A matrix, A, is scaled by multiplying every element by a scale factor, c. a 11 a 12 a 13 ca 11 ca 12 ca 13 (350) cA = c a 21 a 22 a 23 = ca 21 ca 22 ca 23 a 31 a 32 a 33 ca 31 ca 32 ca 33 • Transpose: The transpose of a matrix is obtained by writing the n-th column (top to bottom) of the matrix as the n-th row (left to right). The transpose of a matrix, A, is denoted as AT. For example, if: a 11 a 12 a 13 A = a 11 a 21 a 31 a 41 a 21 a 22 a 23 (351) A T = a 12 a 22 a 32 a 42 ⇒ a 31 a 32 a 33 a 13 a 23 a 33 a 43 a 41 a 42 a 43 Therefore if B = A T , then for every element of A and B, a ij = b ji . Note also the identity: ( AB ) T = B T A T (352) ( AT )T = A (353) and The product of A T A is frequently found in DSP particularly in least squares derived algorithms. See also Hermitian Transpose. • Subtraction: See Matrix-Vector Addition. • Vector-Matrix Multiplication: See Matrix-Vector Multiplication. Matrix Operation Complexity: The number of arithmetic operations to perform the fundamental matrix operations of addition (subtraction), multiplication and inversion can be given in terms of the number of multiplies, adds, divisions and square roots that are required. Matrix Operation Matrix Dimension Additions Multiplies Divides/Sqrts Addition A + B (m × n) + (m × n) mn 0 0 Multiplication AB (m × n).(n × p) mnp mnp 0 Inversion A-1 (n × n) O( n3 ) O( n3) O(n2 ) In general if a matrix is sparse (e.g. upper triangular, diagonal etc.) then the number of arithmetic operations will be reduced since operations with one or more zero arguments need not be 255 performed. For example multiplication of two diagonal matrices both of dimension n × n requires only n multiplies and adds. Also inversion of a diagonal matrix only requires n divisions. It is worth noting that the matrix inverse is rarely calculated explicitly and systems of linear equations of the form Ax = b are usually solved via Gaussian Elimination, or QR decomposition type algorithms [15]. Matrix, Partitioning: It is often convenient to group the elements of a matrix into smaller submatrices either for notational convenience or to highlight a logical division between two quantities represented in the same matrix. For example the 6 × 4 matrix A, can be partitioned into four 3×2 submatrices: a 11 a 12 a 13 a 4 a 21 a 22 a 23 a 24 A = a 31 a 32 a 33 a 34 a 41 a 42 a 43 a 44 = A 11 A 12 (354) A 21 A 22 a 51 a 52 a 53 a 54 a 61 a 62 a 63 a 64 A partitioned matrix is often referred to as a block matrix, i.e. a matrix in which the elements are submatrices, rather than scalars. The use of block matrices is often exploited in the development of DSP algorithms for notational convenience. The specification of an algorithm using partitioned matrices (block matrices) is often referred to as a block algorithm. Block algorithms (such as block matrix multiplication and addition etc.) should be expressed such that the block dimensions and the submatrix dimensions are consistent with the normal procedures of the matrix operation. QR decomposition and the matrix vector form of an IIR filter can be conveniently represented as block matrix algorithms. For example consider the multiplication of the 6 ×4 matrix partitioned into 3 ×2 blocks (or submatrices) by a 4 ×4 matrix partitioned into 2 ×2 blocks or submatrices. The product C = AB can be expressed in terms of the submatrices. Note that the dimensions of the submatrices Aim and Bmj must be such that they can be matrix multiplied. In this example the result gives submatrices Cij of dimension 3 ×2. A = C = A 11 A 12 B 11 B 12 A 21 A 22 B 21 B 22 = A 11 A 12 A 21 A 22 B = B 11 B 12 (355) B 21 B 22 ( A 11 B 11 + A 12 B 21 ) ( A 11 B 12 + A 12 B 22 ) ( A 21 B 11 + A 22 B 21 ) ( A 21 B 12 + A 22 B 22 ) = C 11 C 12 C 21 C 22 (356) Matrix Properties: In this entry properties of a matrix include useful identities and general forms of information that can be extracted from or stated about a matrix. See also Matrix Decompositions, Matrix Operations. • Condition Number: The condition number provides a measure of the ill-condition or poor numerical behavior of a matrix. Consider the following set of equations where A is a known n × n non singular matrix, and b is a known n × 1 vector: DSPedia 256 (357) Ax = b The solution to this system of equations is well known to be: (358) x = A –1 b Using a processor with “infinite” arithmetic precision an exact answer will be obtained. If however the equation is to be solved using finite precision arithmetic, then this can be modeled as a small error added to the elements of A and d where this error is such that: δA -----------≈ε A and δx ----------≈ε x and ε « 1 (359) Therefore the problem is now one of solving: x + δx = ( A + δA ) – 1 ( b + δb ) (360) where δA and δb represent the error (or perturbation) matrix and vector of A and b respectively. It can be shown that the relative error of the norm (perturbation) of the vector x is given by: δx ----------- ≤ εκ ( A ) x (361) where for a square matrix A the condition number, κ ( A ) , is defined as: (362) κ ( A ) = A A –1 The norm of a matrix, A , gives information in some sense of the magnitude of the matrix. One measure of matrix norm is its largest singular value. If the matrix A is decomposed using the singular value decomposition (SVD): (363) A = UΣV T where Σ = diag ( σ 1, σ 2, σ 3, …, σ n ) is a diagonal matrix denoting the singular values of A, and U and V are orthogonal matrices. The condition number of matrix A, denoted as κ ( A ) is defined as the ratio of the largest singular value to the smallest singular value (in accordance with Eq. 362): max ( σ i ) κ ( A ) = --------------------- for 0 < i < n min ( σ i ) (364) Therefore if a matrix has a very large condition number a simple interpretation is that when solving equations of the form in Eq. 358 then even very small errors in the matrix A, as modelled in Eq. 360, may lead to very large errors in the solution vector x; hence “numerical” care must be taken. To state the relevance of κ ( A ) in another way, if the condition number is very large then this implies that when calculating the inverse matrix: (365) A – 1 = VΣ – 1 U T the dynamic range of numbers in the inverse will be very large. This easily seen by noting that Σ = diag ( σ 1–1, σ 2– 1, σ 3– 1, …, σ n– 1 ) . For example if: A= 10 02 then A – 1 = 1 0 0 0.5 and κ ( A ) = 2 the matrix A is well-conditioned and a numerical dynamic range of around 0.1 to 10 ( 40dB = 20 log ( 10 ⁄ 0.1 ) ) is “suitable” for the arithmetic. However for a matrix B: (366) 257 0 B= 1 0 0.0001 0 then B – 1 = 1 0 10000 and κ ( B ) = 10000 (367) the condition number highlights the ill-conditioning of the matrix, and this time a numerical dynamic range of around 0.00001 to 10000 (160dB) is required for reliable arithmetic. Therefore matrix A could be reliably inverted by a 16 bit DSP processor (96dB dynamic range), whereas matrix B would require a 32 bit floating point DSP processor (764dB dynamic range). Note that the larger the condition number the “closer” the matrix is to singularity. A singular matrix will have a condition number of ∞ . For analysis of many DSP algorithms note that the condition number is often given as the ratio of the largest eigenvalue to the smallest eigenvalue: λ max Largest Eigenvalue- = ----------κ ( A ) = ----------------------------------------------------λ min Smallest Eigenvalue (368) This is because in most DSP problems solved using linear algebra techniques the matrix A is square and very often symmetric positive definite, and the eigenvalue decomposition is in fact a special case of the more general singular value decomposition, and the eigenvalues are the same as the singular values. See also Adaptive Signal Processing, Matrix Decompositions - Eigenvalue/Singular Value, Matrix Properties Norm/Eigenvalue Ratio, Vector Properties - Norm, Recursive Least Squares. • Conjugate Transpose: See Matrix Properties - Hermitian Transpose. • Determinant: Noting that the for a 1 × 1 matrix, α = [ a ] , the determinant is given by det ( α ) = a , the determinant of a square matrix, A of dimension m × m can be defined recursively in terms of the determinant of a related ( m – 1 ) × ( m – 1 ) matrix, A 1i obtained by deleting the first row and the i-th column of A. m det ( A ) = ∑ ( –1 )i + 1 a 1i det ( A1i ) (369) i=1 where a 1i is the first element in the i-th column of the matrix. If det ( A ) = 0 then the matrix is singular. Also for two square matrices A and B it can be shown that det ( AB ) = det ( A )det ( B ) , and det ( A T ) = det ( A ) . In general the determinant of a matrix defines the number of independent rows/ columns of the matrix. • Eigenvalue: For a square n × n matrix, A, if there exists a non-zero n × 1 vector x, and a non-zero scalar λ such that: Ax = λx (370) then λ is an eigenvalue and x is an eigenvector of matrix A. See also Matrix Decompositions Eigenanalysis. • Eigenvalue Ratio: The ratio of the largest eigenvalue to the smallest eigenvalue, denoted κ ( A ) , for a square symmetric positive definite matrix, A: λ max Largest Eigenvalue- = ----------κ ( A ) = ----------------------------------------------------λ min Smallest Eigenvalue is more precisely known as the condition number of a matrix. The eigenvalue ratio (also known as eigenvalue spread) gives information about the general numerical behavior (good or otherwise!) of a data matrix A to when a problem usually of the form, Ax = b is solved for the unknown vector x, i.e. x = A – 1 b . See also Matrix Properties - Condition Number, Matrix Decompositions - Eigenvalue/Singular Value, Adaptive Signal Processing Algorithms. • Eigenvalue Spread: See entry for Matrix Properties - Condition Number/Eigenvalue Ratio. (371) DSPedia 258 • Frobenius Norm: See Matrix Properties - Norm. See also Vector Properties - Norm. • Hermitian (Symmetric): A complex matrix is often described as Hermitian if A = A H . Synonymous names are Hermitian symmetric, or complex-symmetric. Note that if the matrix A is real, then A = A T and A would be described as symmetric. See Matrix Decompositions - Hermitian Transpose. • Hermitian Transpose: For two complex matrices A ( m × n ) and B ( n × m ) then the Hermitian transpose of the product can be written as: ( AB ) H = B H A H (372) (AH )H = A (373) Note that: A “dagger” is often used as the Hermitian transpose symbol, i.e. A H = A† The matrix product R of an m × n matrix, A and its Hermitian transpose, A H will always produce a conjugate symmetric m × m matrix, i.e. R = R H : A = ( 1 + 2j ) ( – 2 + j ) ( – 1 + 4j ) ( 3 + j ) ( 3 + 7j ) ( 1 + 5j ) ⇒ R = AA H = AH = ⇔ 27 25 – 31j ( 1 – 2j ) ( 3 – j ) ( – 2 – j ) ( 3 – 7j ) ( – 1 – 4j ) ( 1 – 5j ) (374) 25 + 31j = R H 84 (also, if A is full rank, the R will be positive definite, otherwise R will be positive semi-definite). Note that if a matrix, B, has only real number elements, then the Hermitian transpose is equivalent to the normal matrix transpose, i.e. B H = B T . See also Complex Matrix, Complex Numbers, Matrix Properties - Hermitian Transpose. • Ill-Conditioned: An m × n matrix, A is said to be ill-conditioned when the condition number, calculated as the ratio of the maximum singular value to minimum singular value (or maximum eigenvalue to minimum eigenvalue for n × n matrices) is very high. A matrix that is not ill-conditioned is well-conditioned. For more detail see entry Matrix Properties - Condition Number. See also Matrix Decompositions Eigenvalue/Singular Value • . ∞ -norm: See Matrix Properties - Norm. • Inversion: For two square invertible matrices A and B then: ( AB ) – 1 = B – 1 A –1 (375) See also Matrix Operations - Inversion. • Inversion Lemma: If A and C are nonsingular square matrices and B and D are of compatible dimension such that: P = A + BCD (376) and P is non singular, then the matrix inversion lemma allows P – 1 to be expressed as: P – 1 = A –1 – A – 1 B ( C – 1 + DA – 1 B ) – 1 DA –1 This identity can be confirmed by multiplying the right sides of Eq. 376 and Eq. 377 together: (377) 259 ( A + BCD ) ( A – 1 – A –1 B ( C – 1 + DA – 1 B ) – 1 DA – 1 ) = I + BCDA – 1 – B ( C – 1 + DA – 1 B ) –1 DA – 1 + BCDA – 1 B ( C – 1 + DA – 1 B ) – 1 DA – 1 = I + BCDA – 1 – B [ ( C –1 + DA – 1 B ) – 1 + CDA – 1 B ( C – 1 + DA – 1 B ) –1 ]DA – 1 (378) = I + BCDA – 1 – B ( C – 1 + DA – 1 B ) –1 ( I + CDA – 1 B )DA – 1 = I + BCDA – 1 – B ( C – 1 + DA – 1 B ) –1 ( C – 1 + DA – 1 B )CDA – 1 = I + BCDA – 1 – BCDA – 1 = I QED For some digital signal processing algorithms (such as the recursive least squares (RLS) algorithm) it is often that case that C is a 1 × 1 identity matrix, B is a vector and D the same vector transposed. Also for notational reasons A is written as an inverse matrix. Therefore applying the matrix inversion lemma to: P = R –1 + vv T (379) P –1 = R – Rv ( 1 + v T Rv )v T R (380) gives • Non-negative Definite: See entry for Matrix Properties - Positive Definite. • Nonsingular: See Matrix Properties - Singular. • Norm: A matrix norm gives a measure of the overall magnitude of the matrix space. The most common norms are the Frobenius norm and the set of p-norms. The Frobenius norm of an m × n matrix A, is usually denoted, A m A F F and calculated as: n ∑ ∑ aij2 = (381) i = 1j = 1 The p-norms are generally defined in terms of vector p-norms and calculated as A p Ax p = max --------------x p (382) This can also be expressed in the form: A p = max Au p where u p = 1 (383) On an intuitive level, the matrix 2-norm gives information on the amount by which a matrix will “amplify” the length (vector 2-norm) of any unit vector. Typically p = 1, 2 or ∞ . Note that the ∞ norm is easily calculated as the largest element magnitude in a matrix. See also Matrix Properties - Condition Number, Vector Properties - Norms. • Null Space: The null space of A is defined as: null ( A ) = { x ∈ ℜ N, where Ax = 0 } Intuitively, the null space of A is the set of all vectors orthogonal to the rows of A. See also Matrix Properties - Rank/Range, Vector Properties - Space/Subspace. • 1-norm: See Matrix Properties - Norm. • Overdetermined System: The linear set of equations, Ax = b, where A is a known m × n matrix with linearly independent columns (i.e. rank ( A ) = n ), b is a known m element vector and x is an unknown n element vector, is said to be overdetermined if m > n thus meaning there are more equations than unknowns. An overdetermined system of equations has no exact solution for x. However by minimizing (384) DSPedia 260 the 2-norm of the error vector, e = Ax – b i.e. minimizing ε = ( Ax – b 2 ) , the least squares solution is found: x LS = ( A T A ) –1 A T b (385) For example, given the overdetermined system of equations (note there is no exact solution): 3 1 0 x 1 = 4 0 0 x 2 0 1 2 (386) we can make a geometrical interpretation of the least squares solution by representing the various vectors and projected vectors in three dimensional space: y 4 z 3 2 -1 Vector Ax 3 1 -2 e b 1 0 1 2 2 3 x 4 Now considering the subspace defined by the matrix: 10 A = 00 01 (387) the columns only span the x-z plane ( y = 0 ) of the above three-dimensional space. Therefore the vector Ax LS that minimizes the norm of the error vector, ε = Ax – b 2 , must lie on the x-z plane. Using the least squares solution: x LS = ( A T A ) – 1 A T b = 10 1 0 0 00 0 0 1 01 –1 3 3 1 00 1 0 10 0 3 4 = 4 = 0 01 0 1 00 1 2 2 2 (388) From the above geometrical representation it should be clear that the because the vector Ax is constrained to lie in the x-z plane, if the 2-norm (Euclidean length) of the error vector e = Ax – b is to be minimized this will occur when e is perpendicular (orthogonal) to the x-z plane, i.e. the same solution as the least squares. For problems with more than three dimensions a geometric interpretation cannot be offered explicitly, however intuition gained from simpler examples is useful. See also Least Squares, Square System of Equations, Matrix Properties - Underdetermined System, Vector Properties - 2 norm. • Positive Definite: An n × n square matrix, A, is positive definite if: x T Ax > 0 for all non-zero n element vectors, x. If (389) 261 (390) x T Ax ≥ 0 then A is said to be positive semi-definite or non-negative definite. Note that if a matrix B has full column rank, then a matrix R, calculated as R = B T B is always positive definite. R will also be symmetric. This can be simply seen by noting that: x T Rx = xB T Bx = Bx 2 2 (391) where Bx 22 is the square of the 2-norm of the vector which is always a positive quantity for non zero vectors x. Noting that a symmetric matrix can always be decomposed into its square root or Cholesky form, then all symmetric matrices are positive definite. See also Correlation Matrix, Matrix Decompositions - Cholesky, Vector Properties - Norm. • Positive Semi-definite: See entry for Matrix Properties - Positive Definite. • Pseudo-Inverse: If an m × n matrix, where m > n has rank(A) = n, then the system of equations Ax = b cannot be solved by calculating x = A – 1 b because A is clearly non-invertible. However the least squares solution can be found such that: x LS = ( A T A ) – 1 A T b (392) If A is not full rank (i.e., rank(A)<n) however, then the inverse of ( A T A ) will fail to exist. In this case, the pseudo-inverse of A, A+, is used. The pseudo-inverse is defined from the singular value decomposition of A as: –1 T A+ = V Σ 0 U 0 0 (393) where A has been decomposed (see Matrix Decompositions-Singular Value) into A = U Σ 0 VT 0 0 with Σ being a rank r (r<n) diagonal matrix with a well-defined inverse. If A happens to be full rank then the pseudo-inverse can be directly related to A as: A + = ( A T A ) – 1 A T if m>n. While we have focussed on the over-determined problem here, we should note that the pseudoinverse also provides a minimum norm solution for the underdetermined problem where A is rank deficient. See also Least Squares, Matrix Decompositions - Singular Value Decomposition, Overdetermined System, Underdetermined System. • Rank: The rank of a matrix is equal to the number of independent rows or columns of the matrix. For an m × n matrix, A, where m ≥ n , then rank ( A ) = n if and only if the column vectors are linearly independent; note than rank ( A ) = rank ( A T ) . Similarly, if m < n then rank ( A ) = m if and only if the row vectors of A are linearly independent. If rank ( A ) < min ( m, n ) then the matrix may be described as rank deficient. Note that for an m × m square matrix, if rank ( A ) < m then the matrix is singular. While in an analytical, academic framework (i.e., infinite precision), the concept of rank is clearly defined, it becomes somewhat more problematic to define rank when working with matrix based packages such as MatlabTM. Because of round-off errors, it is possible to have a test for matrix rank indicate a full rank matrix, when the matrix is actually very poorly conditioned. In some cases software packages warn of rank deficiencies (especially on matrix inversions). However, in DSP applications the significance of low power dimensions is often very application specific. Therefore, it is generally a good idea to pay attention to the condition number of matrices with which you are working. As an example, if you are performing a least squares filter design and the coefficient magnitudes become enormous (say on the order of 1015) when you were expecting much more reasonable numbers (say 10-1, 101, etc.) this is a good indication of possible rank deficiency (in this case, the rank deficiency is unlikely to be detected by software monitoring). (394) DSPedia 262 See also Matrix Properties - Range/Singular/Condition Number. • Rank Deficient: See entry for Matrix Properties - Rank. • Range Space: For an m × n matrix A, the subspace spanned by the column partitioning of the matrix A = [ a 1, a 2, a 3, …, a n ] is referred to as the range space of the matrix. Therefore: range ( A ) = { y ∈ ℜ m, where y = Ax }, for any x ∈ ℜ n (395) See also Vector Properties - Space/Subspace. • Singular: For a square matrix, A , if there exists no matrix, X such that AX = I (where I is the identity matrix) then the inverse matrix, A –1 does NOT exist and the matrix is singular; otherwise the matrix is nonsingular. For example the matrix: A = 10 90 (396) is singular as there exists no matrix X such that AX = I . For an n × n singular matrix, A, the rank will less than n. See also Matrix Decompositions - Singular Value Decompositions, Matrix Properties Pseudo-Inverse. • Singular Value: See Matrix Decompositions - Singular Value Decomposition. • Sherman-Morrison-Woodbury Formula: See Matrix Properties - Inversion Lemma. • Space: See Vector Properties - Space. • Square Root Matrix: If a symmetric matrix, R, is decomposed into its Cholesky factors: (397) R = LL T where L is a lower triangular matrix, L is often also called a square root matrix of R. There are many other definitions of matrix square root. For example, for the symmetric square matrix R: 1 --2 1 --2 (398) R ≡ VΛ V T where the eigen-decomposition of R is used and the square root of the diagonal matrix of eigenvalues is simply defined as the diagonal matrix of the square root of the individual eigenvalues. See also Matrix Decompositions - Cholesky/Eigenanalysis. • Square System of Equations: The linear set of equations: (399) Ax = b where A is a known non-singular n × n matrix (i.e., rank(A)=n), b is a known n element vector, and x is an unknown n element vector, represents a square system of equations which has an exact solution for x given by: (400) x = A –1 b For example: 3 2 x1 1 1 x2 = 1 3 ⇒ x1 x2 = 1 –2 1 –1 3 3 = –5 8 For large n it is usually not advisable to calculate A-1 directly due to potential numerical instabilities particularly if A is ill-conditioned. Equations of the form in Eq. 399 are best solved using orthogonal techniques such as the QR algorithm, or more general matrix decomposition techniques such LU decomposition (with pivoting), or Cholesky decomposition if A is symmetric. If matrix A has m > n then (401) 263 the problem is overdetermined and if m < n then the problem is underdetermined. If the rank of A is less than n, then the pseudo-inverse is required. See also Least Squares, Matrix Decompositions - Cholesky/ LU/QR/SVD, Matrix Properties Ill-Conditioned/Overdetermined System/Pseudo-Inverse/ Underdetermined System. • Subspace: See Vector Properties - Subspace. • Trace: The trace of a square n × n matrix, A, is defined as the sum of the diagonal elements of that matrix: a 11 a 12 … a 1n trace ( A ) = trace a 21 a 22 … a 2n : : a n1 a n2 : : … a nn n = ∑ aii (402) i=1 It is relatively straightforward (using matrix decompositions) to show that for any m × n matrix A, and any n × m matrix B, then: trace ( AB ) = trace ( BA ) (403) In DSP a particularly useful property of the trace is that trace ( A ) = λ1 + λ 2 + … + λn , where λ i is the ith eigenvalue of an n × n matrix A. See also Matrix Decompositions - Eigenanalysis. • Transpose: For two matrices A ( m × n ) and B ( n × m ) then the transpose of the product can be written as: ( AB ) T = B T A T (404) ( A T )T = A (405) Note that: The product of any m × n matrix and its transpose gives an m × m square symmetric matrix: A = 1 2 –3 4 –1 5 ⇒ AA T = 1 2 – 3 4 –1 5 1 4 14 – 13 2 –1 = – 13 42 –3 5 (406) • 2-norm: See Matrix Properties - Norm. • Underdetermined System: The linear set of equations Ax = b is said to be underdetermined, when A is a known m × n matrix with m < n , b is a known m element vector and x is an unknown n element vector. Essentially, there are fewer equations than unknowns and an infinite number of solutions for x exist. If A has linearly independent rows (i.e. rank ( A ) = m ), then there are an infinite number of exact solutions. If rank(A)<m, however, then the set of equations may be inconsistent, i.e., no exact solution exists. In this latter case, an infinite number of least squares (inexact) solutions exists, with the pseudo-inverse giving the minimum norm solution. An underdetermined system of equations has an infinite number of solutions for x. Consider the following underdetermined system of equations: a 11 a 21 i.e. x1 x2 = b1 a 11 x 1 + a 21 x 2 = b 1 Choosing any value for x1, a value of x2 satisfying the underdetermined system of equations can be produced. Hence there is no unique solution and there are an infinite number of solutions. However some (407) DSPedia 264 solutions are “better” than others, and the minimum norm solution, where the smallest magnitude 2-norm x 2 is calculated can be found using least squares techniques. The overdetermined problem can be usefully illustrated geometrically. Consider the following overdetermined system of equations: x1 1 00 3 x2 = 0 01 2 x3 (408) The solution set to Eq. 408 is: x 1 = 3, x 2 = Any Real Number, x 3 = 2 (409) Representing this solution in three dimensional space y x2 z x e 3 1 -2 -1 0 1 2 b 2 3 4 x From a geometrical interpretation, regardless of the magnitude of x 2 , the matrix A will project the vector x onto b. The underdetermined least squares problem can however be uniquely solved using the minimum norm solution. If the 2-norm of the error vector e = Ax – b is minimized, i.e. ε = e 2 = Ax – b 2 , then from the above geometrical interpretation the best solution occurs when x 2 = 0 . This solution is unique and best in the sense that the x vector has minimum norm. This solution can be calculated by using the least squares solution for underdetermined systems: x LS 1 0 1 0 1 0 0 1 – T T = A ( AA ) b = 0 0 0 0 0 01 01 0 1 –1 10 3 3 = 1 0 3 = 00 0 2 0 1 2 01 2 See also Least Squares, Matrix Decompositions - Singular Value, Overdetermined systems, Square System of Equations. • Well-Conditioned: An m × n matrix, A is said to be well-conditioned when the condition number, calculated as the ratio of the maximum singular value to minimum singular value (or maximum eigenvalue to minimum eigenvalue for n × n matrices) is low relative to the precision of the system on which the matrix is being manipulated. A matrix that is not well-conditioned is ill-conditioned. For more details see entry Matrix Properties - Condition Number. See also Matrix Decompositions - Eigenvalue/Singular Value. • Woodbury’s Identity: See Matrix Properties - Inversion Lemma. Matrix Scaling: See Matrix Operations - Scaling. (410) 265 Matrix, Structured: A matrix that has regularly grouped elements and a specific structure of zero elements is called a structured matrix. When structured matrices are to be used in calculations, the zeroes in the structure can often be exploited to reduce the total number of computations, and the matrix storage requirements. A number of key structured matrices often found in linear algebra based DSP algorithms and analysis can be identified. See also Matrix Decompositions, Matrix Operations, Matrix Properties. • Band: In a band matrix the upper right and lower left corners of the matrix are zero elements, and a band of diagonal elements are non-zero. For example a 5 × 6 matrix with band width of 3 may have the form: b 11 b 12 0 B = 0 0 0 b 21 b 22 b 23 b 0 0 0 b 32 b 33 b 34 0 0 0 0 b 43 b 44 b 45 0 0 0 (411) 0 b 54 b 55 b 56 • Bidiagonal: A matrix where only the main diagonal, and the first diagonal (above or below the main) are non-zero. See also Bidiagonalization. d 1 g1 0 0 0 d2 g2 0 E = 0 0 d3 g3 (412) 0 0 0 d4 • Circulant: An n × n circulant matrix has only N distinct elements, where each row is formed by shifting the previous row by one element to the right in a circular buffer fashion. One interesting property of circulant matrices is that the eigenvalues can be determined by taking a DFT of the first row. The eigenvectors are given by the standard basis vectors of the DFT. See also Matrix-Structured-Toeplitz. r0 r1 r2 r3 r3 r0 r1 r2 C = (413) r2 r3 r0 r1 r1 r2 r3 r0 • Diagonal: A diagonal matrix has all elements, except those on the main diagonal, equal to zero. Multiplying an appropriately dimensioned matrix by a diagonal matrix is equivalent to multiplying the i-th row of the matrix, by the i-th diagonal element. Diagonal matrices are usually square matrices, although this is not necessarily the case. d 11 0 D= 0 0 0 d 22 0 0 0 0 d 33 0 0 0 For shorthand, a diagonal matrix is often denoted as: D = diag(d1 d2 d3 d4) where di = dii. 0 d 44 (414) DSPedia 266 • Identity: The identity matrix has all elements zero, except for the main diagonal elements which are equal to one. The identity matrix is almost universally denoted as I. For any matrix A, multiplied by the appropriately dimensioned identity matrix, the result is A. Any matrix multiplied by its inverse, gives the identity matrix. See also Diagonal Matrix, Matrix Inverse. 10 00 I = 01 00 (415) 00 10 00 01 • Lower Triangular: A matrix where all elements below the main diagonal are equal to zero. Lower triangular matrices are useful in solving linear algebraic equations with algorithms such as LU (lower, upper) decomposition. Useful properties are that the product of a lower triangular matrix, and a lower triangular matrix is a lower triangular matrix, and the inverse of a lower triangular matrix is a lower triangular matrix. See also Forward-substitution, Upper Triangular Matrix. l 11 0 L = 0 0 l 21 l 22 0 0 l 31 l 32 l 33 0 (416) l 41 l 42 l 43 l 44 • Orthogonal: A matrix is called orthogonal (or orthonormal) if its transpose, QT, forms the inverse matrix Q –1 , i.e. Q T = Q – 1 and, Q T Q = I = QQ T (417) It can also be said that the columns of the matrix Q form an orthonormal basis for the space ℜ m. While the terms orthogonal and orthonormal are used interchangeably as applied to matrices, they have distinct meanings when applied to sets of functions or vectors -- with orthonormal indicating unit norm for every element in an orthognonal set. See also Matrix Decompositions Eigenvalue/QR, Matrix Properties Unitary Matrix. • Orthonormal: See Orthogonal. • Permutation: A matrix that is essentially the identity matrix with the row orders changed. Multiplying another matrix, A, by a permutation matrix, P, will swap the row orders of A. In general multiplication of a matrix by a permutation matrix does not change any of the fundamental quantities such as eigenvalues, condition number. The permutation matrix is an orthogonal matrix. 0 P = 0 0 1 0 1 0 0 0 0 1 0 1 0 0 0 • Rectangular: A matrix that does not have the same number of rows and columns. • Sparse: Any matrix with a large proportion of zero elements is often termed a sparse matrix. Matrices such as lower triangular, diagonal etc can be described as structured sparse matrices. When performing matrix algebra on sparse matrices, the number of MACs required is usually greatly reduced over an equivalent operation using the a full populated matrix, given that many null operations are performed, e.g. multiplies and additions that have one or two zero values. • Square: A matrix with the same number of rows as columns. Covariance and correlation matrices are necessarily square. (418) 267 • Symmetric: A matrix is symmetric if A = AT. The line of symmetry is therefore through the main diagonal. Many matrices used in DSP algorithms are symmetric, such as the correlation matrix. s 11 s 12 s 13 s 14 S = s 12 s 22 s 23 s 24 s 13 s 23 s 33 s 34 (419) s 4 s 24 s 34 s 44 • Toeplitz: This matrix has constant elements in all diagonals. The correlation matrix of stationary stochastic N element data vector forms an N × N Toeplitz matrix. See also Matrix-Circulant, and Correlation Matrix, Covariance Matrix. r 0 r 1 r 2 r3 T = r –1 r 0 r 1 r2 r –2 r –1 r 0 r1 (420) r –3 r –2 r –1 r0 • Tridiagonal: A matrix where only the main, first upper and first lower diagonals are non-zero elements. t 11 s 12 0 T = 0 v 21 t 22 s 23 0 0 v 32 t 33 s 34 0 (421) 0 v 43 t 44 • Unitary: A complex data matrix is unitary if the transpose of a complex data orthogonal matrix, UT, forms the inverse matrix U – 1 , i.e. U T = U – 1 and therefore, U T U = I = UU T (422) The unitary property is the complex matrix equivalent property of orthogonality. See also Eigenvalue Decomposition, QR algorithm, Unitary Matrix. • Upper Triangular: A matrix where all elements above the main diagonal are equal to zero. Upper triangular matrices are useful in solving linear algebraic equations with algorithms such as LU (lower, upper) decomposition. Use properties are that the product of an upper triangular matrix, and an upper triangular matrix is an upper triangular matrix, and the inverse of an upper triangular matrix is an upper triangular matrix. See also Back-substitution, Lower Triangular Matrix. u 11 u 12 u 13 u 14 U = 0 u 22 u 23 u 24 0 0 u 33 u 34 0 0 (423) 0 u 44 Matrix-Vector Multiplication: See Matrix Operations - Matrix-Vector Multiplication. Maximum Length Sequences: If a binary sequence is produced using a pseudo random binary sequence generator, the sequence is said to be a maximum length sequence if for an N bit register, the binary sequence is of length 2 N – 1 before it repeats itself. In a maximum length sequence the DSPedia 268 number of 1’s is one more than the number of 0’s. Also known as m-sequences. See also PseudoRandom Binary Sequence. Mean Value: The statistical mean value of a signal, x ( k ) , is the average amplitude of the signal. Statistical mean is calculated using the statistical expectation operator, E { . } : E { x ( k ) } = Statistical Mean Value of x ( k ) = ∑ x ( k )p { x ( k ) } (424) k where p { x ( k ) } is the probability density function of x ( k ) . In real time DSP the probability density function of a signal is rarely known. Therefore to find the mean value of a signal the more intuitively obvious calculation of a time average computed over a large and representative number of samples, N, is used: 1 Time Average = ---N N–1 ∑ x(k) (425) k=0 Mean Value x(k) N-1 time, k The time averaged mean value can be calculated by finding the average signal amplitude over a large and representative number of samples. If the signal is ergodic then the time averages equal the statistical averages. If the signal is ergodic then the time averages and statistical averages are the same. See also Ergodic, Expected Value, Mean Squared Value, Wide Sense Stationarity. Mean Squared Value: The statistical mean squared value of a signal, x ( k ) , is the average amplitude of the signal. Statistical mean squared value is often denoted using the statistical expectation operator, E { . } , which is calculated as: E { x 2 ( k ) } = Statistical Mean Squared Value of x ( k ) = ∑ x 2 ( k )p { x ( k ) } (426) k where p { x ( k ) } is the probability density function of x ( k ) . In real time DSP the probability density function of a signal is rarely known and therefore to find the mean squared value of a signal then the more intuitively obvious calculation of a time average calculated over a large and representative number of samples, N, is used: 269 1 Average Squared Value = ---N N–1 ∑ x2( k ) . (427) k=0 x(k) N time, k Mean Squared Value [x(k)]2 N time, k The time averaged mean squared value can be calculated by finding the average signal amplitude of the squared signal over a large and representative number of samples. If the signal is ergodic then the time averages equal the statistical averages. If the signal is ergodic then the time averages and statistical averages are the same. Note that mean squared value is always a positive value for any non-zero signal. See also Ergodic, Expected Value, Mean Squared Value, Variance, Wide Sense Stationarity. Memory: Integrated circuits used to store binary data. Most memory devices are CMOS semiconductors. For a DSP system memory will either be ROM or RAM. See also Static RAM, Dynamic RAM. Message: The information to be communicated in a communication system. The message can be continuous (analog) or discrete (digital). If an analog message is to be transmitted via a digital communications system it must first be sampled and digitized. See also Analog to Digital Converter, Digital Communications. MFLOPS: This measure gives the speed rating of processor in terms of the number of millions of floating point operations per second (MFLOPS) a processor can do. DSP processors can often perform more FLOPS than their clock speeds. This counter-intuitive capacity results from the fact that the floating point operations are pipeline -- with MFLOPS calculated as a time-averaged (best case) performance. The MFLOPS rating can be misleading for practical programs running on a DSP processor that rarely attain the MFLOPS speed when performing peripheral functions such as data acquisition, data output, etc. Middle A: See Western Music Scale. Middle C: See Western Music Scale. MiniDisc (MD): The MiniDisc was introduced to the audio market in 1992 as a digital audio playback and record format with the aim of competing with both compact disc (CD) introduced in 1983, and the compact cassette introduced in the 1960s. Sony developed the MiniDisc partly to break into the portable hifidelity audio market and therefore the format need to be compact and resistant to vibration and mechanical knocks [155]. Compared to the very successful CD format, the DSPedia 270 MiniDisc offers the advantage of being much smaller by virtue of smaller media requited by psychoacoustically compressed data. In addition, it features a record facility. The MiniDisc is a competing format to Philip’s DCC which also uses psychoacoustic data compression techniques. The MiniDisc is 64mm in diameter and uses magneto-optical techniques for recording. The size of the disc was kept small by using adaptive transform acoustic coding (ATRAC) to compress original 44.1kHz, 16 bit PCM music by a factor of 4.83. One MiniDisc can store 64 minutes of compressed stereo audio requiring around 140 Mbytes. Space is also made available for timing and track information. . The MiniDisc encodes data using the same modulation and similar error checking as the CD, namely eight to fourteen modulation (EFM) and a slightly modified cross interleaved ReedSolomon coding (CIRC). The risk of shock and vibration in everyday use is addressed by a 4Mbit buffer capable of storing more than 14 seconds of compressed audio. Therefore if the optical pickup loses its tracking the music can continue playing while the tracking is repositioned (requiring less than a second) and the buffer is refilled. In fact the pickup can read 5 times faster that the ATRAC decoder and therefore during normal operation the MiniDisc reads only intermittently. L R in out L R ADC Digital I/O Three Channel Subband Filter Modified Discrete Cosine Transform Bit allocation/ Spectral Quantizing Error coding/Data Modulation 4 Mbit Data Buffer Read/ Write Head DAC The MiniDisc (MD) compresses stereo 16 bit PCM audio signals sampled at 44.1kHzby a factor of almost 5:1. MiniDisc are read/writable and have a built in data buffer to resist mechanical shock. The MiniDisc can also be used for data storage and corresponds to a read-write disc of storage capacity 140Mbyte. See also Adaptive Transform Acoustic Coding, Compact Disc (CD), Digital Audio, Digital Audio Tape (DAT), Digital Compact Cassette (DCC), Psychoacoustics. Minimum Audible Field: A measure of the lowest level of detectable sound by the human ear. See entry Threshold of Hearing. Minimum Norm Vector: See Vector Properties and Definitions - Minimum Norm. Minimum Phase: All zeroes of the transfer function lie within the unit circle on the z-plane. See also Z-transform. Minimum Residual: See Least Squares Residual. Minimum Shift Keying (MSK): A form of frequency shift keying in which memory is introduced from symbol to symbol to ensure continuous phase. The separation in frequency between symbols is 1/(2T) Hz (for a symbol period of T seconds) allowing the maximum number of orthogonal signals in a fixed bandwidth. The fact that the MSK symbol stream is constrained to ensure continuous phase and has signals closely spaced in frequency means that MSK modulation is the most spectrally efficient form of FSK. MSK is sometimes referred to as Fast FSK since more data can be transmitted over a fixed bandwidth with MSK than FSK. Gaussian MSK (GMSK, as used in the GSM mobile radio system, for example) introduces a Gaussian pulse shaping on the MSK signals. This 271 pulse shaping allows a trade-off between spectral overlap and interpulse interference. See also Frequency Shift Keying, Continuous Phase Modulation. MIPS: This gives a measure of the number of MIPS (millions of instructions per second) that a DSP processor can do. Modem: A concatenation of MODulate and DEModulate. Modems are devices installed at both ends of an analog communication line (such as a telephone line). At the transmitting end digital signals are modulated onto the analog line, and at the receiving end the incoming signal is demodulated back to digital format. Modems are widely used for inter-computer connection and on FAX machines. Modular Interface eXtension (MIX): MIX is a high performance bus to connect expansion modules to a VME bus or a Multibus II baseboard. A few companies have adopted this standard. Modulo-2 Adder: Another name for an exclusive OR gate. See also Full Adder, Pseudo-Random a b z 0 0 0 0 1 1 1 0 1 1 1 0 Truth Table z = ab + ab = a ⊕ b Boolean Algebra a b z Logic Circuit Binary Sequence. Monaural: This refers to a system that presents signals to only one ear (e.g. a hearing aid worn on only one ear is monaural.) See also Binaural, Monophonic, Stereophonic. Monaural Beats: When two tones with slightly different frequencies are played together, the ear may perceive a composite tone beating at the rate of the frequency difference between the tones. See also Beat Frequencies, Binaural Beats. Monophonic: This refers to a system that has only one audio channel (although this single signal may be presented on multiple speakers). See also Monaural, Stereophonic, Binaural. Moore-Penrose Inverse: See Matrix Properties - Pseudo-Inverse. Mosaic: A hypertext browser used on the internet for interchange and exchange of information in the form of text, graphics, and audio. See also Internet, World Wide Web. DSPedia 272 Most Significant Bit (MSB): The bit in a binary number with the largest arithmetic significance. See also Least Significant Bit, Sign Bit. MSB LSB -128 64 0 0 32 16 8 4 2 1 0 1 1 0 1 1 = 16 + 8 + 2 + 1 = 2710 Motherboard: A DSP board that has its own functionality, and also spaces for smaller functional boards (extra processors, I/O channels) to be inserted is called a motherboard. This is analogous to the main board on a PC system that is home to the processor and other key system components. Moving Average (MA) FIR Filter: The moving average (MA) filter “usually” refers to an FIR filter of length N where all filter weights have the value of 1. (The term MA is however sometimes used to mean any (non-recursive) FIR filter usually within the context of stochastic signal modelling [77]). The moving average filter is a very simple form of low pass filter often found in applications where computational requirements need to be kept to a minimum. A moving average filter produces an output sample at time, k, by adding together the last N input samples (including the current one). This can be represented on a simple signal flow graph and with discrete equations as: x(k) x(k-1) x(k-2) N-1 delay elements x(k-3) x(k-N+2) x(k-N+1) y(k) N–1 y ( k ) = x ( k ) + x ( k – 1 ) + x ( k – 2 ) + x ( k – 3 ) + ..... + x ( k – N + 1 ) = ∑ n=0 x(k – n) The signal flow graph and output equation for a moving average FIR filter. The moving average filter requires no multiplications, only N additions. 273 As an example the magnitude frequency domain representations of a moving average filter with 10 weights is: H(f) 20 log H ( f ) (dB) Linear Magnitude Freq. Response Attenuation 10 5 0 1000 2000 3000 4000 5000 Log Magnitude Freq. Response 20 10 0 -10 -20 0 1000 2000 frequency (Hz) 3000 4000 5000 frequency (Hz) The linear and logarithmic frequency responses of a 10 weight moving average FIR filter. The peak of the first sidelobe of any moving average filter is always approximately 13dB below the gain at 0 Hz. In terms of the z-domain, we can write the transfer function of the moving average FIR filter as: ( z )- = 1 + z –1 + z –2 + + z – N + 1 ----------H( z) = Y … X(z) N–1 ∑ = i=0 z –i 1 – z –N= ----------------1 – z –1 (428) recalling that the sum of a geometric series { 1, r, r 2, …, r m } is given by ( 1 – r m + 1 ) ⁄ ( 1 – r ) . If the above moving average transfer function polynomial is factorized, this therefore represents a transfer function with N zeroes and a single pole at z = 1 , which is of course cancelled out by a zero at z = 1 since an FIR filter has no poles associated with it. We can find the zeroes of the polynomial in Eq. 428 by solving: 1 – z –N = 0 ⇒ zn = N 1 where n = 0…N – ⇒ zn = N e j2πn noting e j2πn = 1 ⇒ zn = e j2πn----------N (429) DSPedia 274 which represents N zeroes equally spaced around the unit circle starting at z = 1 , but with the z = 1 zero cancelled out by the pole at z = 1 . The pole-zero z-domain plot for the above 10 weight moving average FIR filter is: Imag z-domain 1 Y(z) H ( z ) = ------------ = 1 + z –1 + z – 2 + … + z – 9 X(z) 0.5 9 -1 -0.5 0 0.5 1 Real -0.5 = ∑ z –i i=0 – z – 10 = 1 ------------------1 – z –1 -1 The pole-zero plot for a moving average filter of length 10. As expected the filter has 9 zeroes equally spaced around the unit circle (save the one not present at z = 1 ). In some representations a pole and a zero may be shown at z = 1 , however these cancel each other out. The use of a pole is only to simplify the z-transform polynomial expression. In general if a moving average filter has N weights then the width of the first (half) lobe of the mainlobe is f s ⁄ 2N Hz, which is also the bandwidth of all of the sidelobes up to f s ⁄ 2 . The moving average filter shown will amplify an input signal by a factor of N. If no gain (or gain =1) is required at 0 Hz then the output of the filter should be divided by N. However one of the attractive features of a moving average filter is that it is simple to implement and the inclusion of a division is not conducive to this aim. Therefore should 0 dB be required at 0 Hz, then if the filter length is made a power of 2 (i.e. 8, 16, 32 and so on) then the division can be done with a simple shift right operation of the filter output, whereby each shift right divides by 2. The moving average FIR filter is linear phase and has a group delay equal to half of the filter length (N/2). See also Comb Filter, Digital Filter, Exponential Averaging, Finite Impulse Response Filter, Finite Impulse Response Filters-Linear Phase, Infinite Impulse Response Filter. Moving Picture Experts Group (MPEG): The MPEG standard comes from the International Organization for Standards (ISO) sub-committee (SC) 29 which is responsible for standards on “Coding of Audio, Picture, Multimedia and Hypermedia Information”. Working Group (WG) 11 (ISO JTC1/SC29/WG11) considered the problem of coding of multimedia and hypermedia information and produced the MPEG joint standards with the International Electrotechnical Commission (IEC): • ISO/IEC 11 172: MPEG-1 (Moving Picture Coding up to 1.5 Mbit/s) Part 1: Systems Part 2: Video Part 3: Audio Part 4: Compliance Testing (CD) Part 5: Technical Report on Software for ISO/IEC 11 172 • ISO/IEC 13 818: MPEG-2 (Generic Moving Picture Coding) Part 1: Systems (CD) Part 2: Video (CD) Part 3: Audio (CD) Part 4: Compliance Testing 275 Part 5: Technical Report on Software for ISO/IEC 13 818 Part 6: Systems Extensions Part 7: Audio Extensions Some current work of (ISO JTC1/SC29/WG11) is focussed on the definition of the MPEG-4 standard for Very-low Bitrate Audio-Visual Coding. MPEG-1 essentially defines a bit stream representation for the synchronized digital video and audio compressed to fit in a bandwidth of about 1.5Mbits/s, which corresponds to the bit rate output of a CD-ROM or DAT. The video stream requires about 1.15 Mbits/s, with the remaining bandwidth used by the audio and system data streams. MPEG is also widely used on the Internet as a means for transferring audio/video clips. MPEG-1 has subsequently enabled the development of various multimedia systems and CD-DV (compact disc digital video). The MPEG standard is aimed at using intra-frame (as in JPEG) and inter-frame compression techniques to reduce the digital storage requirement of moving pictures, or video [72]. MPEG-1 video reduces the color subsampling ratio of a picture to one quarter of the original source values in order that actual compression algorithms are less processor intensive. MPEG-1 video then uses a combination of the discrete cosine transform (DCT) and motion estimation to exploit the spatial and temporal redundancy present in video sequences and (depending on the resolution of the original sequence) can yield compression ratios of approximately 25:1 to give almost VHS quality video. The motion estimation algorithm efficiently searches blocks of pixels, and therefore can track the movement of objects between frames or as the camera pans around. The DCT exploits the physiology of the human eye by taking blocks of pixels and converting them from the spatial domain to the frequency domain with subsequent quantization. As with JPEG, a zig-zag scan of the DCT coefficients yields long runs of zero for the higher frequency components. This improves the efficiency of the run length encoding (also similar to JPEG). In general very high levels of computing power are required for MPEG encoding (of the order of hundreds of MIPs to encode 25 frames/s. However decoding is not quite as demanding and there are a number of single chip decoder solutions available. MPEG-2 is designed to offer higher than MPEG-1 quality playback at bit rates of between 4 and 10Mbits/s which is above the playback rate currently achievable using CD disc technology . MPEG4 is aimed at very low bit rate coding for applications such as video-conferencing or videotelephony. See also Compression, Discrete Cosine Transform, H-Series Recommendations H261, International Organisation for Standards (ISO), Moving Picture Experts Group - Audio, Psychoacoustic Subband Coding, International Telecommunication Union, ITU-T Recommendations, Standards. Moving Picture Experts Group (MPEG) - Audio: The International Organization for Standards (ISO) MPEG audio standards were based around the developed compression techniques of MUSICAM (Masking Pattern Adapted Universal Subband Integrated Coding and Multiplexing) and ASPEC (Adaptive Spectral Perceptual Entropy Coding). MPEG audio compression uses subband coding techniques with dynamic bit allocation based on psychoacoustic models of the human ear. By exploiting both spectral and temporal masking effects, compression ratios of up to 12:1 for CD quality audio (without too much degradation to the average listener) can be realized. The so called MPEG-1, ISO 11172-3 standard, describes compression coding schemes of hifidelity audio signals sampled at 48kHz, 44.1 kHz or 32 kHz with 16 bits resolution in one of four modes: (1) single channel; (2) dual (independent or bilingual) channels; (3) stereo channels; and (4) joint stereo . The standard only defines the format of the encoded data and therefore if improved DSPedia 276 psychoacoustic models can be found then they can be incorporated into the compression scheme. Note that the psychoacoustic modelling is only required in the coder, and in the decoder the only requirement is to “unpack” the signals. Therefore the cost of an MPEG decoder is lower than an MPEG encoder. The standard defines layers 1, 2 and 3 which correspond to different compression rates which require different levels of coding complexity, and of course have different levels of perceived quality. The various parameters (based on an input signal sampled at 48 kHz with 16 bits samples - a data rate of 768 kbits/s) of the three layers of the model are: MPEG Audio ISO 11172-3 Standard Theoretical coding/ decoding delay (ms) Target bit rate/channel (kbits/s) Compression ratio No of subbands in psychoacoustic model “Similar” compression schemes Layer 1 19 192 4:1 32 PASC Layer 2 35 128 6:1 32 MUSICAM Layer 3 59 64 12:1 576 ASPEC Layer 1 is the least complex to implement and is suitable for applications where good quality is required and audio transmission bandwidths of at least 192 kbits/s are available. PASC (precision adaptive subband coding) as used on the digital compact cassette (DCC) developed by Philips is very similar to layer 1. Layer 2 is identical to MUSICAM. Layer 3 which achieves the highest rate of data compression is only required when bandwidth is seriously limited; at 64 kbits/s the quality is generally good, however a keen listener will notice artifacts. In the MPEG-2, ISO 13818-3 standard, key advancements have been made over MPEG-1 ISO 11172 with respect to inclusion of dynamic range controls, surround sound, and the use of lower sampling rates. Surround sound, or multichannel sound is likely to be required for HDTV (high definition television) and other forms of digital audio broadcasting. Draft standards for multichannel sound formats have already been published by the International Telecommunication Union Radiocommunication Committee (ITU-R) and European Broadcast Union (EBU). MPEG-2 is designed to transmit 5 channels, 3 front channels and 2 surround channels in so called 3/2 surround format. Using a form of joint stereo coding the bit rate for layer 2 of MPEG-2 will be about 2.5 times the 2 channel MPEG-1 layer 2, i.e. between 256 and 384 bits/sec. MPEG-2 was also aimed at extending psychoacoustic compression techniques to lower sampling frequencies such as (24 kHz, 22.05 kHz and 16 kHz) which will give good fidelity for speech only type tracks. It is likely that this type of coding could replace techniques such as the ITU-T G.722 coding (G - series recommendations). MPEG-4 will code audio at very low bit rates and is currently under consideration. See also Psychoacoustics, Precision Adaptive Subband Coding (PASC), Spectral Masking, Temporal Masking MPEG: See Moving Picture Experts Group. Multichannel LMS: See Least Mean Squares Algorithm Variants. Multimedia: The integration of speech, audio, video and data communications on a computer. For all of these aspects DSP co-processing may be necessary to implement the required computational 277 algorithms. Multimedia PCs have integrated FAX, videophone, audio and TV - all made possible by DSP. is a Multimedia and Hypermedia Information Coding Experts Group (MHEG): MHEG standard for hypermedia document representation. MHEG is useful for the implementation aspects of interactive hypermedia applications such as on-line textbooks, encyclopedias, and learning software such as are already found on CD-ROM [94]. The MHEG standard comes from the International Organization for Standards (ISO) sub-committee (SC) 29 which is responsible for standards on “Coding of Audio, Picture, Multimedia and Hypermedia Information”. Working Group (WG) 12 (ISO JTC1/SC29/WG12) considered the problem of coding of multimedia and hypermedia information and produced the MHEG joint standard with the International Electrotechnical Commission (IEC): ISO/IEC 13522 MHEG (Coding of Multimedia and Hypermedia Information). See also International Organisation for Standards, Multimedia, Standards. Multimedia Standards: The emergence of multimedia systems in the 1990s brings the communication and presentation of audio, video, graphics and hypermedia documents onto a common platform. The successful integration of software and hardware from different manufacturers etc requires that standards are adopted. For current multimedia systems a number of ITU, ISO and ISO/IEC JTC standards are likely to be adopted. A non-exhaustive sample list of standards that are suitable include: • ITU-T Recommendations: F.701 Teleconference service. F.710 General principles for audiographic conference service. F.711 Audiographic conference teleservice for ISDN. F.720 Videotelephony services - general. F.721 Videotelephony teleservice for ISDN. F.730 Videoconference service- general. F.732 Broadband Videoconference Services. F.740 Audiovisual interactive services. G.711 Pulse code modulation (PCM) of voice frequencies. G.712 Transmission performance characteristics of pulse code modulation. G.720 Characterization of low-rate digital voice coder performance with non-voice signals. G.722 7 kHz audio-coding within 64 kbit/s; Annex A: Testing signal-to-total distortion ratio for kHz audio-codecs at 64 kbit/s. G.724 Characteristics of a 48-channel low bit rate encoding primary multiplex operating at 1544 kbit/s. G.725 System aspects for the use of the 7 kHz audio codec within 64 kbit/s. G.726 40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code Modulation (ADPCM). Annex A: Extensions of Recommendation G.726 for use with uniform-quantized input and output. G.727 5-, 4-, 3- and 2-bits sample embedded adaptive differential pulse code modulation (ADPCM). G.728 Coding of speech at 16 kbit/s using low-delay code excited linear prediction. Annex G to Coding of speech at 16 kbit/s using low-delay code excited linear prediction: 16 kbit/s fixed point specification. DSPedia 278 H.221 Frame structure for a 64 to 1920 kbit/s channel in audiovisual teleservices H.242 System for establishing communication between audiovisual terminals using digital channels up to 2 Mbit/s. H.261 Video codec for audiovisual services at p x 64 kbit/s. H.320 Narrow-band visual telephone systems and terminal equipment. T.80 Common components for image compression and communication - basic principles. X.400 Message handing system and service overview (same as F.400). • Proprietary Standards: Bento Sponsored by Apple Inc for multimedia data storage. GIF Compuserve Inc graphic interchange file format. QuickTime Digital video replay on the MacIntosh. RIFF Microsoft and IBM multimedia file format. DVI Intel’s digital video. MIDI Musical digital interface. • International Organization for Standards: HyTime Hypermedia time based structuring language. IIF Image interchange format. JBIG Lossless compression for black and white images. JPEG Lossy compression for continuous tone, natural scene images. MHEG Multimedia and hypermedia information coding. MPEG Digital video compression techniques. ODA Open document architecture. See also International Telecommunication Union, International Organisation for Standards, Standards. Multiply Accumulate (MAC): The operation of multiplying two numbers and adding to another value, i.e. ((a × b) + c). Many DSP processors can perform (on average) one MAC in one instruction cycle. Therefore if a DSP processor has a clock speed of 20MHz, then it can perform a peak rate of 20,000,000 multiply and accumulates per second. See also DSP Processor, Parallel Adder, Parallel Multiplier. b a c a.b +c Multiprocessing: Using more than one DSP processor to solve a particular problem. The TMS320C40 has six I/O ports to communicate with other TMS320C40s with independent DMA. The term multiprocessing is sometimes used interchangeably with parallel processing. 279 Multipulse Excited Linear Predictive Coding (MLPC): MLPC is an extension of LPC for speech compression that goes some way to overcoming the false synthesized sound of LPC speech. Multipurpose Internet Mail Extensions (MIME): MIME is a proposed standard from the Internet Architecture Board and supports several predefined types of non-text (non-ASCII) message contents, such as 8 bit 8kHz sampled µ-law encoded audio, GIF image files, and postscript as well as other forms of user definable types. See also Standards. Multirate: A DSP system which performs computations on signals at more than one sampling rate usually to achieve a more efficient computational schedule. The important steps in a multirate system are decimation (reducing the sampling rate), and interpolation (increasing the sampling rate). Sub-band systems can be described as multirate. See also Decimation, Interpolation, Upsampling, Downsampling, Fractional Sampling Rate Conversion. µ-law: Speech signals, for example, have a very wide dynamic range: Harsh “oh” and “b” type sounds have a large amplitude, whereas softer sounds such as “sh” have small amplitudes. If a uniform quantization scheme were used then although the loud sounds would be represented adequately the quieter sounds may fall below the threshold of the LSB and therefore be quantized to zero and the information lost. Therefore companding quantizers are used such that the quantization level at low input levels is much smaller than for higher level signals. Two schemes are widely in use: the µ-law in the USA and the A-law in Europe. The expression for µ-law compression is given by: ( 1 + µx )------------------------y ( x ) = ln ln ( 1 + µ ) (430) with y(x) being the compressed output for input x, and the function being negative symmetric around x=0. A typical value of µ is 255. See also A-Law. Music: Music is a collection of sounds arranged in an order that sounds cohesive and regular. Most importantly, the sound of music is pleasant to listen to. Music can have has two main elements: a quasi-periodic set of musical notes and a percussive set of regular timing beats. Each musical note or discrete sound in music is characterized by a fundamental frequency and a rich set of harmonics, whereas the percussion sounds are more random (although distinctive) in nature [13], [14]. Many different ordered music scales (sets of constituent notes) exist. The most familiar is the 12 notes in an octave of the Western music scale on which most modern and classical music is played. The fundamental frequency of each note on the Western music scale can be related to the fundamental frequency of all other notes by a simple ratio. The same musical notes on different musical instruments are characterized by the harmonic content and the volume envelope. The DSPedia 280 3 Trumpet, C3 1 0 -1 -2 -3 0.5 3 1 1.5 1 0 -1 -2 2 2.5 time/seconds Violin, C3 0.5 3 1 0 -1 -2 -3 1 1.5 2 2.5 time/seconds Piano, C3 2 Amplitude, p(k) 2 0 x 104 x 104 Guitar, C3 -3 0 Amplitude, v(k) 3 2 Amplitude, g(k) 2 Amplitude, t(k) x 104 x 104 following figure shows the characteristic waveform for a sampled 0.03 second segment of C4 note played on a trumpet, guitar, violin and piano: 1 0 -1 -2 -3 0 0.5 1 1.5 2 2.5 time/seconds 0 0.5 1 1.5 2 2.5 time/seconds Digitally sampled time waveforms representing the variation in sound pressure level of 0.03 second segments of a C4 note (fundamental frequency of 261.6Hz on the Western music scale) played on a trumpet, guitar, violin and piano. The samples were taken from the full notes shown in the figures below. Clearly, although all of the instruments have a similar fundamental frequency, the varying harmonic content gives them completely different appearances in the time domain. The volume envelope of 281 3 Trumpet, C3 1 0 -1 -2 -3 0.5 1 3 1.5 1 0 -1 -2 2 2.5 time/seconds Violin, C3 1 0 -1 -2 -3 0.5 1 3 1.5 2 2.5 time/seconds Piano, C3 2 Amplitude, p(k) 2 0 x 104 x 104 Guitar, C3 -3 0 Amplitude, v(k) 3 2 Amplitude, g(k) Amplitude, t(k) 2 x 104 x 104 a musical note also contributes to the characteristic sound, as shown in the following figure (from which the above 0.03 time segments were in fact taken): 1 0 -1 -2 -3 0 0.5 1 1.5 2 2.5 time/seconds 0 0.5 1 1.5 2 2.5 time/seconds Time waveforms showing the sound pressure level volume envelope of a C3 note (fundamental frequency of 261.6Hz on the Western music scale) played on a trumpet, guitar, violin and piano. The amplitude envelope of the different musical instruments can be clearly seen. DSPedia 282 Trumpet, C3 -10 -30 -40 -50 x 104 (dB) Magnitude, G(f) Magnitude, T(f) -20 x 104 (dB) 0 0 2 0 4 6 Violin, C3 -10 Guitar, C3 -10 -20 -30 -40 0 2 0 4 6 8 10 frequency / kHz Piano, C3 -10 -20 Magnitude, P(f) Magnitude, V(f) -20 -30 -40 -50 0 -50 8 10 frequency / kHz x 104 (dB) x 104 (dB) To see the harmonic content of each of the four musical instruments we can perform a 2048 point FFT on a representative portion of the waveform resulting in the following frequency domain plots: 0 2 4 6 8 10 frequency / kHz -30 -40 -50 0 2 4 6 8 10 frequency / kHz Frequency spectra of a C4 note (fundamental frequency of 261.6Hz on the Western music scale) for a trumpet, guitar, violin and piano. The spectra were generated from a 0.05 second segment of the note. Musical instruments are carefully designed to give them flexible tuning capabilities and, where possible, good natural frequency resonating. For example violins can be designed such that significant frequencies (such as A4, of fundamental frequency 440Hz) corresponds to the resonance of the lower body of the instrument which as a result will enhance the sound, and also the feeling and tactile feedback to the violinist [14]. Clearly the subtleties of the generation and analysis of music is very complex, although the appreciation of music is very simple! There are many other music scales such as the 22 note Hindu scale, and many other different Asian scales. This perhaps explains why when someone who has never experienced Chinese music listens to it for the first time it may be perceived off key and dissonant because it contains various notes that are just not present in the familiar Western music scale. Another example of an instrument that is not quite playing to the Western music scale are the Scottish bagpipes. The high notes on the chanter are not in fact a full octave (frequency ratio of 2:1) above the analogous lower notes. Hence the bagpipes can sound a little flat at the high notes. However, if the bagpipes are the sound to which we had become accustomed, and anything else might not sound right! Music synthesis is now largely achieved using digital synthesizers that use a variety of DSP techniques to produce an output. See also Digital Audio, Percussion, Music Synthesis, Sound Pressure Level, Western Music Scale. 283 Music Synthesis: Most modern synthesizers use digital techniques to produce simulated musical instruments. Most synthesis requires setting up the fundamental frequency components with appropriate relative harmonic content and a suitable volume profile. A good overview of this area can be found in [14], [32]. See also Attack-Decay-Sustain-Release, Granular Synthesis, LA Synthesis, Music. 284 DSPedia 285 N n: ”n” (along with “k” and “i”) is often used as a discrete time index for in DSP notation. See Discrete Time. Narrowband: Signals are defined as narrowband if the fractional bandwidth of the signals is small, say <10%. See also Fractional Bandwidth, Wideband. Nasals: One of the elementary sounds of speech, namely plosives, fricatives, sibilant fricative, semi-vowels, and nasals. Nasals are formed by lowering the soft palate of the mouth so blocking the mouth and forcing the air stream to pass out via the nose, as in the letter “m”. See also Fricatives, Plosives, Semi-vowels, and Sibilant Fricatives. Natural Frequency: See Resonant Frequency. Near End Echo: Signal echo that is produced by components in local telephone equipment. Near end echo arrives before far end echo. See also Echo Cancellation, Far End Echo. Neper: The neper is a logarithmic measure used to express the attenuation or amplification of voltage or current where the natural logarithm (base e = 2.71828... ) is used rather than the more normal base 10 logarithm: V out Neper (Np) = ln ---------- V in (431) A decineper is calculated by multiplying the neper quantity by 10 (rather than 20 as would be used for decibels): V out Decineper (dNp) = 10 ln ---------- V in (432) To convert from nepers to decibels simply multiply by 20 log e = 8.686... . The neper should not be confused with the Scottish word for turnips (or swedes) which is the neep. Traditionally neeps are eaten on 25th January each year to celebrate the birthday of Robert Burns, the Scottish poet who popularized Auld Lang Syne as well as many other of his own songs and poems. Neeps can of course be eaten at other times of the year. There is no known means by which neeps can be converted to decibels. Neural Networks: Over the last few years the non-linear processing techniques known as neural networks have been used to solve a wide variety of DSP related problems such as speech recognition and image recognition [18], [112], [24]. The simplest forms of neural network can be directly related to the adaptive LMS filter, however the multi-layer nature of even these simple networks have very high computational loads. The name derives from the similarity of the computational model to a simplified model of the nervous system in animals. The applications and implementation of neural networks in DSP is set to grow in the next few years. Newton LMS: See Least Mean Squares Algorithm Variants. DSPedia 286 Noise: An unwanted component of a signal which interferes with the signal of interest. Most signals are contaminated by some form of noise, either present before sensing, or actually induced by the process of sensing the signal (conversion to electrical form) or the sampling process (quantization noise). Computations on a DSP processor can also induce various forms of arithmetic noise (round-off noise). Most DSP algorithms assume that noise sources can be well modelled as additive, i.e., the noise is added to the signal of interest. See also Round-Off Noise, Truncation, White Noise, Additive White Gaussian Noise. time Sine Wave time Sine Wave + Noise time Noise A Sine wave corrupted by additive noise. Noise Cancellation: Using adaptive signal processing techniques, noise cancellation can be used to remove noise from a signal of interest in situations where a correlated reference of the noise signal is available:. s(k) + n(k) d(k) n’(k) Adaptive Filter + s(k) − e(k) Adaptive Algorithm Generic adaptive signal processing noise canceller. Signal s ( k ) is uncorrelated with n ( k ) or n' ( k ) . However n ( k ) and n' ( k ) . are correlated. 287 Noise cancellation techniques are found in biomedical applications where, for example it is required to remove mains hum periodic noise from an ECG waveform: s(k) + n(k) d(k) n’(k) Adaptive Filter + s(k) − e(k) Adaptive Algorithm Adaptive noise cancellation of an ECG signal corrupted by mains hum. Primary Microphone s(k) d(k) n(k) n’(k) NOISE Reference Microphone Adaptive Filter + s(k) − e( k ) ≈ s(k ) Adaptive Algorithm Adaptive noise cancellation of a speech signal corrupted by noise. The reference microphone picks up the noise only, whereas the primary microphone picks both noise and speech. Note that if the reference microphone also picks up speech then the adaptive noise canceller will try to also cancel the speech signal. (This is clearly not the desired effect!) See also Active Noise Control, Adaptive Line Enhancer, Adaptive Filter, Echo Cancellation, Least Mean Squares Algorithm, Recursive Least Squares. Noise Control: See Active Noise Control, Noise Cancellation. Noise Dosemeter: For persons subjected to noise at the workplace, a noise dosemeter or sound exposure meter can be worn which will average the “total” sound they are exposed to in a day. The measurements can then be compared with national safety standards [46]. Noise Shaping: A technique used for audio signal processing and sigma delta analog to digital converters where quantisation noise is high pass filtered out of the baseband. See also Oversampling, Sigma Delta. Noncausal: See Causal. Noncoherent: See Coherent. Nonlinear: Not linear. See also Linear System, Non-linear System. DSPedia 288 Non-linear System: A non-linear system is one that does not satisfy the linearity criteria such that if: y1 ( n ) = f [ x1 ( n ) ] (433) y2 ( n ) = f [ x2 ( n ) ] then: a1 y1 ( n ) + a2 y2 ( n ) = f [ a1 x1 ( n ) + a2 x2 ( n ) ] (434) For example the system y ( n ) = 1.2x ( n ) + 3.4 ( x ( n ) ) 2 is nonlinear as it does not satisfy the above linearity criteria. Any system which introduces harmonic distortion or signal clipping is nonlinear. Non-linear systems can be extremely difficult to analyse both mathematically and practically. Low levels of nonlinear components that are relatively small in magnitude are often ignored in the analysis and simulation of systems. A simple way to test the linearity of a system is to input a single sine wave and vary the frequency over the bandwidth of interest and observe the output signal. If the output contains any sine wave components other than at the frequency of the input sine wave then it is nonlinear system. The most common form of nonlinearity is called harmonic distortion. See also Distortion, Linear System, Total Harmonic Distortion, Volterra Filter. x(t) y(t) time, t x(t) |X(f)| f0 sin 2πf 0 t time, Simple Nonlinear System 2 1 y ( t ) = x ( t ) + --- [ x ( t ) ] 2 y(t) Nonlinear component |Y(f)| frequency, f f0 2f0 frequency 1 sin 2πf t + 1 --- – --- cos 2π ( 2f )t 0 4 4 0 Non-negative Definite Matrix: See Matrix Properties - Positive Definite. Non-Return to Zero (NRZ): When a stream of binary data is to be sent serially, such as transmission of PCM, the data can be sent as (half binary) return to zero (RZ), or (full binary) nonreturn to zero (NRZ). With RZ data streams after a 1 has been sent, the output waveform returns 289 back to 0, whereas with NRZ the output remains at 1 for the duration of the bit period. The waveform bit period NRZ RZ The same sequence of bits, 1011110, transmitted as RZ and NRZ assumed below is polar. See also Bipolar (2), Polar. Non-Simultaneous Masking: See Temporal Masking. Nonsingular Matrix: See Matrix Properties - Nonsingular. Non-Volatile: Semiconductor memory that does not lose information when the power is removed is called non-volatile. ROM is an example of non-volatile memory. Non-volatile RAM is also available. Norm: See Vector Properties and Definitions - Norm. 1-norm: See Matrix Properties - 1-norm. 2-norm: See Matrix Properties - 2-norm. 2-norm of a Vector: See Vector Properties and Definitions - 2-norm. Normal Equations: In least squares error analysis the normal equation is given by: A T Ax LS = A T b (435) given the overdetermined system of equations: Ax = b (436) where A is a known m × n matrix of rank n and with m > n, b is a known m element vector, and x is an unknown n element vector. See also Least Squares, Overdetermined System, Underdetermined System. Normalised Step Size LMS: See Least Mean Squares Algorithm Variants, Step Size Parameter. DSPedia 290 Notch Filter: A notch filter, H ( z ) removes signal components at a very narrow band of frequencies: 20 log H ( f ) Gain (dB) 10 0 -10 -20 -30 -40 0 frequency (Hz) A notch filter removes a very narrow band of frequencies. Notch filters can be designed using standard filter design techniques for band-stop filters. One form of notch filter can be designed using an all-pass IIR digital filter of the form: r 2 – 2r cos θ + z –2H A ( z ) = ----------------------------------------------1 – 2r cos θ + r 2 z – 2 (437) in the configuration: x(k) y(k) HA(z) 0.5 1 Y(z) H ( z ) = ------------ = --- ( 1 + H A ( z ) ) 2 X(z) Notch filter designed using an all pass filter HA(z). The parameters cos θ and r are used to set the notch frequency and bandwidth of the notch. The notch frequency, f n can be calculated from: 2πf cos θ-----------------cos ----------n- = 2r fs 1 + r2 (438) 291 which is calculated from Eq. 437 by noting the frequency when the phase shift of the output of the all pass filter is –π radians (see below). The above notch filter can be drawn more explicitly as the signal flow graph (SFG): x(k) x(k-1) 2 r x(k-2) -2rcosθ 1 y(k-3) r2 y(k-2) 0.5 y(k) y(k-1) -2rcosθ 1 y ( k ) = r 2 x ( k ) – 2r cos θx ( k – 1 ) + x ( k – 2 ) + y ( k – 1 ) – 2r cos θ y ( k – 2 ) + r 2 y ( k – 3 ) Signal flow graph for a notch filter based on an all-pass filter. In order to appreciate the notch filtering attribute of this filter, note that the all pass filter H A ( z ) has a phase response of the form: Phase (radians) H A ( e jω ) All-pass filter 0 -π -2π 0 frequency (Hz) fs/2 Typical form (i.e. -ve sigmoidal) phase response of the all-pass filter H A ( z ) . The actual transition point through -π radians and the various graph slopes are determined by setting the parameters r and cos θ . fn Therefore when the input signal is the frequency f n , then the phase of the output signal of the all pass filter is exactly -π. When added to the input signal x ( k ) , the output y ( k ) is zero: time 1 t π = ---fπ time y(k) x(k) HA(z) zero output 0.5 time time When the output of the all pass filter produces a phase shift of – π radians for an input sine wave input of f n Hz, the output, y ( k ) of the notch filter is zero. As examples, using Eq. 438 we can design two notch filters with a notch frequency of f n = 1250 Hz , for a sampling rate of f s = 10000 Hz . The first design has r = 0.8 and the second design has r = 0.99 , thus giving different notch bandwidths : DSPedia 292 Setting r close to 1 is equivalent to putting the poles and zeroes of the all-pass filter very close to the unit circle. Phase (radians) Gain (dB) 20 log H ( f ) 0 -20 r = 0.8 -40 1.64 π cos θ = ----------- cos --1.6 4 -60 -80 0 1000 2000 3000 4000 H ( e jω ) Phase Response π π/2 0 -π/2 -π 5000 0 frequency (Hz) 0.1 0.2 0.3 0.4 0.5 frequency (Hz) Notch filters at f n = 1250 Hz , with r = 0.8 and cos θ = ( 1.64 ⁄ 1.6 ) cos π ⁄ 4 . Phase (radians) Gain (dB) 20 log H ( f ) 0 -20 r = 0.99 -40 1.8 π cos θ = ----------- cos --1.81 4 -60 -80 0 1000 2000 3000 4000 5000 frequency (Hz) H ( e jω ) Phase Response π π/2 0 -π/2 -π 0 0.1 0.2 0.3 0.4 0.5 frequency (Hz) Notch filter at f n = 1250 Hz with r = 0.99 and cos θ = ( 1.8 ⁄ 1.81 ) cos π ⁄ 4 . The notch bandwidth is smaller that the above design with r = 0.8 and cos θ = ( 1.64 ⁄ 1.6 ) cos π ⁄ 4 . Note that the phase shift is very small at frequecies other than those near the notch frequency If a notch filter is to be used to remove a “single” frequency, then adaptive noise cancellation can often be used as a suitable alternative if a suitable correlated noise source is available. See also Adaptive Signal Processing, All-pass Filter, Digital Filter, Infinite Impulse Response Filter. Noy: The noy is a measurement of noisiness similar in its measurement to a phon. It is defined as the sound pressure level (SPL) of a band of noise from 910Hz to 1090 Hz that subjectively sounds as noisy as the sound under consideration [46]. See also Equal Loudness Contours, Frequency Range of Hearing, Phons, Sound Pressure Level. Null Space: See Vector Properties - Null Space. Numerical Integrity: Instability in a DSP system can either be (1) a function of feedback causing large unbounded outputs, or (2) when very large numbers are divided by very small numbers, or vice versa. Instability of type (2) can cause a loss of numerical integrity when the result is smaller than the smallest decimal number or larger than the largest decimal number that can be represented in the DSP processor being used. In the case of a number that is too small, then the result will likely be returned as zero. However if this number is to be used as a dividend the result is a divide by zero error, which will cause the algorithm to stop or become unstable by generating a maximum amplitude quotient. As an example consider a particular microprocessor that has precision of 3 decimal places. The following matrix algorithm is to be implemented: 293 C = [ A –1 + B ] – 1 (439) Where, A = 1000 0 0 1 B = 00 02 (440) Solving the problem using a processor with 3 decimal place of precision is straightforward and gives: 1000 0 0 1 C = –1 + 0 0 0 1 0.001 0 + 0 0 0 1 01 = = 0.001 0 0 2 –1 –1 (441) –1 = 1000 0 0 0.5 However if the same problem was solved using a processor with only two places of decimal precision, then: 1000 0 0 1 C = = –1 + 0 0 0 1 0 0 + 0 0 0 1 0 1 = 00 02 –1 –1 (442) –1 = non-invertible matrix and the algorithm breaks down. See also Ill-Conditioned. Numerical Properties: The ability of a DSP algorithm to produce intermediate results that are within the wordlength of the processor being used indicates that the particular algorithm has good numerical properties. If, for example, a particular DSP algorithm running on a 32 bit floating point DSP processor produces intermediate values that require more precision than 32 bits floating point, then clearly the final result will be in error by some margin. Therefore it is always desirable to used algorithms with good numerical properties. In linear algebra, for solving a linear set of equations the 294 DSPedia QR algorithm is recognised as having good numerical properties, whereas Gaussian Elimination has very poor numerical properties. See also Round-Off Noise. Numerical Stability: See Numerical Integrity. Nyquist: The Nyquist frequency is the minimum frequency at which an analog signal must be sampled in order that no information is lost (assuming the sampling process is perfect). Mathematically, it can be shown that the Nyquist frequency must be greater than twice the highest frequency component of the signal being sampled in order to preserve all information [10]. In practical terms, real-world signals are never exactly bandlimited. However, the energy that gets aliased is kept small in properly designed DSP systems. See also Aliasing. 295 O Octave: An octave refers the interval between two frequencies where one frequency is double to other. For example, from 125Hz to 250Hz is an octave, and from 250Hz to 500 Hz is an octave and so on. It may seem strange that octave derived from the Greek prefix “oct” which means eight, however this relates to the Western Music Scale whereby an octave is a set of eight musical notes (of increasing frequency), and where the first note has half of the frequency of the last note. See also Decade, Logarithmic Frequency, Roll-off, Western Music Scale. Odd Function: The graph of an odd function has point symmetry about the origin such that y = f ( x ) = – f ( – x ) . For example both the functions y = sin x and y = x 3 are odd functions. In contrast an even function is symmetric about the y-axis such that y = f ( x ) = f ( x ) . See also Even Function. y y x x y = x3 y = sin x Off-Line Processing: If recorded data is available on a hard disk and it is only required to process this data then store it back to disk then the computation is not time limited and this is referred to as off-line processing. If on the other hand an output must be generated as fast as an input is received from a real world sensor then this is real-time processing. See also Real Time Processing. Offset Keyed Phase Shift Keying (OPSK or OKPSK): See Offset Keying. Offset Keyed Quadrature Amplitude Modulation (OQAM or OKQAM): See Offset Keying. Offset Keying: A modulation technique used with quadrature signals (i.e., those signals that can be described in terms of in-phase and quadrature, or cosine and sine, components). In offset keying, symbol transitions for the quadrature component are delayed one half a symbol period from those for the in-phase component. OnCE: Motorola on-chip emulator that allows easy debugging of the DSP56000 family of processors. On-chip Memory: Most DSP processors (DSP56/96 series, TMS320, DSP16/32, ADSP 2100 etc.) have a few thousand words of on-chip memory which can be used for storing short programs, and (significantly) data. The advantage of on-chip memory is that it is faster to access than off-chip memory. For DSP applications such as a FIR filter, where very high speed is essential, the on-chip memory is very important. See also DSP Processor, Cache. On-line Processing: See Real Time Processing. Operational Amplifier (or Op-Amp): An integrated circuit differential amplifier that has a very high open-loop gain (of the order 100000), a high input impedance (MΩ), and low output impedance (100Ω) over a relatively small bandwidth. By introducing negative feedback around the amplifier, DSPedia 296 gain ratios of 1-1000 over a wide bandwidth can be set up. Op-Amps are very widely used for many forms of signal conditioning in DSP audio, medical, telecommunication applications. + schematic icon for an op-amp Oppenheim and Schafer: Alan Oppenheim and Ronald Schafer are the authors of the definitive 1975 text Digital Signal Processing published by Prentice Hall. Still a very relevant reference for DSP students and professionals, although since then many other excellent texts have been published. Order of a Digital Filter: See Digital Filter Order. Order Reversed Filter: See Finite Impulse Response. Orthogonal Matrix: See Matrix Properties - Orthogonal. Orthonormal Matrix: See entry for Matrix Properties - Orthogonal. Orthogonal Vector: See Vector Properties and Definitions - Orthogonal. Orthonormal Vector: See Vector Properties and Definitions - Orthonormal. Otoacoustic Emissions: Sounds that are emitted spontaneously from the ear canal. Measurements of these emissions are used to diagnose hearing loss and other pathologies within the ear. The emissions are induced by stimulating the ear and then measured by recording the response produced after the stimulus. Outer Product: See Vector Properties and Definitions - Outer Product. Overdetermined System of Equations: See Matrix Properties - Overdetermined System of Equations. Oversampling: If a signal is sampled at a much higher rate than the Nyquist rate, then it is oversampled. Oversampling can bring two benefits: (1) a reduction in the complexity of the analog anti-alias filter; and (2) an increase in the resolution achievable from an N-bit ADC or DAC. As an example of oversampling for reducing the complexity of the analog anti-alias filter, consider a particular digital audio system in which the sampling rate is 48kHz. The Nyquist criterion is satisfied by attenuating all frequencies above 24kHz that may be output by certain musical instruments (or interfering electronic equipment) by at least 96 dB (equivalent to a 16 bit dynamic 297 48 96 192 log freq Magnitude Input frequency spectrum Sampling frequency fs = 48 kHz 12 24 48 96 192 log freq Input frequency spectrum Sampling frequency fs = 192 kHz Magnitude Anti- Alias Filter 18 kHz 0 240 dB/octave -96 12 24 48 96 192 freq Anti- Alias Filter 18 kHz 48 dB/octave 0 -96 12 24 48 96 192 freq 48 log freq 24 Anti-alias output frequency spectrum Magnitude 24 Attenuation (dB) 12 Attenuation (dB) Magnitude range). If it is decided that the low pass filter will cut off at 18 kHz, and if 96dB attenuation is required at 24kHz, then the filter requires a roll-off of 240 dB/octave as shown in the following figure: 12 24 48 96 log freq Anti-alias output frequency spectrum For a particular audio application, sampling at 48 kHz requires that the anti-alias has a sharp cut-off at 18kHz to attenuate by 96dB at 24kHz. For a system that oversamples by a factor of 4, i.e. at 192 kHz the anti-alias analogue filter has a reduced roll-off specification as only aliasing frequencies above 96 kHz must be removed to avoid baseband aliasing. Thereafter a digital low pass filter can be designed to filter off the frequencies between 18 and 24 kHz prior to a 4 x’s downsampling Clearly this is a 40th order filter and somewhat difficult to reliably design in analogue circuitry! (Please note the figures used here are for example purposes only and do not necessarily reflect actual digital systems.) However if we oversample the music signal by 4 x’s, i.e. at 4 × 48 kHz = 192 kHz , then an analog anti-alias filter with a roll-off of only 48 dB/octave starting at 18 kHz and providing more than 96dB attenuation at half of the oversampled rate of 96 kHz is required as also shown in the above figure. (In actual fact the roll-off could be even lower as it is very unlikely there will be any significant frequency components above 30 kHz in the original analogue music.) If an oversampled digital audio signal is input to a DSP processor, clearly the processing rate must now run at the oversampled rate. This requires R x’s the computation of its Nyquist rate counterpart (i.e. the impulse response length of all digital filters is now increased by a factor of R), and at a frequency R x’s higher. Hence the DSP processor may need to be R x’s faster to do the same useful processing as the baseband sampled system. This is clearly not very desirable and a considerable disadvantage compared to the Nyquist rate system. Therefore the oversampled signal is decimated to the Nyquist rate, first by digital low pass filtering, then by downsampling. Therefore any frequencies that thereafter exist between 18 and 96 kHz can be removed with a digital low pass filter prior to downsampling by a factor of 4. Hence the complexity of the analogue low pass antialias filter has been reduced by effectively adding a digital low pass stage of anti-alias filtering. For an R x’s oversampled signal the only portion of interest is the baseband signal extending from 0 to f n ⁄ 2 Hz, where f n is the Nyquist rate and f s = Rf n , and hence the decimation described above is required. Therefore in order to reduce the processing rate to the baseband rate the oversampled signal is first digitally low pass filtered to the f n ⁄ 2 using a digital filter with a sharp cut-off. The DSPedia 298 resulting signal is therefore now bandlimited to f n ⁄ 2 and can be downsampled by retaining only every R-th sample. This process of oversampling has therefore reduced the specification of the analog anti-alias filter, by introducing what is effectively a digital anti-alias filter. The design tradeoff is the cost of the sharp cut-off digital low pass (decimation) filter versus the cost of the sharp cutoff analogue anti-alias filter. As well as reducing the cost, oversampling can be used to increase the resolution of an ADC or DAC. For example, if an ADC has a quantization level of q volts the in band quantization noise power can be calculated as: 2q 2 f Q N = --------------B12f s (443) Baseband signal of interest Signal Power 2q 2 f Q N = --------------B12f s Total quantization noise (q2/12) fB Quantisation noise fs/2 freq Therefore in order to increase the baseband signal to quantisation noise ratio we can either increase the number of bits in the ADC or increase the sampling rate f s a number of factors above Nyquist. From the above figure it can be seen that oversampling a signal by a factor of 4 x’s the Nyquist rate reduces the in-band quantization noise (assumed to be a flat spectrum between 0 Hz and f s ⁄ 2 Hz) by 1/4. This noise power is equivalent to an ADC with step size q ⁄ 2 and hence baseband signal resolution has been increased by 1 bit [8]. In theory, therefore, if a single bit ADC were used and oversampled by a factor of 4 15 ( ≈ 10 9 × f s ) then a 16 bit resolution signal could be realized! Clearly this sampling rate is not practically realisable. However at a more intuitively useful level, if an 8 bit ADC converter was used to oversample a signal by a factor of 16x’s the Nyquist rate, then when using a digital low pass filter to decimate the signal to the Nyquist rate, approximately 10 bits of meaningful resolution could be retained at the digital filter output. See also Decimation, Noise Shaping, Quantisation Error, Sigma Delta, Upsampling, Undersampling. 299 P P*64: Another name for the H.261 image compression/decompression standard. Packet: A group of binary digits including data and call control signals that is switched by a telecommunications network as a composite whole. Parallel Adder: The parallel adder is composed of N full adders and is capable of adding two N bit binary numbers to realise an N+1 bit result. A four bit parallel adder is: MSB’s s4 a3 b3 a2 b2 a1 b1 a0 b0 FA FA FA FA s3 General 4 bit addition: s2 a3 a2 a1 a0 + b3 b2 b1 b0 s4 s3 s2 s1 s0 s1 s0 A Example: B S LSB’s 0 1101 +1011 11000 13 11 Four bit binary addition can be performed using a simple linear array of full adder logic circuits. For an N bit full adder, N full adders are required. Because the above carry ripples from the LSB to the MSB (right to left) it is often called a ripple adder. The latency of the adder is calculated by finding the longest path through the adder. The above example is for simple unsigned arithmetic, however the parallel adder can easily be converted to perform in 2’s complement arithmetic [20]. In general inside a DSP processor, the parallel adder will be integrated with the parallel multiplier and arithmetic logic unit, thereby allowing single cycle adds, and single cycle multiply-add operations. See also Arithmetic Logic Unit, Full Adder, Parallel Multiplier, DSP Processor. Parallel Multiplier: The key arithmetic element in all DSP processors is the parallel multiplier which is essentially a digital logic circuit that allows single clock cycle multiplication of N bit binary numbers, where N is the wordlength of the processor. Consider the multiplication of two unsigned 4 bits numbers: General 4 bit multiplication: a3 a2 a1 a0 b3 b2 b1 b0 c3 c2 c1 c0 d3d2d1d0 e3 e2 e1 e0 f3 f2 f1 f0 p7p6p5p4p3p2p1p0 A B P=BxA Example: 1101 1011 1101 1101 0000 1101 10001111 13 11 143 = 11 x 13 Binary multiplication can be performed using the same partial product formation as used for decimal multiplication. This calculation can then be easily mapped onto an array of full adders with single bit multiplication performed by a simple AND gate. DSPedia 300 (In practice 2’s complement multiplication is required in DSP calculations to represent both positive and negative numbers, however for the illustrative purpose here the unsigned parallel multiplier should suffice; the 2’s complement multiplier requires only minor modification [20]). The above 4 bit calculation can be mapped onto an array of binary adders/AND gates: s a 0 a3 0 a2 0 a1 0 a0 b bout c FA cout aout b0 0 0 sout b1 0 0 cout = s.z.c + s.z.c + s.z.c + s.z.c b2 0 0 z = a.b sout = (s ⊕ z) ⊕ c b3 0 aout = a bout = b p7 p6 p5 p4 p3 p2 p1 p0 Each cell of the parallel multiplier has a full binary adder and a logical AND gate. The multiplier performs a binary multiplication by forming the partial products and summing them together using the same mechanism as used in decimal. This multiplier is for positive integer values. Some modification is required to produce a multiplier the operates on 2’s complement arithmetic as required for DSP. The above 4 bit multiplier produces an 8 bit product and requires 4 2 = 16 cells. Therefore a 16 bit multiplier requires 16 2 = 256 cells and produces a 32 bit product, and a 24 bit multiplier requires 24 2 = 576 cells and produces a 48 bit product, and so on. Given that about 12 logic gates may be required for each cell in the multiplier, and each gate requires say 5 transistors, the total transistor count and therefore silicon area required for the multiplier can be very high in terms of percentage of the total DSP processor silicon area. Most general purpose processors do not have parallel multipliers and will perform multiplication using the processor ALU and form one partial product per clock cycle, to produce the product in N clock cycles (where N is the data wordlength). For some ASIC DSP designs a parallel multiplier may be too expensive and therefore a bit serial multiplier may be implemented. These devices require only N cells, however the latency is N clock cycles [12]. See also Division, DSP Processor, Full Adder, Parallel Adder, Square Root. Parallel Processing: When a number of DSP processors are connected together as part of the same system, this is referred to as parallel processing system, as the DSPs are operating in parallel. Although defined as a research area on its own (for complex parallel systems), some simple parallel processing approaches to decomposing DSP algorithms are usually rather obvious where small numbers of DSPs are concerned. Parseval’s Theorem: The total energy in a signal can be calcuated based on its time representation, or its frequency representation. Given that the power calculated in both domains must be the same, this equality is called Parseval’s theorem. From the Fourier series, recall that a signal, x ( t ) , can be represented in terms of its complex Fourier series: 301 ∞ x(t) = ∑ C n e jnω 0 t Synthesis n=∞ T (444) 1 Analysis C n = --- ∫ x ( t )e –j nωo t dt T 0 Complex Fourier Series Equations The power in the signal, x ( t ) , can be calculated by integrating over one time period, T : T 1 P = --- ∫ x 2 ( t ) dt T (445) 0 However if we calculated the power based on the power of each of the complex exponential signals, then the total power is: ∞ P = ∑ ∞ ∞ ∑ C n e jnω0 t 2 = n = –∞ C n 2 e jnω0 t 2 = ∑ Cn 2 (446) n = –∞ n = –∞ given that power in the complex exponential e jnω0 t = cos nω 0 t + j sin nω 0 t is 1. Hence for the complex Fourier series representation of a signal, we can state Parseval’s theorem as: ∞ T --1- ∫ x 2 ( t ) dt = T ∑ Cn 2 (447) n = –∞ 0 If the periodic signal x ( t ) is real valued, we can also stated Parseval’s theorem in terms of the amplitude/phase Fourier series representation. Recalling that for a period signal that: ∞ x(t) = ∑ Mn cos ( nω0 t – θn ) n=0 (448) θn = tan– 1 B ⁄ A Mn = A n2 + B n2 where A n and B n are the Fourier coefficients then: ∞ P = ∑ ∞ ( M n cos ( nω 0 t – θ n ) ) 2 = n=0 and Parseval’s theorem can be stated as: ∑ n=0 M n2 -------2 (449) DSPedia 302 ∞ T --1- ∫ x 2 ( t ) dt = T ∑ M n2 -------2 (450) n=0 0 If a signal is aperiodic, the Parseval’s theorem can be stated in terms of the total energy in the signal being the same in the time domain and frequency domain: ∞ E = ∫ ∞ x ( t ) 2 dt = –∞ ∫ X ( f ) 2 df (451) –∞ See also Discrete Fourier Transform, Fourier Series, Fourier Transform. Passband: The range of frequencies that pass through a filter with very little attenuation. See also Filters. PC-Bus: Plug in DSP cards (or boards) for IBM PC (AT) and compatibles conform to the PC-Bus standard. Through the PC-Bus, a DSP processor will be provided with power, (12V and 5V), Ground lines, and a 16 bit data bus for transfer between DSP board and PC. See also DSP Board. Percentage Error: See Relative Error. Perceptual Audio Coding: By exploiting well understood psychoacoustic aspects of human hearing, data compression can be applied to audio thus reducing transmission bandwidth or storage requirements [30], [52]. When the ear is perceiving sound, spectral masking or temporal masking may occur - a simple example of spectral masking is having a conversation next to a busy freeway where speech intelligibility will be reduced as certain portions of the speech are masked by noisy passing vehicles. If a perceptual model can be set up which has similar masking attributes to the human ear, then this model can be used to perform perceptual audio coding, whereby redundant sounds (which will not be perceived) do not require to be coded or can be coded with reduced precision. See also Adaptive Transform Acoustic Coding, Audiology, Auditory Filters, Precision Adaptive Subband Coding (PASC), Psychoacoustics, Spectral Masking, Temporal Masking, Threshold of Hearing. Percussion: Any instrument which can be struck to produce a sound can be described as percussive [14]. Percussion sounds are either pitched or unpitched. For example drums and cymbals are usually unpitched instruments used to create and sustain the rhythm of music. Certain type of drums however, such as timpani actually have an associated pitch. Xylophones and marimba’s are pitched percussion instruments with a range of three or four octaves. 303 Drum 3 x 10-4 x 10-4 In the figures below the sound pressure level volume envelope, a short time segment and a frequency domain representation is shown for a cymbal strike and a snare drum beat. 2 Amplitude, c(k) Amplitude, d(k) 2 Cymbal 3 1 0 -1 -2 -3 1 0 -1 -2 -3 0 0.5 1 1.5 2 2.5 time/seconds 0 0.5 1 1.5 2 2.5 time/seconds Drum 3 x 10-4 x 10-4 The variation in sound pressure level for a drum beat and cymbal strike. Both signals last for about 1.5 seconds. From a simple visual inspection the cymbal seems to have more sustain and is a “fuller” waveform. 2 Amplitude, c(k) Amplitude, d(k) 2 Cymbal 3 1 0 -1 -2 -3 0.7 0.71 0.72 0.73 1 0 -1 -2 -3 0.74 0.75 time/seconds 0.7 0.71 0.72 0.73 0.74 0.75 time/seconds Drum 0 Magnitude, C(f) (dB) Magnitude, D(f) (dB) A short 0.15 second segment of the drum and cymbal signals clearly shows the cymbal to contain a wider range of higher frequencies. Both signals are random in nature with little discernible periodic content. -10 -20 -30 -40 -50 -10 -20 -30 -40 -50 0 2 4 6 8 10 frequency/kHz Cymbal 0 0 2 4 6 8 10 frequency/kHz Taking an FFT over a short 0.05 segment of the drum and cymbal waveforms serves to illustrate the stochastic nature of the two sounds. From the above figures it can be seen that the drum beat and cymbal strike signals both appear to be stochastic in nature although given that they produce sound based on a resonating impulse there is clear quasi-periodic content. These signals also possess a degree of regularity in that successive DSPedia 304 strikes sound “similar”. The drum exhibits a lower frequency content than the cymbal which is consistent with the more “bassy” sound it has. The sound pressure level created by drums and cymbals depends on the force with which they are struck; both are capable of generating up to 100 dB at a distance of 1 metre. See also Music, Western Music Scale. Perfect Pitch: The ability to exactly specify the name of a musical note being played on the Western music scale is called perfect pitch. Only a very few individuals have perfect pitch, and there is still some debate to whether such skills can be learned. Many individuals and musicians have good relative pitch, whereby given the name of one note in a sequence, they can correctly identify others in the sequence. See also Music, Pitch, Relative Pitch, Western Music Scale. Permanent Threshold Shift (PTS): When the threshold of hearing is raised due to exposure to an excessive noise a permanent threshold shift is said to have occurred. See also Audiology, Audiometry, Temporary Threshold Shift (TTS), Threshold of Hearing. Permutation Matrix: See Matrix Structured - Permutation. Period: The period, T, of a simple sine waveform is the time it takes for one complete wavelength to be produced. The inverse of period, gives the frequency, or the number of wavelengths in one sec: f = --1T (452) period f(t) T 2T 3T time t Personal Computer Memory Card International Association (PCMCIA): The name given to bus slots that became almost standard on notebook and subnotebook PCs around 1994. PCMCIA cards were originally memory cards, but now modems, small disk drives, digital audio soundcards, and DSP cards are available. The term PC Card is now being used in preference to the rather unwieldy acronym PCMCIA [169]. Personal Digital Assistant (PDA): A consumer electronics category which classifies handheld computers that can decode handwritten information (pattern recognition) and communicate with other computers and FAX machines [169]. Phase: The relative starting point of a periodic signal, measured in angular units such as radians or degrees. Also, the angle a complex number makes relative to the real axis. A sine wave (occurring with respect to time) can be written as: x ( t ) = A sin ( 2πft + φ ) (453) 305 Voltage where A is the signal amplitude; f is the frequency in Hertz; φ is the phase and t is time. A sin(φ) period = 1/f A time t Phase Compensation: A technique to modify the phase of a signal, but leaving the magnitude response unchanged. Phase compensation is usually peformed using an all-pass filter. If the phase of a system is compensated to produce an overall linear phase, then this is often refered to as group delay equalisation as linear phase corresponds to a constant group delay. See All-pass FilterPhase Compensation, Equalisation, Finite Impulse Reponse Filter - Linear Phase. Phase Delay: A term usually synonymous with group delay. See Group Delay. Phase Jitter: In telephony the measurement (in degrees out of phase) that an analog signal deviates from the referenced phase of the main data carrying signal. Phase jitter interferes with the interpretation of information by changing the timing or misplacing a demodulated signal in frequency. See also Clock Jitter. Phase Modulation: One of the three ways of modulating a sine wave signal to carry information. The sine wave or carrier has its phase changed in accordance with the information signal to be transmitted. See also Amplitude Modulation, Frequency Modulation. Phase Response: See also Fourier Series - Amplitude/Phase Representation, Fourier Series Complex Exponential Representation. Phase Shift Keying (PSK): A digital modulation technique in which the information data bits are encoded in the phase of the carrier signal. The receiver recovers the data bits by detecting the phase of the received signal over a symbol period and decoding this phase into the appropriate data bit pattern. See also Amplitude Shift Keying, Differential Phase Shift, Frequency Shift Keying. Phasing: A musical effect whereby the phase of a signal is modified, mixed (or added) with original signal, and the composite signal is then played [32]. See also Music, Music Synthesis. Phons: The phon (pronounced fone) is a (subjective) measure of loudness. The units of phons are given to the sound pressure level of a 1000Hz tone that a human listener has judged to be equally loud to the sound to be measured. Hence to measure a particular sound in phons would require a listener to switch back and forth between a calibrated, variable 1000Hz tone and the sound to be measured. See also Equal Loudness Contours, Equivalent Sound Continuous Level, Frequency Range of Hearing, Sound Pressure Level. Piezoelectric: Piezoelectric materials can convert mechanical stress into electrical output energy, hence they are widely used as sensors. Piezoelectric crystals are also used in a feedback configuration to make very precise clocks. Pipelining Execution: DSP processors having RISC architectures often implement a pipelining structure whereby instructions are executed by the processor in four stages: (1) Instruction Fetch, DSPedia 306 (2) Instruction Decode, (3) Memory Read, (4) Execute. Each stage takes one cycle of the processor clock, meaning that each instruction is a minimum of 4 clock cycles. However because the DSP processor has been designed to be pipelined, the processor can perform all four stages in one cycle. Hence this overlapping means that on average one instruction can be executed every clock cycle. Pink Noise: Pink noise is similar to white noise, except that rather than having a flat power spectrum, it falls off at 10dB/decade. Pink noise is sometimes referred to a 1 ⁄ f noise. Pitch: There are a number of varying definitions of pitch, however the generic meaning is the subjective quality of a sound which positions it somewhere in the musical scale [14]. As the number of cycles per second of a musical note increases linearly our perceived sense of pitch increases logarithmically. Although very similar to frequency which is measured exactly, pitch is determined subjectively. For example if two pure tones of slightly different frequencies are presented to a listener and they are allowed to adjust the intensity levels of one of them, then it is likely that they will be able to find a level where both tones sound as if they have the same pitch. Pitch is therefore to some extent dependent on intensity. At louder levels for low frequency tones the pitch decreases with increase in intensity, but for high tones the pitch increases with increase in intensity. See also Music, Perfect Pitch, Western Music Scale. Pivotting: See Matrix Decompositions - Pivoting. Plane Rotations: See Matrix Decompositions - Plane Rotations. Plosives: One of the elementary sounds of speech, namely plosives, fricatives, sibilant fricative, semi-vowels, and nasals. Plosives are formed by blocking the vocal tract so that no air flows and suddenly removing the obstruction to produce a puff of air. Examples of plosive sounds are “p”, “b”, “t”, “d”, “g”, and “k”. See also Fricatives, Nasals, Semi-vowels, and Sibilant Fricatives. PN Sequence: See Pseudo-Random Noise Sequence. Polar: Polar refers to the type of signalling method used for digital data transmission, in which the marks (ones) are indicated by positive polarities and the spaces (zeros) are indicated by negative polarities (or vice-versa). See also Bipolar (2), Non-return to Zero. Poles: If the impulse response of a recursive system (with feedback) is transformed into the zdomain, the poles of the function are found by factoring the denominator polynomial to find the roots. If the poles are outside the unit circle, then this is an indication that the system is unstable. The transfer function H(z) of a simple two pole IIR filter with the output y(n) = x(n) + 0.75y(n-1) 0.125 y(n-2) is stable: x(k) 0.75 y(k) 0.125 307 1 1 - = ----------------------------------------------------------------H ( z ) = --------------------------------------------------------------– – – 1 2 1 ( 1 – 0.75z + 0.125z ) ( 1 – 0.5z ) ( 1 – 0.25z –1 ) (454) i.e. the poles are z = 0.25 and z = 0.5. If the roots were outside of the unit circle (having a magnitude greater than 1), then the system, h(n) would be unstable. Positive Definite Matrix: See Matrix Properties - Positive Definite. Positive Semi-definite: See Matrix Properties - Positive Semi-definite. Postmultiplication: See Matrix Operations - Postmultiplication Power Spectral Density (PSD): The power spectral density describes the frequency content of a stationary stochastic or random signal. The PSD can be estimated by taking the average of the magnitude squared DFT sample values (the periodogram). Many other DSP techniques have been developed for estimating signal frequency content. This area of research is collectively call spectral estimation. The PSD is calculated from the Fourier transform of the autocorrelation function: ∞ Power Spectral Density, S ( f ) = ∑ r ( n )e –j2πfn (455) n = –∞ where the autocorrelation function, r ( n ) , provides a measure of the predictability of a signal, x ( k ) : r ( n ) = E { x ( k )x ( k + n ) } = ∑ x ( k )x ( k + n )p { x ( k ), x ( k + n ) } (456) k where p { x ( k ), x ( k + n ) } is the joint probability density function of x ( k ) and x ( k + n ) . For signals assumed to be ergodic the autocorrelation can be estimated as a time average: 1 r ( k ) = -----------------2M – 1 2M – 1 ∑ x ( n )x ( n + k ) for large M (457) k=0 If a particular autocorrelation function is estimated for n different time lags, then a PSD estimate can be computed as the DFT of these correlations. .See also Autocorrelation, Discrete Fourier Transform. Power Rails: The voltage used to power a DSP board will usually consist of a number of voltage sources, which are often referred to as power rails. For a DSP board, there are usually digital power rails (0 volts and 5 volts) to power the digital circuitry, and analog power rails (-12 volts, 0 volts, and +12 volts) to power the analog circuitry. DSPedia 308 PQRST Wave: The name given to the characteristic shape of an electrocardiogram (heartbeat) signal waveform. See also Electrocardiogram. Amplitude (mV) 0.6 R 0.5 0.4 0.3 0.2 0.1 0 -0.1 T P Q -0.2 0 0.1 S 0.2 0.3 0.4 0.5 time (secs) Precedence Effect: In a reverberant environment the sound energy received by the direct path can be much lower than the energy received by indirect reflected paths. However the human ear is still able to localize the sound location correctly by localizing the first components of the signal to arrive. Later echoes arriving at the ear increase the perceived loudness of the sound as they will have the same general spectrum. This psychoacoustic effect is known as the precedence effect, law of the first wavefront, or sometimes the Haas effect. The precedence effect applies mainly to short duration sounds or those of a discontinuous or varying form. See also Ear, Lateralization, Source Localization, Threshold of Hearing. Precision Adaptive Subband Coding (PASC): A data compression technique developed by Philips and used in hifidelity digital audio systems such as digital compact cassette (DCC). PASC is closely related to the audio compression methods defined in ISO/MPEG layer 1. Listening tests have revealed that the overall quality of PASC encoded music is “almost identical to that of compact disc (CD)”. In fact it has been argued that in terms of dynamic range DCC has improved performance given that it is compressing 20 bit PCM data compared to the encoding of 16 bit PCM data by a CD [83]. Precision adaptive subband coding compresses audio by not coding elements of an audio signal that a listener will not hear. PASC is based mainly on two psychoacoustic principles. First, the ear only hears sounds above the absolute threshold of hearing, and therefore any sounds below this threshold do not require to be coded. Second louder sounds spectrally mask quieter sounds of a “similar” frequency such that the quiet sound is unheard in the simultaneous presence of the louder 309 80 70 60 50 40 30 20 10 0 -10 20 80 70 100Hz narrowband noise at 60 10dB (SPL) is not perceived 50 Approximate absolute 40 threshold of hearing 30 20 10 0 -10 50 100 200 500 1000 2000 5000 10000 20 frequency (Hz) Raised threshold of hearing 600Hz tone at 20dB (SPL) is not perceived SPL (dB) SPL (dB) sound due to the psychoacoustic raising of the threshold of hearing. The following figure illustrates both principles: A sound below the threshold of hearing 50 100 200 500 1000 2000 5000 10000 frequency (Hz) Simultaneous spectral masking of a 1000Hz tone Attenuation (dB) In order to exploit psychoacoustic masking the first stage of a PASC system splits the Nyquist bandwidth of a signal (of between 16 and 20 bit resolution) sampled at 48kHz into 32 equal subbands each of bandwidth 750Hz. This is accomplished using a 512 weight prototype FIR low pass filter, h ( n ) , of 3dB bandwidth 375Hz, and stopband attenuation 120dB. Note that to achieve 120dB attenuation 20 bit filter coefficients are required. By modulating the impulse response h ( n ) with modulating frequencies of 375Hz, 1125Hz, 1875Hz and so on in 750Hz intervals, a series of 32 bandpass filters with a 3dB bandwidth of 750Hz and centered around the modulating frequency are produced. A polyphase subband filter bank is therefore set up as illustrated below 512 weight prototype FIR filter 0 3 0 375 1 1125 Subband filters BW = 750Hz 2 3 4 1875 2625 3375 30 31 22875 23625 0 x ( k) input 0 1 2 3 4 382 383 Polyphase subband filter bank 0 1 2 11 n 0 1 2 11 n 0 1 2 11 n 0 1 2 11 n 1 2 k f s = 48 kHz f s ⁄ 32 = 1.5 kHz 31 32 subbands used for PASC. The filter bank is based on a 512 weight FIR filter prototype with stopband attenuation of 120dB, i.e. 20 bits resolution. Data is input in 8 ms blocks (384 samples) and each subband is decimated to 12 samples. (Note that although aliasing occurs between adjacent subbands, the alias components are cancelled when the subbands are merged to reconstruct the original audio data spectrum [49].) The DSPedia 310 input data stream is subband filtered in blocks of 8 ms, which corresponds to 384 samples ( 48000 × 0.008 ). Therefore the output of each subband filter after decimation consists of 12 samples. With the signal in subband coded form the second stage of the PASC system is to perform a comparison of the full audio spectrum with a model of the human ear. The subband filtering allows a simple (but coarse) spectral analysis of the signal to be produced by calculating the power of the 12 sample values in each subband. If the power in a subband is below the threshold of hearing, then the subband is treated as being empty and does not need to be coded for the particular 8ms block being analyzed. If the power in a particular subband is above the threshold of hearing then a comparison is made with the known masking threshold to calculate the in-band masking level. Following this the level of masking caused by this signal in other neighboring subbands is established. The overall masking calculation is accomplished using a 32 × 32 matrix containing the masking information and defined in the ISO/MPEG standard. From the masking calculation results, a decision is made as to the number of bits that will be allocated to represent the data in that subband such that the quantization noise introduced is below the masking level (or raised threshold of hearing) and will therefore not be heard when the audio signal is reconstructed. The bit rate of a PASC encoded time frame of 8ms is fixed at 96 bits/frame (for each subband, on average). Therefore the bits must be allocated judiciously to the subbands. The subbands with the highest power relative to the masking level are allocated first as it is likely they will be important and dominant sounds in the overall audio spectrum and will require the best resolution. If two subbands have the same ratio, the lower frequency subband is given priority over the higher one. An example of quantization noise masking is given below: 0 Masking Level 1000Hz signal Signal Power (dB) Signal Power (dB) 1000Hz signal 750 1000 1500 log frequency (Hz) 16 bit quantization noise 0 Masking Level 750 1000 1500 log frequency (Hz) 8 bit quantization noise The 1000 Hz narrowband noise will spectrally mask any signals below the masking level (or raised threshold of hearing). Therefore, considering only this subband, when the signal is reproduced the higher level of quantization noise in the 8 bit signal will not be perceived. Hence the 8 bit signal has the same perceived quality as the 16 bit signal and data compression has been achieved without noticeable loss in quality. Note the masking effect of signals in nearby subbands may extend into the 750-1500Hz subband which could further increase the masking level and therefore allow even fewer bits to represent the signal. Rather than fixed point sample values (as used in the above illustrative example) PASC uses a simple block floating point number representation to represent sample values. The mantissa can be between 2 and 15 bits and the exponent is a 6 bit value. The actual number of bits assigned to the mantissa depend on the masking calculations. This leads to an overall dynamic range from 311 +6dB to -118dB (the extra 6dB headroom is required due to the subband filtering process) which is more than the 96dB available from 16 bit linear coding. On average a psychoacoustic subband coded music signal rarely assigns bits to subbands covering the frequency range 15 kHz to 24 kHz (i.e. they are usually empty!), around 3 to 7 mantissa bits will typically be required for subbands covering the frequency range 5kHz - 15 kHz, and for the frequency range 100Hz - 5 kHz between 8 and 15 bits are typically required. The higher bit allocation for lower frequencies is as expected as the masking effect is less pronounced at lower frequencies (see Spectral Masking). This allocation of precision would perhaps suggest that the initial subband structure should have a small bandwidth for low frequencies and a higher bandwidth for larger frequencies. However the small bandwidth required at low frequencies would require a very long impulse response filter which needs to be compensated for by delaying the output signal from higher subbands which have a smaller bandwidth if phase is to be preserved. To implement this delay on chip requires such a large area that this solution is not economically attractive, albeit good compression ratios would be possible. After each 8 ms time frame has undergone the PASC coding and bit allocation, the data is then stored in a encoded bit stream for recording to magnetic tape. Cross interleaved Reed-Solomon code (CIRC) is used for error correction coding of PASC data when recorded onto DCC (digital compact cassette). PASC techniques can also be applied to input data sampled at 32kHz or 44.1kHz. Because the data rate stays the same at 384bits/sec, the subband filter bandwidth for these sampling frequencies reduces to 500Hz and 698Hz respectively. See also Adaptive Transform Acoustic Coding (ATRAC), Auditory Filters, Compact Disc, Data Compression, Digital Compact Cassette (DCC), Frequency Range of Hearing, Psychoacoustics, Spectral Masking, Subband Filtering, Temporal Masking, Threshold of Hearing. Premultiplication: See Matrix Operations - Premultiplication Probability: The use of probabilistic measures and statistical mathematics in digital signal processing is very important. Specifically the concept of a random variable which is characterised via a probability density function (PDF) is very important. With probability, random signals can be characterised and information on their frequency content can be realised. In its simplest form the probability of an event A happening, and denoted as p ( A ) can be determined by performing a large number of trials, and counting the number of times that event A occurs. Therefore: no. of times A occured, n P ( A ) = lim --------------------------------------------------------------------Atotal no. of trials, n n→∞ (458) determines the probability of event A occurring. A simple example is the shaking of a die to determine the probability of a 6 occurring. If, for example 60 trials were done and a 6 occurred 8 times then P d ( 6 ) = 8 ⁄ 60 , where the subscript “d” specifies the process name. Of course the true probability is P d ( 6 ) = 1 ⁄ 6 which would have been determined if an “infinite” number of trials were done. From the above simple definition, it can be noted that 0 ≤ P ( A ) ≤ 1 . Clearly if P ( A ) = 0 (the null event) then the event (almost) never occurs, whereas if P ( A ) = 1 then it (almost) always occurs DSPedia 312 (the sure event). If you find the paranthetical “almosts” annoying, amusing, confusing, etc., remember that probability means never having to say you’re certain (or was that statistics?). The joint probability that the event AB will occur is denoted P ( AB ) . The following definitions are also useful for probability: • Bayes Theorem: The joint probability that an event AB occurs can be expressed as: P ( AB ) = P ( A )P ( B A ) = P ( B )P ( A B ) (459) If two events A and B are independent then P ( A B ) = P ( A ) or P ( B A ) = P ( B ) . • Conditional Probability: The probability that an event A occurs, where an event B has already occurred is denoted as P ( A B ) . • Independence: Two separate events, A and B, are independent if the probability of A and B occurring is obtained from the multiplication of the probability of A occurring, and B occurring: P ( AB ) = P ( A )P ( B ) (460) • Joint Probability: The probability of two events, A and B, occurring is: no. of times AB occured, n AB P ( AB ) = lim ---------------------------------------------------------------------------total no. of trials, n n→∞ (461) where the notation P(AB) can be read “the probability of event A and event B. As an example consider an experiment where a coin is flipped, and a die is shaken at the same time. The probability that a head shows up P c ( head ) , and the number 3, P d ( 3 ) is: 1 1 1 P ( head & 3 ) = P d ( 3 )P c ( head ) = --- × --- = -----6 2 12 (462) The shaking of the die and flipping of the coin are both independent events, i.e. the outcome of the coin flip has no bearing on the outcome of the die shake. See also Ergodic, Expected Value, Mean Value, Mean Squared Value, Probability, Random Variable, Variance, Wide Sense Stationarity. Probability Density Function: See Random Variable. Proportional Integral Derivative (PID) Controller: Process control applications monitor a variable such as temperature, level, flow and so on, and output a signal to adjust that variable to equal some desired value. In a PID the difference between the desired and measured variable is found (the error), and if large then the integral part of the controller causes the output to change faster and the derivative adjusts the magnitude of the output (controlling) signal in proportion to the error rate. PID controllers usually do not require the processing power of a DSP as the data processing rates are well within that of microcontrollers. Pseudo-Inverse: See Matrix Properties - Pseudo-Inverse. Pseudo-Inverse Matrix: See Matrix Properties - Pseudo-Inverse. 313 Pseudo-Noise (PN): Analog pseudo-noise can be generated using pseudo random binary sequence generator connected to a digital to analog converter (DAC): Clock, fc N-bit Pseudo Random Binary Sequence Shift Register 1 t c = ---fc x (k ) 2N-1-1 x(k) 0 N-bit DAC k -2N-1 Analog Reconstruction Filter Volts x ( t) 0 t The period of the pseudo noise is Nt c seconds. There are of course other methods of producing analog “noise”, however the term pseudo noise usually indicates that the sequence was generated using pseudo random noise sequence generating schemes. See also Pseudo-Random Noise Sequence, Pseudo-Random Binary Sequence. Pseudo-Random Binary Sequence (PRBS): The PRBS is a binary sequence generated by the use of an r-bit sequential linear feedback shift register arrangement. PRBS’s are sometimes called pseudo noise (PN) sequences and pseudo random noise (PRN). PRBS’s are widely used in digital communications, where for example both ends of digital channel contain a circuit capable of generating the same PRBS, and which can therefore allow the bit error rate of the channel to be measured, or perhaps adaptive equalization to be performed. Exclusive-OR gate PRBS Generator PRBS Generator 0100101110110 0100101110110 Receiver & Demodulation Modulation & Transmission Error if output = 1 0100101110110 Communication line Duluth, Minnesota, USA Glasgow, Scotland A PRBS sequence can be transmitted down a communications line (e.g. telephone, satellite etc.) and the data sequence received at the receiver checked against the known transmitted sequence, assuming the two PRBS generators are synchronised and producing the same sequence. If the output of an exclusive-OR gate is binary 1, then an error has occurred. Other applications include using PRBS for spread spectrum communications [9], for scrambling data, and using a PRBS for range finding via radar or sonar [116]. DSPedia 314 A PRBS is called pseudo random because in actual fact the sequence repeats over a large number of bits and is therefore actually periodic, however the short term behaviour of the sequence appears random. The general construction of a PRBS producing linear feedback shift register of length r bits is: PRBS output Exclusive-OR p(k) C1 C2 C3 C4 Single Bit Register Cr Cn Single Bit Multiplier where the register is clocked at every T c seconds (often denoted as the chip interval), and the binary data signal, p ( k ) , is therefore output at a rate of f c = 1 ⁄ T c . The longer the register, then the longer the PRBS that can be generated. The values of the single bit multipliers C r are either 0 or 1 and they can be represented in a convenient characteristic polynomial notation: r f( X) = 1 + ∑ Ck Xk = Cr X r + Cr – 1 X r – 1 + … + C1 X + 1 (463) k=1 By carefully choosing the polynomial it is possible to ensure that the shift register cycles through all the possible states (or N-tuples), with the exception of the all zero state [40]. This will produce a PRBS of 2 r – 1 bits (and known as a maximal sequence) before the cycle restarts. If the register ever enters the zero state it will never leave. As an example consider a 31 bit maximal length sequence can be produced from the polynomial: X5 + X2 + 1 (464) which specifies the 5 bit PRBS shift register: PRBS output p( k) For a particular PRBS, a sequence of the same bits (either 1’s or 0’s) is referred to as “run”, and the number of bits in the run, is the “length”. For a maximal length sequence from an r bit register of length N (= 2 r – 1 ) bits it can be shown that the PRBS will contain one run length of N 1’s, and one 315 run of N – 1 0’s. The number of other run lengths of 1’s and 0’s increases with the power of 2 as follows: Run Length 1’s 0’s N 1 0 N-1 0 1 N-2 1 1 N-3 2 2 : : : 3 N-5 2 2N-5 2 2N-4 2N-4 1 2N-3 2N-3 For example an r = 4 bit shift register can be set up from the polynomial X 4 + X 3 + 1 to produce a 15 bit maximal length PRBS as follows: p(k) 1 0 0 0 1 0 p(k) 0 1 1 0 1 0 1 1 1 1 0 15 bits time Priming the shift register with 0001, will cause it to cycle through 1000, 0100, 0010, 1001, 1100, 0110, 1011, 0101, 1010, 1101, 1110, 1111, 0111, 0011, and back to 0001. If the contents of the shift register are considered as a binary number, then a PRBS generator contains all binary numbers from 1 to 2 N – 1 in a “random” order. Note that the PRBS has a sequence of four 1’s, one sequence of three 0,s and so on accordance with the above table denoting the run lengths for an N bit PRBS. Note that when a PRBS is generated over N clock cycles, then the shift register contains at some point, all binary numbers from 1 to 2 r – 1 , i.e. except zero, a state from which the PRBS can never leave. Feedback taps for some maximal length sequences using longer shift lengths are shown in the table below: Shift Register Length, r Maximal Code Length, N Maximal Sequence Generating Polynomials 5 31 X5+X3+1 8 255 X8+X6+X5+X4+1 10 1023 X10+X7+1 16 65535 X16+X15+X13+X4+1 20 1048575 X20+X17+1 24 16777215 X24+X23+X22+X17+1 DSPedia 316 Note that other polynomials can be used to generate other maximal length sequences of N bits. The actual number of maximal length generating polynomials can be calculated using prime factor analysis [116]. A useful property of a maximal length sequence is that the alternate bits in a sequence form the same sequence at half of the rate. Consider two runs of the above 15 bit PRBS sequence generated from the polynomial X 4 + X 3 + 1 and creating a new sequence by retaining only every second bit: 15 bits 15 bits 1 0 0 0 1 0 0 1 1 0 1 0 1 1 1 1 0 0 0 1 0 0 1 1 0 1 0 1 1 1 1 1 2 1 3 4 0 5 6 7 1 8 0 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 1 1 1 1 0 0 0 1 0 0 1 Taking only every alternate bit then the same PRBS is generated but at half of the frequency. For example above, taking bits, 1, 3, 5 and so on, produces the same PRBS at half of the frequency. In turn the PRBS sequence at one quarter of the frequency can be produced from the half rate PRBS, and so on decimating by any factor R, where R is a power of 2. If a signal q ( k ) is derived from the PRBS signal p ( k ) such that 1 volt, if p ( k ) = 1 q( k) = – 1 volt, if p ( k ) = 0 (465) then the autocorrelation of a maximal length PRBS, q ( k ) , of N bits is: 1 R q ( m ) = --L 1, m = jN ∑ p ( n )p ( n + m ) = – ---1- , m ≠ jN where j = 0, 1, 2, … N n=0 L–1 (466) (for L large), which can be represented as: Rq ( m ) 1 0 -2N -N 6 5 4 3 2 1 1 2 3 4 5 6 -1/N N-1 N 2N m 317 It can therefore be shown that the autocorrelation of the continuous time waveform q ( t ) is also periodic and is a triangular waveform: Rq ( τ ) 1 0 -2NTc lag, τ NTc -NTc 2NTc -1/N The power spectrum, P ( f ) , obtained from the Fourier transform of the autocorrelation, is therefore a line spectrum, with a ( sin x ⁄ x ) 2 envelope: log P ( f ) 1--N 1 ∆f = ---------NT c 1----Tc 2----Tc 3----Tc 4----Tc f, frequency (Hz) Similar types of feedback shift registers to the PRBS generator are also used for setting up cyclic redundancy check codes. See also Characteristic Polynomial, Cyclic Redundancy Check. Pseudo-Random Noise Sequence (PRNS): A sequence numbers that has properties that make the sequence appear to be random, in spite of the fact that the numbers are generated in a deterministic way and therefore periodic. Linear feedback shift registers are often used to generate these sequences. Maximal Length (ML) binary sequences produce 2 N – 1 bit sequences (the longest sequence possible without repetition) from an N bit shift register. See also Pseudo-Random Binary Sequence. Psychoacoustics: The study of how acoustic transmissions are perceived by a human listener. Psychoacoustics relates physical quantities such as absolute frequency and sound intensity levels to perceptual qualities, such as pitch, loudness and awareness. Although certain sounds may be presented to the ear, the human hearing mechanism and brain may not perceive these sounds. For example a simple psychoacoustic phenomena is habituation whereby a repetitive sound such as a clock ticking is not heard until attention is specifically drawn to it. Spectral masking is an example of a more complex psychoacoustic phenomena known whereby loud sounds over a 318 DSPedia certain frequency band mask the presence of other quieter sounds with similar frequencies. Spectral masking is now widely exploited to allow data compression of music such as in PASC, Musicam and ATRAC. See also Adaptive Transform Acoustic Coding, Audiology, Auditory Filters, Beat Frequencies, Binaural Beats, Binaural Unmasking, Equal Loudness Contours, Habituation, Lateralization, Monaural Beats, Precedence Effect, Perceptual Audio Coding, Precision Adaptive Subband Coding (PASC), Sound Pressure Level, Sound Pressure Level Weighting Curves, Spectral Masking, Temporal Masking, Temporary Threshold Shift, Threshold of Hearing. Psychoacoustic Model: A model of the human hearing mechanism based on aspects of the human perception of different sounds to the actual sounds being played. For example a psychoacoustic model for the phenomenon known as spectral masking has been realized and used to facilitate data compression technique for digital compact cassette (DCC), and for the mini-disc (MD). See also Psychoacoustics, Precision Adaptive Subband Coding (PASC), Spectral Masking, Temporal Masking, Threshold of Hearing. Ptolemy: An object oriented framework for discrete event simulation and DSP systems, design, testing and simulation. Ptolemy is available from University of Berkeley. Pulse Amplitude Modulation (PAM): PAM is a term generally used to refer to communication via a sequence of analog values such as would be needed to send the voltages corresponding to a sampled but not quantized analog signal. When the set of values the samples can take on is finite, the term Amplitude Shift Keying (ASK) is usually used to denote this digital modulation technique. However, PAM is sometimes used interchangeably with ASK. See also Sampling, Amplitude Shift Keying. Pulse Code Modulation (PCM): If an analog waveform is sampled at a suitable frequency, then each sample can be quantized to a value represented by a binary code (often 2’s complement). The number of bits in the binary code defines the voltage quantization level, and the sampling rate should be at least twice the maximum frequency component of the signal (the Nyquist rate). See also Analog to Digital Converter, Digital to Analog Converter. See figure after Pulse Width Modulation. Pulse Position Modulation (PPM): If an analog waveform is sampled at a suitable frequency then the value of each sample can be represented by a single pulse that has a variable position within the sample period that is proportional to the sample analog value. Signals that are received in PPM can be converted back to analog by comparing the samples with a sawtooth waveform. When the pulse is detected, the level of the sawtooth at that time represents the analog value. The earlier a pulse is detected, the lower the analog value. See figure after Pulse Width Modulation. Pulse Train: A periodic train of single unit pulses. Pulse trains with a period equalling human voice pitch are used as excitation in vocoding (voice coding) schemes such as linear predictive coding (LPC). See Linear Predictive Coding, Square Wave. Pulse Width Modulation (PWM): PWM is similar to Pulse Position Modulation except that is the information is coded as a the width of a pulse rather than its position in the symbol period. The pulse 319 width is proportional to the analog value of that sample. The analog signal can be recovered by integrating the pulses. 111 110 101 100 011 010 001 000 4 1 5 Pulse Width Modulation 4 1 5 Pulse Position Modulation time Sampling and Quantization 001 100 4 101 6 time 6 time 110 1 5 Pulse Code Modulation 6 time Pythagorean (Music) Scale: Prior to the existence of the equitemporal or Western music scale, a (major) musical key was formed from using certain carefully chosen frequency ratios between adjacent notes, rather than the constant tone and semitone ratios of the modern Western music scale. The ancient C-major Pythagorean scale would have had the following frequency ratios: C-major Scale C Frequency ratio 1/1 D E F G 9/8 81/64 4/3 3/2 A B 27/16 243/128 C 2/1 The frequency ratio gives the ratio of the fundamental frequency of the root note, to the current note. The above ratios correspond to the Pythagorean Music Scale. Any note can be used to realise a Pythagorean major key or scale. However using the Pythagorean scale it is difficult to form other major or minor keys without a complete retuning of the instrument. Instruments that are tuned and played using the Pythagorean scale will probably sound in some sense “ancient” as our modern appreciation of music is now firmly based on the equitempered Western music scale. See also Digital Audio, Just Scale, Music, Music Synthesis, Western Music Scale.. 320 DSPedia 321 Q Q Format: Representing binary numbers in the Q format ensures that all numbers have a magnitude between -1 and 1. The MSB of a Q15 number is the sign bit with magnitude 1, and the bits following have bit values of: 2 –1 = 0.5 , 2 –2 = 0.25 , 2 –3 = 0.125 ,..... 2 – 15 = 3.0517578 × 10 –5 . The only difference between normal two’s complement (binary point after the LSB) and Q format is the position of the binary point. The Q format is used in DSP to ensure that when two numbers are multiplied together their magnitude will always be less than 1. Therefore fixed point DSP processors can perform arithmetic without overflow. QR: See Matrix Decompositions - QR. QR Algorithm: A linear technique that implicitly forms an orthogonal matrix Q to transform a matrix A into an upper triangular matrix R, i.e. A = QR. The QR algorithm is numerically stable and can be used for solving linear sets of equations in a variety of DSP applications from speech recognition to beamforming. The algorithm is however, very computationally expensive and not used very often for real time DSP. See Matrix Decompositions - QR. Quad: A prefix to mean “four of”. For example the Burr Brown DAC4814 chip is described as a Quad 12 Bit Digital to Analog Converter (DAC) meaning that the chip has four separate (or independent) DACs. See also Dual. Quadraphonic (or Quadrophonic): Using four independent channels for the reproduction of hifidelity music. Quadrophonic systems were first introduced in the 1970s as an enhancement to the stereophonic system, however the success was limited. In the 1990s surround sound systems such as Dolby Prologic use four and more channels to encode the sound with 3-dimensional effect. The term quadraphonic is rarely implemented or used. Note that a system which simply uses four loudspeakers (two left channels and two right channels) is not quadraphonic. See also Stereophonic, Surround Sound, Dolby Prologic. Quadratic Equation: A polynomial is a quadratic equation if it has the form, ax 2 + bx + c = 0 , where x is a variable, and a,b, and c are constants. Note that the quantity x may be a vector, and a, b, and c are appropriately dimensioned vectors and matrices. For example in calculating the Wiener-Hopf solution the following equation must be solved: xRx T + px + c = 0 (467) where x is an n × 1 vector, R is an n × n matrix, p is an n × 1 vector and c is a scalar constant. Quadratic Formula: Given a quadratic polynomial, polynomial can be calculated from: – b ± b 2 – 4ac x = --------------------------------------2a ax 2 + bx + c = 0 , the roots of this (468) DSPedia 322 such that: + b 2 – 4ac- + ----------------------------------– b 2 – 4ac- = 2 + b x + b --- x + --c----------------------------------x b x a a 2a 2a Geometrically, the roots of a polynomial are where a graph of parabolic in shape) cuts the x-axis. y y = ax 2 + bx + c (469) (which is 6 5 4 y = x2 – x – 2 3 2 1 0 -4 -3 -2 -1 -1 1 2 3 4 x -2 Note that if the graph does not cut the x-axis, then the quantity b – 4ac will be an imaginary number (square root of a negative number), and the roots are then complex numbers. See also Complex Roots, Poles, Polynomial, Zeroes. Quadratic Surface: See Hyperparaboloid. Quadrature: This term is used in reference to the four quadrants defined in two dimensions. Quadrature representations are particularly useful in communications because the cosine and sine components of a single frequency can be thought of as the two axes in the complex plane. By representing signals via in-phase (cosine) and quadrature (sine) components, all of the tools of complex number analysis are available to simplify the analysis and design of digital signal sets. Quadrature Amplitude Modulation (QAM): When both the amplitude and the phase of a quadrature (two dimensional) signal set are varied to encode the information bits in a digital communication system, the modulation technique is often referred to as QAM. Common examples are rectangular signal sets defined on a two-dimensional Cartesian lattice, such as 16 QAM (4 bits per symbol), 32 QAM (5 bits per symbol), and 64 QAM (6 bits per symbol). QAM modulation techniques are used for many modem communication standards. See also V-Series Recommendations, Amplitude Shift Keying, Phase Shift Keying. Quadrature Mirror Filters (QMF): A type of digital filter which has special properties making it suitable for sub-band coding filters. Quadrature Phase Shift Keying (QPSK): QPSK is a common digital modulation (phase shift keying) technique that uses four signals (symbols) that have equal amplitude and are successively shifted by 90 degrees in phase. See also Phase Shift Keying, Quadrature. Quantization: Converting from a continuous value into a series of discrete levels. For example, a real value can be quantized to its nearest integer value (rounding) and the resulting error is referred 323 to as the quantization error. The quantization error therefore reflects the accuracy of an ADC. Quantization introduces an irreversible distortion on an analogue signal. Binary Output Quantization Level, q Analog Input Quantizers are found somewhere at the heart of every lossy compression algorithm. In JPEG, for example, the quantizer appears when the DCT coefficients for an image block are quantized. See also Analog to Digital Converter, A-law C, Sample and Hold. Quantization Error: The difference between the true value of a signal and the discrete value from the A/D at that particular sampling instant. If the quantization level is q volts, then the maximum error at each sample is q/2 volts. If an analog value x is to be quantized it is convenient to represent the quantized value as a sum of the true analog value and a quantization error component, e, i.e.: x = x + e , where x is the quantized value of x. See also Rounding Noise, Truncation Noise. Quantization Noise: Assuming the an ADC rounds to the nearest digital level, the maximum quantisation error of any one sample is q/2 volts (see Quantization Error). If we assume that the probability of the error being at a particular value between +q/2 and -q/2 is equally likely then the probability density function for the error is flat. p( e) 1/q q/2 e -q/2 Therefore treating the error as white noise, then we can calculate the noise power of the error as: n adc 1 = --q q⁄2 ∫ –q ⁄ 2 q 2e 2 de = ----12 (470) DSPedia 324 The quantisation noise will extend over the frequency range 0 to fs/2. , i.e. the full baseband. Signal Spectrum Y(f) Signal Power E(f), Quantisation Noise fs/2 frequency (Hz) Low level signals may be masked by the quantisation noise. Although it is assumed that the quantisation noise is uncorrelated with the signal, in practice for periodic signals this is not strictly true, and therefore the flat white spectrum is not strictly true. For an N-bit signal, there are 2 N levels from the maximum to the minimum value of the quantiser: Binary Output 2N – 1 – 1 -1 2 Quantization step size q = ------2N 1 Analog Input –2 N – 1 Therefore the mean square value of the quantisation noise power can be calculated as: ( 2 ⁄ 2 N )2 4 Q N = 10 log --------------------- = 10 log 2 –2 N + 10 log ------ ≈ – 6.02N – 4.77 dB 12 12 (471) Another useful measurement is the signal to quantisation noise ratio (SQNR). For the above ADC with voltage input levels between -1 and +1 volts, if the input signal is the maximum possible, i.e. a sine wave of amplitude 1 volt, then the average input signal power is: --Signal Power = E [ sin 2πf t 2 ] = 1 2 (472) Therefore the maximum SQNR is: Signal Power 0.5 3 SQNR = 10 log ----------------------------------- = 10 log --------------------------- = 10 log 2 – 2 N + 10 log --Noise Power 2 ( 2 ⁄ 2 N ) 2 -------------------- 12 (473) = 6.02N + 1.76 dB For a perfect 16 bit ADC the quantisation noise can be calcuated to be 98.08 dB. See also A-law compression, Signal to Noise Ratio. 325 Quantisation Noise, Reduction by Oversampling: Oversampling can be used to increase the resolution of an ADC or DAC. If an ADC has a step size of q volts (see Quantisation Error) and a Nyquist sampling rate of f n , then the maximum error, e ( n ) , of a quantised sample is between – q ⁄ 2 and q ⁄ 2 . Therefore if the true sample value is x ( n ) , then the quantised sample, y ( n ) , is: y( n ) = x( n ) + e( n ) (474) If we assume that the quantisation value is equally likely to take any value in this range (i.e. it is white), then we can assume that the probability density function for the noise signal is uniform. Therefore the average quantisation noise power in the range 0 to f n ⁄ 2 can be calculated as the average squared value of e : QN 1 q⁄2 2 11 = --- ∫ e de = --- --- e 3 q –q ⁄ 2 q3 q⁄2 –q ⁄ 2 q 2= ----12 (475) The same answer could be obtained from the time average: 1 Q N ≈ ----M M–1 ∑ q2 e 2 ( m ) = -----12 (476) m=0 In order to appreciate that the quantisation noise does not decrease, note that the same approximate answer is obtained for a signal that is oversampled by R times: 1 Q N ≈ --------MR MR – 1 ∑ q 2e 2 ( r ) = ----12 (477) r=0 For an oversampled system sampling at f ovs and using the same converter, the total quantisation noise power will of course be the same but because it is white (a flat spectrum) it is now spread over the range 0 to f ovs ⁄ 2 . Evaluating Eqs. 475 or 476 for different sampling rates will give the same answer. The actual noise power in the baseband, Q ovs , is now given as: q2( fn ⁄ 2 ) Q ovs = --------------------------12 ( f ovs ⁄ 2 ) (478) (Note that for the more common periodic and aperiodic signals, the quantisation noise spectra is not “white”; however for a “noisy” stochastic input signal the white quantisation noise assumption is “reasonably” valid). From Eq. 478, in order to increase the baseband signal to quantisation noise ratio we can either increase the number of bits in the ADC or increase the sampling rate above the Nyquist rate. By increasing the sampling rate, the total quantisation noise power does not increase, and as a result the in-band quantisation noise power will decrease. DSPedia 326 As an example, oversampling a signal by a factor of 4 x’s the Nyquist rate reduces the in-band quantisation noise by 1/4: Signal Power Baseband signal of interest Qovs = 1/4 QN QN Total quantisation noise QN { fn /2 fovs/2 freq When a signal is oversampled the total level of quantisation noise does not change. Therefore for every increase in sampling rate above Nyquist the baseband quantisation noise power will reduce. This level of baseband noise power is equivalent to an ADC with step size q/2: q 2 ( fn ⁄ 2 ) ( q ⁄ 2 ) 2- = Q N - = 2 --------------------------Q OVS = -------------------------12 ( 4f n ⁄ 2 ) 12 4 (479) and hence baseband signal resolution has been increased by 1 bit since. For each extra bit of resolution the signal to quantisation noise ratio improves by 20 log 2 = 6.02 dB . In theory therefore if a single bit ADC were used and oversampled by a factor of 4 15 ( ≈ 10 9 × f n ) then a 16 bit resolution signal could be realized! Clearly this sampling rate is not practically realisable. However on a more pragmatic level, if a well trimmed low noise floor 8 bit ADC converter was used to oversample a signal by a factor of 16 x’s the Nyquist rate, then when using a digital low pass filter to decimate the signal to the Nyquist rate, approximately 10 bits of resolution could be obtained. Single bit oversampling ADCs can however still be achieved using quantisation noise shaping strategies within sigma delta converters (see Sigma Delta). To illustrate increasing the signal resolution by oversampling, the figure below shows the result of a simulation quantising a high resolution floating point white noise digital signal in the amplitude range -1 to +1 to 4 bits (i.e. 16 levels in range -1 to +1) using a digital quantiser to simulate an ADC. The bandwidth of interest is 0-5000 Hz, and hence the Nyquist rate is f n = 10000 Hz , and oversampling at 16 x’s gives f ovs = 160000 Hz and should yield two “extra bits” of resolution. The quantisation noise for the Nyquist rate and oversampled rate quantisers (ADCs) then reveals the 327 expected 12 dB advantage from the oversampling strategy. 16 White noise band limited 0-80000 Hz 4 bit quantiser − Magnitude (dB) LPF 0-5000Hz Nyquist Quantisation Noise + 10000Hz 4 bit quantiser + − 160000Hz LPF 0-5000Hz 0 10 20 30 12 dB 40 0 16 2500 5000 frequency/Hz Oversampled Quantisation Noise Quantising a real value (floating point) signal of baseband 0-5000 Hz to 4 bits. Note that the oversampling procedure produces a level of inband quantisation noise that is 12 dB below that of the Nyquist rate quantiser. The magnitude spectra was produced from a 1024 point FFT of the quantisation noise, and smoothed by a window of length 8. The input white noise signal was 16384 samples. See also Decimation, Interpolation, Oversampling, Quantization, Sigma Delta Converter. Quarter Common Intermediate Format (QCIF): The QCIF image format is 144 lines by 180 pixels/line of luminance and 72 x 90 of chrominance information and is used in the |TU-T H.261 digital video recommendation. A full version of QCIF called CIF (common image format) is also defined in H.261. The choice between CIF or QCIF depends on available channel capacity and desired quality. See also Common Intermediate Format, H-series Recommendations, International Telecommunication Union. Quicksilver: A versatile, if difficult to find, software package. Quicktime: A proprietary algorithm for video compression using very low levels of processing to allow real time implementation in software on and Macintosh computers [79]. Quicktime does not achieve the picture quality of techniques such as MPEG1. See also MPEG1. 328 DSPedia 329 R Ramp Waveform (Continuous and Discrete Time): The continuous ramp waveform can be defined as: – t0 t---------- τ ramp ( ( t – t 0 ) ⁄ τ ) = 0 -- if 0 ≤ ( t – t 0 ) < τ continuous time (480) otherwise r(t) 1 0 t0 t0 +τ t The continuous triangular pulse r ( t ) = ramp ( ( t – t 0 ) ⁄ τ ) The discrete time ramp waveform can be defined as: – k0 k------------ ramp ( ( k – k 0 ) ⁄ κ ) = κ -- 0 if 0 ≤ ( k – k 0 ) < κ discrete time (481) otherwise g( k ) 1 0 k0 −κ k0 k0 +κ k g ( k ) = tri ( ( k – k 0 ) ⁄ κ ) See also Elementary Signals, Rectangular Pulse, Sawtooth Waveform, Square Wave, Triangular Pulse, Unit Impulse Function, Unit Step Function. Random Access Memory (RAM): Digital memory which can be used to read or write binary data to. RAM memory is usually volatile, meaning that it loses information when the power is switched off. Non-volatile RAM is available. See also Non-Volatile, Static RAM, Dynamic RAM. Random Variable: A random variable is a real valued function which is defined based on the outcomes of a probabilistic system. For example a die can be used to create a signal based on the random variable of the die outcome. The probabilistic event is the shaking of the die where each DSPedia 330 independent event is denoted by k, and there are 6 equally likely outcomes. A particular random variable x ( k ) can be defined by the following table: Die Event Random Variable x(.) p(x(.)) 1 -15 1/6 2 -10 1/6 3 -5 1/6 4 +5 1/6 5 +10 1/6 6 +25 1/6 Table 1: and the random signal x ( k ) turns out to be: x(k) 25 20 15 10 5 0 k -5 -10 -15 The time average of the signal x ( k ) , denoted as x , can be calculated as: 1 x = lim ---N → ∞N N ∑ x(k ) = 1.6666… (482) k=0 The statistical mean, and denoted as E [ x ( k ) ] , where E [ . ] is the expectation operator can be calculated as: E[x(k)] = ∑ p ( x )x, for all values of x x 1 1 1 1 1 1 = 25 ⋅ --- + 10 ⋅ --- + 5 ⋅ --- – 5 ⋅ --- – 10 ⋅ --- – 15 ⋅ --- 6 6 6 6 6 6 = 1.6666… The time average mean squared value, denoted as x 2 , can be calculated as: (483) 331 1 x 2 = lim ---N → ∞N N ∑ x2( k ) = 183.333… (484) k=0 The statistical average squared value, denoted as E [ x 2 ( k ) ] can be calculated from: E[ x2( k) ] = ∑ p ( x )x 2, for all values of x x 1 1 1 1 1 1 = 625 ⋅ --- + 100 ⋅ --- + 25 ⋅ --- – 25 ⋅ --- – 100 ⋅ --- – 225 ⋅ --- 6 6 6 6 6 6 (485) = 183.333… If the random process generating x ( k ) is ergodic, then the statistical averages equal the time averages, i.e. x = E [ x ( k ) ] and x 2 = E [ x 2 ( k ) ] . For a particular random variable, x ( k ) , a cumulative distribution function can be specified, where: no. of values of x ( k ) ≤ a F ( a ) = P ( x ( k ) ≤ a ) = lim --------------------------------------------------------------- total no. of values, n n→∞ (486) i.e., F ( a ) specifies the probability that the value x ( k ) is less than a. Therefore for the above random variable, x ( k ) , the cumulative distribution function is: F(a) 1 5/6 2/3 1/2 1/3 1/6 -15 -10 -5 0 -5 10 15 20 25 a The probability density function (PDF) is defined as: (a) --------------p ( x ) = dF dx a=x ( x ≤ a )= dP -----------------------dx (487) a=x where the “ ( k ) “has been dropped for notational convenience. The PDF for the random variable x ( k ) produced by the probabilistic events of a die shake is therefore: where the arrows represent dirac-delta functions located at the discrete values of the random variable. Therefore the total area under the graph p ( x ) is 1. The above distributions are discrete, in that the random variable can only take on specific values and therefore the distribution function increases in steps, and the PDF consists of dirac delta functions. There also exist continuous distributions where the random variable can take on any real DSPedia 332 p( x) 1/6 -15 -10 -5 0 -5 10 15 20 25 x number within some range. For example, consider a continuously distributed random variable which denotes the exact voltage measured from a 9 volt battery. By measuring the voltage of a large number of batteries, a random variable y ( . ) denoting the battery voltages can be produced. For a particular batch of a few thousand batteries the distribution function and PDF obtained are: F(a) p( y ) 1 0.8 0.3 0.6 Area = 0.14 0.2 0.4 0.1 0.2 0 Total area = 1 1 2 3 4 5 6 7 8 9 10 11 a, volts Cumulative Distribution Function 0 1 2 3 4 5 6 7 8 9 10 11 y, volts Probability Distribution Function If, for example, it is required to calculate the probability of a battery having a voltage between 6 and 7 volts, then the area under the PDF between y values of 6 and 7 can be calculated, or the appropriate values of the distribution function subtracted: P( 6 < y ≤ 7) = 7 ∫6 p ( y ) dy = F ( 7 ) – F ( 6 ) = 0.14 (488) In DSP signals with both discrete and continuous distributions are found. For example thermal noise is continuously distributed signal, whereas the sequence of character symbols typically sent by a modem has a discrete distribution. Some important discrete distributions in DSP are: • Binomial; • Poisson; Some important continuous probability density functions in DSP are: 333 • Gaussian: p( x) 1 -------------2πσ (x – µ)2 – -------------------1 2 p ( x ) = --------------- e 2σ 2πσ σ Mean, E [ x ] = µ µ x Variance, E [ ( x – µ ) 2 ] = σ 2 • Uniform: p( x) 0, x < ( 2µ – A ) ⁄ 2 p ( x ) = 1 ⁄ A, x – m ≤ A ⁄ 2 , >( 0 x 2µ – A ) ⁄ 2 Mean, E [ x ] = µ 1--A A µ x A2 Variance, E [ ( x – µ ) 2 ] = -----12 The n-th moment of a PDF taken about the point x = x 0 is: E [ ( x – xo )n ] = ∞ ∫–∞ ( x – xo ) n p ( x ) dx (489) The second order moment around the mean, E [ ( x – E [ x ] ) 2 ] is called the variance or the second central moment. See also Ergodic, Expected Value, Mean Value, Mean Squared Value, Probability, Variance, Wide Sense Stationarity. Range of Matrix: See Matrix Properties - Range. Rank of Matrix: See Matrix Properties - Rank. Rate Converter: Usually referring to the change of the sampling rate of a signal. See Decimation, Downsampling, Fractional Sampling Rate Converter, Interpolation, Upsampling. RBDS: An FM data transmission standard that allows radio stations to send traffic bulletins, weather reports, song titles or other information to a display on RBDS compatible radios. Radios will therefore be able to scan for a particular type of music. For emergency broadcasting an RBDS signal can automatically turn on a radio, turn up the radio volume and issue an emergency alert. RC Circuit: The very simplest form of analog low pass or high pass filter used in DSP systems. The 3dB point is at f 3dB = 1 ⁄ ( 2πRC ) . An RC circuit is only suitable as a (low pass) anti-alias filter when the sampling frequency is considerably higher than the highest frequency present in the input signal; this is usually only the case for oversampled DSP systems where the anti-alias process is primarily performed digitally. The roll-off for a simple low pass RC circuit is 6dB/octave, or 20dB/ decade when plotted on a logarithmic frequency scale. DSPedia 334 An RC circuit can also be used as a differentiator noting that the current through a capacitor is limited by the rate of change of the voltage across the capacitor: dV i = C ------dt (490) See also 3dB point, Decade, Differentiator, Logarithmic Frequency, Octave, Oversampling, Roll-off, Sigma Delta. Low Pass RC Filter V out 1 ----------- = ------------------------------------------V in 1 + 4π 2 R 2 f 2 C 2 R Vin Vout 1 = -----------------------------------1 + ( f ⁄ f 3dB ) 2 1.0 0.9 0.8 20log10 Vout/Vin (dB) V out ----------V in C 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 f3dB 2f3dB 3f3dB 4f3dB 5f3dB frequency (Hz) 0 -5 -10 -15 -20 -25 -30 -35 -40 -45 -50 -55 -60 0.1f3dB f3dB 10f3dB 100f3dB 1000f3dB log10 f 335 High Pass RC Filter V out 2πfRC ----------- = ------------------------------------------V in 1 + 4π 2 R 2 f 2 C 2 C R V out ----------V in Vout f ⁄ f 3dB = -----------------------------------2 1 + ( f ⁄ f 3dB ) 1.0 0.9 20log10 Vout/Vin (dB) 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 f3dB 2f3dB 3f3dB 4f3dB 5f3dB 0 -5 -10 -15 -20 -25 -30 -35 -40 -45 -50 -55 -60 0.001f3dB 0.01f3dB 0.1f3dB frequency (Hz) f3dB 10f3dB log10 f Read Only Memory (ROM): Digital memory to which data cannot be written. ROM also retains information even when the power is switched off. Reasoning, Circular: See Circular Reasoning. Real Exponential: See Exponential, Complex Exponential. Real Time Processing: Real time is the expression used to indicate that a signal must be processed and output again without any noticeable delay. For example, consider speech being sensed by a microphone before being sampled by a DSP system. Suppose it is required to filter out the low frequencies of the speech before sending the data down a telephone line. The filtering must be done in real time otherwise new samples of data will arrive before the system has DSP system has finished its calculations on the previous ones! Systems that do not operate in real time are often referred to as off-line. See also Off-Line Processing. Reciprocal Polynomial: Consider the polynomial: H ( z ) = a 1 + a 2 z –1 + … + a N – 1 z – N + 1 + a N z – N (491) The reciprocal polynomial is given by: H r ( z ) = a N* + a N* – 1 z – 1 + … + a 1* z – N + 1 + a 0* z –N (492) where a i* is the complex conjugate of a i . The polynomials are so called because the reciprocals of the zeroes of H ( z ) are the zeroes of H r ( z ) . If H ( z ) factorises to: H ( z ) = ( 1 – α1 z –1 ) ( 1 – α 2 z – 1 )… ( 1 – α N – 1 z –1 ) ( 1 – α N z –1 ) (493) DSPedia 336 then the zeroes of the order reversed polynomial are α 1– 1, α 2–2, …α N–1– 1 , α N–1 which can be seen from: H r ( z ) = z – N H ( z –1 ) = z – N ( 1 – α 1 z ) ( 1 – α 2 z )… ( 1 – α N – 1 z ) ( 1 – α N z ) (494) = ( z –1 – α1 ) ( z –1 – α2 )… ( z –1 – α N – 1 ) ( z –1 – α N ) ( –1 ) N = ------------------------------------------ ( 1 – α 1–1 z –1 ) ( 1 – α 2–1 z –1 )… ( 1 – α N– 1– 1 z –1 ) ( 1 – α N–1 z –1 ) α 1 α 2 …α N – 1 α N Reciprocal polynomials are of particular relevance to the design of all pass filters. See All-pass Filter, Finite Impulse Response, Order Reversed Filter. Reconstruction Filter: The analog filter at the output of a DAC to remove the high frequencies present in the signal (in the form of the steps between the discrete levels of signal). y(k) Voltage time, k time,t Analog Reconstruction Filter Steppy output voltage from Digital to Analog Converter Reconstruction filter smooths out the high frequency steps. freq fs/2 fs 3fs/2 freq fs/2 freq fs/2 fs 3fs/2 Magnitude spectra of aliased signal after DAC; of the Reconstruction filter; and of the reconstructed analog signal. Rectangular Matrix: See Matrix Structured - Rectangular. Rectangular Pulse (Continuous and Discrete Time): The continuous time rectangular pulse can be defined as: 337 1 if t – t 0 < τ ⁄ 2 rect ( ( t – t 0 ) ⁄ τ ) = 0 otherwise continuous time (495) p(t) 1 t0 - τ/2 0 t0 t0 +τ/2 t The continuous rectangular pulse p ( t ) = rect ( ( t – t0 ) ⁄ τ ) The discrete time rectangular pulse can be defined as: 1 if k – k 0 < κ ⁄ 2 rect ( ( k – k 0 ) ⁄ κ ) = 0 otherwise (496) discrete time q(t) 1 k0 −κ/2 0 k0 continuous The q ( t ) = rect ( ( k – k 0 ) ⁄ κ ) k0 +κ/2 k rectangular pulse A rectangular pulse can also be generated by the addition of unit step functions. The unit step function is defined as: 0 if k < k 0 u ( k – k0 ) = 1 if k ≥ k 0 discrete time (497) x( k) 1 0 1 2 3 4 5 6 7 8 9 10 11 12 x ( k ) = rect ( ( k – 9 ) ⁄ 7 ) = u ( k – 4 ) – u ( k – 10 ) k DSPedia 338 A rectangular pulse train, or square wave can be produced by distributing a rectangular pulse in a non-overlapping fashion. See also Elementary Signals, Square Wave, Triangular Pulse, Unit Step Function. Rectangular Pulse Train: See Square Wave. Recursive LMS: See Least Mean Squares IIR Algorithms. Red Book: The specifications for the compact disc (CD) digital audio format were jointly specified by Sony and Philips and are documented in what is known as the Red Book. The standards for CD are also documented in the IEC (International Electrotechnical Commission) standard BNN15-83095, and IEC-958 and IEC-908 . Reed Solomon Coding: See Cross Interleaved Reed Solomon Coding. Recruitment: See Loudness Recruitment. Recursive Least Squares (RLS): The RLS algorithm can also be used to update the weights of an adaptive filter where the aim is to minimize the sum of the squared error signal. Consider the adaptive FIR digital filter which is to be updated using an RLS algorithm such that as new data arrives the RLS algorithm uses this new data (innovation) to improve the least squares solution: d(k) Input signal x( k) Adaptive Filter, w(k) y( k ) Desired signal + − Output signal e( k) Error signal RLS Adaptive Algorithm y ( k ) = Filter { x ( k ), w ( k ) } w k + 1 = w k + e ( k )f { d ( ( k ), x ( k ) ) } For least squares adaptive signal processing the aim is to adapt the impulse response of the FIR digital filter such that the input signal x ( k ) is filtered to produce y ( k ) which when subtracted from desired signal d ( k ) , minimises the sum of the squared error signal e ( k ) over time from the start of the signal at 0 (zero) to the current time k. Note: While the above figure is reminiscent of the Least Mean Squares (LMS) adaptive filter, the distinction between the two approaches is quite important: LMS minimizes the mean of the square of the output error, while RLS minimizes the actual sum of the squared output errors. In order to minimize the error signal, e ( k ) , consider minimizing the total sum of squared errors for all input signals up to and including time, k. The total squared error, v ( k ) , is: k v(k ) = ∑ [ e( s ) ]2 = e2(0 ) + e2( 1) + e2(2 ) + … + e2( k) s=0 Using vector notation, the error signal can be expressed in a vector format and therefore: (498) 339 e( 0) d( 0) y( 0) e( 1) d( 1) y( 1) e( 2) d( 2) – y( 2) = = dk – yk ek = : : : e( k – 1) d( k – 1) y(k – 1) e( k) d( k) y(k) (499) Noting that the output of the N weight adaptive FIR digital filter is given by: N–1 y(k) = ∑ w n x ( k – n ) = w T x k = x kT w (500) n=0 where, w = [ w 0, w 1, w 2, …, w N – 1 ] and (501) x k = [ x ( k ), x ( k – 1 ), x ( k – 2 ), …, x ( k – N + 1 ) ] (502) then Eq. 499 can be rearranged to give: ek = e(0) x 0T w x 0T e(1) x 1T w x 1T e(2) : = dk – x 2T w : x 2T : e( k – 1) x kT – 1 w x kT – 1 e(k) x kT w x kT x(0) 0 0 x(1) x(0) 0 x(1) x(0) = dk – x ( 2 ) : : : x(k – 1) x(k – 2) x(k – 3) x(k) x(k – 1) x(k – 2) i.e. = dk – w … 0 w0 … 0 w1 … 0 w2 … : : … x(k – N ) … x ( k – N + 1 ) wN – 1 (503) e k = dk – Xk w where X k is a ( k + 1 ) × N data matrix made up from input signal samples. Note that the first N rows of X k are sparse. Equation 498 can be rewritten such that: DSPedia 340 v ( k ) = e kT e k = e k 2 2 = [ dk – Xk w ] T [ dk – Xk w ] (504) = d kT dk + w T X kT X k w – 2d kT X k w where e k 2 is the 2-norm of the vector e k . From a first glance at the last line of Eq. 503 it may seem that a viable solution is to set e k = 0 then simply solve the equation w = X k–1 dk . However this is of course not possible in general as X k is not a square matrix and therefore not invertible. In order to find a “good” solution such that the 2-norm of the error vector, e k , is minimized, note that Eq. 504 is quadratic in the vector w , and the function v ( k ) is an up-facing hyperparaboloid when plotted in N+1 dimensional space, and there exists exactly one minimum point at the bottom of the hyperparaboloid where the gradient vector is zero, i.e., ∂ v(k) = 0 ∂w (505) ∂ v ( k ) = 2X T T T k X k w – 2X k d k = – 2X k [ d k – X k w ] ∂w (506) From Eq. 504 and therefore: – 2X kT [ d k – X k w LS ] = 0 (507) ⇒ X kT X k w LS = X kT dk and the least squares solution, denoted as w LS and based on data received up to and including time, k, is given as: w LS = [ X kT X k ] –1 X kT dk (508) Note that because [ X kT X k ] is a symmetric square matrix, then [ X kT X k ] – 1 is also a symmetric square matrix. As with any linear algebraic manipulation a useful check is to confirm that the matrix dimensions are compatible, thus ensuring that w LS is a N × 1 matrix: N w w w w N x x x x x x x x x x x x x x x x k+1 x x x x x x x x -1 x x x x x x x x x x x x x x x x x x x x x x x x XT X x x x x x x x x x x x x k+1 N w k+1 x x x x -1 XT x x x x x x x x N d d d d d d k+1 d Note that if in the special case where X k is a square non-singular matrix, then Eq. 508 simplifies to: 341 w LS = X k– 1 X k–T X kT d k = X k–1 d k (509) The computation to calculate Eq. 508 requires about O(N4) MACs (multiply/accumulates) and O ( N ) divides for the matrix inversion, and O ( ( k + 1 ) × N 2 ) MACs for the matrix multiplications. Clearly therefore, the more data that is available, then the more computation required. At time iteration k+1, the weight vector to use in the adaptive FIR filter that minimizes the 2-norm of the error vector, e k can be denoted as w k + 1 , and the open loop least squares adaptive filter solution can be represented as the block diagram: x( k ) w0 w1 wN-2 wN-1 y(k ) − + e(k ) w k + 1 = [ X kT X k ] – 1 X kT d k d( k) Note however that at time k + 1 when a new data sample arrives at both the input, x ( k + 1 ) , and the desired input, d ( k + 1 ) then this new information should ideally be incorporated in the least squares solution with a view to obtaining an improved solution. The new least squares filter weight vector to use at time k + 2 (denoted as w k + 2 ) is clearly given by: w k + 2 = [ X kT + 1 X k + 1 ] –1 X kT + 1 dk + 1 (510) This equation requires that another full matrix inversion is performed, [ X kT + 1 X k + 1 ] –1 , followed by the appropriate matrix multiplications. This very high level of computation for every new data sample provides the motivation for deriving the recursive least squares (RLS) algorithm. RLS has a much lower level of computation by calculating w k + 1 using the result of previous estimate w k to reduce computation. Consider the situation where we have calculated w k , from, w k = [ X kT – 1 X k – 1 ] – 1 X kT – 1 dk – 1 = P k – 1 X kT – 1 d k – 1 (511) P k – 1 = [ X kT – 1 X k – 1 ] –1 (512) where When the new data samples, x ( k ) and d ( k ) , arrive we have to calculate: w k + 1 = [ X kT X k ] –1 X kT dk = P k X kT d k (513) However note that P k can be written in terms of the previous data matrix X k – 1 and the data vector x k by partitioning the matrix X k : DSPedia 342 P k = [ X kT X k ] –1 = X kT – 1 x k = X kT – 1 X k – 1 + x k x kT = P k– 1– 1 + x k x kT Xk – 1 x kT (514) –1 –1 where, of course, x k = [ x ( k + 1 ), x ( k ), x ( k – 1 ), …, x ( k – N + 1 ) ] as before in Eq. 500. In order to write Eq. 514 in a more “suitable form” we use the matrix inversion lemma (see Matrix PropertiesInversion Lemma) which states that: [ A –1 + BCD ] –1 = A – AB [ C + DAB ] –1 DA (515) where A is a non-singular matrix and B, C and D are appropriately dimensioned matrices. Using the matrix inversion lemma of Eq.514, where P k – 1 = A , x k = B , x kT = D and C is the 1 × 1 identity matrix. i.e. the scalar 1, then: P k = P k – 1 – P k – 1 x k [ 1 + x kT P k – 1 x k ] –1 x kT P k – 1 (516) This equation implies that if we know the matrix [ X kT – 1 X k – 1 ] –1 then the matrix [ X kT X k ] –1 can be computed without explicitly performing a complete matrix inversion from first principles. This, of course, saves in computation effort. Equations 513 and 516 are one form of the RLS algorithm. By additional algebraic manipulation, the computation complexity of Eq. 516 can be simplified even further. By substituting Eq. 516 into Eq. 513, and partitioning the vector d k and simplifying gives: w k + 1 = [ P k – 1 – P k – 1 x k [ 1 + x kT P k – 1 x k ] –1 x kT P k – 1 ]X kT d k = [ P k – 1 – P k – 1 x k [ 1 + x kT P k – 1 x k ] –1 x kT P k – 1 ] X kT – 1 x k dk – 1 d(k ) (517) = [ P k – 1 – P k – 1 x k [ 1 + x kT P k – 1 x k ] –1 x kT P k – 1 ] X kT – 1 d k – 1 + x k d ( k ) Using the substitution that w k = P k – 1 X kT – 1 d k – 1 and also dropping the time subscripts for notational convenience, i.e. P = P k – 1 , x = x k , d = d k – 1 , and d = d ( k ) ) , further simplification can be performed: 343 w k + 1 = [ P – Px [ 1 + x T Px ] –1 x T P ] X T d + xd = PX T d + Pxd – Px [ 1 + x T Px ] –1 x T PX T d – Px [ 1 + x T Px ] –1 x T Pxd = w k – Px [ 1 + x T Px ] –1 x T w k + Pxd – Px [ 1 + x T Px ] –1 x T Pxd (518) = w k – Px [ 1 + x T Px ] –1 x T w k + Pxd [ 1 – [ 1 + x T Px ] –1 x T Px ] = w k – Px [ 1 + x T Px ] –1 x T w k + Px [ 1 + x T Px ] – 1 [ [ 1 + x T Px ] – x T Px ]d = w k – Px [ 1 + x T Px ] –1 x T w k + Px [ 1 + x T Px ] – 1 d = w k + Px [ 1 + x T Px ] – 1 ( d – x T w k ) and reintroducing the subscripts, and noting that y ( k ) = x kT w k : w k + 1 = w k + P k – 1 x k [ 1 + x kT P k – 1 x k ] – 1 ( d ( k ) – y ( k ) ) = wk + mk ( d ( k ) – y ( k ) ) (519) = wk + mk e ( k ) where m k = P k – 1 x k [ 1 + x kT P k – 1 x k ] – 1 and is called the gain vector. The RLS adaptive filtering algorithm therefore requires that at each time step, the vector m k and the matrix P k are computed. The filter weights are then updated using the error output, e ( k ) . Therefore the block diagram for the closed loop RLS adaptive FIR filter is: : d(k) x(k ) w0 w1 wN-2 wN-1 − y(k ) + e(k ) wk + 1 = w k + m k e ( k ) Pk – 1 x k m k = ----------------------------------------[ 1 + x kT P k – 1 x k ] P k = P k – 1 – m k x kT P k – 1 The above form of the RLS requires O ( N 2 ) MACs and one divide on each iteration. See also Adaptive Filtering, Least Mean Squares Algorithm, Least Squares, Noise Cancellation, Recursive Least Squares-Exponentially Weighted. Recursive Least Squares (RLS) - Exponentially Weighted: One problem with least squares and recursive least squares (RLS) algorithm derived in entry Recursive Least Squares, is that the minimization of the 2-norm of the error vector e k calculates the least squares vector at time k based on all previous data, i.e. data from long ago is given as much relevance as recently received data. Therefore if at some time in the past a block of “bad” data was received or the input signal statistics changed then the RLS algorithm will calculate the current least squares solution giving as much DSPedia 344 relevance to the old (and probably irrelevant) data as it does to very recent inputs. Therefore the RLS algorithm has infinite memory. In order to overcome the infinite memory problem, the exponentially weighted least squares, and exponentially weighted recursive least squares (EW-RLS) algorithms can be derived. Consider again Eq. 498 where this time each error sample is weighted using a forgetting factor constant λ which just less than 1: k v(k) = ∑ λk – s [ e( s) ]2 = λ k e 2 ( 0 ) + λ k – 1 e 2 ( 1 ) + λ 2k – 2 e 2 ( 2 ) + … + e 2 ( k ) (520) s=0 For example if a forgetting factor of 0.9 was chosen then data which is 100 time iterations old is premultiplied by 0.9 100 = 2.6561 × 10 –5 and thus considerably de-emphasized compared to the current data. Therefore in dB terms, data that is more 100 time iterations old is attenuated by 10 log ( 0.00026561 ) = – 46 dB . Data that is more than 200 time iterations old is therefore attenuated by around 92 dB, and if the input data were 16 bit fixed point corresponding to a dynamic range of 96dB, then the old data is on the verge of being completely forgotten about. The forgetting factor is typically a value of between 0.9 and 0.9999. Noting the form of Eq. 504 we can rewrite Eq. 520 as: v ( k ) = e kT Λ k e k (521) where Λ k is a ( k + 1 ) × ( k + 1 ) diagonal matrix Λ k = diag [ λ k, λ k – 1, λ k – 2, …, λ, 1 ] Therefore: v ( k ) = [ dk – Xk w ]T Λk [ dk – Xk w ] = dkT Λ k d k + w T X kT Λ k X k w – 2d kT Λ k X k w (522) Following the same procedure as for Eqs. 505 to 508 the exponentially weight least squares solution is easily found to be: w LS = [ X kT Λ k X k ] –1 X kT Λ k dk (523) In the same way as the RLS algorithm was realised, we can follow the same approach as Eqs. 511 to 519 and realise the exponentially weighted RLS algorithm: wk + 1 = wk + mk e ( k ) Pk – 1 xk m k = ---------------------------------------[ λ + x kT P k – 1 x k ] P k – 1 – m k x kT P k – 1 P k = ----------------------------------------------λ (524) 345 Therefore the block diagram for the exponentially weighted RLS algorithm is: d(k) x(k ) w0 w1 wN-2 wN-1 − y(k ) + e(k ) wk + 1 = w k + m k e ( k ) Pk – 1 x k m k = ----------------------------------------[ λ + x kT P k – 1 x k ] P k – 1 – m k x kT P k – 1 P k = -----------------------------------------------λ Compared to the Least Mean Squares (LMS) algorithm, the RLS can provide much faster convergence and a smaller error, however the computation required is a factor of N more than for the LMS, where N is the adaptive filter length. The RLS is less numerically robust than the LMS. For more detailed information refer to [77]. See also Adaptive Filtering, Least Mean Squares Algorithm, Least Squares, Noise Cancellation, Recursive Least Squares. Reflection: Sound can be reflected when a sound wave reaches a propagation medium boundary, e.g. from air to brick (wall). Some of the sound may be reflected and the rest will either be absorbed (converted to heat or transmitted through the medium). See also Absorption. Register: A memory location inside a DSP processor, used for temporary storage of data. Access to the data in a register is very fast as no off-chip memory movements are required. Relative Error: The ratio of the absolute error (difference between true value and estimated value) to the true value of a particular quantity is called the relative error. For example consider two real numbers x and y, that will be represented to only one decimal place of precision: x = 1.345 and (525) y = 1000.345 (526) The rounded values, denoted as x’ and y’ will be given by x′ = 1.3 and (527) y′ = 1000.3 (528) The absolute errors, ∆x and ∆y , caused by the rounding are the same for both quantities, and given by: ∆x = x – x′ = 1.345 – 1.3 = 0.045 (529) ∆y = y – y′ = 1000.345 – 1000.3 = 0.045 (530) DSPedia 346 The relative error, however, is defined as the ratio of the absolute error to the correct value. Therefore the relative error of x’ and y’ can be calculated as: ∆x ------- = 0.045 --------------- = 0.0334 x 1.345 (531) ∆y 0.045 - = 4.5 × 10 –5 ------- = -----------------------y 1000.345 (532) Relative error is often denoted as a percentage error. Therefore in the above example x’ represents a 3.34% error, whereas y’ is only a 0.0045% error. Relative errors are widely used in error analysis calculations where the results of computations on estimated, rounded or truncated quantities can be predicted by manipulating only the relative errors. See also Absolute Error, Error Analysis. Relative Pitch: The ability to specify the names of musical notes on the Western music scale if the name of one of the notes is first given is known as relative pitch. Relative pitch skills are relatively common among singers and musicians. The ability to identify any musical note with no clues is known as perfect or absolute pitch and is less common. See also Music, Perfect Pitch, Pitch, Western Music Scale. Resistor-Capacitor Circuit: See RC Circuit. Resolution: The accuracy to which a particular quantity has been converted. If the resolution of a particular A/D converter is 10mVolts then this means that every analog quantity is resolved to within 10mVolts of its true value after conversion. Resonance: When an object is vibrating at its resonant frequency it is said to be in resonance. See Resonant Frequency. Resonant Frequency: All mechanical objects have a resonant or natural frequency at which they will vibrate if excited by an impulse. For example, striking a bell, or other metal object will, cause a ringing sound (derived from the vibrations) at the bell’s resonant or natural frequency. If a component is excited by vibrations at its resonant frequency then it will start to vibrate in synchrony and lead to vibrations of a very large magnitude. This is referred to as sympathetic vibration. For example, if a tone at the same frequency as a bell’s resonant frequency is played nearby, the bell will start to ring in unison at the same frequency. Music is derived from instruments’ vibrating strings and membranes, and columns of air at resonant frequency. Resource Interchange File Format (RIFF): RIFF is a proprietary format developed by IBM and Microsoft. RIFF essentially defines a set of file formats which are suitable for multimedia file handling (i.e. audio, video, and graphics): • Playing back multimedia data; • Recording multimedia data; • Exchanging multimedia data between applications and across platforms. A RIFF file is composed of a descriptive header identifying the type of data, the size of the data, and the actual data. Currently well known forms of RIFF file are: • WAVE: Waveform Audio Format (.WAV files) 347 • PAL: Palette File Format (.PAL files) • RDIB: RIFF Device Independent Bitmap Format (.DIB files) • RMID: RIFF MIDI Format (.MID files) • RMMP: RIFF Multimedia Movie File Format RIFF files are supported by Microsoft Windows on the PC. (Note that there is also a counterpart to RIFF called RIFX that uses the Motorola integer byte ordering format rather than the Intel format.) See also Standards. Return to Zero: See Non-Return to Zero. Reverberation: The multitude of a particular sound’s waves that add to the direct path sound wave but slightly later in time due to the longer distance (reflected) transmission paths. Virtually all rooms have some level of reverberation (compare a carpeted office to an indoor swimming pool to contrast rooms with short reverberation time to those with long reverberation times.) More formally the reverberation time in a room is defined as the time it takes a sound to fall to one millionth (reduce by 60dB) of its initial sound intensity. Ringing Tone: Tones at 440 Hz and 480 Hz make up the ringing tone for telephone systems. See also DialTone, Dual Tone Multifrequency. Ripple Adder: See Parallel Adder. RISC: RISC (Reduced instruction set computer) refers to a microprocessor that has implemented a smaller core of instructions than a Complex Instruction Set Computer (CISC) in order that the silicon area can be filled with more application appropriate facilities. Some designers refer to DSP processors are RISC, whereas others note that RISCs are subtly different and lack features such as internal DMA, multiple interrupt pins, single cycle MACs, wide accumulators and so on. RISCs are designed to perform a wide range of general purpose instructions unlike DSPs, which are optimized for MACs. Texas Instruments describe their TMS320C31 DSP chip as a hybrid DSP, with features of both RISC and CISC. Best not to worry! RS232: A simple serial communications protocol. A few DSP boards use RS232 lines to communicate with the host computer. The ITU (formerly CCITT) adopted a related version of the RS232 cable which is specified in recommendation V24. Robinson-Dadson Curves: Robinson and Dadson’s 1956 paper [126] studied the definition of sound intensity, the subjective loudness of human hearing, and associated audiometric measurements. They repeated elements of earlier work by Fletcher and Munson in 1933 [73] and produced a set of equal loudness contours which showed the variation in sound pressure level (SPL) of tones at different frequencies that are perceived as having the same loudness. See also Equal Loudness Contours, Frequency Range of Hearing, Loudness Recruitment, Sound Pressure Level, Threshold of Hearing. Roll-off: Common filter types such as low pass, band pass, or high pass filters have distinct regions: the passband, transition band(s) and stopband(s). The increasing attenuation above the 3dB point from the passband to the stopband is referred to as the transition band. The rate at which the filter response decreases from passband to stop band is called the roll-off of the filter. The higher the roll-off, then the closer the filter is to the ideal filter which would have an infinite roll-off from passband to stopband. DSPedia 348 The roll-off a simple analog (single pole) RC circuit is 6dB/octave at frequencies above the cut-off frequency, f3dB, (or 3dB point). If two RC circuits are cascaded together to realise a second order (two pole) filter then the roll-off at frequencies above the cut-off frequency will be 12dB/octave or 40dB/decade (To attain better roll-off is it unlikely that passive RC circuits would be cascaded together, and it is more likely that a higher order active filter would be used). In general for an N-th order/pole cascaded RC filter (and which will have at least N capacitors), the roll-off rate at frequencies high above f3dB the roll-off will be: Roll-off 1 = 20 log 10 ----------------------------------------------------------------------------------------- 2N 1 + a ( f ⁄ f )2 + … + ( f ⁄ f o 3dB 3dB ) (533) ≈ – 20N log 10 ( f ⁄ f 3dB ) For applications such as analog anti-alias filters, Bessel, Butterworth or Chebychev filters with sharp cut-off frequencies with a hard knee at f3dB are required and the roll-off rate should be at least the same as the dynamic range of the digital wordlength. For example using an ADC with 16 bits wordlength and dynamic range 20 log 2 16 = 96dB it would be advisable to use an anti-alias filter of at least 96dB/octave such that any frequency components above fs are completed removed. Note that even with this sharp cut-off some frequency components between fs/2 and fs will still alias down to the baseband if f3dB is chosen to equal fs/2. If less selective filters are available, it is generally necessary to set f3dB to less than fs/2 (or use oversampling techniques). See also Active Filter, 349 Decade, Decibels, Filter (Bessel, Butterworth, Chebychev), Knee (of a filter), Logarithmic Frequency, Logarithmic Magnitude, Octave. . 20log10 Vout/Vin (dB) Ideal filter, infinite roll-off 0 -5 -10 -15 -20 -25 -30 -35 -40 -45 -50 -55 -60 Log10 frequency (decade) Roll-off of simple RC circuit: 20dB/decade 0.1 0.5 1 5 10 50 100 500 1000 20log10 Vout/Vin (dB) log10(f/f3dB) 0 -3 -6 -9 -12 -15 -18 -21 -24 -27 -30 -33 -36 0.125 Log2 frequency (octave) Roll-off of 6dB/octave using a simple RC circuit: Roll-off of 12dB/octave using a second order active filter. 0.25 0.5 1 2 4 8 16 32 64 log2(f/f3dB) The magnitude transfer function of the simple RC circuit is given by: V out 1 1 ----------- = ------------------------------------- , where f 3dB = ---------------2πRC V in 1 + ( f ⁄ f 3dB ) 2 R Vin C V Round-Off Error: When two N bit numbers are multiplied together, the result is a number with 2N bits. If a fixed point DSP processor with N bits resolution is used, the 2N bit number cannot be accommodated for future computations which can operate on only N bit operands. Therefore, if we assume that the original N bit numbers were both constrained to be less than 1 in magnitude by using a binary point, then the 2N bit result is also less that 1. Hence if we round the least significant N bits up or down, then this is equivalent to losing precision. This loss of precision is referred to as round-off error. Although the round-off error for a single computation is usually not significant, many errors added together can be significant. Furthermore if the result of a computation yields the value DSPedia 350 of 0 (zero) after rounding, and this result is to be used as a divisor, a divide by zero error will occur. See also Truncation Error, Fractional Binary, Binary Point. Binary 0.1011001 x 0.1010001 = 0.011100001010010 Decimal 0.6953125 x 0.53125 = 0.44000244140625 0.0111000 Rounding 0.4375 After multiplication of two 8 bits numbers, the 16 bit result is rounded to 8 bits introducing a binary round-off error of 0.000000001010010 which in decimal is 0.00250244140625. Round-Off Noise: When round-off errors are modelled as a source of additive noise in a system, the effect is referred to as round-off noise. This noise is usually discussed in terms of its mean power. See also Round-off Error. Row Vector: See Vector. Run Length Encoding (RLE): If a data sequence contains a consecutive sequence of the same data word, then this is referred to as a “run”, and the number of data words is referred to as the “length” of the run. Run length encoding is a technique that allows data sequences prone to repetitive values to be efficiently encoded and therefore compressed. For example, if a 256 × 256 image is stored in a file sequentially by each row, then a run of identical pixel values in a row can is encoded by two data words, one stating the repeated value, and one stating the length of the run. Run length encoding is a lossless compression technique. See also Compression. 351 S Sample and Hold (S/H): A analog circuit used at the input to A/D converters to maintain a constant input voltage while the digital equivalent is calculated by the A/D converter. The output waveform is an analog voltage, that is “steppy” in appearance, with the duration of the steps (the hold time) being determined by the chosen sampling frequency fs. The sample and hold function is also referred to as a zero order hold. See also First Order Hold, Analog to Digital Converter. time Sample and Hold Circuitry Voltage Voltage fs 1--fs time Sampling: The process of converting an analog signal into discrete samples at regular intervals. To correctly sample a signal the sampling rate or sampling frequency, fs, should be at least twice the maximum frequency component of the signal (the Nyquist criteria). Sampling results in analog samples of a signal. Quantization converts these analog samples to a discrete set of values. See also Analog to Digital Converter. Sampling Rate: The number of samples per second from a particular analog signal, usually expressed in Hz (Hertz). Saturation Arithmetic: When the magnitude of the result of a computation will overflow the result is limited by the DSP processor to the maximum positive or negative number (otherwise the number would be too large for the processor wordlength). For a fixed point 16 bit DSP processor therefore, the maximum value generated by any computation will be 32767, and the minimum value will be 32768. DSPedia 352 Sawtooth Waveform: A sawtooth waveform is a periodic signal made up from individual ramp waveforms. See also Ramp Waveform. s(t) 0 2τ τ Continuous time sawtooth waveform with period, τ. 3τ t s( k) 0 κ 2κ 3κ k Discrete time sawtooth waveform with period, κ. SAXPY: This term is used in vector algebra to indicate the calculation: x = αx + y (534) SAXPY is a mnemonic for scalar alpha x plus y, and has its origins as part of the Linpack software.[15] Schur-Cohn Test: Given a z-domain polynomial of order N, the Schur-Cohn test can be used to establish if the roots of the polynomial are within the unit circle [77]. The Schur-Cohn test can therefore be used on IIR filters to check stability (i.e. all poles within the unit circle), or to test if a filter is minimum phase (all zeroes and poles within the unit circle). Schur Form: See Matrix Decompositions - Schur Form. Scrambler/Descrambler: A scrambler is either an analog or digital device used to implement secure communication channels by modifying a data stream or analog signal to appear random. A descrambler reverses the effect of the scrambler to recover the original signal. Many different techniques exist for scrambling signals and are of two main forms: frequency domain techniques, and time domain techniques. Second Order: Usually meaning two of a particular device cascaded together. Used in a nonconsistent way. Second order is often used to refer to a segment of a linear system that can be represented by a system polynomial of order 2. Semitone: In music theory each adjacent note in the chromatic scale differs by one semitone, which corresponds to multiplying the lower frequency by the twelfth root of 2, i.e. 2 1 / 12 = 1.0594631… . A difference of two semitones is a tone. See also Western Music Scale. Semi-vowels: One of the elementary sounds of speech, namely plosives, fricatives, sibilant fricative, semi-vowels, and nasals. Semi-vowels are relatively open sounds and formed via constrictions made by the lips or tongue. See also Fricatives, Nasals, Plosives, and Sibilant Fricatives. 353 Sensation Level (SL): A person’s sensation level for particular sound stimulus is calculated as a power ratio relative to their own minimum detectable level of that specific sound: Sound Intensity Sensation Level = 10 log ------------------------------------------------------------------------------------------- = dB (SL) Minimum Detectable Sound Level (535) Therefore if a sound is 40dB (SL) then it is 40dB above that person’s minimum detectable level of the sound. Clearly the physical intensity of a sensation level will differ from person to person [30]. See also Audiology, Hearing Level, Sound Level Units, Sound Pressure Level, Threshold of Hearing. Sensorineural Hearing Loss: If the cochlea, auditory nerve or other elements of the inner ear are not functioning correctly then the associated hearing loss is often known as sensorineural [30]. Typically the audiogram will reveal that the sensorineural hearing loss increases with increased frequency. Although a frequency selective linear amplification hearing aid will assist in some cases to reduce the impairment, in general the wearer will still have difficulty in perceiving speech signals in noisy environments. Such is the complex nature of this form of hearing loss. See also Audiology, Audiometry, Conductive Hearing Loss, Ear, Hearing Aids, Hearing Impairment, Loudness Recruitment, Threshold of Hearing. Sequential Linear Feedback Register: See Pseudo Random Binary Sequence. Serial Copy Management System (SCMS): The Serial Copy Management System provides protection from unauthorised digital copying of copyrighted material. The SCMS protocol ensures that only one digital copy is possible from a protected recording [128], [158]. Shading Weights: Coefficients used to weight the contributions of different sensors in a beamforming array (or the coefficients in an FIR filter). Shading weights control the characteristics of the sidelobes and mainlobe for a beamformer (or, analogously, an FIR filter). The use and design of shading weights is very similar to that for Data Windows and FIR filters. See also Beamforming, Windows, FIR Filters. Shannon, Claude Elwood: Claude Elwood Shannon can be justly described as the father of the digital information age by virtue of his mathematical genius in defining the important principles of what we now call information theory. Claude Shannon was born in Michigan on April 30th 1916. He first attended University of Michigan in 1932 and graduated with a Bachelor of Science degree in Electrical Engineering, and also in Mathematics. In 1936 he joined MIT as a research assistant, and in 1938 published his first paper “A Symbolic Analysis of Relay and Switching Circuits”. In 1948 he produced the celebrated paper “A Mathematical Theory of Communication” in the Bell System Technical Journal [129]. It is widely accepted that Claude Shannon profoundly altered virtually all aspects of communication theory and real world practice. Claude Shannon’s other interests have included “beat the dealer” gambling machines, mirrored rooms, robot bicycle riders, and a long time interest in the practical and mathematical aspects of juggling. Readers are referred to Shannon’s biography and collected papers [41] for more insights on this most interesting individual. Sherman-Morrison-Woodbury Formula: See Matrix Properties - Inversion Lemma. Shielded Pair: Two insulated wires in a cable wrapped with metallic braid or foil to prevent interference and provide reduced transmission noise. DSPedia 354 Sibilant Fricatives: One of the elementary sounds of speech, namely plosives, fricatives, sibilant fricative, semi-vowels, and nasals. Sibilant fricatives are the hissing sounds formed when air is forced over the cutting edges of the front teeth with the lips slightly parted. See also Fricatives, Nasals, Plosives, and Semi-vowels. Sidelobes: In an antenna or sensor array processing system, sidelobes refer to the secondary lobes of sensitivity in the beampattern. For a filter or a data window, sidelobes refer to the stopband lobes of sensitivity. The lower the sidelobe level, the more selective or sensitive a given system is said to be. The level of the first sidelobe (relative to the main lobe peak) is often an important parameter for a data window, a digital filter, or an antenna system. Sidelobes are best illustrated by an example. mainlobe array gain as a function of angle sidelobes -15 -10 -5 0 dB contour Typical Beampattern See also Main lobe, Beamformer, Beampattern, Windows. Sigma Delta (Σ−∆): Σ−∆ converters use noise shaping techniques whereby the baseband quantization noise from oversampling can be high pass filtered, and the oversampling factor required to increase signal resolution can be reduced from the 4x’s per single bit normally required when oversampling (see Oversampling). A simple first order Σ−∆ ADC converter only requires the analog components of an integrator, a summer, a 1 bit quantiser (or a 1 bit ADC) and single bit DAC in the feedback loop. A first order Σ−∆ DAC requires only the analog components of a 1 bit DAC: + x(t) Analog Input Σ 1-bit ADC ∫ 1 bit y(k) fovs 1-bit DAC ADC Analog signal Digital signal (single bit) + x(k) Digital N-bit input integrator Σ + + Σ Quantiser z–1 fovs 1-bit DAC y(t) Analog Output z–1 DAC First order single bit Σ−∆ converter ADC and DAC. The 1 bit ADC intercepts the y-axis at the input maximum and minimum, and the quantiser (in the DAC) intercepts at ± 2 N – 1 . For the Σ−∆ ADC the integrator can be produced using a capacitive component, the summer using a simple summation amplifier, and the quantiser using a comparator. Unlike conventional data converters the non linear element (the quantiser) is within a feedback loop in a mixed analogue/digital system and as a result Σ∆ devices are difficult to analyze. However as a first step to understanding the principle of operation of a Σ∆ device consider the following 355 representation of the ADC which is similar to the one above but the integrator has now been moved in front of the adder. x(t) + ∫ - 1-bit ADC Σ 1 bit 1 -1 Analogue Input y(k) fovs 1-bit DAC ∫ Modified first order sigma delta ADC. Clearly the Σ∆ modulator tries to keep the mean value of the 1-bit high frequency signal equal to the mean of the input signal. Thus for a frequency input of 0 Hz, the mean output is not affected by quantisation noise. This simple result can be extended and for inputs of “very low frequency” with respect to the sampling frequency, f ovs , and we conclude that the output will be a “good” representation of the input. Because of the non-linearities present, the simple first order Σ−∆ “loop” is actually very difficult to analyze. Therefore the above linearized digital model which represents a “reasonably” mathematically tractable model is used [8]. The analog integrator is modelled with a digital integrator and the quantizer is modelled as an additive white noise source. The ADC is therefore linearised and replaced by a signal independent white noise source, n ( k ) , of variance (power) q 2 ⁄ 12 (where q is the step size of the single bit quantiser) and the analog integrator approximated by a digital integrator such that: k y(k) = x(k) + y(k – 1) = t ∑ x ( n ) ≈ ∫ x ( τ ) dτ n=0 (536) 0 where t = kT and T is the sampling period. The following analysis models are therefore realised: + x(k) - Input Σ + integrator Σ + z–1 z–1 n(k) + Σ + 1 bit y(k) + x(k) Digital N-bit input Σ + integrator Σ + n(k) + Σ + 1 bit z –1 z –1 ADC DAC The (identical) linearised digital models for a Σ∆ ADC and a Σ∆ DAC. The linearised model allows for a more simple analysis of the behaviour of the circuits. Note that z – 1 represents a sample delay element of period t ovs = 1 ⁄ f ovs . y(k) DSPedia 356 This z-domain model can further be simplified to: N(z) X(z) + - Σ 1 ------------------1 – z –1 + + Σ 1 bit Y(z) z –1 Linearised digital models for a Σ∆ ADCs and a Σ∆ DACs. Compared to the previous figure, the integrator can be represented as a simple pole. The output of the above Σ∆ first order model is simply given by: 1 + N(z ) Y ( z ) = [ X ( z ) – z – 1 Y ( z ) ] ---------------1 – z –1 ⇒ Y ( z ) – z – 1 Y ( z ) = [ X ( z ) – z – 1 Y ( z ) ] + N ( z ) ( 1 – z –1 ) (537) ⇒ Y ( z ) = X ( z ) + N ( z ) – z –1 N ( z ) Written in the time domain the output is therefore: y(k) = x(k) + n(k) – n(k – 1) (538) From Eq. 538 we can note that the input signal passes unaltered through the modulator, whereas the added noise is high pass filtered (for low frequency values of n ( k ) , then n ( k ) – n ( k – 1 ) ≈ 0 ). The total quantisation noise power of the 1 bit quantiser is therefore increased by using the Σ∆ loop (actually doubled or increased by 3dB), but the low frequency quantisation noise power (i.e. at the baseband) is reduced if the sampling frequency is high enough. Compared to the 1 extra bit of resolution obtained for every increase in sampling frequency by 4 for an oversampling ADC (see Quantization Noise-Reduction by Oversampling), the first order Σ∆ loop brings the advantage of approximately 1.5 bits of extra resolution (in the baseband) for each doubling of the sampling frequency [8]. To illustrate the operation of a first order Σ∆ converter a linear chirp signal with frequency increasing from 100 to 4800 Hz over a 0.1 second interval was input to the above sigma delta loop sampling 357 at 64 x’s the Nyquist rate, i.e. f ovs = 64f n = 640000 Hz . A 0.45ms (292 samples) of the sigma delta output and chirp input input signal is shown below: . SD single bit output Input Signal 1 Amplitude y( k) -1 64.1 64.2 64.3 64.4 64.5 time/ms Output of a first order sigma delta loop for a 0.45ms segement of the input chirp signal (when the signal frequency was around 3000 Hz) sampled at 640000 Hz. 292 single bit samples are shown. The power spectrum obtained from an FFT on about a 0.1s segment of chirp signal i.e 65536 samples (zero padded from 64000) is: Magnitude (dB) 0 Baseband (5000 Hz) -20 0 -40 -60 -80 0 40 80 120 160 200 240 280 320 frequency/Hz x 103 Frequency domain output of a first order, R = 64 x’s oversampled sigma delta converter. The Nyquist rate was f s = 10000 Hz . The input signal was a linear chirp from 100 Hz to 4800Hz over a 0.1 second interval (64000 samples) and 65536 points ( ≈ 0.1 seconds) were used in the (zero padded) FFT. The dotted line shows the first order noise shaping characteristic predicted by Eq. 538. By digitally low pass filtering this single bit signal, around 9-10 bits of resolution are achievable in the baseband of 0 to 5000 Hz. Clearly the quantisation noise has been high pass filtered out of the baseband, thus giving additional resolution. The dotted line in the above figure shows the quantisation noise shaping spectrum predicted by Eq. 538. For this oversampling rate of R = 64 the signal to quantisation noise ratio in the baseband is about 55dB giving between 9 and 10 bits of signal resolution (cf. 20log29 dB). If only an oversampling single bit converter was used (i.e. no Σ∆ loop), 64 x’s oversampling would only allow about 3-4 bits of resolution. To extract the higher resolution baseband signal a low pass filter is required to extract only the baseband signal. DSPedia 358 To obtain more than 9-10 bits resolution without further increasing the sampling frequency, a higher order sigma delta converter can be used. The circuit for a simplified second order sigma delta loop can be represented as the z-domain model: + x(t) Analog Σ - + integrator Σ + + Σ - z –1 + integrator Σ + Quantiser z –1 y(k) Single bit Output fovs 1-bit DAC Second order sigma delta modulator. The baseband noise is much lower than that of the first order sigma delta loop due to the more effective high pass quantisation noise filtering. Analytical and experimental studies of this system are considerably more complex than that of the first order loop. For each doubling of the sampling frequency the second order loop gives around an extra 2.5 bits resolution. The z-domain output of the above converter is: Y ( z ) = X ( z ) + ( 1 – z –1 ) 2 N ( z ) (539) and it can be seen that this extra baseband resolution is a result of the second order high pass filtering of the quantisation noise compared to the first order loop. The result of inputting the same signal as previously, a linear chirp signal with frequency increasing from 100 to 4800 Hz over a 0.1 second interval at 64 x’s the Nyquist rate, i.e. f ovs = 64f n = 640000 Hz into a second order sigma delta modulator is: Magnitude (dB) 0 Baseband (5000 Hz) -20 -40 -60 -80 0 40 80 120 160 200 240 280 320 frequency/Hz x103 Frequency domain output of second order R = 64 x’s oversampled sigma delta converter. The input signal was a linear chirp signal from 100 Hz to 4800Hz over a 0.1 second interval. 65536 data points were used in the FFT. The dotted line shows the second order noise shaping characteristic predicted by Eq. 539. By digitally low pass filtering this single bit signal, around 13-14 bits of resolution are achievable in the baseband of 0 to 5000 Hz. The signal to quantisation noise in the baseband is now even higher, almost of the order of 80dB and therefore allowing between 13 and 14 bits of signal resolution to be obtained (cf. 20log213 dB). Note that the design of higher than second order Σ−∆ loops must be done very “carefully” in order to ensure stability and a straightforward cascading is to produced higher order loops is ill advised [8]. 359 = 1 --------------------second 64 × 10 3 time Magnitude Baseband signal f ovs Quantisation noise 5 freq/kHz t ovs time d fn f ovs 5 freq/kHz 320 t t n = 1 --------------10000 second time d Aliased spectra fn 5 freq/kHz 6.4 MHz fovs Σ∆ ADC Analog input from antialias filter 320 1 ibit Attenuation Magnitude -1 t Amplitude 1 ovs Magnitude t Amplitude Amplitude At the output of a Σ−∆ ADC, the single bit oversampled signal is decimated, i.e. digitally low pass filtered to half of the Nyquist frequency, and then downsampled: Downsampler Digital Low Pass Filter freq/kHz 5 320 64 Multibit oversampled PCM signal Multibit Nyquist rate (10kHz) PCM signal Decimation of a 64 x’s oversampled sigma delta signal at f ovs = 64 × f n = 64 kHz , to the Nyquist rate of f n = 10 kHz by low pass digital filtering then down-sampling by 64. Note that the interpolated signal will be delayed by the group delay, t d of the digital low pass filter (which should be linear phase in the baseband). Note that in practice the low pass filtering and downsampling is done in stages, see Sigma Delta-Decimation. The number of bits of signal resolution in the final output stage is a function of the order of the Σ∆ converter, and the filtering properties of the low pass filter. 0 time fn 5 10 Aliased spectra freq/kHz t ovs td time freq/kHz Multibit Nyquist rate PCM signal sampled at 10kHz 64 Attenuation 6.4 MHz Upsampler t 320 ovs 1 -1 f ovs 5 Amplitude 1 --------------------second 64 × 10 3 Magnitude ovs = Amplitude t Magnitude Magnitude Amplitude In order to produce a suitably noise shaped single bit data stream for input to a Σ∆ DAC the reverse of the above process is performed: t time d fn Quantisation noise 5 freq/kHz 320 6.4 MHz Digital Low Pass Filter Σ∆ DAC freq/kHz 5 fovs Baseband signal 320 Multibit oversampled PCM signal To analog reconstruction filter Interpolation of a Nyquist rate signal sampled at f n = 10 kHz , to a sampling rate of 64 × f n = 64 kHz by upsamping and low pass digital filtering. Note that the interpolated Nyquist rate or baseband signal will be delayed by the group delay, t d of the digital low pass filter (which should be linear phase in the baseband). Note that in practice the low pass filtering and upsampling is done in stages, see Sigma Delta-Interpolation. The number of bits of signal resolution in the final output stage is a function of the order of the Σ∆ converter, and the properties of the low pass filter. DSPedia 360 To use sigma delta converters in a DSP system computing at the Nyquist rate, the following components are required: Analogue AntiAlias Filter Analogue Input Digital Σ−∆ ADC 1 Decimation: Low pass filter and Downsample 16 16 bit PCM DSP fovs = R fn fn Analogue 16 Interpolation Upsample and Low Pass Filter Σ−∆ DAC 1 Reconstrcution filter fovs = R fn Analogue Output Using sigma delta converters as part of an DSP system. The analogue anti-alias and reconstruction filters are simple low order filters which match the order of the Σ∆ codec. The DSP processor is running at the Nyquist rate, f n and interpolation and decimation stages are used to convert the oversampled 1 bit digital signal to a multibit Nyquist rate digital signal. See also Decimation, Differentiator, Integrator, Oversampling, Interpolation , Quantisation Noise Reduction by Oversampling, Sigma Delta - Anti-Alias Filter, Sigma Delta - Decimation Filters, Sigma Delta - Reconstruction Filter. + x(k) - Input Σ + integrator Σ + z –1 z –1 n(k) + Σ 1 bit y(k) Magnitude Y(f) (dB) Sigma Delta, Anti-Alias Filter: One of the advantages of using sigma delta converters is the analogue anti-alias and reconstruction filters are very simple and therefore low cost. Consider a first order order sigma delta loop oversampling at 64 x’s the Nyquist rate, with the quantiser modelled as white noise source, n ( k ) (see Sigma Delta), and the input signal of full scale deflection (represented as 0dB) and occupying the entire Nyquist bandwidth: 0 Baseband, fs/128 First order RC circuit -20 -40 Shaped Quantisation Noise -60 -80 frequency fovs/2 fovs/128 Using the simple first order sigma delta model (left hand side), the frequency spectra shows that the quantisation noise is low in the region of the baseband, and the multibit signal representation can be extracted from the 1 bit signal by digital low pass filtering and downsampling by 64. To ensure aliasing does not occur, an analog anti-alias filter (first order RC circuit) removing frequency components above fovs/2 is required. In order that aliasing does not occur, the analog anti-alias filter must cut-off all frequencies above fovs/2. Noting that the digital low pass decimation filter (see Sigma Delta) will filter all frequencies between fovs/2 and fovs/128, then the analog anti-alias only requires to cut off above fovs/2. The antialias filter should be cutting off by at least the baseband resolution of the converter. Therefore noting that the power roll off of an RC circuit is 6dB/octave, then if the 3dB frequency is placed at fovs/128, at 64 times this frequency (6 octaves) 36dB of attenuation is produced at fovs/2. Noting that the quantisation noise power is already about 20dB below the 0dB level at fovs/2, then a total of 56dB of attentuation is produced. For a second order sigma delta converter via a similar argument as above, a second order anti-alias filter is required, (noting that the quantisation noise at fovs/2 is now increased due to enhanced noise 361 shaping). In general for an n-th order sigma delta converter an n-th order anti-alias filter should be used. The same is true for the reconstruction filter used with a sigma delta DAC. See also Oversampling, Sigma Delta. Sigma Delta Converter: See Sigma Delta. Sigma Delta, Decimation Filters: Decimation for a sigma delta converter requires that a low pass filter with a cut off frequency of 1/R-th of the oversampling frequency is implemented, where R is the oversampling ratio. This filter should also have linear phase in the passband. To implement a low pass FIR filter with 90dB stopband rejection and a passband of, for example, 1/64 of the sampling rate ( R = 64 ) would require thousands of filter weights. Clearly this is impractical to implement. Therefore the low pass filtering and downsampling is often done in stages, using initial stages of simple comb type filters where all filter coefficients are of value 1 leading to a simple FIR that requires only additions and no multiplications. After this initial coarse filtering, a sharp cut-off FIR filter (still of a hundred or more weights) can be used at the final stage: f ovs = 64f n Analog input from antialias filter Σ∆ ADC 1 bit 64f n Downsampler Downsampler Digital Low Pass Comb Filter 12 ibits 16 4f n Sharp Cut Off FIR Low Pass Filter 4 16 ibits fn Decimation of the output of a 3rd order sigma delta converter using a low pass comb filter followed by a sharper cut off low pass FIR filter running at only 4 x’s the Nyquist rate, f n . Interpolation for a Σ∆ ADC is the effective reverse of the above process. See also Comb Filter, Decimation, Sigma Delta, Sigma Delta - Anti-Alias Filter. Sigma Delta (Σ−∆) Loop: A term sometimes used to indicate a first order sigma delta converter. The “loop” refers to the feedback from converter output to an input summation state. See Sigma Delta. Sigma Delta, Reconstruction Filter: The order of the reconstruction filter for a sigma delta DAC should match that of sigma delta order. For details see Sigma Delta - Anti Alias Filter. Sign Data/Regressor LMS: See Least Mean Squares Algorithm Variants. Sign Error LMS: See Least Mean Squares Algorithm Variants. Sign-Sign LMS: See Least Mean Squares Algorithm Variants. Signal Conditioning: The stage where a signal from a sensor is amplified (or attenuated) and anti-alias filtered in order that its peak to peak voltage, V Pk to Pk , swing matches the voltage swing of the A/D converter and so that the signal components are not aliased upon sampling and conversion. Signals are also conditioned going the opposite way from D/A converter to signal conditioning amplifier, to actuator. Signal Flow Graph (SFG): A simple line diagram used to illustrate the operation of an algorithm; particularly the flow of data. Signal flow graphs consist of annotated directed lines and splitting and summing nodes. It is very often easier to represent an algorithm in signal flow graph form than it is DSPedia 362 to represent it algebraically. See, for example, the Fast Fourier Transform signal flow graph. Below a z-domain signal flow graph is illustrated for a 4 tap FIR filter. z-1 z-1 z-1 x(n) w0 w1 w2 - Summing Node w3 y(n) Signal Flow Graph for a 4 tap FIR filter. Signal Primitives: See Elementary Signals. Signal Space: Signal space is a convenient tool for representing signals (or symbols) used for encoding information to be sent over a channel. The signal space approach to digital communication systems exploits the fact that a finite number of signals can be represented as points (or vectors) in a finite dimensional vector space. This vector space representation allows convenient matrix-vector notation (linear algebra) to be used in the design and analysis of these systems. See also Vector Space, Matrix. Signal to Interference plus Noise Ratio (SINR): The ratio of the signal power to the interference power plus the noise power. Used especially in systems that experience significant interference components (e.g., intentional jamming) in addition to additive noise. Signal to Noise Ratio (SNR, S/N): The ratio of the power of a signal to the power of contaminating (and unwanted) noise. Clearly a very high SNR is desirable in most systems. SNR ratios are usually given in dB’s and calculated from: Signal Power SNR = 10 log 10 ----------------------------------- Noise Power (540) Simplex: Pertaining to the ability to send data in one direction only. See also Full Duplex, Half Duplex. Similarity Transform: See Matrix Decompositions - Similarity Transform. Simultaneous Masking: See Spectral Masking. Sinc Function: The sinc function is widely used in signal processing and is usually denoted as: sin xsinc ( x ) = ---------x (541) 363 The sinc function can be plotted as: sin x---------x 5π 4π 3π 2π 1.0 0 π π 2π 3π 4π 5π x The logarithmic magnitude sinc function (which is symmetric about the y-axis) has the form: 0 sin x 20 log ----------- x -10 -20 -30 -40 -50 -60 π 2π 3π 4π 5π 6π 7π 8π x Note that the first sidelobe peak occurs at approximately -26 dB (and at -13 dB if the function 10 log sin x ⁄ x is plotted). Singular Value: See Matrix Decompositions - Singular Value. Sine Wave: A sine wave (occurring with respect to time) can be written as: x ( t ) = A sin ( 2πft + φ ) (542) DSPedia 364 Voltage where A is the signal amplitude; f is the frequency in Hertz; φ is the phase and t is time. A sin(φ) period = 1/f A time t Sine Wave Generation: See Dual Tone Multifrequency - Tone Generation. Single Cycle Execution: Many DSP processors can perform a full precision multiplication (e.g., 16 bit integer, 32 bit floating point - 24 bit mantissa, 8 bit exponent) and accumulate (MAC) (a.b +c) in a single cycle of the clock used to control the DSP processor. See DSP Processor, Parallel Multiplier. Single Pole: If the input-output transfer function of a circuit has only one pole (in the s-domain), then it is often referred to a single pole circuit. The magnitude frequency plot of a single pole circuit will roll-off at 20dB/decade (6dB/octave). An RC circuit is a simple single pole circuit. See also Active Filter, RC Circuit. Singular Matrix: See Matrix Properties - Singular. Slope Overload: If the step size is too small when delta modulating a digital signal, then slope overload will occur resulting in a large error between the coded signal and the original signal. Slope overload can be corrected by increasing the sampling frequency, or increasing the delta (∆) step size, although the latter may lead to granularity effects. See also Delta Modulation, Granularity Effects . x(n) Slope overload error time Snap-In Digital Filter: The name used to mean a digital filter that can easily be introduced between the analog front end (A/Ds) and the user interface (the PC screen). A term introduced by Hyperception Inc. Solenoid: A device that converts electro-magnetic energy into physical displacement. Sones: A sone is a subjective measure of loudness which relates the logarithmic response of the human ear to SPL. One sone is the level of loudness experienced by listening to a sound of 40 phon. A measure of 2 sones will be twice as loud, and 0.5 sones will be half as loud and so on. See also Phons, Sensation Level, Sound Pressure Level. 365 Sound: Sound is derived from vibrations which cause the propagating medium’s particles (usually air) to alternately rarify and compress. For DSP purposes sound can be sensed by a microphone and the electrical output sent to an analog to digital converter (ADC) for input to a DSP processor. Sound can be reproduced in a DSP system using a loudspeaker. When the loudspeaker below produces a tone the compression and rarefaction of air particles occurs in all directions of sound propagation. For illustrative purposes only the compression and rarefaction in one direction is shown: Compression Air “particles” Sound Pressure Level LoudSpeaker Rarefication time Direction of sound propagation Sound waves are longitudinal -- meaning that the wave fluctuations occur in the direction of propagation of the wave. As a point of comparison, electromagnetic waves are transversal -meaning the variation occurs perpendicular to the direction of propagation. Hence subtle differences exist between modelling acoustic wave propagation and electromagnetic wave propagation. For example, there is no polarization phenomena for acoustic waves. See also Audio, Microphone, Loudspeaker, Sound Pressure Level, Speed of Sound. Sound Exposure Meters: For persons subjected to noise at the workplace, a sound exposure meter can be worn which will average the “total” sound they are exposed to in a day, and the measurement can then be compared with national safety standards [46]. Sound Intensity: Sound intensity is a measure of the power of a sound over a given area. The ear of a healthy young person can hear sounds between frequencies around 1000 - 3000Hz at intensities as low 10-12 W/m2 (the threshold of hearing) and as high as 1W/m2 (just below the threshold of pain). Because of the human ear linear dynamic range of almost 1,000,000,000,000, absolute sound intensity is rarely quoted. Instead a logarithmic measure called sound pressure level (SPL) is calculated by measuring the sound intensity relative to a reference intensity of 10-12 W/m2: I SPL = 10 log ------- dB I ref (543) See also Audiology, Equal Loudness Contours, Infrasound, Sound Pressure Level, Sound Pressure Level Weighting Curves, Threshold of Hearing, Ultrasound. DSPedia 366 Sound Intensity Meter: A sound intensity meter will use two or more identical microphones in order that simple beamforming techniques can be performed in an attempt to resolve the direction (as well as magnitude) of a noise. This can be important in noisy environments where there are several noise sources close together rather than a single noise source. Typically a sound intensity meter will consist of two precision microphones with very similar performance which are mounted a fixed distance apart. The sound intensity meter measures both the amplitude and relative phase and then calculates the noise amplitude and direction of arrival. By dividing the frequency analysis into bands, multiple sources at different frequencies and from different directions can be identified. Sound intensity meters usually measure noise over one third octave frequency bands. Sound intensity meters correspond to standard IEC 1043:1993. See also Sound Intensity, Sound Pressure Level, Sound Pressure Level Weighting Curves [46]. Sound Level Units: There are a number of different units by which sound level can be expressed. The human ear can hear sounds at pressures as low as 10-5 N/m2 (approximately the threshold of hearing for a 1000Hz tone). Sound level can also be measured as sound intensities which specify dissipated power over area, rather than as a pressure; 2 × 10 –5 N/m 2 is equivalent to 10-12W/m2. Because of the very large dynamic range of the human ear, most sound level units and related measurements are given on a logarithmic dB scale. See also Audiometry, Equivalent Sound Continuous Level, Hearing Level, Phons, Sones, Sound, Sound Exposure Meters, Sound Intensity, Sound Intensity Meter, Sensation Level, Sound Pressure Level, Sound Pressure Level Weighting Curves, Threshold of Hearing. Sound Pressure Level (SPL): Sound Pressure Level (SPL) is specified in decibels (dB) and is calculated as the logarithm of a ratio: I SPL = 10 log ------- dB I ref (544) where I is the sound intensity measured in Watts per square meter (W/m2) and Iref is the reference intensity of 10-12W/m2 which is the approximate lower threshold of hearing for a tone at 1000Hz. Alternatively (and more intuitively given the name sound “pressure” level) SPL can be expressed as a ratio of a measured sound pressure relative to a reference pressure, P ref , of 2 × 10 – 5 N/m 2 = 20 µ Pa: I P 2- P = 20 log --------- dB SPL = 10 log ------- = 10 log ------- I ref P2 P ref (545) ref Intensity is proportional to the squared pressure. i.e. I ∝ P2 (546) A logarithmic measure is used for sound because of the very large dynamic range of the human has a linear scale of intensity of more than 10 12 and because of the logarithmic nature of hearing. Due to the nature of hearing, a 6dB increase in sound pressure level is not necessarily perceived as twice as loud. (See entry for Sones.) Some approximate example SPLs are: 367 SPL (dB) Intensity ratio I / Iref Pressure ratio P / Pref Example Sound 120 1012 106 Gun-fire (Pain threshold) 100 1010 105 The Rolling Stones 80 108 104 Noisy lecture theatre 60 106 103 Normal conversation 40 104 102 Low murmur in the countryside 20 102 101 Quiet recording studio 0 1 1 Threshold of human hearing -10 10-1 = 0.1 10-1/2 = 0.316 The noise of a nearby spider walking. Table 2: It is worth noting that standard atmospheric pressure is around 101300 N/m2 and the pressure exerted by a very small insect’s legs is around 10 N/m2. Therefore the ear and other sound measuring devices are measuring extremely small variations on pressure. See also Audiology, Audiometry, Equivalent Sound Continuous Level, Hearing Level, Sones, Sound Intensity, Sound Pressure Level, Sound Pressure Level Weighting Curves, Threshold of Hearing. Sound Pressure Level (SPL) Weighting Curves: Because the human ear does not perceive all frequencies of the same SPL with the same loudness, a number of SPL weighting scales were introduced. The most common is the A weighting curve (based on the average threshold of hearing) which attempts to measure acoustics signals in the same way that the ear perceives it. Sound pressure level measurements made using the A-weighting curve are indicated as dB(A) or dBA, although the use of this weighting is so widespread in SPL meters measuring environmental noise, that the A is often omitted. Sounds above 0dB(A) over the frequency range 20-16000Hz are “likely” to be perceptible by humans with unimpaired hearing. As an example of using the weighting curve, a 100Hz tone with SPL of 100dB(SPL) will register about 78dB(A) on the A-weighting scale and can be “loosely” interpreted as being 88dB above the threshold of hearing at 100Hz from the figure below. Other less commonly used weighting curves are denoted as B, C and D. Standard weighting curves can be found in IEC 651: 1979, BS 5969: 1981, and ANSI S1.4-1983. DSPedia 368 See also Audiogram, Audiology, Hearing Level, Permanent Threshold Shift, Psychoacoustics, Sound Pressure Level, Spectral Masking, Temporal Masking, Threshold of Hearing. Approximate Sound Pressure Level Weighting Curves Sound Pressure Level Weighting (dB) 20 10 0 C -10 -20 -30 -40 D B -50 -60 A -70 -80 20 50 100 200 500 1000 2000 5000 10000 20000 frequency (Hz) Source Coding: This refers to the coding of data bits to reduce the bit rate required to represent an information source (i.e., a bit stream). While channel coding introduces structured redundancy to allow correction and detection of channel induced errors, source coding attempts to reduce the natural redundancy present in any information source. The lower limit for source coding (without loss of information) is set by the entropy of the source. See also Channel Coding, Huffman Coding, Entropy, Entropy Coding. Source Localization: See Localization. Space: See Vector Properties - Space. Space, Vector: See Vector Properties and Definitions - Space. Span of Vectors: See Vector Properties and Definitions - Span. Sparse Matrix: See Matrix Structured - Sparse. Spatial Filtering: Digital filters can be used to separate signals with non-overlapping spectra in the frequency domain. A DSP system can also be set up to separate signals arriving from different 369 spatial locations (or directions) with an array of sensors. This process is referred to as spatial filtering. See Beamforming, Beampattern. Microphone array DSP System Competing speakers cancelled Speaker of interest in listener’s look direction The DSP system identifies the broadside (head-on) waveform and attempts to null out the interfering signal from the oblique angles to produce a spatially filtered signal which is sent to an amplifier and a small loudspeaker in the listener’s ear. Spectral Analysis: Methods for finding the frequency content of signals, usually using the FFT and variants. Spectral Decomposition: See Matrix Decompositions - Spectral Decomposition Spectral Leakage: When a segment of data is transformed into the frequency domain using the FFT (or DFT), there will be discontinuities at the start and end of the data window unless the data window is an integral number of periods of the waveform (this is rarely the case). The discontinuities will manifest themselves in the frequency domain as sidelobes around the main peaks. Spectral leakage can be reduced (at the expense of wider peaks) by smoothing windows such as the Hanning, Hamming, Blackman-harris, harris, Von Hann and so on. See also Discrete Fourier Transform - Spectral Leakage, Windows, Sidelobes. Spectral Masking: Spectral masking refers to the situation where a very loud audio signal in a certain frequency band drowns out a quieter signal of similar frequencies. A very stark example of spectral masking is where a conversation is rendered inaudible if standing next to a revving jet engine! Spectral masking is almost often referred to as simply masking. Spectral masking also has more subtle and quantifiable effects whereby the presence of a signal causes the threshold of hearing of signals with a similar frequency to increase [30], [52]. For example if a narrowband of noise of approximately 100Hz bandwidth and centered at 500Hz is DSPedia 370 played to a listener at various different sound pressure levels, the threshold of hearing around 500Hz is raised: 40dB 450-550 Hz Noise Approximate threshold of hearing Raised threshold 50 100 200 SPL (dB) SPL (dB) 20dB 450-550 Hz Noise 80 70 60 50 40 30 20 10 0 -10 20 500 1000 2000 5000 10000 frequency (Hz) 80 70 60 50 40 30 20 10 0 -10 20 80 70 60 50 40 30 20 10 0 -10 20 Raised threshold 50 100 200 500 1000 2000 5000 10000 frequency (Hz) 50 100 200 500 1000 2000 5000 10000 frequency (Hz) 80dB 450-550 Hz Noise SPL (dB) SPL (dB) 60dB 450-550 Hz Noise Raised threshold 80 70 60 50 40 30 20 10 0 -10 20 Raised threshold 50 100 200 500 1000 2000 5000 10000 frequency (Hz) The louder the level of the narrowband noise, the more pronounced is the masking effect on nearby frequencies. The higher the SPL, the more the threshold of hearing of nearby frequencies will be raised, i.e. the more pronounced the masking effect is. In the above example when the 500Hz narrowband noise is at a level of 80dB then the 1000Hz tone at 20dB is inaudible to the human ear. In general the effect of masking is more pronounced for frequencies above the masking level. For the above example of narrowband noise, at 80dB SPL the masking effect at frequencies above 500Hz almost stretches a full octave falling off at around 60dB/octave, whereas for frequencies below 500Hz the masking effect falls off at around 120dB/octave. 371 The bandwidth of the masking level is higher for high frequencies. For example below 500Hz the masking level bandwidth is less that 100Hz, whereas for 10-15kHz, the bandwidth of the masking level is around 4kHz: 80 70 60 50 40 30 20 10 0 -10 20 Raised threshold 50 100 200 500 1000 2000 5000 10000 frequency (Hz) SPL (dB) SPL (dB) 4000Hz Masking Level Bandwidth 80 70 60 50 40 30 20 10 0 -10 20 Raised threshold 50 100 200 500 1000 2000 5000 10000 frequency (Hz) The masking bandwidth is larger for higher frequencies. For the narrowband 100Hz noise the masking bandwidth if less than 100Hz, whereas for the narrowband noise at 5000Hz the masking bandwidth is around 4000Hz The auditory effects of spectral masking are the basis for signal compression techniques such as precision adaptive subband coding (PASC). See also Auditory Filters, Equal Loudness Contours, Psychoacoustic subband coding (PASC), Temporal Masking, Threshold of Hearing. Spectrogram: A 2-D plot with time on the x-axis, and frequency on the y-axis. The magnitude at a particular frequency and a particular time on the spectrogram is indicated by a color (or grey scale) contour map. Widely used in speech processing. Speech Compression: Using DSP algorithms and techniques to reduce the bit rate of speech for transmission or storage. Algorithms in wide use for communications related applications (usually speech sampled at 8kHz and 8 bit samples) that have been standardized include, LPC10, CELP, MRELP, CVSD, VSELP and so on. Speech Immunity: Dual tone multifrequency receivers must be able to discriminate between tone pairs, and speech or other stray signals that may be present on the telephone line. The capacity of a circuit to discriminate between DTMF and other signals is often referred to at the speech immunity. See also Dual Tone Multifrequency. Speech Processing: The use of DSP for speech coding, synthesis, or speech recognition. Speech synthesis research is more advanced, whereas speech recognition and natural language understanding continue to be a very large area of research. Speech Recognition: Using DSP to actually interpret human speech and convert into text or trigger particular control functions (e.g. open, close and so on). Speech Shaped Noise: If a random noise signal has similar spectral characteristics to a speech signal this may be referred to as speech shaped noise. Speech noise is unlikely to be intelligible and would be mainly used for DSP system testing and benchmarking. Speech shaped noise is also used in audiometry. Speech Synthesis: The process of using DSP for synthesizing human speech. A simple method is to digitally record a dictionary of a few thousand commonly used words and cascade them 372 DSPedia together to form a desired sentence. This rudimentary form of synthesis will have no intonation and be rather difficult to listen to and understand for long messages. It will also require a large amount of memory. True speech synthesizers can be set up with a set of formant filters, fricative formant and nasal unit and associated control algorithms (for context analysis etc.). Speed of Sound: The speed of sound in air is nominally taken as being 330m/s. In actual fact, depending on the actual air pressure and temperature this speed will vary up and down. More generally the speed of sound will depend on the solid, liquid or gas in which it is travelling. Some typical values for the speed of sound are: 373 Substance Approximate Speed of Sound (m/s) Air at -10oC 325 Air at 0oC 330 Air at 10oC 337 Air at 20oC 343 Water 1500 Steel 5000-7000 Wood 3000-4000 Table 3: See also Absorption, Sound, Sound Pressure Level. SPOX: A signal processing operating system and the associated library of functions. Spread Spectrum: Spread spectrum is a communication technique whereby bandwidth of the the modulated signal to be transmitted is increased, and thereafter decreased in bandwidth at the receiver [9], [16]. Square Matrix: See Matrix Structured - Square. Square Root: The square root is a rare operation in real time DSP as most compression, digital filtering, and frequency transformation type algorithms require only multiply-accumulates with the occasional divide. Square roots are, however, found in some image processing routines (rotation etc) and in DSP algorithms such as QR decomposition. General purpose DSP processors do not perform square roots in a single cycle, as they do for multiplication, and successive approximation techniques are usually used. Consider the following iterative technique to calculate a : 1 a- x n + 1 = --- x n + --- 2 x n (547) DSPedia 374 Using an initial guess, x0, as a/2 the algorithm converges asymptotically. The algorithm is often said to have converged when a specified error quantity is less than a particular value. Finding the square root of a = 15, using the iterative update: 1 a x n + 1 = --- x n + ----- 2 xn (548) After only 6 iterations the algorithm has converged to within 0.03 of the correct solution. Variable, xn 16 14 12 10 8 6 4 2 0 1 2 3 4 5 6 Iteration, n Square Root Decomposition: See Matrix Decompositions - Cholesky. Square Root Free Given’s Rotations: See Matrix Decompositions - Square Root Free Given’s Rotations. Square Root Matrix: See Matrix Properties - Square Root Matrix. Square System of Equations: See Matrix Properties - Square System of Equations. Square Wave: A sequence of periodic rectangular pulses. See Rectangular Pulse. Stability: If an algorithm in a DSP processor is stable then it is producing bounded and perhaps useful output results from the applied inputs. If an algorithm or system is not stable then it is exhibiting instability and outputs are likely to be oscillating. See Instability. Stand-Alone DSP: Most DSP application programs are developed on DSP boards hosted by IBM PCs. After development of, for example, a DSP music effects box, the system will be stand-alone as it is no longer hosted by a PC. Standards: Technology standards are agreed definitions, usually at the international level which allow the compatibility, reliable operation and interoperability of systems. With relevance to DSP there are various standards on telecommunications, radiocommunications, and information technology, most notable from the ISO, ITU and ETSI. See also Bell 103/113, Bell 202, Bell 212, Bento, Blue Book, Comité Européen de Normalisation Electrotechnique, Digital Video Interactive, European Broadcast Union, European Telecommunications Standards Institute, F-Series Recommendations, G-Series Recommendations, Global Information Infrastructure, Graphic Interchange Format, H-Series Recommendations, HyTime, I-Series Recommendations, IEEE Standard 754, Image Interchange Facility, Integrated Digital Services Network, International Electrotechnical Commission, International Mobile (Maritime) Satellite Organization, International Organisation for Standards, International Telecommunication Union, ITU-R Recommendations, ITU-T Recommendations, JSeries Recommendations, Joint Binary Image Group, Joint Photographic Experts Group, Moving 375 Picture Experts Group, Multimedia and Hypermedia Information Coding Experts Group, Multipurpose Internet Mail Extensions, Multimedia Standards, Red Book, Resource Interchange File Format, T-Series Recommendations, V-Series Recommendations, X-Series Recommendations. Static Random Access Memory (SRAM): Digital memory which can be read from or written to. SRAM does not need to be refreshed as does DRAM. See also Dynamic RAM. Statistical Averages: See Expected Value. Stationarity: See Strict Sense Stationary, Wide Sense Stationarity. Status Register (SR): See Condition Code Register. Step Reconstruction: See Zero Order Hold. Step Size Parameter: Most adaptive algorithms require small steps while changing filter weights, parameters or signals being estimated. The size of this step is often a parameter of the algorithm called the step size (or the adaptive step size). As an example, the step size in the LMS (Least Mean Squares) algorithm is almost always denoted by µ. The larger µ, the larger the adaptive increments taken by the processor with each update. Haykin 1991, suggests a normalized LMS step size parameter, α, that is equal to µ normalized by the power of the input signal. This allows appropriate comparison of adaptive LMS processors operating with different input signals. The step size parameter can also vary with time -- this “variable step size” often allows adaptive algorithms to achieve faster convergence times and lower overall misadjustment simultaneously. See also Adaptive Signal Processing, Least Mean Squares Algorithm, Least Mean Squares Algorithm Variants - Variable Step Size LMS. Stereo: Within DSP systems stereo has come to mean a system with two input channels and/or two output channels. See also Dual, Stereophonic. Stereophonic: This refers to a system that has two independent audio channels. See also Monaural, Monophonic, Binaural. Stochastic Conversion: If an ADC with only single bit resolution producing two levels of -1 and +1 is used, then this is often referred to as stochastic conversion. See also Analog to Digital Conversion, Dithering. Stochastic Process: A stochastic process is a random process. Random signals are good examples of stochastic processes. A number of measurements are associated with stochastic signals, such as mean, variance, autocorrelation and so on. Signals such as short speech segments can be described as stochastic. Stopband: The range of frequencies that are heavily attenuated by a filter. See also Passband. Strict Sense Stationary: A random process is strict sense stationary if it has a time invariant mean, variance, 3rd order moment and so on. For most stochastic signals, strict stationarity is unlikely (or difficult to show) and not (usually) a necessary criteria for analysis, modelling, etc. Usually wide sense stationarity will suffice. When texts or papers refer to a stationary process they almost always are referring to stationary in the wide sense unless explicitly stating otherwise. For DSP, particularly least mean squares type algorithms, the looser criterion of wide sense stationarity is referred to. Strict sense stationarity implies wide sense stationarity, but the reverse is not DSPedia 376 necessarily true. A wide sense stationary Gaussian process, however, is also strict sense stationary. See also Wide Sense Stationarity. Subband Filtering: A technique where a signal is split ino subbands and DSP algorithms are applied (usually independently) to each subband [49]. When a signal is split into subbands the sampling rate can be reduced, and very often the PCM resolution can be reduced. See also Precision Adaptive Subband Coding. Subband Coding: A technique whereby a signal is filtered into frequency bands which are then coded using fewer bits than for the original wideband signal. Good sub-band coding schemes exist for signal compression that exploit psychoacoustic perception. See also Precision Adaptive Subband Coding. Sub-Harmonic: For a given fundamental frequency produced by, for example, a vibrating string, the frequency of the harmonics are integer multiples of the fundamental frequency, and the frequency of the subharmonics are integer dividends of the fundamental frequency. See also Fundamental Frequency, Harmonic. Music. Magnitude fundamental frequency Sub-harmonic f0/2 f0 2f0 3f0 Harmonics 4f0 frequency (Hz) The frequency domain representation of a fundamental frequency signal with harmonics and sub-harmonics associated harmonics. Subspace: See Vector Properties and Definitions - Subspace. Subspace, Vector: See Vector Properties and Definitions - Subspace. Subtractive Synthesis: Traditional analogue technique of synthesizing music starting with a signal that contains all possible harmonics of a fundamental. Thereafter harmonic elements can be filtered out (i.e. subtracted) in order to produce the desired sound [32]. See also Music, Western Music Scale. Successive Approximation: A type of A/D converter which converts from analog voltage to digital values using an approximation technique based on a D/A converter. Super Bit Mapping (SBM): SBM (a trademark of Sony) is noise shaping FIR filter algorithm developed by Sony for mastering of compact disks from 20 bit master sources. It is essentially a noise shaping FIR filter of order 12 which produces a high pass noise shaping curve. Surround Sound: A number of systems have been developed to create the impression that sound is spread over a wide area with the listener standing in the centre. DSP techniques are widely used to create artificial echo and reverberation to simulate the acoustics of stadiums and theatres. Dolby Surround Sound is widely used on the soundtracks of many major film releases. To be truly effective the sound should be coming from 360o with loudspeakers placed at the front and back of the listener. 377 Sustain: See Attack-Decay-Sustain-Release. Switch: A device with (typically) two states, e.g. off and on; high or low etc. Also a means of connecting/disconnecting two systems. Symbol: In a digital communications system the transmission and reception of information occurs in discrete chunks. The symbol is the signal (one from a finite set) transmitted over the channel during the symbol period. The receiver detects which of the finite set of symbols was sent during each symbol period. The message is recovered by the decoding of the received symbol stream. The packaging of the message into discrete symbols sent over regular intervals forms the fundamental basis of any digital communication system. See also Digital Communications, Message, Symbol Period. Symbol Period: In a digital communication system, the symbol period defines the regular time interval over which symbols are transmitted. During a symbol period exactly one of a finite number of signals are transmitted over the communications channel. Accurate knowledge of when this period begins and ends (synchronization) is required at the receiver in a communications system. See also Symbol, Digital Communications. Symmetric Matrix: See Matrix Structured - Symmetric. Synchronous: Meaning a system in which all transitions are regulated by a synchronizing clock. System Identification: Using adaptive filtering techniques, an unknown filter or plant can be identified. In an adaptive system identification architecture, when the error, ε(k) has adapted to a minimum value (ideally zero) then, in some sense, y ( k ) ≈ d ( k ) , and therefore the transfer function of the adaptive filter is now similar to, or the same as, the unknown filter or system. An example application of system identification would be to identify the transfer function of the acoustics of a room. See also Adaptive Filtering, Inverse System Identification, LMS algorithm, Active Noise Cancellation . Unknown System d(k) x(k) Adaptive Filter y(k) − Σ + ε(k) Adaptive Algorithm Generic Adaptive Signal Processing System Identification Architecture Systolic arrays: A generic name for a DSP system that consists of a large number of very simple processors interconnected to solve larger problems [25]. 378 DSPedia 379 T T-Series Recommendations: The T-series telecommunication recommendations from the International Telecommunication (ITU), advisory committee on telecommunications (denoted ITUT and formerly known as CCITT) provide standards for terminal characteristics protocols for telematic services and document transmission architecture. Some of the current recommendations (http://www.itu.ch) include: T.0 T.1 T.2 T.3 T.4 T.6 T.10 T.10 bis T.11 T.12 T.15 T.22 T.23 T.30 T.35 T.42 T.50 T.51 T.53 T.60 T.62bis T.64 T.65 T.70 T.71 T.80 T.81 T.82 T.83 T.90 T.100 T.102 T.103 T.104 T.105 T.106 T.122 T.123 T.125 T.351 Classification of facsimile apparatus for document transmission over the public networks. Standardization of phototelegraph apparatus. Standardization of Group 1 facsimile apparatus for document transmission. Standardization of Group 2 facsimile apparatus for document transmission. Standardization of Group 3 facsimile apparatus for document transmission (+ amendment). Facsimile coding schemes and coding control functions for Group 4 facsimile apparatus Document facsimile transmissions over leased telephone-type circuits. Document facsimile transmissions in the general switched telephone network. Phototelegraph transmissions on telephone-type circuit. Range of phototelegraph transmissions on a telephone-type circuit. Phototelegraph transmission over combined radio and metallic circuits. Standardized test charts for document facsimile transmissions. Standardized colour test chart for document facsimile transmissions. Procedures for document facsimile transmission in the general switched telephone network (+amendment). Procedure for the allocation of CCITT defined codes for non-standard facilities. Continuous colour representation method for facsimile. Information technology - 7-bit coded character set for information interchange. Latin based coded character sets for telematic services. Character coded control functions for telematic services. Terminal equipment for use in the teletext service. Control procedures for teletext and G4 facsimile services based on X.215 and X.225. Conformance testing procedures for the teletext. Applicability of telematic protocols and terminal characteristics to computerized communication terminals (CCTs). Network-independent basic transport service for the telematic services. Link Access Protocol Balanced (LAPB) extended for half-duplex physical level facility. Common components for image compression and communication - Basic principles. Information technology; digital compression and coding of continuous-tone still images; requirements and guidelines. Information technology - Coded representation of picture and audio information; progressive bilevel image compression (+T82 Correction 1). Information technology - digital compression and coding of continuous-tone still images: compliance testing. Characteristics and protocols for terminals for telematic services in ISDN (+ amendment). International information exchange for interactive Videotex. Syntax-based videotex end-to-end protocols for the circuit mode ISDN. Syntax-based videotex end-to-end protocols for the packet mode ISDN. Packet mode access for syntax-based videotex via PSTN. Syntax-based videotex application layer protocol. Framework of videotex terminal protocols. Multipoint communication service for audiographics and audiovisual conferencing service definition. Protocol stacks for audiographic and audiovisual teleconference applications. Multipoint communication service protocol specification. Imaging process of character information on facsimile apparatus. DSPedia 380 T.390 T.400 T.41X/ T.42X T.431 T.432 T.433 T.434 T.441 T.50X T.510 T.521 T.522 T.523 T.541 T.561 T.562 T.563 T.564 T.571 T.611 Teletext requirements for interworking with the telex service. Introduction to document architecture, transfer and manipulation. Information technology - Open document architecture (ODA) and interchange format. Document transfer and manipulation (DTAM) - Services and protocols - Introduction and general principles. Document transfer and manipulation (DTAM) services and protocols - Service definition. Document Transfer, Access and Manipulation (DTAM) - Services and protocols - Protocol specification. Binary file transfer format for the telematic services. Document transfer and manipulation (DTAM) - Operational structure. Document application profile for the interchange of various documents. General overview of the T.510-series. Communication application profile BT0 for document bulk transfer based on the session service. Communication application profile BT1 for document bulk transfer. Communication application profile DM-1 for videotex interworking. Operational application profile for videotex interworking. Terminal characteristics for mixed mode (MM) of operation. Terminal characteristics for teletext processable mode(PM.1). Terminal characteristics for Group 4 facsimile apparatus. Gateway characteristics for videotex interworking. Terminal characters for the telematic file transfer within teletext service. Programming communication interface (PCI) APPLI/COM for facsimile Group 3, facsimile Group 4, teletext, telex, e-mail and file transfer services. For additional detail consult the appropriate standard document or contact the ITU. See also ITUT Recommendations, International Telecommunication Union, Standards. Tactile Perception: Sounds below 20Hz (infrasonic or infrasound) cannot be heard by most humans, however this low frequency infrasound can be felt tactilely. Some pipe organs can play notes lower than 20Hz which can enhance the overall appreciation of the rest of the music in the audible range. Tap: The name given to a data line corresponding to a delayed version of the input signal. A tapped delay line has several points (i.e., taps) where delayed input samples are multiplied by the individual weights of a digital filter. The number of taps in a digital filter is equal to the number of weights or coefficients. For example, a particular FIR may be described as having 32 taps or 32 coefficients. The terms taps and weights (or coefficients) are used interchangeably -- this usage is imprecise, but we usually “know what is meant.” See also FIR filter, IIR filter, Adaptive Filter. Tape Speed: See Cassette Tape. Tempco: See Temperature coefficient. Temperature Coefficient: The temperature coefficient gives a measure of the voltage (or current) drift of a component with respect to temperature change. For example if a particular 20 bit ADC (range of had a temperature coefficient of 1ppm/oC, then this means that for a change in 20 temperature of 1oC, the output of the ADC would drift by less than 1 bit ( 2 = 1, 048, 576 ). Temporal Masking: The human ear may not perceive quiet sounds which occur a short time before or after a louder sound. This masking effect is called temporal masking. When the quiet sound occurs just after the louder sound (forward temporal masking) it may be interpreted that the ear has not “recovered” from the louder sound. If the quiet sound comes just before the louder sound then backward temporal masking may occur; a simple interpretation of this effect is less obvious. The effects of temporal masking are still a topic of debate and research [30]. 381 Masking Effect (dB) For forward temporal masking, the closer together the loud and quiet sound, then the more of a masking effect that is likely to be present. The amount of masking is influenced by the frequency and sound pressure levels of the two sounds, and masking effects may occur for up to 200ms. Temporal masking can be useful for perceptual coding of audio whereby the first few milliseconds of sounds (such as after loud drumbeats) are not fully coded. Duration of “loud” sound Backward Masking Spectral Masking. Forward Masking time Sounds occurring just after the loud sound may in fact be (forward) masked (i.e. rendered perceptually inaudible) to the listener. A less pronounced backward masking effect also occurs. See also Audiology, Audiometer, Binaural Unmasking, Moving Picture Experts Group - Audio, Psychoacoustics, Psychoacoustic Subband Coding (PASC), Sound Pressure Level, Spectral Masking, Temporary Threshold Shift, Threshold of Hearing Temporary Threshold Shift (TTS): When the threshold of hearing is raised temporarily (i.e., the threshold eventually returns to normal) due to exposure to excessive noise a temporary threshold shift is said to have occurred. Recovery can be within a few minutes or take several hours. Many people have experienced this effect by attending a loud concert or shooting a gun. See also Audiology, Audiometry, Threshold of Hearing, Permanent Threshold Shift. Terrestrial Broadcast: TV and radio signals are sent to consumers in one of three ways: terrestrial, satellite, or cable. Terrestrial broadcasts transmit electromagnetic waves modulated with the radio or TV signal from earth based transmitters, and are received by earth based aerials or antennas. Third Octave Band: A typical bandwidth measure used when making measurements of sound intensity over a few octaves of frequency. The third of an octave is usually one third of the particular octave. For example, choosing octaves frequencies at 125, 250, 500Hz and so on, the bandwidths of the third of an octave bands are approximately 42Hz, 86Hz, and 166Hz. To compute a third octave frequency band around frequency f0, note that from 21/6f0 down to 2-1/6f0, the ratio of the high and low frequencies is 21/3, or one-third of an octave (a doubling). The third octave bandwidth is computed as (21/6- 2-1/6)f0. Three consecutive third octaves make an octave. Third Order: Usually meaning three of a particular device cascaded together. Used in a nonconsistent way. See also Second Order. Threshold Detection: One of the most rudimentary forms of signal analysis, where a particular signal is monitored to find at what points it has a magnitude larger than some predefined threshold. DSPedia 382 For example an ECG signal may be monitored using threshold detection in order to calculate the heart rate (the inverse of the R to R time). Amplitude Detect all occurrences of signal above threshold level 4 2 0 Thresholding an ECG waveform to determine the heart rate. time (secs) Threshold of Audibility: The level of a tone that is just audible defines the threshold of audibility for that frequency. For a more general sound, the threshold of audibility is the level at which it becomes just audible. See also Audiogram. Threshold of Hearing: The threshold of hearing or minimum audible field (MAF) is a curve of the minimum detectable sound pressure level (SPL) of pure frequency tones plotted against frequency. There are a number of different methods for obtaining the lower threshold of hearing depending on the actual point on/in the ear where SPL is measured, whether headphones or loudspeakers were used, and of course the cross section of population over which the averaged curve is obtained, i.e. different age groups, including/excluding hearing impaired persons and so on. (Note that although SPL was originally defined as a sound pressure level relative to the minimum detectable 1000Hz tone, established at 10-12W/m2, the average threshold of hearing at 1000Hz is actually around 5dB.) Threshold of Hearing 80 70 60 SPL (dB) 50 Approximate Threshold of Hearing 40 30 Audible Region 20 10 0 Inaudible Region -10 20 50 100 200 500 1000 2000 5000 10000 20000 frequency (Hz) The curve shown above is based on the Fletcher-Munson [73] and Robinson-Dadson [126] curves and is now a well established shape showing clearly that the ear is most sensitive to the range 1000-5000Hz where speech is found. At very low and very high frequencies the minimum thresholds increase rapidly. It is worthwhile noting that the threshold of pain is around 120dB, and prolonged exposure to such high intensities will damage the ear. The upper frequency limit of 383 hearing can be as high as 20kHz for very young children, but in adults is about 12-15kHz. The lower limit of hearing is often quoted as 20Hz as further reduction on frequency is not perceived as a further reduction in pitch. Also at these frequencies high SPL sounds can be “felt” as well as heard [30]. Many animals have hearing ranges well above 20kHz, the most noted example being dogs who respond to the sound made from dog whistles which humans cannot hear. Given that the bandwidth of hi-fidelity digital audio systems is up to 22.05kHz for CD and 24kHz for DAT it would appear that the full range of hearing is more than covered. However this is one of the key issues of the CD-analogue records debate. The argument of some analog purists is that although humans cannot perceive individual tones above 20kHz, when listening to musical instruments which produce harmonic frequencies above the human range of hearing these high frequencies are perceived in some “collective” fashion. This adds to the perception of live music; the debate will doubtless continue into the next century. See also Audiogram, Audiometry, Auditory Filters, Binaural Unmasking, Ear, Equal Loudness Contours, Equivalent Sound Continuous Level, Frequency Range of Hearing, Habituation, Hearing Aids, Hearing Impairment, Hearing Level, Infrasound, Permanent Threshold Shift, Psychoacoustics, Sensation Level, Sound Pressure Level (SPL), Spectral Masking, Temporal Masking, Temporary Threshold Shift (TTS), Ultrasound. Timbre: (Pronounced tam-ber). The characteristic sound that distinguishes one musical instrument from another. Key components of timbre are the signal amplitude envelope and the harmonic content of the signal [14]. See also Attack-Decay-Sustain-Release, Music, Western Music Scale. Time Invariant: A quantity that is constant over time. For example if the mean of a stochastic signal is described as being time invariant, then this means that the measured value of the mean will be the same if measured today, and then tomorrow. TMS320: The part number prefix for Texas Instruments series of DSP processors. One of the early members of the family was the TMS320C10 in 1984. Toeplitz Matrix: See Matrix Structured - Toeplitz. Tonal Distortion: If an analogue signal with periodic or quasi-periodic components is converted to a digital signal and the output contains harmonics of the periodic signal that were not present in the original, then this is referred to as tonal or harmonic distortion. For example, the following digital DSPedia 384 Amplitude, y(n) Magnitude, |Y(f)| (dB) signal is a 200Hz sine wave sampled at 48000Hz with an amplitude of 250. A 16384 point FFT confirms that signal there is no tonal distortion present. time(ms) frequency (kHz) The time and frequency representations of a 200Hz sine wave of amplitude 100, sampled at 48000 Hz, i.e. y ( k ) = 100 sin ( ( ( 2π200 )k ) ⁄ 48000 ) The 16384 point FFT shows that there is no tonal distortion. Note that on the frequency graph an amplitude of 100 corresponds to about -50 dB ( = 20 log ( 50 ⁄ 32767 ) ) where the full scale amplitude of 32767 ( = 2 15 – 1 ) is 0dB Amplitude, d(n) Magnitude, |D(f)| (dB) However when the signal is clipped at an amplitude of 80, then this non-linear operation causes tonal distortion as can be seen in the frequency domain representation: time(ms) frequency (kHz) The time and frequency representations of a 200Hz sine wave of amplitude 100, sampled at 48000 Hz, i.e. d ( k ) = 100 sin ( ( ( 2π200 )k ) ⁄ 48000 ) which has been clipped at ±80. The 16384 point FFT shows that there is clearly tonal distortion at integer multiples of the signal frequency. 385 Amplitude, v(n) Magnitude, |V(f)| (dB) Also when a very low level periodic signal is converted from an analog to a digital representation, the quantisation error will be correlated with the signal which will manifest itself as tonal distortion: time(ms) frequency (kHz) The time and frequency representations of a 100Hz sine wave of amplitude 5, sampled at 48000 Hz, i.e. v ( k ) = 100 sin ( ( 2π200 )k ⁄ 48000 ) . The 16384 point FFT shows that there is clearly tonal distortion present. When a speech or music signal is converted from analog to digital then the quasi-periodic nature of the signals may result in tonal distortion components. This tonal distortion may be due either to nonlinearities in the system or analog-to-digital conversion of very low level signals, See also Dithering, Total Harmonic Distortion. Tone (1): A pure sine wave (existing for all time, t ). Tone (2): In music theory each adjacent note in the chromatic scale differs by one semitone, which corresponds to multiplying the lower frequency by the twelfth root of 2, i.e. 2 1 / 12 = 1.0594631… . A difference of two semitones is a tone. Coincidentally (or perhaps by design!) “tone” is an anagram of “note”, as in musical note. See also Western Music Scale. Tone Generation: See Dual Tone Multifrequency - Tone Generation. Total Error Budget: Virtually every component in a standard input/output DSP system will contribute some error, or noise to a signal passing through. If a designer knows the tolerable error in the final system output, then from this total error budget, tolerances and allowable errors can be assigned to components. In a DSP system the designer will need to consider both analog and digital components in the total error budget. Total Harmonic Distortion (THD): If a pure tone signal of M Hz is played into a system and the output is found to contain not only the original signal, but also small components at harmonic frequencies of 2M, 3M, and so on then distortion has occurred. The THD is calculated as the percentage of total energy contained in the harmonics to the energy of the signal itself. THD is usually expressed in dB. See also Total Harmonic Distortion plus Noise. Total Harmonic Distortion plus Noise (THD+N): A measure often associated with ADCs and DACs defining the ratio of all spectral components over the specified bandwidth, excluding the input signal, to the rms value of the signal. See also Total Harmonic Distortion. TP Algorithm: The Turning Point algorithm was a technique to reduce the sampling frequency of an ECG signal from 200 to 100 samples/sec. The algorithm developed from the observation that except for the QRS portion of the ECG with large amplitudes and slopes, a sampling rate of 100 DSPedia 386 samples was more than adequate. The algorithm processes three points at once in order to identify where a significant turning point occurs. Trace of a Matrix: See Matrix Properties - Trace. Transceiver: A data communications device that can both transmit and receive data. Transcoding: Converting from one form of coded information to another. For example converting from MPEG1 compressed video to H.261 compressed video can be termed as transcoding. Transducer: A device for converting one form of energy into another, e.g. a microphone converts sound energy into electrical energy. Transform Coding: For some signals, mathematical transformation of the data into another domain may yield a data set that is more amenable to compression techniques that the original signal. The transform is usually applied to small blocks of data which are compared with a standard set of blocks to produce a correlation function for each. The signal is decompressed by applying the correlation functions as a weighting to each standard block. It is possible to combine transform coding and predictive coding to yield powerful compression algorithms. The disadvantage is that the algorithms are computation intensive. See also JPEG, MPEG, DCT. Transfer Function: A description (usually in the mathematical Z-domain) of the function a particular linear system will perform on signals. For example, the transfer function of a very simple low pass filter, y ( n ) = x ( n ) + x ( n – 1 ) , could be given as the transfer function H(z): Y ( z ) = + z –1 H ( z ) = -----------1 X( Z ) (549) See also Impulse Response. Transients: When an impulse is applied to a system, the resulting signal is often referred to as a transient. For example when a piano key is struck, the piano wire creates a transient as it continues to vibrate long after the key was struck. Sometimes, unexplained small currents and voltages within a system are described (and perhaps dismissed) as transients. Transpose Matrix: See Matrix Operations - Transpose. Transpose Vector: See Vector Properties and Definitions - Transpose. Transputer: A microprocessor designed by INMOS Ltd. The first and original parallel processing chips (T212, T414, and T800) had four serial links to allow intercommunication with other Transputers. Since its launch in 1984 the Transputer, despite its catchy name, failed to set the computing world on fire. Although the Transputer was used for many DSP applications, its slow arithmetic restricted its use and it never became a general purpose DSP. Trellis Coded Modulation (TCM): TCM is a digital modulation technique that combines convolutional coding and decoding techniques (including the Viterbi algorithm) with signal design to reduce transmission errors in a digital communication system while retaining the same average symbol energy and system bandwidth. TCM increases the number of signals in a signal set by some factor of two without increasing the signal space dimension (i.e., the system bandwidth). The coder 387 and decoder exploit the increase in the number signals by separating signals both by Euclidean distance in signal space as well as free distance in the convolutional code trellis. The Viterbi algorithm is used with a Euclidean distance rather than a Hamming distance as the appropriate metric to minimize probability of error (for the additive white gaussian noise channel). Trellis Codes are often referred to as Ungerboeck Codes, after G. Ungerboeck who is credited with their development. See also Viterbi Algorithm, Euclidean Distance, Hamming Distance. Tremolo: Tremolo is the effect where a low frequency amplitude modulation is applied to the musical output of an instrument. Tremolo can be performed digitally using simple multiplicative DSP techniques [32]: Tremolo Signal = cos ( 2π ( f t ⁄ f s )k ) s ( k ) (550) where, f s is the sampling frequency, f t is tremolo frequency of modulation and s ( k ) is the original digital music signal. In practice however the tremolo effect may require more subtle forms of modulation to produce an aesthetic sound. See also Music, Vibrato. Triangular Pulse (Continuous and Discrete Time): The continuous time triangular pulse can be defined as: – t0 1 – t---------- τ tri ( ( t – t 0 ) ⁄ τ ) = 0 -- if t – t 0 ≤ τ continuous time otherwise g(t) 1 0 t0 - τ t0 t0 +τ t The continuous triangular pulse g ( t ) = tri ( ( t – t 0 ) ⁄ τ ) The discrete time triangular pulse can be defined as: (551) DSPedia 388 – k0 1 – k------------ κ tri ( ( k – k 0 ) ⁄ κ ) = -- 0 if k – k 0 ≤ κ discrete time (552) otherwise g( k ) 1 0 k0 −κ k0 k0 +κ k g ( k ) = tri ( ( k – k 0 ) ⁄ κ ) See also Elementary Signals, Rectangular Pulse, Square Wave, Unit Impulse Function, Unit Step Function. Triangularization: See Matrix Decompositions - Cholesky/LU/QR. Tridiagonal Matrix: See Matrix Structured - Tridiagonal. Truncation Error: When two N bit numbers are multiplied together, the result is a number with 2N bits. If a fixed point DSP processor with N bits resolution is used, the 2N bit number cannot be accommodated for future computations which can operate on only N bit operands. Therefore, if we assume that the original N bit numbers were both constrained to be less than 1 in magnitude by using a binary point, then the 2N bit result is also less that 1. Hence if we throw away the last N bits, then this is equivalent to losing precision. This loss of precision is referred to as truncation error. Although the truncation error for a single computation is usually not significant, many errors added together can be significant. Furthermore if the result of a computation yields the value of 0 (zero) after truncation, and this result is to be used as a divisor, a divide by zero error will occur. See also Round-Off Error, Fractional Binary. Binary 0.1101011 x 0.1000100 = 0.011100011011000 Decimal 0.8359375 x 0.53125 = 0.444091796875 0.0111000 Truncation 0.4375 After multiplication of two 8 bit numbers the 16 bit result is truncated to 8 bits introducing a binary round off error of 0.000000011011000 which in decimal is 0.006591796875. If rounding had been used, then the result would have been 0.0111001, which is an error of 0.000000000101000, and in decimal an error of 0.001220703125. Truncation Noise: When truncation errors are considered in terms of their mean power, this results in a measure of the truncation noise. See also Truncation Error. Tweeter: The section of a loudspeaker that reproduces high frequencies is often called the tweeter. The name is derived from the high pitched tweet of a bird. See also Woofer. 389 Twisted Pair: The name given to a pair of twisted copper wires used for telephony. The gauge (and, consequently, the frequency response) of this type of transmission line will depend on the precise purpose and location. The “twist” is to improve common mode noise rejection. Two’s Complement: The type of arithmetic used by most DSP processors which allows a very convenient way of representing negative numbers, and imposes no overhead on arithmetic operations. In two’s complement the most significant bit is given a negative weighting, e.g. 1001 0000 0000 0001 2 = -2 15 + 2 12 + 2 1 = -32768 + 4096 + 1 = -28671 (553) See also Sign bit. Two-wire Circuit: A circuit formed of two conductors insulated from each other, providing a send and return path. Signals may pass in one or both directions although not at the same time. See also Four Wire Circuit, Half Duplex, Full Duplex. 390 DSPedia 391 U Ungerboeck Codes: See Trellis Coded Modulation. Ultrasonic: Acoustics signals (speed in air, 330 ms–1 ) having frequencies above 20kHz, the upper level of human hearing. The ultrasonic spectrum extends up to MHz frequencies. Underdetermined System: See Matrix Properties - Underdetermined System of Equations. Unit Impulse Function (Continuous Time and Discrete Time): The mathematical definition of the continuous time unit impulse function is a signal with an infinite magnitude, but with an infinitesimal duration and that has a unit area. The continuous time unit impulse function is often referred to as the Dirac impulse (or Dirac delta function) and is not physically realisable. The mathematical representation for the continuous time unit impulse function occurring at time t 0 , is usually denoted by the Greek letter δ (delta) in the form: 0 if t ≠ t 0 δ ( t – t0 ) = undefined if t = t 0 (554) Graphically the Dirac impulse, δ ( t – t 0 ) , can be represented as the following rectangular or triangular models where ε → 0 : ε 1/ε 2ε Rectangular Model 0 ε 1/ε t0 t 0 Triangular Model t0 t Rectangular and triangular models of the continuous time unit impulse function. As ε → 0 both models become infinitely tall and infinitesimally thin, but continue to maintain a unit area. Although the Dirac impulse does not exist in the real physical world, it does have significant importance in the mathematical analysis of signals and systems. A useful mathematical definition of the continuous time unit impulse function is: du ( t ) δ ( t ) = ------------dt (555) where u(t) is the unit step function. (To be mathematically correct the impulse function is actually a distribution rather than a function of time. The distinction is that a function must be single valued and for any time, t, the function has one and only one value.) DSPedia 392 The discrete time unit impulse function has a magnitude of 1 at a specific (discrete) time. The unit impulse response is bounded for all time and is therefore physically realizable. The discrete time unit impulse function is often referred to as the Kronecker impulse or (Kronecker delta function). The mathematical representation for the discrete time unit impulse function occurring at (discrete) time k 0 , is usually denoted by the Greek letter δ (delta): 0 if k ≠ k 0 δ ( k – k0 ) = 1 if k = k0 (556) discrete time Graphically the discrete time unit impulse function, δ ( k – k 0 ) , can be represented as: δ ( k – k0 ) 1 0 1 2 k0 k The discrete time unit impulse function δ ( k – k 0 ) Both the discrete time and continuous time unit impulse functions exhibit a sampling property when an analog signal is multiplied by a unit impulse response and integrated over time. Hence they are extremely useful mathematical tools for the analysis and definition of DSP sampled data systems. See also Elementary Signals, Fourier Transform Properties, Impulse, Rectangular Pulse, Sampling Property, Unit Step Function. Unit Step Function (Continuous Time and Discrete Time): The mathematical representation for the continuous time unit step function occurring at time t 0 , is usually denoted by the letter u, and defined by: 0 if t < t 0 u ( t – t0 ) = 1 if t ≥ t 0 continuous time (557) Graphically the continuous time unit step function, u ( t – t 0 ) , can be represented as: u ( t – t0 ) 1 0 t0 t The continuous time unit step function u ( t – t 0 ) The unit step function can be mathematically derived from the unit impulse function, δ(t), as: 393 t u(t) = ∫ δ ( τ ) dτ (558) ∞ The discrete time unit step function is denoted by: u ( k – k 0 ) = 0 if k < 0 1 if k ≥ 0 (559) discrete time Graphically the discrete time unit step function, u ( k – k 0 ) , can be represented as: u ( k – k0 ) 1 0 1 2 k0 k The discrete time unit step function u ( k – k 0 ) Rectangular, or pulse functions can be generated by the addition of unit step functions: x( k) 1 0 1 2 3 4 5 6 7 8 9 10 11 12 k x ( k ) = u ( k – 4 ) – u ( k – 10 ) See also Elementary Signals, Fourier Transform Properties, Impulse, Rectangular Pulse, Sampling Property, Step Response, Unit Impulse Function. Unit Step Response: See Step Response. Unit Pulse Function: See Rectangular Pulse, Unit Step Pulse. Unit Vector: See Vector Properties and Definitions - Unit Vector. Unitary Matrix: See Matrix Properties - Unitary. Unstable: See Instability. Upper Triangular Matrix: See Matrix Structured - Upper Triangular. Upsampling: Increasing the sampling rate of a digital signal by inserting zeroes between adjacent samples. To upsample a digital signal, xk, sampled at fs Hz to Mfs Hz would require that M-1 zeroes are inserted between adjacent samples in the original signal. Upsampling in combination with a low pass filter to remove the aliased portions of the frequency spectra gives interpolation. Up-sampling has no effect on the shape of the frequency spectrum of the signal. (If up sampling was performed DSPedia 394 using a digital zero order hold, i.e. the value of xk is inserted instead of zeroes, then the frequency spectrum of the output signal is modulated by a sinc function.) See also Downsampling, Decimation, Fractional Sampling Rate Converter, Interpolation, Sigma Delta Converter, Zero Order Hold. x(k) y(k) 1f s = --ts ts tu time time Output Input 4 Upsampler |Y(f)| |X(f)| 0 fs/2 fs 3fs/2 2fs 5fs/2 3fs 7fs/2 4fs frequency 0 fu /2 fu frequency 395 V V-Series Recommendations: The V-series recommendations from the International Telecommunication (ITU), advisory committee on telecommunications (denoted ITU-T, and formerly known as CCITT) propose a number of standards for telecommunication based data transmission. Among the more well known of these standards from a DSP perspective are V22bis (2400 bits/s modem), V32bis (14400 bits/sec modem), V34 (14400 bit/s modem), and V42bis (higher than 28800 bits/s modem featuring data compression) which all feature advanced adaptive signal processing techniques for echo control and data equalisation. Some of the current ITU-T Vseries recommendations (http://www.itu.ch) can be summarised as: V.1 Equivalence between binary notation symbols and the significant conditions of a two-condition code. V.2 Power levels for data transmission over telephone lines. V.4 General structure of signals of International Alphabet No. 5 code for character oriented data transmission over public telephone networks. V.7 Definitions of terms concerning data communication over the telephone network. V.8 Procedures for starting sessions of data transmission over the general switched telephone network. V.10 Electrical characteristics for unbalanced double-current interchange circuits operating at data signalling rates nominally up to 100 kbit/s. V.11 Electrical characteristics for balanced double-current interchange circuits operating at data signalling rates up to 10 Mbit/s. V.13 Simulated carrier control. V.14 Transmission of start-stop characters over synchronous bearer channels. V.15 Use of acoustic coupling for data transmission. V.16 Medical analogue data transmission modems. V.17 A 2-wire modem for facsimile applications with rates up to 14 400 bit/s. V.17 A 2-wire modem for facsimile applications with rates up to 14 400 bit/s. V.18 Operational and interworking requirements for modems operating in the text telephone mode. V.19 Modems for parallel data transmission using telephone signalling frequencies. V.21 300 bits per second duplex modem standardized for use in the general switched telephone network. V.22 1200 bits per second duplex modem standardized for use in the general switched telephone network and on point-to-point 2-wire leased telephone-type circuits. V.22bis 2400 bits per second duplex modem using the frequency division technique standardized for use on the general switched telephone network and on point-to-point 2-wire leased telephonetype circuits. V.23 600/1200-baud modem standardized for use in the general switched telephone network. V.24 List of definitions for interchange circuits between terminal equipment (DTE) and data circuitterminating equipment (DCE). The V24 standard is very similar to the RS232 standard. V.25 Automatic answering equipment and/or parallel automatic calling equipment on the general switched telephone network including procedures for disabling of echo control devices for both manual and automatic operation. V.25bis Automatic calling and/or answering equipment on the general switched telephone network (GSTN) using the 100-series interchange circuits. V.26 2400 bits per second modem standardized for use on 4-wire leased telephone-type circuits. V.26bis 2400/1200 bits per second modem standardized for use in the general switched telephone network. V.26ter 2400 bits per second duplex modem using the echo cancellation technique standardized for use on the general switched telephone network and on point-to-point 2-wire leased telephone-type circuits. V.27 4800 bits per second modem with manual equalizer standardized for use on leased telephonetype circuits. DSPedia 396 V.27bis 4800/2400 bits per second modem with automatic equalizer standardized for use on leased telephone-type circuits. V.27ter 4800/2400 bits per second modem standardized for use in the general switched telephone network. V.28 Electrical characteristics for unbalanced doubled-current interchange circuits. V.29 9600 bits per second modem standardized for use on point-to-point 4-wire leased telephonetype circuits. V.31 Electrical characteristics for single-current interchange circuits controlled by contact closure. V.31bis Electrical characteristics for single-current interchange circuits using optocouplers. V.32 A family of 2-wire, duplex modems operating at data signalling rates of up to 9600 bit/s for use on the general switched telephone network and on leased telephone-type circuits. V.32bis A duplex modem operating at data signalling rates of up to 14400 bit/s for use on the general switched telephone network and on leased point-to-point 2-wire telephone-type circuit. V.33 14400 bits per second modem standardized for use on point-to-point 4-wire leased telephonetype circuits. V.34 A modem operating at data signalling rates of up to 28800 bit/s for use on the general switched telephone network and on leased point-to-point 2-wire telephone-type circuits. V.36 Modems for synchronous data transmission using 60-108 kHz group band circuits. V.37 Synchronous data transmission at a data signalling rate higher than 72 kbit/s using 60-108 kHz group band circuits. V.38 A 48/56/64 kbit/s data circuit terminating equipment standardized for use on digital point-to-point leased circuits. V.41 Code-independent error-control system. V.42 Error-correcting procedures for DCEs using asynchronous-to-synchronous conversion. V.42bis Data compression procedures for data circuit terminating equipment (DCE) using error correction procedures. V.50 Standard limits for transmission quality of data transmission. V.51 Organization of the maintenance of international telephone-type circuits used for data transmission. V.52 Characteristics of distortion and error-rate measuring apparatus for data transmission. V.53 Limits for the maintenance of telephone-type circuits used for data transmission. V.54 Loop test devices for modems. V.55 Specification for an impulsive noise measuring instrument for telephone-type circuits. V.56 Comparative tests of modems for use over telephone-type circuits. V.57 Comprehensive data test set for high data signalling rates. V.58 Management information model for V-series DCE's. V.100 Interconnection between public data networks (PDNs) and the public switched telephone networks (PSTN). V.110 Support of data terminal equipments with V-Series type interfaces by an integrated services digital network. V.120 Support by an ISDN of data terminal equipment with V-series type interfaces with provision for statistical multiplexing. V.230 General data communications interface layer 1 specification. For additional detail consult the appropriate standard document or contact the ITU. See also Bell 103/113., Bell 202, Bell 212, International Telecommunication Union, ITU-T Modem, Recommendations, Standards. Variable Step Size LMS: See Least Mean Squares Algorithm Variants. Variance: The variance of a signal is the mean of the square of the signal about the mean value. If the signal is ergodic the statistical averages will equal the time averages and then: 397 1 m = Mean of x ( k ) = E { x ( k ) } = ∑ x ( k )p { x ( k ) } ≅ ---N k N–1 ∑ x( k ) (560) k=0 and Variance of x ( k ) = E { ( x ( k ) – m ) 2 } = ∑ ( x( k ) – m)2p{ x(k )} k 1 ≅ ---N N–1 (561) ∑ [x(k) – m]2 k 0 for large N. In a practical DSP situation where real signals are being used, the variance is often calculated using time averages. Variance gives a measure of the AC power in a signal. See also Ergodic, Expected Value, Mean Value, Mean Squared Value, Wide Sense Stationarity. Vector: A vector is a set of ordered information. A vector is usually denoted in texts using boldface lower case letters, v (cf. matrices, denoted ;kby upper case boldface) or with an underscore, v . A column vector has n rows and one column i.e. n × 1 dimension, and a row vector has one row and n columns, i.e. 1 × n dimension. In DSP a vector is usually a set of ordered elements conveying information or data. For example the last N samples of a signal, gk may be stored in a continuous array of memory and referred to and operated on as a (data) vector: gk gk – 1 gk = gk – 2 : (562) gk – N + 2 gk – N + 1 Vectors can be added subtracted, multiplied, scaled and transposed. See also Data Vector, Matrix, Vector Operations, Vector Properties, Weight Vector. Vector Addition: See Vector Operations - Addition. Vector-Matrix Multiplication: See Vector Operations - Matrix-Vector Multiplication. Vector Multiplication: See Vector Operations - Multiplication. Vector Operations: Vectors of the appropriate dimension can be added, subtracted, multiplied, scaled, and transposed. • Addition (Subtraction): If two vectors are to be added (or subtracted) then they must be of exactly the same dimension. Each element in one vector is added (subtracted) to the analogous element in the other vector. For example: DSPedia 398 1 3 + 6 0 = 2 3 4 = 6 5 (1 + 3) (6 + 0) (2 + 3) (563) Vector addition is commutative, i.e. a + b = b + a . • Dot Product: See Vector Operations - Inner Product. • Inner Product: When a row vector is multiplied by a column vector of the same dimension, the result is a scalar called the inner product. For example an FIR filter forms an inner product by multiplying the weight vector by the data vector. The inner product is sometimes referred to as the dot product. See also Outer Product. xk w T x = w0 w 1 w2 w3 xk – 1 (564) xk – 2 xk – 3 • Multiplication: Two vectors, w and v, can be multiplied either to form the inner product, w T v or the outer product, wv T . The inner product (also known as the dot product) of an 1 × n and an n × 1 vector is a scalar. For example: x0 (565) w0 w 1 w2 x 1 = w0 x 0 + w 1 x1 + w2 x 2 x2 The outer product of an n × 1 and a 1 × n vector is a square n × n matrix. For example w0 v0 w 0 v1 w 0 v2 w0 w1 v 0 v1 v 2 = v0 w 1 v1 w1 v2 w1 w2 v1 w2 v2 w2 v0 w 2 (566) The inner product (also known as the dot product) is widely used for digital filter presentation, and the output product is found in a number of linear algebraic derived DSP algorithms such as Recursive Least Squares. Matrix-Vector Multiplication: A n × 1 vector can be premultiplied by an m × n matrix to give an m × 1 . vector. For example: a 11 a 12 a 21 a 22 a 31 a 32 b1 b2 a 11 b 1 + a 2 b 2 (567) = a 21 b 1 + a 22 b 2 a 31 b 1 + a 32 b 2 • A 1 × n vector can be postmultiplied by a n × m matrix to give a 1 × m vector. b 1 b2 a 11 a 21 a 31 a 12 a 22 a 32 = a 11 b 1 + a 2 b 2 a 21 b 1 + a 22 b 2 a 31 b 1 + a 32 b 2 (568) 399 Note that if Ab = c then b T A T = c T . See also Matrix Operations. • Scaling: A vector, a, is scaled by multiplying each element by a scale factor, s. a1 sa 1 sa = s a 2 = sa 2 a3 (569) sa 3 • Transpose: The transpose of a row vector is obtained by writing the top to bottom elements as the left to right elements of a column vector, and vice-versa for the transpose of a column vector. The transpose of a vector a is denoted as aT. For example, if: b1 b = b2 ⇒ b3 b T = b1 b 2 b 3 b 4 (570) b4 Note that ( b T ) T = b . • Subtraction: Vector Operations - Addition. • Vector-Matrix Multiplication: See Vector Operations - Matrix-Vector Multiplication. See also Matrix, Vector Properties and Definitions. Vector Properties and Definitions: A number of vector properties can be defined: • Basis: A basis is a minimal set of linearly independent vectors which spans a particular subspace. Representations of any vector in that subspace spanned by the basis vectors can be achieved by a unique linear combination of the basis vectors. • Cauchy-Schwartz Inequality: The Cauchy Schwartz inequality as applies to the 2-norm of two vectors is given by: wTx ≤ w 2 x 2 (571) A useful interpretation of this inequality is that the output of an FIR digital filter will have a magnitude less than or equal to the multiplication of the 2-norm of the weight vector and data vector; this information can be useful in deciding the wordlength required be a DSP processor. See also Vector Properties - Norm, FIR Filter. • ∞ -norm: See Matrix Properties - Norm. • Linearly Dependent: See Linearly Independent Entry. • Linearly Independent: A set of vectors, { x 1, x 2, …, x N } , is linearly independent if: N ∑ αj xj = 0 j=1 implies that α i = 0 , for i = 1 to N If this condition is not true, then the vector set { x 1, x 2, …, x N } is said to be linearly dependent. As an example consider the vector set: (572) DSPedia 400 1 0 0 { x 1, x 2, x 3 } = 0 , 1 , 0 0 0 1 (573) There is clearly no linear combination of { x 1, x 2, x 3 } such that 3 ∑ αj xj = α1 x1 + α2 x2 + α3 x3 ≠ 0 (574) j=1 other than the trivial solution of α 1 = α 2 = α 3 = 0 . The set of vectors { x 1, x 2, x 3 } are therefore linearly independent. However the set of vectors: 1 0 1 { w 1, w 2, w 3 } = 0 , 1 , 2 0 0 0 (575) are not linearly independent (and therefore linear dependent) as: 3 ∑ αjwj (576) = α 1 w 1 + α2 w 2 + α 3 w 3 = 0 j=1 if α 1 = 1, α 2 = 2, α 3 = – 1 . See also Vector Properties - Basis, Subspace, Rank. • Minimum Norm: A system of linear equations can be defined as: (577) Ax = b where A is a known m × n matrix and has rank ( A ) = min ( m, n ) , x is an unknown n element vector, and b is a known m element vector. If multiple solutions exist that give the same error between Ax and b, then the solution with the minimum 2-norm is typically desirable. This solution is referred to as the minimum norm solution and is given by: (578) x LS = A + b where A + is the pseudoinverse. See also Matrix Properties - Underdetermined/Overdetermined, PseudoInverse, Vector Properties - Norm. • Norm: The vector norm provides a measure of the magnitude or distance spanned by an n element vector in n-dimensional space. The most useful class of norms are the p-norms defined by: 1 --p n v p = ( v 1 + v 2 + … + vn ) 1 / p = ∑ vi (579) i=1 The most often used of these norms is the 2-norm, also referred to as the magnitude of the vector v: v 2 = ( v 12 + v 22 + … + v n2 ) 1 / 2 = The square of the 2-norm is denoted as v For example, the 2-norm of a vector, x: 2 2 . v 12 + v 22 + … + v n2 (580) 401 x = 3 4 –7 x 2 = ( 9 + 16 + 49 ) = 74 = 8.602 (581) Other norms occasionally used are the 1-norm, which is the sum of the magnitude of all of the elements, and the and the ∞-norm, which returns the magnitude of the element with the largest absolute value: v v A p-norm unit vector is one that x x = v 1 + v2 + … + v n = max x i ∞ For the above 3 element vector x, 1 1 p for i = 1 to n"" = 14 and, x ∞ (582) (583) = 7 . = 1 .. See also Matrix Properties - Invariant Norm. • One Norm: See Vector Properties - Norm. • Orthogonal: A set of vectors ( v 1, v 2, v 3, …, v n ) is said to be orthogonal if: v iT v j = 0 for all i ≠ j (584) • Orthonormal: A set of vectors ( v 1, v 2, v 3, …, v n ) is said to be orthonormal if: v iT v j = δ ij for all i, j (585) where δ ij is the Kronecker delta (i.e., δ ij =1 if i=j and δ ij =0 otherwise). Orthogonal and orthonormal sets of vectors seem closely related and they are. The important distinction between an orthogonal set of vectors and an orthonormal set of vectors is that the vectors from the orthonormal set all have a norm of one. This is not necessarily the case for the set of orthogonal vectors. • Outer Product: When a column vector ( n × 1 ) is post-multiplied by a row vector ( 1 × n ) the result is a matrix ( n × n elements). For example for n = 3: ( x 12 ) ( x 1 x 2 ) ( x 1 x 3 ) x1 xx T = x 2 x 1 x 2 x 3 = ( x 2 x 1 ) ( x 22 ) ( x 2 x 3 ) x3 (586) ( x 3 x 1 ) ( x 3 x 2 ) ( x 32 ) The outer product is used to realise estimates of the covariance matrix and/or correlation matrix and is widely used in adaptive digital signal processing formulations. See also Vector Properties - Inner product. • Subspace: Given an m-dimensional space, ℜ m , and a set of m-dimensioned vectors ( v 1 v 2 v 3 … v n ) , the set of all possible linear combinations of these vectors forms a subspace of. ℜ m . The form of the linearly combination is given by: n ∑ αi vi where, α i ∈ ℜ i=1 The subspace defined by the linear combination of the vectors is said to be the span of ( v 1 v 2 v 3 … v n ) . For example consider the space ℜ 3 . The set of vectors: (587) DSPedia 402 1 v1 = 0 0 and 0 v2 = 0 2 (588) can only specify points on the x-z plane within the three dimensional [x, y, z] space. Hence v 1, v 2 specify a subspace. y 1 0 Subspace spanned vectors: 0 , 0 0 2 z 0 x There are effectively an infinite number of (plane) subspaces of ℜ 3 . Note that a subspace of ℜ 3 could also be a straight line in three dimensional space if, for example, only v 1 = [ 1, 0, 0 ] T is used to define the subspace. Since the form of the linear combination in Eq. [587] allows the scalars to be any value (including all zeros), it is clear that the origin has to be a point in any valid subspace. • Space: Given a vector, v = [ v 1, v 2, …, v m ] T , of dimension or length m, then it can be said that for v i ∈ ℜ, for i = 0, 1, 2, …m and where ℜ is the set of real numbers, then v is contained in the space (or m-dimensional space) denoted as ℜ m . As examples, the space ℜ 2 can be visualised as the space consisting of all points on a two dimensional plane, and the space ℜ 3 , considered as all possible points in three dimensional space. For spaces ℜ 4 and above it is impossible to visualise there physical existence, however their mathematical existence is assured to the reader! See also Vector Properties - Subspace, Matrix Properties - Range. y y z 0 0 x x Space ℜ 2 consists of all points on the x-y plane. Space ℜ 3 consists of all points in the x-y-z three dimensional space. For the vector v = [ x i y i ] T , For the vector w = [ x i y i z i ] T , if x i, y i ∈ ℜ if x i, y i, z i ∈ ℜ then v ∈ ℜ 2 then w ∈ ℜ 3 or v spans the space ℜ 2 or w spans the space ℜ 3 • Span: Given a linearly independent set of m-dimensional vectors { x 1, x 2, …x n } , the set of all linear combinations of these vectors is referred to as the span of { x 1, x 2, …x n } , i.e. n span { x 1, x 2, …x n } = ∑ αixi i=1 where, α i ∈ ℜ (589) 403 Note that the span will define a subspace of ℜ m , where m > n . Note that if m = n then the vectors span the entire space ℜ m . See also Vector Properties - Space/Subspace. • Transpose Vector: The transpose of a vector is formed by interchanging the rows and columns and is denoted by the superscript T. For example for a vector, x: a x = b c then (590) xT = a b c • 2-norm: See Vector Properties - Norm. • Unit Vector: A unit vector with respect to the p-norm is one that - Norm. x p = 1 . See also Vector Properties • Weight Vector: The name given to the vector formed by the weights of an FIR filter. See also Matrix, Vector Operations. Vector Scaling: See Vector Operations - Scaling. Vector Sum Excited Linear Prediction (VSELP): Similar to CELP vocoders except that VSELP uses more than one codebook. VSELP also has the additional advantage that it can be run on fixed point DSP processors, unlike CELP which requires floating point computation. Vector Transpose: See Vector Operations - Transpose. Vibration: A continuous to and fro motion, or reciprocating motion. Vibrations at audible frequencies give rise to sound. Vibrato: This is a simple frequency modulating effect applied to the output of a musical instrument. For example a mechanical arm on a guitar can be used to frequency modulate the output to produce a warbling effect. Vibrato can also be performed digitally by simple frequency modulation of a signal. See also Music, Tremolo. Virtual Instrument: The terminology used by some companies for a measuring instrument that is implemented on a PC but is presented in a form that resembles the well know analog version of the instrument. For example a virtual oscilloscope forms all of the normal controls as buttons and dials actually drawn on the screen in order that the instrument can immediately be used by an engineer whether they are familiar with DSP or not. Virtual Reality: A virtual instrument (substitute) for living. Ultimately, this application of DSP image and audio may prove to be very addictive. Visually Evoked Potential: See Evoked Potentials. Viterbi Algorithm: This algorithm is a means of solving an optimization problem (that can be framed on a trellis -- or structured set of pathways) by calculating the cost (or metric) for each possible path and selecting the path with the minimum metric [103]. The algorithm has proven extremely useful for decoding convolutional codes and trellis coded modulation. For these applications, the paths are defined on a trellis and the metrics are Hamming distance for convolutional codes and Euclidean distance for trellis coded modulation. These metrics result in the smallest possible probability of error when signals are transmitted over an additive white Gaussian noise channel (this is a common modelling assumption in communications). See also Additive DSPedia 404 White Gaussian Noise (AWGN), Channel Coding, Trellis Coded Modulation, Euclidean Distance, Hamming Distance. Viterbi Decoder: A technique for decoding convolutionally encoded data streams that uses the Viterbi algorithm (with a Hamming distance metric) to minimize the probability of data errors in a digital receiver. See Viterbi Algorithm. See also Channel Coding. VLSI: Very Large Scale Integration. The name given to the process of integrating millions of transistors on a single silicon chip to realize various digital devices (logic gates, flip-flops) which in turn are used to make system level components such as microprocessors, all on a single chip. VME Bus: A bus found in SUN workstations, VAXs and others. Many DSP board manufacturers make boards for VME bus, although they are usually a little more expensive than for the PC-Bus. Vocoders: A vocoder analyzes the spectral components of speech to try to identify the parameters of the speech waveform that are perceived by the human ear. These parameters are then extracted, transmitted and used at the receiver to synthesize (approximately) the original speech pattern. The resulting waveform may differ considerably from the original, although it will sound like the original speech signal. Vocoders have become popular at very low bit rates (2.4kbits/sec). Volatile: Semiconductor Memory that loses its contents when the power is removed is volatile. See also Non-Volatile, Dynamic RAM, Static RAM. Volterra Filter: A filter based on the non linear Volterra series, and used in DSP to model certain types of non-linearity. The second order Volterra filter includes second order terms such that the output of the filter is given by: N–1 y(k) = N – 1N – 1 ∑ wn ( k )x ( k – n ) + ∑ ∑ wij x ( k – i ) x ( k – j ) n=0 (591) i=0j=0 where w n are the linear weights and w ij are the quadratic weights. Adaptive LMS based Volterra filters are also widely investigated and a good tutorial article can be found in [109]. Voice Grade Channel: A communications channel suitable for transmission of speech, analog data, or facsimile, generally over a frequency band from 300Hz to 3400Hz. Volume Unit (VU): VU meters have been used in recording for many years and give a measure of the relative loudness of a sound [14], [46]. In general a sound of long duration is actually perceived by the human ear as louder than a short duration burst of the same sound. VU meters have rather a “sluggish” mechanical response, and therefore have an in built capability to model the human ear temporal loudness response. An ANSI standard exists for the design of VU meters. See also Sound Pressure Level. Von Hann Window: See Windows. VXI Bus: A high performance bus used with instruments that can fit on a single PCB card. This standard is a capable of transmitting data at up to 10Mbytes/sec. 405 W Waterfall Plot: A graphical 3-D plot that shows frequency plotted on the X-axis, signal power on the Y-axis, and time elapsing on the Z-axis (into the computer screen). As time elapses and segments of data are transformed by the FFT, the screen can appear like a waterfall the 2-D spectra pass along the Z-axis. Warble Tone: If an audible pure tone is frequency modulated (FM) by a smaller pure tone (typically a few Hz) the perceived signal is often referred to as a warble tone, i.e. the signal is perceived to be varying between two frequencies around the carrier tone frequency. Warble tones are often used in audiometric testing where stimuli signals are played to a subject through a loudspeaker in a testing room. If pure tones were used there is a possibility that a zone of acoustic destructive interference would occur at or near the patient’s head thus making the test erroneous. The use of warble tones greatly reduces this possibility as the zones of destructive interference will not be static. To produce a warble tone, consider a carrier tone at frequency f c , frequency modulated by another tone at frequency f m : w ( t ) = sin ( 2πf c t + β sin 2πf m t ) = sin θ ( t ) i.e. θ ( t ) = 2πf c t + β sin 2πf m t (592) where β is the modulation index which controls the maximum frequency deviation from the carrier frequency. For example if a carrier tone f c = 1000 Hz is to be modulated by a tone f m = 5 Hz such that the warble tone signal frequency varies between 900Hz and 1000Hz at a rate 5 times per second, then noting that the instantaneous frequency of an FM tone, f , is given by: 1 dθ ( t ) f = ------ ------------- = f c + βf m cos 2πf m t 2π dt (593) Amplitude the modulation index required is β = 20 to give the required frequency swing. See also Audiometer, Audiometry, Binaural Beats, Constructive Interference, Destructive Interference. time (secs) A warble tone where an audible frequency tone carrier is modulated by a lower frequency modulating tone usually of a few Hz. Watt: The surname of the Scottish engineer James Watt who gave his name to the unit of power. In an electrical system power is calculated from: 2 -----P = V ⋅ I = I2 = V R (594) Waveform: The representation of a signal plotted (usually) as voltage against time, where the voltage will represent some analog time varying quantity (e.g. audio, speech and so on). DSPedia 406 Waveform Averaging: (Ensemble Averaging) The process of taking a number of measurements of a periodic signal, summing the respective elements in each record and dividing by the number of measurements. Waveform Averaging is often used to reduce the noise when the noise and periodic signal are uncorrelated. As an example, averaging is widely used in ECG signal analysis where the process retains correlated frequencies in the periodic signal and the removes the uncorrelated one to reveal the distinctive ECG complex. Wavelet Transform: The wavelet transform is an operation that transforms a signal integrated with specific functions, often known as the kernel functions. This kernel functions may be referred to as the mother wavelet and the associated scaling function. Using the scaling function and mother wavelet, multi-scale translations and compressions of these functions can be produced. The wavelet transform actually generalizes the time frequency representation of the short time Fourier Transform (STFT). Compared to the STFT the wavelet transform allows non-uniform bandwidths or frequency bins and allows resolution to be different at different frequencies. Over the last few years DSP has seen considerable interest and application of the wavelet transform, and the interested reader is referred to [49]. Web: See World Wide Web. Weight Vector: Weighted Moving Average (WMA): See Finite Impulse Response (FIR) filter. See also Moving Average. Weight Vector: The weights of an FIR digital filter can be expressed in vector notation such that the output of a digital filter can be conveniently expressed as a row-column vector product (or inner product). xk-1 xk-2 xk-3 xk w0 w1 w2 w3 yk 3 yk = ∑ wn x k – n = w 0 xk + w 1 xk – 1 + w2 x k – 2 + w3 x k – 3 n=0 xk ⇒ yk = wT x = w0 w 1 w2 w3 xk – 1 xk – 2 xk – 3 If the digital filter is IIR, then two weight vectors can be defined: one for the feedforward weights and one for the feedback weights. For further notational brevity the two weight vectors and two data 407 vectors can be respectively combined into a single weight vector, and a data vector consisting of past input data and past output samples:. xk-1 xk-2 xk-3 xk a0 a1 a2 yk yk-3 yk-2 b3 2 yk = ∑ n=0 b2 yk-1 b1 3 a n xk – n + ∑ b n y k – n = a0 x k + a 1 xk – 1 + a2 x k – 2 + b1 y k – 1 + b2 y k – 2 + b3 y k – 3 n=1 xk yk – 1 ⇒ yk = a T x k + b T y k – 1 = a0 a1 a2 x k – 1 + b1 b2 b 3 yk – 2 xk – 2 ⇒ yk = a T bT xk yk – 1 yk – 3 = wT uk See also Vector Properties and Definitions - Weight Vector. Weighting Curves: See Sound Pressure Level Weighting Curves. Weights: The name given to the multipliers of a digital filter. For example, a particular FIR may be described as having 32 weights. The terms weights and coefficients are used interchangeably. See also FIR filter, IIR filter, Adaptive Filter. Well-Conditioned Matrix: See Matrix Properties - Well Conditioned. Western Music Scale: The Western music scale is based around musical notes separated by octaves [14]. If a note, X, is an octave higher than another note, Y, then the fundamental frequency of X is twice that of Y. From one octave frequency to the next in the Western music scale, there are twelve equitempered frequencies which are spaced one semitone apart, where a semitone is a logarithmic increase in frequency (If the two octave frequencies are counted then there are thirteen DSPedia 408 notes). The Western music scale can be best illustrated on the well known piano keyboard which comprises a full chromatic scale: F#3 A3b B3b C4# E4b F#4 A4b B4b C#4 E4b F#5 F3 G3 A3 B3 C4 D4 E4 F4 G4 A4 B4 C5 D5 E5 F5 G5 One octave increasing fundamental frequency A section of the familiar piano keyboard with the names of the notes marked. One octave is twelve equitempered notes (sometimes called the chromatic scale), or eight notes of a major scale. The black keys represent various sharps (#) and flats (b). The piano keyboard extends in both directions repeating the same twelve note scale. Neighboring keys (black or white) are defined as being a semitone apart. If one note separates two keys, then they are a tone apart. The letters A to G are the names given to the notes. The International Pitch Standard defines the fundamental frequency of the note A4 as being 440 Hz. The note A4 is the first A above middle C (C4) which is located near the middle of a piano keyboard. Each note on the piano keyboard is characterised by its fundamental frequency, f 0 , which is usually the loudest component caused by the fundamental mode of vibration of the piano string being played. The “richness” of the sound of a single note is caused by the existence of other modes of vibration which occur at harmonics (or integer multiples) of the fundamental, i.e. 2f 0, 3f 0 and so on. The characteristic sound of a musical instrument is produced by the particular harmonics that make up each note. On the equitempered Western music scale the logarithmic difference between the fundamental frequencies of all notes is equal. Therefore noting that in one octave the frequency of the thirteenth note in sequence is double that of the first note, then if the notes are equitempered the ratio of the fundamental frequencies of adjacent notes must be 2 1 / 12 = 1.0594631… . As defined the ratio between the first and thirteenth note is then of course ( 2 1 / 12 ) 12 = 2 , or an octave. The actual logarithmic difference in frequency between two adjacent notes on the keyboard is: log 2 1 / 12 = 0.025085… (595) Two adjacent notes in the Western music scale are defined as being one semitone apart, and two notes separated by two semitones are a tone apart. For example, musical notes B and C are a semitone apart, whereas G and A are a tone apart as they are separated by Ab. 409 Therefore the fundamental frequencies of 3 octaves of the Western music scale can be summarised in the following table, where the fundamental frequency of the next semitone is calculated by multiplying the current note fundamental frequency by 1.0594631...: Note Fundamental frequency (Hz) Note Fundamental frequency (Hz) Note Fundamental frequency (Hz) C3 130.812 C4 261.624 C5 523.248 C3# 138.591 C4# 277.200 C5# 554.400 D3 146.832 D4 293.656 D5 587.312 Eb3 155.563 Eb4 311.124 Eb5 622.248 E3 164.814 E4 329.648 E5 659.296 F3 174.614 F4 349.228 F5 698.456 F#3 184.997 F#4 370.040 F#5 740.080 G3 195.998 G4 392.040 G5 784.080 Ab3 207.652 Ab4 415.316 Ab5 830.632 A3 220 A4 440 A5 880 Bb3 233.068 Bb4 466.136 Bb5 932.327 B3 246.928 B4 493.856 B5 987.767 A correctly tuned musical instrument will therefore produce notes with the frequencies as stated above. However it is the existence of subtle fundamental frequency harmonics that gives every instrument its unique sound qualities. It is also worth noting that certain instruments may have some or all notes tuned “sharp” or “flat” to create a desired effect. Also noting that pitch perception and frequency is not a linear relationship the high frequencies of certain instruments may be tuned slightly “sharp”. Music is rarely represented in terms of its fundamental frequencies and instead music staffs are used to represent the various notes that make up a particular composition. A piece of music is usually played in a particular musical key which is a subset of eight notes of an octave and where those eight notes have aesthetically pleasing perceptible qualities. The major key scales are DSPedia 410 realised by starting at a root note and selecting the other notes of the key in intervals of 1, 1, 1/2, 1, 1, 1, 1/2 tones (where 1/2 tone is a semitone). For example the C-major and G-major scales are: G One tone A 1 1 B C 1/2 D E 1 1 F# G 1 1/2 G-major Scale 1 One Semitone G Ab A Bb B C 1/2 C-major Scale C C# D Eb E 1 1 D F 1/2 E F F# G Ab A 1 Bb B 1 G 1 A C 1/2 B C Starting at the any root note, X, of the chromatic scale, the, X-major scale can be produced by selecting notes in steps of 1,1,1/2,1,1,1,1/2 tones. The above shows example of the C- and G-major scales. There are a total of 12 major scales possible. There are many other forms of musical keys, such as the natural minors which are formed by the root note and then choosing in steps of 1, 1/2, 1, 1, 1/2, 1, 1. For more information on the rather elegant and simple mathematics of musical keys, refer to a text on music theory. C-major Scale Treble Staff C4 D4 E4 F4 G4 A4 B4 C5 D5 E5 F5 G5 D3 E 3 F3 G 3 A3 B 3 C 4 D4 Bass Staff G2 A2 B2 C2 Music notation for the C major scale which has no sharps or flats (i.e., only the white notes of the piano keyboard). Different notes are represented by different lines and spaces on the staff (the five parallel lines). The treble clef (the “g” like letter marking the G-line on the top left hand side of the staff) usually defines the melody of a tune, whereas the bass clef (the “f” like letter marking the F-line on the bottom left hand side of the staff) defines the bass line. Note that middle C (C4) is represented on a “ledger” line existing between the treble and bass staffs. On a piano the treble is played with the right hand, and the bass with the left hand. For other scales (major or minor), the required sharps and flats are shown next to the bass and treble clefs. Many musical instruments only have the capability of playing either the treble or bass, e.g. the flute can only play the treble clef, or the double bass can only play the bass clef. 411 G-major Scale Treble Staff Bass Staff # C4 D4 E4 F4# G4 A 4 B 4 C5 D5 E5 F5# G2 A2 D3 E3 F3# G3 A3 B3 G5 # B2 C3 C4 D4 Music notation for the G major scale which has one sharp (sharps and flats are the black notes of the piano keyboard). Therefore whenever an F note is indicated by the music, then an F# should be played in order to ensure that the G-major scale is used. So what are the qualities of the Western music scale that make it pleasurable to listen to? The first reason is familiarity. We are exposed to music from a very early age and most people can recognise and recall a simple major scale or a tune composed of notes from a major scale. The other reasons are that the ratios of the frequencies of certain notes when played together are “almost” low integer ratios and these chords of more than one note take on a very “full” sound. For example the C-major chord is composed of the 1st, 3rd and 5th notes of the C-major scale, i.e. C,E,G. If we consider the ratios of the fundamental frequencies of these notes: 5 C ---- = 2 4 / 12 = 1.2599… ≈ --4 E 3 C- = 7 / 12 = --1.4983… ≈ --2 2 G (596) 6 E- = 3 / 12 = --1.189… ≈ --2 5 G they can be approximated by “almost” integer ratios of the fundamental frequencies. (Note that on the very old scales -- the Just scale and the Pythagorean scale -- these ratios were exact). When these three notes are played together the frequency differences actually reinforce the fundamental which produces a rich strong sound. This can be seen by considering the simple trigonometric identities: 1–5⁄4 1+5⁄4 C + E = cos C 0 + cos ( 2 1 / 3 C 0 ) ≈ 2 cos -------------------- C 0 cos -------------------- C 0 2 2 1 9 = 2 cos --- C o cos --- C 0 8 8 and (597) DSPedia 412 1–3⁄2 1+3⁄2 C + G = cos C 0 + cos ( 2 7 / 12 C 0 ) ≈ 2 cos -------------------- C 0 cos -------------------- C 0 2 2 5 1 = 2 cos --- C 0 cos --- C 0 4 4 (598) where C 0 = 2πf C t and f C is the fundamental frequency of the C note. Adding together the C and E results in a sound that may be interpreted as a C three octaves below C0 modulating a D. Similarly the addition of the C and G results in sound that may be interpreted as a C two octaves below C0 that is modulated an E. The existence of these various modulating subharmonics leads to the “full” and aesthetically pleasing sound of the chord. In addition to major chords, there are many others such as the minor, the seventh and so on. All of the chords have there own distinctive sound to which we have become accustomed and associated certain styles of music. Prior to the existence of the equitempered scale there were other scales which used perfect integer ratios between notes ratios. Also around the world there are still many other music scales to be found, particularly in Asia. See also Digital Audio, Just Music Scale, Music, Music Synthesis, Pythagorean Scale. White Noise: A signal that (in theory) contains all frequencies and is (for most purposes) completely unpredictable. Most white noise is defined as being Gaussian, which means that it has definable properties of mean (average value) and variance (a measure of its power). White noise has a constant power per unit bandwidth, and is labelled white because of the analogy with white light (containing all visible light frequencies with nearly equal power). In a digital system, a white noise sequence has a flat spectrum from 0Hz to half the sampling frequency. Wide Sense Stationarity: If a discrete time signal, x ( k ) , has a time invariant mean: E{x(k)} = ∑ x ( k )p { x ( k ) } (599) k and a time invariant autocorrelation function: r( n) = ∑ x ( k )x ( k – n )p { x ( k ) } (600) k that is a function only of the time separation, n – k , but not of time, k, is said to be wide sense stationary. Therefore if the signal, x ( k ) , is also ergodic, then: 1 E { x ( k ) } ≅ --------------------M2 – M1 M2 – 1 ∑ x ( k ), for any M 1 and M 2 where M 2 » M 1 (601) n = M1 and 1 E { x 2 ( k ) } ≅ --------------------M2 – M1 M2 – 1 ∑ n = M1 [ x( k ) ]2 , for any M 1 and M 2 where M 2 » M 1 (602) 413 For derivation and subsequent implementation of least means squares DSP algorithms using stochastic signals, assuming wide sense stationarity is usually satisfactory. See Autocorrelation, Expected Value, Least Mean Squares, Mean Value, Mean Squared Value, Strict Sense Stationary, Variance, Wiener-Hopf Equations. Wideband: A signal that uses a large portion of a particular frequency band may be described as wideband. The classification into wideband and narrowband depends on the particular application being described. For example, the noise from a reciprocating (piston) engine may be described as narrowband as it consists of a one main frequency (the drone of the engine) plus a some frequency components around this frequency, whereas the noise from a jet engine could be described as wideband as it covers a much larger frequency band and is more white (random) in its make-up. Sound Pressure (dB) Sound Pressure (dB) In telecommunications wideband or broadband may describe a circuit that provides more bandwidth than a voice grade telephone line (300-3000Hz) i.e. a circuit or channel that allows frequencies of upto 20kHz to pass. These type of telecommunication broadband channels are used for voice, high speed data communications, radio, TV and local area data networks. Narrowband Engine Noise 0.1 0.4 1.6 6.4 25.6 Frequency (kHz) Wideband Engine Noise 0.1 0.4 1.6 6.4 25.6 Frequency (kHz) Widrow: Professor Bernard Widrow of Stanford University, USA, generally credited with developing the LMS algorithm for adaptive digital signal processing systems. The LMS algorithm is occasionally referred to as Widrow’s algorithm. Wiener-Hopf Equations: Consider the following architecture based on a FIR filter and a subtraction element: d( k ) x( k) w 0 w 1 w 2 wN –2 wN –1 + y(k ) - e( k) The output of an FIR filter, y ( k ) is subtracted from a desired signal, d ( k ) to produce an error signal, e ( k ) . If there is some correlation between the input signal, x ( k ) and the desired signal, d ( k ) then values can be calculated for the filter weights, w ( 0 ) to w ( N – 1 ) in order to minimize the mean squared error, E { e 2 ( k ) } . If the signal x ( k ) and d ( k ) are in some way correlated, then certain applications and systems may require that the digital filter weights, w ( 0 ) to w ( N – 1 ) are set to values such that the power of the error signal, e ( k ) is minimised. If weights are found that minimize the error power in the mean squared sense, then this is often referred to as the Wiener-Hopf solution. DSPedia 414 To derive the Wiener Hopf solution it is useful to use a vector notation for the input vector and the weight vector. The output of the filter, y(k), is the convolution of the weight vector and the input vector: N–1 y(k) = ∑ wn x ( k – n ) = wTx(k ) (603) n=0 where, w = [ w 0 w 1 w 2 … w N – 2 wN – 1 ] T (604) x ( k ) = [ x( k ) x( k – 1 ) x( k – 2 ) … x( k – N + 2 ) x( k – N + 1 ) ]T (605) and, Assuming that x ( k ) and d ( k ) are wide sense stationary processes and are correlated in some sense, then the error, e ( k ) = d ( k ) – y ( k ) can be minimised in the mean squared sense. To derive the Wiener-Hopf equations consider first the squared error: e2( k ) = [ d(k ) – y(k )]2 = d 2 ( k ) – [ w T x ( k ) ] 2 – 2d ( k )w T x ( k ) (606) = d 2 ( k ) – w T x ( k )x T ( k )w – 2w T d ( k )x ( k ) Taking expected (or mean) values we can write the mean squared error (MSE), E { e 2 ( k ) } as: E { e 2 ( k ) } = E { d 2 ( k ) } – w T E { x ( k )x T ( k ) }w – 2w T E { d ( k )x ( k ) } (607) Writing in terms of the N × N correlation matrix, r0 r1 r2 … rN – 1 r1 r0 r1 … rN – 2 r1 r0 … rN – 3 R = E { x ( k )x T ( k ) } = r 2 : : : … : rN – 1 rN – 2 rN – 3 … ro and the N × 1 cross correlation vector, (608) 415 p0 p1 p = E { d ( k )x ( k ) } = p 2 (609) : pN – 1 gives, ζ = E { e 2 ( k ) } = E { d 2 ( k ) } + w T Rw – 2w T p (610) where ζ is used for notational convenience to denote the MSE performance surface. Given that this equation is quadratic in w then there is only one minimum value. The minimum mean squared error (MMSE) solution, w opt , can be found by setting the (partial derivative) gradient vector, ∇ , to zero: ∇ = ∂ζ = 2Rw – 2p = 0 ∂w ⇒ w opt = desired signal input signal x(k) (611) R –1 p FIR Digital Filter, y(k) =wTx(k) y(k) Output − signal d(k) + e(k) error signal Calculate w = R-1p A simple block diagram for the Wiener-Hopf calculation. Note that there is no feedback and therefore, assuming R is non-singular, the algorithm is unconditionally stable. (612) To appreciate the quadratic and single minimum nature of the error performance surface consider the trivial case of a one weight filter: ζ = E { d 2 ( k ) } + rw 2 – 2wp (613) DSPedia 416 MSE, ζ where E [ d 2 ( k ) ] , r, and p are all constant scalars. Plotting mean squared error (MSE), ζ , against the weight vector, w, produces a parabola (upfacing): Point of zero gradient ∇ = dζ = 2rw – 2p = 0 dw w opt = r –1 p MMSE w wopt The mean square error (MSE) performance surface, ζ, of for a single weight filter. The MMSE solution occurs when the surface has gradient, ∇ = 0 . If the filter has two weights the performance surface is a paraboloid which can be drawn in 3 dimensions: MSE, ζ Point of zero gradient MMSE ∇ = dζ = 2Rw – 2p = 0 dw w1 w1(opt) w opt = w0(opt) w0 w1 = opt r0 r1 r1 r0 –1 p0 p1 w0 The mean square error (MSE) performance surface, ζ, of for a two weight filter. If the filter has more than three weights then we cannot draw the performance surface in three dimensions, however, mathematically there is only one minimum point which occurs when the gradient vector is zero. A performance surface with more than three dimensions is often called a hyperparaboloid. To actually calculated the Wiener-Hopf solution, w opt = R – 1 p requires that the R matrix and p vector are realised from the data x ( k ) and d ( k ) , and the R matrix is then inverted prior to premultiplying vector p. Given that we assumed that x ( k ) and d ( k ) are stationary and ergodic, then we can estimate all elements of R and p from: 1 r n = ----M M–1 ∑ i=0 xi xi + n and 1 p n = ----M M–1 ∑ di xi + n (614) i=0 Calculation of R and p requires approximately 2MN multiply and accumulate (MAC) operations where M is the number of samples in a “suitably” representative data sequence, and N is the 417 adaptive filter length. The inversion of R requires around N3 MACs, and the matrix-vector multiplication, N2 MACs. Therefore the total number of computations in performing this one step algorithm is 2MN + N3 + N2 MACs. The computation load is therefore very high and real time operation is computationally expensive. More importantly, if the statistics of signals x ( k ) or d ( k ) change, then the filter weights will need to be recalculated, i.e. the algorithm has no tracking capabilities. Hence direct implementation of the Wiener-Hopf solution is not practical for real time DSP implementation because of the high computational load, and the need to recalculate when the signal statistics change. For this reason real time systems which need to minimize an error signal power use gradient descent based adaptive filters such as the least mean squares (LMS) or recursive least squares (RLS) type algorithms. See also Adaptive Filter, Correlation Matrix, Correlation Vector, Least Mean Squares Algorithm, Least Squares. Whitening Filter: A filter that takes a stochastic signal and produces a white noise output [77]. If the input stochastic signal is an autoregressive process, the whitening filters are all-zero FIR filters. See also Autoregressive Model. Window: A window is a set of numbers that multiply a set of N adjacent data samples. If the data was sampled at frequency f s , then the window weights N ⁄ f s second of data. There a number of semi-standardized data weighting windows used to pre-weight data prior to frequency domain calculations (FFT/DFT). The most common are the Bartlett, Von Hann, Blackman, Blackmannharris, Hamming, and Hanning: • Bartlett Window: A data weighting window used prior to frequency transformation (FFT) to reduce spectral leakage. Compared to the uniform window (no weighting) the Bartlett window doubles the width of the main lobe, while attenuating the main sidelobe by 26dB, compared to the 13dB of the uniform window. For N data samples, the Barlett window is defined by: n h ( n ) = 1.0 – ----------N⁄2 N N for n = – ---- ..... –2 , –1 , 0, 1, 2, .... ---2 2 (615) • Blackmann Window: A data weighting window used prior to frequency transformation (FFT) providing improvements over the Bartlett and Von Hann windows by increasing spectral leakage rejection. For N data samples, the Blackmann window is defined by: 2 h (n ) = ∑ k=0 N 2knπ N a ( k ) cos -------------- for n = ---- .....-2,-1,0,1,2 ..... –---N 2 2 (616) with coefficients: a(0) = 0.42659701, a(1) = 0.49659062, a(2) = 0.07684867 • Blackmann-harris Window: A type of data window often used in the calculation of FFTs/DFTs for reducing spectral leakage. Similar to the Blackman window, but with four cosine terms: 3 h (n ) = ∑ k=0 N 2knπ N a ( k ) cos -------------- for n = ---- .....-2,-1,0,1,2 ..... –---N 2 2 with coefficients: a(0) = 0.3635819, a(1) = 0.4891775, a(2) = 0.1365995, a(3) = 0.0106411 • Hamming Window: A data weighting window used prior to frequency transformation (FFT) to reduce spectral leakage. Compared to the uniform window (no weighting) the Bartlett window doubles the width (617) DSPedia 418 of the main lobe, while attentuating the main sidelobe by 46dB, compared to the 13dB of the uniform window. Compared to the similar Von Hann window, the Hamming window sidelobes do not decay as rapidly. For N data samples, the Barlett window is defined by: N 2nπ N h ( n ) = 0.54 + 0.46 cos ---------- for n = ---- .....-2,-1,0,1,2 ..... –---N 2 2 (618) • harris Window: A data weighting window used prior to frequency transformation (FFT) to reduce spectral leakage (similar to the Bartlett and Von Hann windows). For N data samples, the harris window is defined by: h( n) = 3 ∑ k =0 N 2knπ N a ( k ) cos -------------- for n = ---- .....-2,-1,0,1,2 ..... –---N 2 2 (619) with coefficients: a(0) = 0.3066923, a(1) = 0.4748398, a(2) = 0.1924696, a(3) = 0.0259983 • Vonn Hann Window: A data weighting window used prior to frequency transformation (FFT). Compared to the uniform window (no weighting) the Von Hann doubles the width of the main lobe, while attentuating the main sidelobe by 32dB, compared to the 13dB of the uniform window. For N data samples, the Von Hann window is defined by: 2nπ h ( n ) = 0.5 + 0.5 cos ---------- N N N for n = – ---- ..... –2 , –1 , 0, 1, 2, .... ---2 2 (620) Wold Decomposition: H. Wold showed that any stationary stochastic discrete time process, x ( n ) , can be decomposed into two components: (1) a general linear regression of white noise; and (2) a predictable process. The general linear regression of white noise is given by: ∞ u( k) = 1 + ∑ bnv ( k – n ) n=1 ∞ with ∑ bn < ∞ (621) n=1 and the predictable process, s ( n ) , can be entirely predicted from its own past samples. s ( n ) and v ( n ) are uncorrelated, i.e. E { v ( n )s ( k ) } = 0 for all n, k [77]. See also Autoregressive Modelling, Yule Walker Equations. Woodbury’s Identity: See Matrix Properties - Inversion Lemma. Wordlength: The size of the basic unit of arithmetic computation inside a DSP processor. For a fixed point DSP processor the wordlength is at least 16 bits, and in the case of the DSP56000, it is 24 bits. Floating point DSP processors usually use 32 bit wordlengths. See also DSP Processor, Parallel Multiplier. World Wide Web (WWW): The World Wide Web (or the web) has become the de facto standard on the internet for storing, finding and transferring open information; hypertext (with text, graphics and audio) is used to access information. Most universities and companies involved in DSP now have web servers with home pages where the information available on a particular machine is summarised. There are also likely to be hypertext links available for cross referencing to additional information. The best way to understand the existence and usefulness of the World Wide Web is to use it with tools such as Mosaic or Netscape. Speak to your system manager or call up your phone company or internet service provider for more information. 419 Woofer: The section of a loudspeaker that reproduces low frequencies is often called the woofer. The name is derived from the low pitched woof of a dog. The antithesis to the woofer is the tweeter. See also Tweeter. 420 DSPedia 421 X X-Series Recommendations: The X-series telecommunication recommendations from the International Telecommunication Union (ITU), advisory committee on telecommunications (denoted ITU-T and formerly known as CCITT) provide standards for data networks and open system communication. For details on this series of recommendations consult the appropriate standard document or contact the ITU. The well known X.400 standards are defined for the exchange of multimedia messages by storeand-forward transfer. The X.400 standards therefore provide an international service for the movement of electronic messages without restriction on the types of encoded information conveyed. The ITU formed a collaborative partnership with the International Organization for Standards for the development and continued definition of X.400 in 1988 (See ISO 10021 (Parts 17).) A joint technical committee was also formed by the ISO and the International Electrotechnical Commission (IEC). See also International Electrotechnical Commission, International Organization for Standards, International Telecommunication Union, ITU-T Recommendations, Standards. xk: x k or x(k) is often the name assigned to the input signal of a DSP system. x(k) DSP System y(k) 422 DSPedia 423 Y yk: y k or y(k) is usually the name assigned to the output signal of a DSP system. x(k) y(k) DSP System Yule Walker Equations: Consider a stochastic signal, u ( k ) produced by inputting white noise, v ( k ) to an all-pole filter: Modelled Signal, or Autoregressive Process Autoregressive Model {b1, b2,..., bM White Noise v(k) u(k) The output signal u ( k ) is referred to as an autoregressive process, and was generated by a white noise input at v ( k ) . If the inverse problem is posed such that you are given the autoregressive signal u ( k ) and the order of the process (say M), then the autoregressive filter weights {b1, b2, ... bM} that produced the given process from a white noise signal, v ( n ) can be found by solving the Yule Walker equations: ⇒ b AR = R –1 r where the vector (622) b = [ b 1 … b M – 1 b M ] T , R is the M × M correlation matrix: R = E { u ( k – 1 )u T ( k – 1 ) } = r0 … rM – 2 rM–1 : … … : r0 : r1 rM – 1 … r1 r0 rM – 2 (623) and r the M × 1 correlation vector, r1 r = E { u ( k )u ( k – 1 ) } = r2 : rM where r n = E { u ( k )u ( k – n ) } = E { u ( k – n )u ( k ) } , where E { . } is the expectation operator. See also Autoregressive Modelling. (624) 424 DSPedia 425 Z Z-1: Derived from the z-transform of signal, z – 1 is taken to mean a delay of one sample period. Sometimes denoted simply as ∆ . Zeroes: A sampled impulse response (e.g. of a digital filter) can be transferred into the Z-domain, and the zeroes of the function can be found by factorizing the polynomial to find the roots: H ( z ) = 1 – 3z –1 + 2z – 1 = ( 1 – z –1 ) ( 1 – 2z –2 ) (625) i.e. the zeros are z = 1 and z = 2. Zero Order Hold: If a signal is upsampled or reconstructed by holding the same value until the next sample value, then this is a zero order hold. Also called step reconstruction. See First Order Hold, Reconstruction Filter. Zero-Padding: See Fast Fourier Transform - Zero Padding. Zoran: A manufacturer and designer of special purpose DSP devices. Z-transform: A mathematical transformation used for theoretical analysis of discrete systems. Transforming a signal or a system into the z-domain can greatly facilitate the understanding of a particular system [10]. 426 DSPedia 427 Common Numbers Associated with DSP In this section numerical values which are in some way associated with DSP and its applications are listed. The entries are given in an alphabetical type order, where 0 is before 1, 1 is before 2 and so on, with no regard to the actual magnitude of the number. Decimal points are ignored. 0 dB: If a system attenuates a signal by 0 dB then the signal output power is the same as the signal input power, i.e. P out - = 10 log 1 = 0 dB 10 log --------P in (626) 0x: Used as a prefix by Texas Instruments processors to indicate hexadecimal numbers. 0.0250858... : The base 10 logarithm of the ratio of the fundamental frequency of any two neighboring notes (one semi-tone apart) on a musical instrument tuned to the Western music scale. See also Western Music Scale. 0.6366197: An approximation of 2 ⁄ π . See also 3.92dB. 1 bit A/D: An alternative name for a Sigma-Delta ( Σ-∆ ) A/D. 1 bit D/A: An alternative name for a Sigma-Delta ( Σ-∆ ) D/A. 1 bit idea: An alternative name for a really stupid concept. 10-12 W/m2: See entry for 2 x10-5 N/m2. 1004Hz: When measuring the bandwidth of a telephone line, the 0dB point is taken at 1004 Hz. 10149: The ISO/IEC standard number compact disc read only system description. Sometimes refered to as the Yellow Book. See also Red Book. 10198: The ISO/IEC standard number for JPEG compression. 1024: 210. The number of elements in 1k, when refering to memory sizes, i.e. 1 kbyte = 1024 bytes. 1.024 Mbits/sec: The bit rate of a digital audio system sampling at f s = 32000 Hz with 2 (stereo) channels and 16 bits per sample. 1070 Hz: One of the FSK (frequency shift keying) carrier frequencies for the Bell 103, 300 bits/sec modem. Other frequencies are 1270 Hz, 2025 Hz and 2225 Hz. 103: The Bell 103 was a popular 300 bits/sec modem standard. 1.05946...: The twelfth root of 2, i.e 2 1 / 12 . This number is the basis of the modern western music scale whereby the ratio of the fundamental frequencies of any two adjacent notes on the scale is 1.05946... See also Music, Western Music Scale. 10.8dB: Used in relation to quantisation noise power calculations; 10 log 1 ⁄ 12 = 10.8 dB . 11.2896 MHz: 2 × 5.6448 MHz and used as a clock for oversampling sigma delta ADCs and DACs. 5.6448 MHz sampling frequency can be decimated by a factor of 128 to 44.1kHz ,a standard hifidelity audio sampling frequency for CD players. 428 DSPedia 115200 bits/sec: The 111520 bits/sec modem is an eight times speed version of the very popular 14400 modem and became available in the mid 1990s. This modem uses echo cancellation, data equalisation, and data compression technique to achieve this data rate. See also 300, 2400, Vseries recommendations. 11544: The ISO/IEC standard number for JBIG compression. 11172: The ISO/IEC standard number for MPEG-1 video compression. 120 dB SPL: The nominal threshold of pain from a sound expressed as a sound pressure level. 1200 Hz: The carrier frequency of the originating end of the ITU V22 modem standard. The answering end uses a carrier frequency of 2400Hz. Also one of the carrier frequencies for the FSK operation of the Bell 202 and 212 standards, the other one being 2400Hz. 1209 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency. 12.288 MHz: 2 × 6.144 MHz and used as a clock for oversampling sigma delta ADCs and DACs. 6.144 MHz sampling frequency can be decimated by a factor of 128 to 48kHz, a standard hifidelity audio sampling frequency for DAT. 128: 27 12.8 MHz: 2 × 6.4 MHz and used as a clock for oversampling sigma delta ADCs and DACs. 6.4 MHz sampling frequency can be decimated by a factor of 64 to a sampling frequency of 100kHz. 13 dB: The attentuation of the first sidelobe of the function 10 log sin x ⁄ x is approximately 13 dB. See also Sine Function. 1336 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency. 13522: The ISO/IEC standard number for MHEG multimedia coding. 13818: The ISO/IEC standard number for MPEG-2 video compression. -13 dB: The ISO/IEC standard number for MPEG-2 video compression. 1.4112 Mbits/sec: The bit rate of a CD player sampling at fs = 44100Hz, with 2 (stereo) channels and 16 bits per sample. 14400 bits/sec: The 14400 bits/sec modems was six times speed version of the very popular 2400 modem and became available in the early 1990s, with the cost falling dramatically in a few years. See also 300, 2400, V-series recommendations. 1.452 - 1.492 GHz: The 40 MHz radio frequency band allocated for satellite DAB (digital audio broadcasting) at the 1992 World Administrative Radio Conference in Spain. Due to other plans for this bandwidth, a number of countries selected other bandwidths such as 2.3 GHz in the USA, and 2.5 GHz in fifteen other countries. 147: The number of the European digital audio broadcasting (DAB) project started in 1987, and formally named Eureka 147. This system has been adopted by ETSI (the European 429 Telecommunication Standards Institute) for DAB and currently uses MPEG Audio Layer 2 for compression. 147:160: The largest (integer) common denominator of the sampling rates of a CD player, and a DAT player, i.e. 44100 ---------------- : 48000 ---------------- = 147 : 160 300 300 (627) 1477 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency. 1.536 Mbits/sec: The bit rate of a DAT player sampling at fs = 48000Hz, with 2 (stereo) channels and 16 bits per sample. 160: See 147. 1633 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency. 16384: 214 1.76 dB: Used in relation to quantisation noise power calculations; 10 log 1.5 = 1.76 dB . 176.4kHz: The sample rate when 4 ×’s oversampling a CD signal where the sampling frequency f s = 44.1kHz . 1800 Hz: The carrier frequency of the QAM (quadrature amplitude modelling) ITU V32 modem standard. 2 bits: American slang for a quarter (dollar). 2-D FFT: The extension of the (1-D) FFT into two dimensions to allow Fourier transforms on images. 2 × 10-5 N/m2: The reference intensity, sometimes denoted as Iref , for the measurement of sound pressure levels (SPL). This intensity can also be expressed as 10-12 W/m2, or as 20 µ Pa (micropascals). This intensity was chosen as it was close to the absolute level of a tone at 1000Hz that can just be detected by the human ear; the average human threshold of hearing at 1000Hz is about 6.5dB. The displacement of the eardrum at this sound power level is suggested to be 1/10th the diameter of a hydrogen molecule! 20 dB/octave: Usually used to indicate how good a low pass filter attenuates at frequencies above the 3dB point. 20dB per octave means that each time the frequency doubles then the attenuation of the filter increases by a factor of 10, since 20dB = 20 log 2 ( 10 ) . 20dB/decade is the same rolloff as 6dB/decade. See also Decibels, Roll-off. 20 µ Pa (micropascals): See entry for 2 x10-5 N/m2. 205: The number of data points in used in Goertzel’s algorithm (a form of discrete Fourier transform (DFT)) for tone detection. 430 DSPedia 2025 Hz: One of the FSK (frequency shift keying) carrier frequencies for the Bell 103, 300 bits/sec modem. Other frequencies are 1070 Hz, 1270 Hz and 2225 Hz. 2048: 211 2100: The part number of most Analog Devices fixed point DSP processors. 21000: The part number of most Analog Devices floating point DSP processors. 2225 Hz: One of the FSK (frequency shift keying) carrier frequencies for the Bell 103, 300 bits/sec modem. Other frequencies are 1070 Hz, 1270 Hz and 2025 Hz. 24 bits: The fixed point wordlength of some members of the Motorola DSP56000 family of DSP processors. 2400 bits/sec: The 2400 bits/sec modems appeared in the early 1990s as low cost communication devices for remote computer access and FAX transmission. The bit rate of 2400 was chosen as it is a factor of 8 faster than the previous 300 bits/sec modem. Data rates of 2400 were achieved by using echo cancellation and data equalisation techniques. The 2400 bits/sec modem dominated the market until the cost of the 9600 modems started to fall in about 1992. To ensure a simple backwards operation compatibility all modems are now produced in factors of 2400, i.e. 4800, 7200, 9600, 14400, 28800, 57600, 115200. See also V-series recommendations. 2400 Hz: The carrier frequency of the answering end of the ITU V22 modem standard. The originating end uses a carrier frequency of 1200Hz. Also one of the carrier frequencies for the FSK operation of the Bell 202 and 212 standards, the other one being 1200Hz. 256: 28 26 dB: The attentuation of the first sidelobe of the function 20 log sin x ⁄ x is approximately 26 dB. See also Sine Function. 261.624 Hz: The fundamental frequency of middle C on a piano tuned to the Western music scale. See also 440 Hz. 2.718281... : The (truncated) value of e, the natural logarithm. 28800 bits/sec: The 28800 bits/sec modem is an eight times speed version of the very popular 14400 modem and became available in the mid 1990s. This modem uses echo cancellation, data equalisation, and data compression technique to achieve this data rate. See also 300, 2400, Vseries recommendations. 2.8224 MHz: An intermediate oversampling frequency used for sigma delta ADCs and DACs used with CD audio systems. 2.8224 MHz can be decimated by a factor of 64 to 44.1 kHz, the standard sampling frequency of CD players. 3 dB: See 3.01dB. 3.01 dB: The approximate value of 10 log 10 ( 0.5 ) = 3.0103 . If a signal is attenuated by 3dB then its power is halved. 300: The largest (integer) common denominator of the sampling rates of a CD player, and a DAT player, i.e. 431 44100 ---------------- : 48000 ---------------- = 147 : 160 300 300 (628) 300 bits/sec: The bit rate of the first commercial computer modems. Although 28800 bits/sec is now easily achievable, 300 bits/sec modems probably outsell all other speeds of modems by virtue of the fact that most credit card telephone verification systems can perform the verification task at 300 bits/sec in a few seconds. See also Bell 103, 2400, V-series recommendations. 3.072 MHz: An intermediate oversampling frequency used for sigma delta ADCs and DACs used with DAT and other professional audio systems. 3.072 MHz can be decimated by a factor of 64 to 48kHz, the current standard professional hifidelity audio sampling frequency. 32 kHz: A standard hifidelity audio sampling rate. The sampling rate of NICAM for terrestrial broadcasting of stereo audio for TV systems in the United Kingdom. 32 bits: The wordlength of most floating point DSP processors. 24 bits are used for the mantissa, and 8 bits for the exponent. 3.2 MHz: An intermediate oversampling frequency for sigma delta ADCs and DACs that can be decimated by a factor of 32 to 100 kHz. 320: The part number for most Texas Instruments DSP devices. 32768: 215 3.3 Volt Devices: DSP processor manufacturers are now releasing devices that will function with 3 volt power supplies, leading to a reduction of power consumption. 350 Hz: Tones at 350 Hz and 440 Hz make up the dialing tone for telephone systems. 35786 km: The height above the earth of a satellite geostationary orbit. This leads to between 240 and 270ms one way propagation delay for satellite enabled telephone calls. On a typical international telephone connection the round-trip delay can be as much as 0.6 seconds making voice conversation difficult. In the likely case of additional echoes voice conversation is almost impossible without the use of echo cancellation strategies. +++ 352.8 bits/sec: One quarter of the bit rate of hifidelity CD audio sampled at 44.1 kHz, with 16 bit samples and stereo channels ( 44100 × 16 × 2 = 1411200 bits/sec ). The data compression scheme known as PASC (psychoacoustic subband coding) used on DCC (digital compact cassette) compresses by a factor 4:1 and therefore has a data rate of 384 bits/sec when used on data sampled at 44.1kHz. & 352.8kHz: The sample rate when 8 ×’s oversampling a CD signal where the sampling frequency is f s = 44.1kHz . +++ 384 bits/sec: One quarter of the bit rate of hifidelity audio sampled at 48kHz, with 16 bit samples and stereo channels ( 48000 × 16 × 2 = 1536000 bits/sec ). The data compression scheme known as PASC (psychoacoustic subband coding) used on DCC (digital compact cassette) compresses by a factor 4:1 and therefore has a data rate of 384 bits/sec when used on data sampled at 44.1kHz. & 3.92dB: The attenuation of the frequency response of a step reconstructed signal at f s ⁄ 2 . The attenuation is the result of the zero order hold “step” reconstruction which is equivalent to DSPedia 432 convolving the signal with a unit pulse of time duration t s = 1 ⁄ f s , or in the frequency domain, multiplying by the sinc function, H ( f ) :: sin πft H ( f ) = -----------------sπft s (629) Therefore at f s ⁄ 2 , the droop in the output signal spectrum has a value of: sin ( π ⁄ 2 )- = 2 --- = 0.63662 H ( f s ⁄ 2 ) = ----------------------π⁄2 π (630) which in dB’s can be expressed as: 20 log ( 2 ⁄ π ) = 3.922398 (631) 4 dB: Sometimes used as an approximation to 3.92dB. See also 3.92dB 4096: 212 4294967296: 232 440 Hz: The fundamental frequency of the first A note above middle C on a piano tuned to the Western music scale. Definition of the frequency of this one note allows the fundamental tuning frequency of all other notes to be defined. Also the pair of tones at 440 Hz and 350 Hz make up the telephone dialing tone, and 440 Hz and 480 Hz make up the ringing tone for telephone systems. 44.1kHz: The sampling rate of Compact Disc (CD) players. This sampling frequency was originally chosen to be compatible with U-matic video tape machines which had either a 25 or 30Hz frame rate, i.e. 25 and 30 are both factors of 44100. 44.056kHz: The sampling rate of Compact Disc (CD) players. was originally chosen to be compatible with U-matic video tape machines which had either a 25 or 30Hz frame rate, i.e. 25 and 30 are both factors of 44100. When master recording was done on a 29.97Hz frame rate video machine, this required the sampling rate to be modified to a nearby number that was a factor of 29.97, i.e. 44.056kHz. This sampling rate is redundant now. 4.76cm/s: The tape speed of compact cassette players, and also of digital compact cassette players (DCC). 4.77 dB: 10 log 3 ≈ 4.77dB , i.e. a signal that has its power amplfied by a factor of 3, has an amplification of 4.77dB. 48kHz: The sampling rate of digital audio tape (DAT) recorders, and the sampling rate used by most professional audio systems. 480 Hz: The tone pair 480 Hz and 620 Hz make up the busy signal on telephone systems. 433 4800 bits/sec: The 4800 bits/sec modems was a double speed version of the very popular 2400 modem. Data rates of 4800 were achieved using echo cancellation and data equalisation techniques. See also 2400, V-series recommendations. 512: 29 56000: The part number for most Motorola fixed point DSP devices. 5.6448 MHz: An oversampling frequency for sigma delta ADCs and DACs used with CD players. 5.6448 MHz can be decimated by a factor of 128 to 44.1kHz the standard hifidelity audio sampling frequency for CD players. 57200 bits/sec: The 57200 bits/sec data rate modem is an 4 times speed version of the very popular 14400 modem and became available in the mid 1990s. This modem uses echo cancellation, data equalisation, and data compression technique to achieve this data rate. See also 300, 2400, V-series recommendations. 6dB/octave: The “6” is an approximation for 20 log 10 2 = 6.0206 . Usually used to indicate how good a low pass filter attenuates at frequencies above the 3dB point. 6dB per octave means that each time the frequency doubles then the attenuation of the filter increases by a factor of 2, since. 6dB/octave is the same roll-off as 20dB/decade. See also Decibels, Roll-off. 6.144 MHz: An oversampling frequency for sigma delta ADCs and DACs used with DAT and other professional audio systems. 6.144 MHz can be decimated by a factor of 128 to 48kHz to the current standard professional hifidelity audio sampling frequency. 620 Hz: The tone pair 480 Hz and 620 Hz make up the busy signal on telephone systems. 6.4 MHz: An oversampling frequency for sigma delta ADCs and DACs that can be decimated by a factor of 64 to 100 kHz. 64kBits/sec: A standard channel bandwidth for data communications. If a channel has a bandwidth of approximately 4kHz, then the Nyquist sampling rate would be 8kHz, and data of 8 bit wordlength is sufficient to allow good fidelity of speech to be transmitted. Note that 64000 bits/sec = 8000Hz × 8 bits. 6.4 MHz: A common sampling rate for a 64 times oversampled sigma-delta ( Σ-∆ ) A/D, resulting in up to 16 or more bits of resolution at 100kHz after decimation by 64. 65536: 216 697 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency. & 705600 bits/sec: The bit rate of a single channel of a CD player, with 16 bit samples, and sampling at f s = 44100kHz . & 705.6 kHz: The sample rate when 16 ×’s oversampling a CD signal where the sampling frequency f s = 44100kHz . 7200 bits/sec: The 7200 bits/sec modems was a three times speed version of the very popular 2400 modem and became available in the early 1990s, with the cost falling dramatically in a few 434 DSPedia years. Data rates of 7200 were achieved using echo cancellation and data equalisation techniques. See also 2400, V-series recommendations. 741 Op-Amp: The part number of a very popular operational amplifier chip widely used for signal conditioning, amplification, and anti-alias, reconstruction filters. 768000 bits/sec: The bit rate of a single channel DAT player with 16 bits per sample, and sampling at f s = 48000 Hz . 770 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency. 8 kHz: The sampling rate of most telephonic based speech communication. 8192: 213 852 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency. 941 Hz: One of the frequency tones used for DTMF signalling. See also Dual Tone Multifrequency. 9.54dB: 20 log 3 ≈ 9.54dB , i.e. a signal that has its voltage amplfied by a factor of 3, has an amplification of 9.54 dB. 9600 bits/sec: The 9600 bits/sec modems was a four times speed version of the very popular 2400 modem and became available in the early 1990s, with the cost falling dramatically in a few years. Data rates of 9600 were achieved by using echo cancellation and data equalisation techniques. See also 2400, V-series recommendations. 96000: The part number for most Motorola 32 bit floating point devices. 435 Acronyms: ADC - Analogue to Digital Converter. ADSL - Advanced Digital Subscriber Line ADSR - Attack-Decay-Sustain-Release. AES/EBU - Audio Engineering Society/European Broadcast Union. A/D - Analogue to Digital Converter. ADPCM - Adaptive Differential Pulse Code Modulation. ANC - Active noise cancellation. ANSI - American National Standards Institute. AIC - Analogue Interfacing Chip. ARB - Arbitrary Waveform Generation. ASCII - American Standard Code for Information Interchange. ASIC - Application Specific Integrated Circuit. ASK - Amplitude Shift Keying. ASPEC - Adaptive Spectral Perceptual Entropy Coding . ASSP - Acoustics, Speech and Signal Processing. AVT - Active Vibration Control. AWGN - Additive White Gausssian Noise. BER - Bit Error Rate. BISDN - Broadband Integrated Services Digital Network. BPF - Band pass filter. BPSK - Binary Phase Shift Keying. CCR - Condition Code Register. CCITT - Comité Consultatif International Télégraphique et Téléphonique. (International Consultative Committee on Telegraphy and Telecommunication, now known as ITU-T.) CCIR - Comité Consultatif International Radiocommunication. (International Consultative Committee on Radiocommunication, now known as ITU-R.) CD - Compact Disc CD-DV: Compact Disc Digital Video. 436 DSPedia CELP - Coded Excited Linear Prediction Vocoders. CENELEC - Comité Européen de Normalisation Electrotechnique (European Committee for Electrotechnical Standardization). CIF - Common Intermediate Format. CIRC - Cross Interleaved Reed Solomon code. CISC - Complex Instruction Set Computer. CPM - Continuous Phase Modulation. CPU - Central Processing Unit. CQFP - Ceramic Quad Flat Pack. CRC - Cyclic Redundancy Check. CVSD - Continuous variable slope delta modulator. D/A - Digital to analogue converter. DAB - Digital Audio Broadcasting. DAC - Digital to analogue converter. dB - decibels. DECT - Digital European Cordless Telephone. DL - Difference Limen. & DARS - Digital Audio Radio Services. DBS - Direct Broadcast Satellites. DCC - Digital Compact Cassette. DCT - Discrete Cosine Transform. & DDS - Direct Digital Synthesis. DECT - Digital European Cordless Telephone. DFT - Discrete Fourier Transform. DLL - Dynamic Link Library. DMS - Direct Memory Access. DPCM - Differential Pulse Code Modulation. DPSK - Differential Phase Shift Keying. DRAM - Dynamic Random Acces Memory. 437 DSL - Digital Subscriber Line DSP - Digital Signal Processing. DTMF - Dual tone Multifrequency. DSfP - Digital Soundfield Processing. ECG - Electrocardiograph. EEG - Electroencephalograph. EFM - Eight to Fourteen Modulation. EMC - Electromagnetic compatibility. EPROM - Electrically programmable read only memory. EEPROM - Electrically Erasable Programmable Read Only Memory. EQ - Equalization (usually in acoustic applications). ETSI - European Telecommunications Standards Institute. FIR - Finite Impulse Response. FFT - Fast Fourier Transform. FSK - Frequency Shift Keying. G - prefix meaning 10 9 , as in GHz, thousands of millions of Hertz GII - Global Information Infrastructure. GIF - Graphic Interchange Format. GSM - Global System For Mobile Communications (Group Speciale Mobile). HDSL - High speed Digital Subscriber Line hhtp - Hypertext Transfer Protocol. IEEE - Institute of Electrical and Electronic Engineers (USA). IEE - Institute of Electrical Engineers (UK). IEC - International Electrotechnical Commission. IIR - Infinite impulse response. IIF - Image Interchange Facility. INMARSAT - International Mobile Satellite Organization. ISDN - Integrated Services Digital Network. ISO - International Organisation for Standards. DSPedia 438 ISO/IEC JTC - International Organization Commission Joint Technical Committee. for Standards/ International ITU - International Telecommunications Union. ITU-R - International Telecommunications Union - Radiocommunication. ITU-T - International Telecommunications Union - Telecommunication. I/O - Input/Output. JBIG - Joint Binary Image Group. JND - Just Noticeable Difference. JPEG - Joint Photographic Expert Group. JTC - Joint Technical Committee. k - prefix meaning 10 3 , as in kHz, thousands of Hertz. LFSR - Linear Feedback Shift Register Coding. LPC - Linear Predictive Coding. LSB - Least Significant Bit. M - prefix meaning 10 6 as in MHz, millions of Hertz. MAC - Multiply Accumulate. MFLOPS - Millions of Floating Point Operations per Second. MIDI - Music... MAF - Minimum Audible Field. MAP - Minimum Audible Pressure. MIPS - Millions of Instructions per second. MLPC - Multipulse Linear Predictive Coding. MA - Moving Average. MD - Mini-Disc. MMSE - Minimum Mean Squared Error. MHEG - Multimedia and Hypermedia Experts Group. MPEG - Moving Picture Experts Group. MRELP - M.. ms - millisecond ( 10 –3 ). Electrotechnical 439 MSB - Most Significant Bit. MSE - Mean Squared Error. MSK - Minimum Shift Keying. MIX - Modular Interface eXtension. MUSICAM - Masking pattern adapted Universal Subband Integrated Coding And Multiplexing. NRZ - Non Return to Zero. ns - nanosecond ( 10 –9 seconds). OKPSK - Offset-Keyed Phase Shift Keying. OKQAM - Offset-Keyed Quadrature Amplitude Modulation. OOK - On Off Keying. OPSK - Offset-Keyed Phase Shift Keying. OQAM - Offset-Keyed Quadrature Amplitude Modulation. PAM - Pulse Amplitude Modulation. PASC - Precision Adaptive Subband Coding. PCM - Pulse Code Modulation. PCMCIA - Personal Computer Memory Card International Association. PN - Pseudo-Noise. ppm - Parts per million. PPM - Pulse Position Modulation. PRBS - Pseudo Random Binary Sequence. PSK - Phase Shift Keying. PSTN - Public Switched Telephone Network. PTS - Permanent Threshold Shift. PWM - Pulse Width Modulation. PDA - Personal Digital Assistant. PGA - Pin Grid Array. PID - Proportional Integral Controller. PQFP - Plastic Quad Flat Pack. PRNS - Pseudo Random Noise Sequence. 440 QAM - Quadrature Amplitude Modulation. QPSK - Quadrature Phase Shift Keying. RAM - Random access memory. RBDS - Radio Broadcasting.....? RELP - Residual Excited Linear Prediction Vocoder. RIFF - Resource Interchange File Format. RISC - Reduced Instruction Set Computer. RLC - Run Length Coding. RLE - Run Length Encoding. ROM - Read only memory. RPE - Recursive Predictor Error or Regular Pulse Excitation RZ - Return to Zero. Rx - Receive. SBM - Super Bit Mapping (A trademark of Sony). SCMS - Serial Copy Management System. SFG - Signal Flow Graph. SGML - Standard Generalized Markup Language. S/H - Sample and Hold. SINR - Signal to Interference plus Noise Ratio. SNR - Signal to Noise Ratio. S/N - Signal to Noise ratio. S/P-DIF - Sony/Philips Digital Interface Format. SR - Status Register. SPL - Sound Pressure Level. SRAM - Static random access memory. SRC - Sample Rate Converter. TBDF - Triangular Probability Density Function. TCM - Trellis Coded Modulation. THD - Total Harmonic Distortion. DSPedia 441 THD+N - Total Harmonic Distortion plus Noise. TTS - Temporary Threshold Shift. Tx - Transmit. VSELP - Vector Sum Excited Linear Prediction. VU - Volume Unit. WMA - Weighted Moving Average. WWW - World Wide Web. µ sec - microsecond ( 10 –6 ) Standards Organisation ANSI - American National Standards Institute. BS - British Standard. IEC - International Electrotechnical Committee. IEEE - Institute of Electronic and Electrical Engineers. ISO - International Organisation for Standards. 442 DSPedia 443 References and Further Reading Goto Papers Textbooks [1] S. Banks. Signal Processing, Image Processing and Pattern Recognition. Prentice Hall, Englewood Cliffs, NJ, 1990. [2] T.P. Barnwell III, K. Nayebi, C.H. Richardson. Speech Coding. A Computer Laboratory Textbook. John Wiley and Sons, 1996. [3] A. Bateman and W. Yates. Digital Signal Processing Design. Pitman Publishing 1988. [4] E.H. Berger, W.D. Ward, J.C. Morrill, L.H. Royster. Noise and Hearing Conservation Manual, 4th Edition. American Industrial Hygiene Association. [5] R.L. Brewster. ISDN Technology. Chapman & Hall, London, 1993. [6] R.G. Brown, P.Y.C. Hwang. Introduction to Random Signals and Applied Kalman Filtering, John Wiley and Sons, 1992. [7] C.S. Burrus, J.H. McLellan, A.V. Oppenheim, T.W. Parks, R.W. Schafer, H.W. Schuessler. Computer Based Exercises for Signal Processing Using Matlab. Prentice Hall, 1994. [8] J.C. Candy, G.C. Temes. Oversampling Delta-Sigma Data Converters. Piscataway, NJ; IEEE Press, 1992. [9] L.W. Couch II. Modern Communication Systems: Principles and Applications. Prentice-Hall, Englewood Cliffs, NJ, 1995. [10] D.J. DeFatta, J.G. Lucas, W.S. Hodgkiss. Digital Signal Processing: A System Design Approach. John Wiley, New York, 1988. [11] J.R. Deller, J.G. Proakis, J.H.K. Hansen. Discrete Time Processing of Speech Signals. MacMillan, New York, 1993. [12] P.D. Denyer and D. Renshaw. VLSI Signal Processing - A Bit Serial Approach. Addison-Wesley, 1995. [13] G. De Poli, A Piccialli, C. Roads. Representations of Musical Signals. The MIT Press, Boston, USA, 1991. [14] J.M. Eargle. Music Sound and Technology. Van Nostrand Reinhold, 1990. [15] G.H. Golub, C.F. Van Loan. Matrix Computations. John Hopkins University Press, 1989. [16] J.G. Gibson, The Mobile Communications Handbook. CRC Press/IEEE Press, 1996. [17] S. Haykin. Adaptive Filter Theory (2nd Edition). Prentice Hall, Englewood Cliffs, NJ, 1990. [18] S. Haykin. Neural Networks: A Comprehensive Foundation. MacMillan College, 1994. [19] D.R. Hush and B.G. Horne. Progress in supervisied neural networks. IEEE Signal Processing Magazine, Vol. 10, No. 1, pp. 8-39, January 1993. [20] K. Hwang, F. Briggs. Computer Architecture and Parallel Processing. McGraw-Hill, 1985. [21] E.C. Ifeachor, B.W. Jervis. Digital Signal Processing: A Practical Approach. Addison-Wesley, 1993. [22] N. Kalouptsidis,Theodoridis. Adaptive System Identification and Signal Processing Algorithms. Prentice Hall, 1993. 444 DSPedia [23] A. Kamas, E.A. Lee. Digital Signal Processing Experiments. Prentice-Hall, Englewood Cliffs, NJ, 1989. [24] S.Y. Kung. Digital Neurocomputing. Prentice-Hall, Englewood Cliffs, NJ, 1992. [25] S.Y. Kung. VLSI Array Processors. Prentice-Hall, Englewood Cliffs, NJ, 1987. [26] P.A. Lynn. An Introduction to the Analysis and Processing of Signals, 1982. [27] J.D. Martin. Signals and Processes: A Foundation Course. [28] C. Marven, G. Ewers. A Simple Approach to Digital Signal Processing. Texas Instruments Publication, 1993. [29] R.M. Mersereau, M.J.T. Smith. Digital Filtering. John Wiley, New York, 1993. [30] B.C.J Moore. An Introduction to the Psychology of Hearing. [31] A.V. Oppenheim, R.W. Schafer. Discrete Time Signal Processing. Prentice Hall, Englewood Cliffs, NJ, 1989. [32] R.A. Penfold. Synthesizers for Musicians. PC Publishing, London, 1989. [33] K. Pohlmann. Advanced Digital Audio. Howards Sams, Indiana, 1991. [34] K. Pohlmann. An Intoduction to Digital Audio, Howard Sams, Indiana, 1989. [35] T.S. Rappaport. Wireless Communications. IEEE Press, New York, 1996. [36] P. Regalia. Adaptive IIR Filtering. Marcel Dekker, 1995. [37] F. Rumsey. Digital Audio. Butterworth-Heinemann, 1991 [38] E. Rogers and Y. Li. Parallel Processing in a Control Systems Environment. Prentice Hall, Englewood Cliffs, NJ, 1993. [39] K. Sayood. Introduction to Data Compression. Morgan-Kaufman, 1995. [40] M. Schwartz. Information, Transmission, and Modulation Noise. McGraw-Hill. [41] N.J.A. Sloane, A.D. Wyner (Editors). Claude Elwood Shannon: Collected Papers. IEEE Press, 1993, Piscataway, NJ. ISBN 0-7803-0434-9. [42] M.J.T. Smith, R.M Mersereau. Introduction to Digital Signal Processing: A Computer Laboratory Textbook. John Wiley and Sons, 1992. [43] K. Steiglitz. A Digital Signal Processing Primer. Addison-Wesley, 1996. [44] C.A. Stewart and R. Atkinson. Basic Analogue Computer Techniques. McGraw-Hill, London, 1967. [45] N. Storey. Electronics: A Systems Approach. Addison-Welsey, 1992. [46] M. Talbot-Smith (Editor). Audio Engineer’s Reference Book, Focal Press, ISBN 0 7506 0386 0, 1994. [47] F.J. Taylor. Principles of Signals and Systems. New York; McGraw-Hill, 1994. [48] W.J. Tompkins. Biomedical Digital Signal Processing. Prentice Hall, Englewood Cliffs, NJ, 1993. [49] P.P. Vaidyanathan. Multirate Systems and Filter Banks. Prentice Hall, Englewood Cliffs, NJ,1993. [50] S.V. Vaseghi. Advanced Signal Processing and Digital Noise Reduction. John Wiley/B.G. Tuebner, 1996. [51] J. Watkinson. An Introduction to Digital Audio. Focal Press, ISBN 0 240 51378 9, 1994. [52] J. Watkinson. Compression in Video and Audio. ISBN 0240513940, Focal Press, April 1994. 445 [53] B. Widrow and S. Stearns. Adaptive Signal Processing. Prentice Hall, 1985. [54] J. Watkinson. The Art of Digital Audio, 2nd Edition. ISBN 0240 51320 7, 1993. Technical Papers [55] P.M. Aziz, H.V. Sorenson, J.V. Der Spiegel. An overview of sigma delta converters. IEEE Signal Processing Magazine, Vol. 13, No. 1, pp. 61-84, January 1996. [56] J.W. Arthur. Modern SAW-based pulse compression systems for radar application. Part 2: Practical systems. IEE Electronics Communication Engineering Journal, Vol. 8, No. 2, pp. 57-78, April 1996. [57] G.M. Blair. A Review of the Discrete Fourier Transform. Part 1: Manipulating the power of two. IEE Electronics and Communication Engineering Journal, Vol. 7, No.4, pp. 169-176, August 1995. [58] G.M. Blair. A review of the discrete Fourier Transform. Part 2: Non-radix algorithms, real transforms and noise. IEE Electronics Communication Engineering Journal, Vol. 7, No. 5, pp. 187-194, October 1995. [59] J.A. Cadzow. Blind deconvolution via cumulant extrema. IEEE Signal Processing Magazine, Vol. 13, No. 3, pp. 24-42, May 1996. [60] J. Cadzow. Signal processing via least squares error modelling. IEEE ASSP Magazine, Vol. 7, No. 4, pp 12-31, October 1990. [61] G. Cain, A. Yardim, D. Morling. All-Thru DSP Provision, Essential for the modern EE. Proceedings of IEEE International Conference on Acoutics, Speech and Signal Processing 93, pp. I-4 to I-9, Minneapolis, 1993. [62] C. Cellier. Lossless audio data compression for real time applications. 95th AES Convention, New York, Preprint 3780, October 1993. [63] S. Chand and S.L. Chiu (editors). Special Issue on Fuzzy Logic with Engineering Applications. Proceedings of the IEEE, Vol. 83, No. 3, pp. 343-483, March 1995. [64] R. Chellappa, C.L. Wilson and S. Sirohey. Human and machine recognition of faces. Proceedings of the IEEE, Vol. 83, No. 5 pp. 705-740, May 1995. [65] J. Crowcroft. The Internet: a tutorial. IEE Electronics Communication Engineering Journal, Vol. 8, No. 3, pp. 113122, June 1996. [66] J.W. Cooley. How the FFT gained acceptance. IEEE Signal Processing Magazine, Vol. 9, No. 1, pp. 10-13, January 1992. [67] J.R. Deller, Jr. Tom, Dick and Mary discover the DFT. IEEE Signal Processing Magazine, Vol. 11, No. 2, pp. 3650, April 1994. [68] S.J. Elliot and P.A. Nelson. Active Noise Control. IEEE Signal Processing Magazine, Vol. 10, No. 4, pp 12-35, October 1993. [69] L.J. Eriksson. Development of the filtered-U algorithm for active noise control. Journal of the Acoustical Society of America, Vol. 89 (No.1), pp 27-265, 1991. [70] H. Fan. A (new) Ohio yankee in King Gustav’s country. IEE Signal Processing Magazine, Vol.12, No. 2, pp. 3840, March 1995. [71] P.L. Feintuch. An adaptive recursive LMS filter. Proceedings of the IEEE, Vol. 64, No. 11, pp. 1622-1624, November 1976. [72] D. Fisher. Coding MPEG1 image data on compact discs. Electronic Product Design (UK), Vol. 14, No. 11, pp. 2633. November 1993. 446 DSPedia [73] H. Fletcher and W.A. Munson. Loudness, its definition, measurement and calculation. Journal of the Acoustical Society of America, Vol. 70, pp. 1646-1654, 1933. [74] M. Fontaine and D.G. Smith. Bandwidth allocation and connection admission control in ATM networks. IEE Electronics Communication Engineering Journal, Vol. 8, No. 4, pp. 156-164, August 1996. [75] W. Gardner. Exploitation of spectral redundancy in cyclostationary signals. IEEE Signal Processing Magazine, Vol. 8, No. 2, pp 14-36, April 1991. [76]H. Gish and M. Schmidt. Text independent speaker identification. Vol. 11, No. 4, pp 18- 32, October 1994. [77] P.M. Grant. Signal processing hardware and software. IEEE Signal Processing Magazine, Vol. 13, No. 1, pp. 8688, January 1996. [78] S. Harris. The effects of sampling clock jitter on Nyquist sampling analog to digital converters, and on oversampling delta sigma ADCs. Journal of the Audio Engineering Society, July 1990. [79] S. Heath. Multimedia standards and interoperability. Electronic Product Design (UK), Vol. 15, No. 9, pp. 33-37. November 1993. [80] F. Hlawatsch and G.F. Boudreaux-Bartels. Linear and quadratic time-frequency signal representations. IEEE Signal Processing Magazine, Vol. 9, No. 2, pp. 21-67, April 1992. [81]D.R. Hush and B.G. Horne. Progress in Supervised Neural Networks. IEEE Signal Processing Magazine, Vol. 10, No. 1, pp. 8-39, January 1993. [82] Special Issue on DSP Education, IEEE Signal Processing Magazine, Vol. 9, No.4, October 1992. [83] Special Issue on Fuzzy Logic with Engineering Applications. Proceedings of the IEEE, Vol. 83, No. 3, March 1995. [84] A. Hoogendoorn. Digital Compact Cassette. Proceedings of the IEEE, Vol. 82, No. 10, pp. 1479-1489, October 1994. [85] B. Jabbari (editor). Special Issue on Wireless Networks for Mobile and Personal Communications. Vol. 82, No. 9, September 1994. [86] D.L. Jaggard (editor). Special Section on Fractals in Electrical Engineering. Proceedings of the IEEE, Vol. 81, No. 10, pp. 1423-1523, October 1993. [87] N. Jayant, J. Johnston, R. Safranek. Signal compression based on models of human perception. Proceedings of the IEEE, Vol. 81, No. 10, pp. 1385-1382, October 1993. [88] C.R. Johnson. Yet still more on the interaction of adaptive filtering, identification, and control. IEE Signal Processing Magazine, Vol.12, No. 2, pp. 22-37, March 1995. [89] R.K. Jurgen. Broadcasting with Digital Audio. IEEE Spectrum, Vol. 33, No. 3, pp. 52-59. March 1996. [90] S.M. Kay and S.L. Marple. Spectrum Analysis - A Modern Perspective. Proceedings of the IEEE, Vol. 69, No. 11, pp 1380-1419, November 1981. [91] K. Karnofsky. Speeding DSP algorithm design. IEEE Spectrum, Vol. 33, No. 7, pp. 79-82, July 1996. [92] W. Klippel. Compensation for non-linear distortion of horn loudspeakers by digital signal processing. Journal of the Audio Engineering Society, Vol. 44, No. 11, pp 964-972, Novemeber 1996. [93] P. Kraniauskas. A plain man’s guide to the FFT. IEEE Signal Processing Magazine, Vol. 11, No. 2, pp. 24-36, April 1994. [94] F. Kretz and F. Cola. Standardizing Hypermedia Information Objects. IEEE Communications Magazine, May 1992. 447 [95] M. Kunt (Editor). Special Issue on Digital Television, Part 1: Technologies. Proceedings of the IEEE, Vol. 83, No. 6, June 1995. [96] M. Kunt (Editor). Special Issue on Digital Television, Part 2: Hardware and Applciations. Proceedings of the IEEE, Vol. 83, No. 7, July 1995. [97] T.I. Laakso, V. Valimaki, M. Karjalainen and U.K. Laine. Splitting the unit delay. IEEE Signal Processing Magazine, Vol. 13, No. 1, pp. 30-60, January 1996. [98] T.I. Laasko, V. Valimaki, M. Karjalainen, U.K. Laine. Splitting the Unit Delay. IEEE Signal Processing Magazine, Vol. 13, No. 1, pp. 30-60, January 1996. [99] P. Lapsley and G. Blalock. How to estimate DSP processor perfomance. IEEE Spectrum, Vol. 33, No. 7, pp. 7478, July 1996. [100]V.O.K. Li and X. Qui. Personal communication systems. Proceedings of the IEEE, Vol. 83, No. 9, pp. 1210-1243, September 1995. [101]R.P. Lippmann. An introduction to computing with neural nets. IEEE ASSP Magazine, Vol. 4, No. 2, pp. 4-22, April 1987. [102]G.C.P. Lokhoff. DCC: Digital Compact Cassette. IEEE Transactions on Consumer Electronics, Vol. 37, No. 3 pp 702-706, August 1991. [103]H. Lou. Implementing the Viterbi Algorithm. IEEE Signal Processing Magazine. Vol 12, No. 5 pp. 42-52, September 1995. [104].J. Lipoff. Personal communications networks bridging the gap between cellular and cordless phones. Proceedings of the IEEE, Vol. 82, No. 4, pp. 564-571, April 1994. [105]M. Liou. Overview of the p*64 kbit/s Video Coding Standard. Communications of the ACM, April 1991. G.K. Ma and F.J. Taylor. Multiplier policies for digital signal processing. IEEE ASSP Magazine, Vol. 7, No. 4, pp 6-20, January 1990. [106]G-K. Ma and F.J. Taylor. Multiplier policies for digital signal processing. IEEE ASSP Magazine, Vol. 7, No. 1, January 1990. [107]Y. Mahieux, G. Le Tourneur and A. Saliou. A microphone array for multimedia workstations. Journal of the Audio Engineering Society, Vol. 44, No. 5, pp. 331-353, May 1996. [108]D.T. Magill, F.D. Natali and G.P. Edwards. Spread-spectrum technology for commercial applications. Proceedings of the IEEE, Vol. 82, No. 4, pp. 572-584, April 1994. [109]V.J. Mathews. Adaptive Polynomial Filters. IEEE Signal Processing Magazine, Vol. 8, No. 3, pp. 10-26, July 1991. [110]N. Morgan and H. Bourland. Neural networks for statistical recognition of continuous speech. Proceedings of the IEEE, Vol. 83, No. 5, pp. 742-770, May 1995. [111]N. Morgan and H. Bourland. Continuous speech recognition. IEEE Signal Processing Magazine, Vol. 12, No. 3, pp. 24-42, May 1995. [112]N. Morgan and H. Bourland. Neural networks for statistical recognition of speech. Proceedings of the IEEE, Vol 83, No. 5, pp 742-770, May 1995. [113]A. Miller. From here to ATM. IEEE Spectrum, Vol. 31, No. 6, pp 20-24, June 1994. [114]Y.K. Muthusamy, E. Barnard and R.A. Cole. IEEE Signal Processing Magazine, Vol. 11, No. 4, pp. 33-41, October 1994. [115]R.N. Mutagi. Psuedo noise sequences for engineers. IEE Electronics Communication Engineering Journal, Vol. DSPedia 448 8, No. 2, pp. 79-87, April 1996. [116]R.N. Mutagi. Pseudo noise sequences for engineergs. IEE Electronics and Communication Engineering Journal, Vol. 8. No. 2, pp. 79-87, April 1996. [117]C.L. Nikias and J.M Mendel. Signal processing with higher order statistics. IEEE Signal Processing Magazine, Vol. 10, No. 3, p 10-37, July 1993. [118]P.A. Nelson, F. Orduna-Bustamante, D. Engler, H. Hamada. Experiments on a system for the synthesis of virtual acoustic sources. Journal of the Audio Engineering Society, Vol. 44, No. 11, pp 973-989, Novemeber 1996. [119]P.A. Nelson, F. Orduna-Bustamante, H. Hamada. Multichannel signal processing techniques in the reproduction of sound. Journal of the Audio Engineering Society, Vol. 44, No. 11, pp 973-989, Novemeber 1996. [120]P. Noll. Digital audio coding for visual communications. Proceedings of the IEEE, Vol. 83, No. 6, pp. 9 925-943, June 1995 [121]K.J. Olejniczak and G.T. Heydt. (editors). Special Section on the Hartley Transform. Proceedings of the IEEE, Vol. 82, No . 3, pp. 372-447, March 1994. [122]J. Picone. Continuous speech recognition using hidden Markov models. IEEE ASSP Magazine, Vol. 7, No. 3, pp. 26-41, July 1990. [123]M. Poletti. The design of encoding functions for stereophonic and polyphonic sound systems. Journal of the Audio Engineering Society, Vol. 44, No. 11, pp 948-963, November 1996. [124]P.A. Ramsdale. The development of personal communications. IEE Electronics Communication Engineering Journal, Vol. 8, No. 3, pp. 143-151, June 1996. [125]P. Regalia, S.K. Mitra, P.P. Vaidynathan. The digital all-pass filter: a versatile building block. Proceedings of the IEEE, Vol. 76, No. 1, pp. 19-37, January 1988. [126]D.W. Robinson and R.S. Dadson. A redetermination of the equal loudness relations for pure tones. British Journal of Applied Physics, Vol. 7, pp. 166-181, 1956. [127]R.W. Robinson. Tools for Embedded Digital Signal Processing. IEEE Spectrum, Vol. 29, No. 11, pp 81-84, November 1992. [128]C.W. Sanchez. An Understanding and Implementation of the SCMS Serial Copy Management System for Digital Audio Transmission. 94th AES Convention, Preprint #3518, March 1993. R. Schafer and T. Sikora. Digital video coding standards and their role in video communications. Proceedings of the IEEE, Vol. 83, No. 6, pp. 907-924, June 1995. [129]C.E. Shannon. A mathematical theory of Communication. The Bell System Technical Journal, Vol. 27, pp. 379423, July 1948. (Reprinted in Claude Elwood Shannon: Collected Papers [41].) [130]C.E. Shannon. The Bandwagon (Editorial). Institute of Radio Engineers, Transations on Information Theory, Vol. IT-2, p. 3 March 1956. (Reprinted in Claude Elwood Shannon: Collected Papers [41].) [131]J.J. Shynk. Frequency domain and multirate adaptive filtering. IEEE Signal Processing Magazine, Vol. 9, No. 1, pp. 10-37, January 1992. [132]J.J. Shynk. Adaptive IIR filtering. IEEE ASSP Magazine, Vol. 6, No. 2, pp. 4-21, April 1989. [133]H.F. Silverman and D.P. Morgan. The appliation of dynamic programming to converted speech recognition. IEEE ASSP Magazine, Vol. 7, No. 3, pp. 6-25, July 1990. [134]J.L. Smith. Data compression and perceived quality. IEEE Signal Processing Magazine, Vol. 12, No. 5, pp. 58-59, September 1995. 449 [135]A.S. Spanias. Speech coding: a tutorial review. Proceedings of the IEEE, Vol. 82, No. 10, pp. 1541-1582, October 1994. [136]A.O. Steinhardt. Householder transforms in signal processing. IEEE Signal Processing Magazine, Vol. 5, No. 3, pp. 4-12, July 1988. [137]R.W. Stewart. Practical DSP for Scientist. Proceedings of IEEE International Conference on Acoutics, Speech and Signal Processing 93, pp. I-32 to I-35, Minneapolis, 1993. [138]C.Stone. Infrasound. Audio Media, Issue 55, AM Publishing Ltd, London,June 1995. [139]J.A. Storer. Special Section on Data Compression. Proceedings of the IEEE, Vol. 82, No. 6, pp. 856-955, June 1994. [140]JP. Strobach. New forms of Levinson and Schur algorithms. IEEE Signal Processing Magazine, Vol. 8, No. 1, pp. 12-36, January 1991. [141].R. Treicher, I. Fijalkow, and C.R. Johnson, Jr. Fractionally spaced equalizers. IEEE Signal Processing Magazine, Vol. 13, No. 3, pp. 65-81, May 1996. [142]B.D.Van Veen and K. Buckley. Beamforming: A Versatile Approach to spatial filtering. IEEE ASSP Magazine, Vol. 5, No.2, pp. 4-24, April 1988. [143]V. Valimaki, J. Huopaniemi, M. Karjalainen and Z. Janosy. Physical modeling of plucked string instruments with application to real time sound synthesis. Journal of the Audio Engineering Society, Vol. 44, No. 5, pp. 331-353, May 1996. [144]V.D. Vaughn and T.S. Wilkinson. System considerations for multispectral image compression designs. IEEE Signal Processing Magazine, Vol. 12, No. 1, pp. 19-31, January 1995. [145]S.A. White. Applications of distributed arithmetic to digital signal processing: a tutorial review. [146]IEEE ASSP Magazine, Vol. 6, No. 3, pp. 4-19, July 1989. [147]W.H.W. Tuttlebee. Cordless telephones and cellular radios: synergies of DECT and GSM. IEE Electronics Communication Engineering Journal, Vol. 8, No. 5, pp. 213-223, October 1996. [148]Working Group on Communication Aids for the Hearing Impaired. Speech perception aids for hearing impaired people: current status and needed research. Journal of Acoustical Society of America, Vol. 90, No.2, 1991 [149]R.D. Wright. Signal processing hearing aids. Hearing Aid Audiology Group, Special Publication, British Society of Audiology, London, 1992. [150]F. Wylie. Digital audio data compression. IEE Electronics and Communication Engineering Journal, pp. 5-10, February 1995. [151]I. Wickelgren. The Strange Senses of Other Species. IEEE Spectrum, Vol. 33, No. 3, pp. 32-37. March 1996. [152]B. Widrow et al. Adaptive Noise Cancellation: Principles and Applications. Proceedings of the IEEE, Vol. 63, pp. 1692-1716, 1975. [153]B. Widrow et al. Stationary and Non-stationary learning characteristics of the LMS adaptive filter. Proc. IEEE, Vol 64, pp. 1151-1162, 1976. [154]T. Yamamoto, K. Koguchi, M. Tsuchida. Proposal of a 96kHz sampling digital audio. 97th AES Convention, October 1994, Audio Engineering Society preprint 3884 (F-5). [155]T. Yoshida. The rewritable minidisc system. Proceedings of the IEEE, Vol. 82, No. 10, pp. 1492-1500 October 1994. [156]Y.Q. Zhang, W. Li, M.L. Liou (Editors). Special Issue on Advances in Image and Video Compression. Proceedings DSPedia 450 of the IEEE, Vol. 83, No. 2, February 1995. [157]British Society of Audiology. Recommended procedures for pure tone audiometry. British Journal of Audiometry, Vol. 15, pp213-216, 1981. [158]IEC-958/ IEC-85, Digital Audio Interface / Amendment. International Electrotechnical Commission, 1990. [159]DSP Education Session. Proceedings of IEEE International Conference on Acoutics, Speech and Signal Processing 92, pp. 73-109, San Francisco, 1992. [160]Special Section on the Hartley Transform (Edited by K.J. Olejniczak and G.T. Heydt). Proceedings of the IEEE, Vol. 82, No. 3, March 1994. [161]Special Issue on Advances in Image and Video Compression (Edited by Y.Q. Zhang, W. Li and M.L. Liou). Proceedings of the IEEE, Vol. 83, No. 2, February 1995. [162]Special Issue on Digital Television Part 2: Hardware and Applications (Editor M. Kunt). Proceedings of the IEEE, Vol. 83, No. 7, July 1995. [163]Special Issue on Electrical Therapy of Cardiac Arrhythmias (Edited by R.E. Ideker and R.C. Barr). Proceedings of the IEEE, Vol. 84, No. 3, March 1996. [164]Special Section on Data Compression (Editor J.A. Storer). Proceedings of the IEEE, Vol. 82, No. 6, June 1994. [165]Special Section on Field Programmable Gate Arrays (Editor A. El Gamal). Proceedings of the IEEE, Vol. 81, No. 7, July 1993. [166]Special Issue on Wireless Networks for Mobile and Personal Communications (Editor B. Jabbari). Proceedings of the IEEE, Vol. 82, No. 9, September 1994. [167]Special Issue on Digital Television, Part 1: Technologies (Editor M. Kunt). Proceedings of the IEEE, Vol. 83, No. 6, June 1995. [168]Special Issue on Time-Frequency Analysis (Editor P.J. Loughlin). Proceedings of the IEEE, Vol. 84, No. 9, September 1996. [169]Technology 1995. IEEE Spectrum, Vol. 32, No.1, January 1995.

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement